[ 
https://issues.apache.org/jira/browse/PARQUET-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128100#comment-17128100
 ] 

Ben Watson commented on PARQUET-1870:
-------------------------------------

Thanks for the useful information Gabor. I'm not sure I want to add as much 
code as there is in that UUID PR - that's quite a lot of code for an 
unsupported, deprecated field. 

I will take a look at other libraries, but I like parquet-avro because it lets 
me share code across my Avro and Parquet implementations, and deals with the 
JSON conversion nicely. I've had trouble finding an implementation that doesn't 
rely heavily on Hadoop Path objects or Spark, both of which cause a lot of 
problems within my plugin.

I have enabled INT96 support in my plugin by compiling parquet-avro myself with 
the one change I mentioned above, and I'm happy to stick to that solution going 
forwards. Thanks again.

> Handle INT96 more gracefully in parquet-avro
> --------------------------------------------
>
>                 Key: PARQUET-1870
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1870
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-avro
>    Affects Versions: 1.11.0
>            Reporter: Ben Watson
>            Priority: Minor
>
> The parquet-avro library does not support INT96 columns (PARQUET-323), and 
> any attempt to process a file containing such a column results in:
> {code:java}
> throw new IllegalArgumentException("INT96 not implemented and is 
> deprecated");{code}
> INT96 is still used in many legacy datasets, and so it would be useful to be 
> able to process Parquet files containing these records, even if the INT96 
> values themselves aren't rendered.
> The same functionality has already been re-added into parquet-pig 
> (PARQUET-1133).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to