[ 
https://issues.apache.org/jira/browse/PARQUET-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419833#comment-16419833
 ] 

ASF GitHub Bot commented on PARQUET-1143:
-----------------------------------------

scottcarey commented on issue #430: PARQUET-1143: Update to Parquet format 
2.4.0.
URL: https://github.com/apache/parquet-mr/pull/430#issuecomment-377383897
 
 
   FWIW, I tested out the current master code, overriding the version in my 
spark projects.  I could not output zstandard parquet files because spark-sql's 
`ParquetOptions` class intercepts the config strings and maps them to a 
`CompressionCodecName` in parquet-hadoop, rather than just delegating the name 
lookup to parquet-hadoop, so it does not understand the string 'zstd'.
   
   This coupling means that using this from spark will require a new version of 
spark-sql.  Honestly, the code here should be responsible for converting from a 
simple name to the codec, not spark.  Then one could upgrade only the parquet 
version and gain access to new compression codecs without recompiling/releasing 
spark.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Update Java for format 2.4.0 changes
> ------------------------------------
>
>                 Key: PARQUET-1143
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1143
>             Project: Parquet
>          Issue Type: Task
>          Components: parquet-mr
>    Affects Versions: 1.9.0, 1.8.2
>            Reporter: Ryan Blue
>            Assignee: Ryan Blue
>            Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to