[
https://issues.apache.org/jira/browse/PARQUET-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419831#comment-16419831
]
ASF GitHub Bot commented on PARQUET-1143:
-----------------------------------------
scottcarey commented on issue #430: PARQUET-1143: Update to Parquet format
2.4.0.
URL: https://github.com/apache/parquet-mr/pull/430#issuecomment-377383897
FWIW, I made a local version of master with a custom version, overrode it in
my spark projects, and could not output zstandard parquet files because
spark-sql's `ParquetOptions` class intercepts the config strings and maps them
to a `CompressionCodecName` in parquet-hadoop, rather than just delegating the
name lookup to parquet-hadoop.
This coupling means that using this from spark will require a new version of
spark-sql. Honestly, the code here should be responsible for converting from a
simple name to the codec, not spark. Then one could upgrade only the parquet
version and gain access to new compression codecs without recompiling/releasing
spark.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Update Java for format 2.4.0 changes
> ------------------------------------
>
> Key: PARQUET-1143
> URL: https://issues.apache.org/jira/browse/PARQUET-1143
> Project: Parquet
> Issue Type: Task
> Components: parquet-mr
> Affects Versions: 1.9.0, 1.8.2
> Reporter: Ryan Blue
> Assignee: Ryan Blue
> Priority: Major
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)