[
https://issues.apache.org/jira/browse/SPARK-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052473#comment-17052473
]
Dongjoon Hyun commented on SPARK-20901:
---------------------------------------
Hi, [~FelixKJose]. Apache Spark community is trying to provide a seamless user
experience and this issue aims to track those kind of difference. You may want
to link something more if you want. Basically, new feature development speed is
different among Apache Parquet and ORC. For example, ORC bloom filter is used
in Apache Spark already, but Parquet boom filter is not applicable yet. For
ZStandard support, it's opposite. Please note that Apache Spark also didn't
consume the latest versions; Apache ORC 1.6.2 and Apache Parquet 1.11.0. The
difference table is changed time to time. You had better investigate both one
in your use cases.
> Feature parity for ORC with Parquet
> -----------------------------------
>
> Key: SPARK-20901
> URL: https://issues.apache.org/jira/browse/SPARK-20901
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Dongjoon Hyun
> Priority: Major
>
> This issue aims to track the feature parity for ORC with Parquet.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]