[ 
https://issues.apache.org/jira/browse/SPARK-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17052473#comment-17052473
 ] 

Dongjoon Hyun commented on SPARK-20901:
---------------------------------------

Hi, [~FelixKJose]. Apache Spark community is trying to provide a seamless user 
experience and this issue aims to track those kind of difference. You may want 
to link something more if you want. Basically, new feature development speed is 
different among Apache Parquet and ORC. For example, ORC bloom filter is used 
in Apache Spark already, but Parquet boom filter is not applicable yet. For 
ZStandard support, it's opposite. Please note that Apache Spark also didn't 
consume the latest versions; Apache ORC 1.6.2 and Apache Parquet 1.11.0. The 
difference table is changed time to time. You had better investigate both one 
in your use cases.

> Feature parity for ORC with Parquet
> -----------------------------------
>
>                 Key: SPARK-20901
>                 URL: https://issues.apache.org/jira/browse/SPARK-20901
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0
>            Reporter: Dongjoon Hyun
>            Priority: Major
>
> This issue aims to track the feature parity for ORC with Parquet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to