[ 
https://issues.apache.org/jira/browse/PARQUET-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17609928#comment-17609928
 ] 

ASF GitHub Bot commented on PARQUET-2196:
-----------------------------------------

wgtmac opened a new pull request, #1000:
URL: https://github.com/apache/parquet-mr/pull/1000

   This PR implements the LZ4_RAW codec which was introduced by parquet format 
v2.9.0. Since there are a lot of common logic between the LZ4_RAW and SNAPPY 
codecs, this patch moves them into NonBlockedCompressor and 
NonBlockedDecompressor and make the specific codec extend them.
   
   Added TestLz4RawCodec test to make sure the new codec itself is correct.




> Support LZ4_RAW codec
> ---------------------
>
>                 Key: PARQUET-2196
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2196
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Gang Wu
>            Priority: Major
>
> There is a long history about the LZ4 interoperability of parquet files 
> between parquet-mr and parquet-cpp (which is now in the Apache Arrow). 
> Attached links are the evidence. In short, a new LZ4_RAW codec type has been 
> introduced since parquet format v2.9.0. However, only parquet-cpp supports 
> LZ4_RAW. The parquet-mr library still uses the old Hadoop-provided LZ4 codec 
> and cannot read parquet files with LZ4_RAW.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to