[ 
https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinli Shang updated PARQUET-1872:
---------------------------------
    Description: 
When ZSTD becomes more popular, there is a need to translate existing data to 
ZSTD compressed which can achieve a higher compression ratio. It would be 
useful if we can have a tool to convert a Parquet file directly by just 
decompressing/compressing each page without decoding/encoding or assembling the 
record because it is much faster. The initial result shows it is ~5 times 
faster. 



  was:
When ZSTD becomes more popular, there is a need to translate existing data ZSTD 
compressed which can achieve a higher compression ratio. It would be useful if 
we can have a tool to convert a Parquet file directly by just 
decompressing/compressing each page without decoding/encoding or assembling the 
record because it is much faster. The initial result shows it is ~5 times 
faster. 




> Add TransCompression command 
> -----------------------------
>
>                 Key: PARQUET-1872
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1872
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>    Affects Versions: 1.12.0
>            Reporter: Xinli Shang
>            Assignee: Xinli Shang
>            Priority: Major
>
> When ZSTD becomes more popular, there is a need to translate existing data to 
> ZSTD compressed which can achieve a higher compression ratio. It would be 
> useful if we can have a tool to convert a Parquet file directly by just 
> decompressing/compressing each page without decoding/encoding or assembling 
> the record because it is much faster. The initial result shows it is ~5 times 
> faster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to