[jira] [Resolved] (PARQUET-2071) Encryption translation tool

2022-01-14 Thread Xinli Shang (Jira)


 [ 
https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinli Shang resolved PARQUET-2071.
--
Resolution: Fixed

> Encryption translation tool 
> 
>
> Key: PARQUET-2071
> URL: https://issues.apache.org/jira/browse/PARQUET-2071
> Project: Parquet
>  Issue Type: New Feature
>  Components: parquet-mr
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
>
> When translating existing data to encryption state, we could develop a tool 
> like TransCompression to translate the data at page level to encryption state 
> without reading to record and rewrite. This will speed up the process a lot. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (PARQUET-1872) Add TransCompression Feature

2022-01-14 Thread Xinli Shang (Jira)


 [ 
https://issues.apache.org/jira/browse/PARQUET-1872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinli Shang resolved PARQUET-1872.
--
Resolution: Fixed

> Add TransCompression Feature 
> -
>
> Key: PARQUET-1872
> URL: https://issues.apache.org/jira/browse/PARQUET-1872
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Affects Versions: 1.12.0
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
>
> When ZSTD becomes more popular, there is a need to translate existing data to 
> ZSTD compressed which can achieve a higher compression ratio. It would be 
> useful if we can have a tool to convert a Parquet file directly by just 
> decompressing/compressing each page without decoding/encoding or assembling 
> the record because it is much faster. The initial result shows it is ~5 times 
> faster. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (PARQUET-2105) Refactor the test code of creating the test file

2022-01-14 Thread Xinli Shang (Jira)


 [ 
https://issues.apache.org/jira/browse/PARQUET-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xinli Shang resolved PARQUET-2105.
--
Resolution: Fixed

> Refactor the test code of creating the test file 
> -
>
> Key: PARQUET-2105
> URL: https://issues.apache.org/jira/browse/PARQUET-2105
> Project: Parquet
>  Issue Type: Improvement
>  Components: parquet-mr
>Reporter: Xinli Shang
>Assignee: Xinli Shang
>Priority: Major
>
> In the tests, there are many places that need to create a test parquet file 
> with different settings. Currently, each test file just creates its own code. 
> It would be better to have a test file builder to create that. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)