[
https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402670#comment-17402670
]
Xinli Shang commented on PARQUET-2071:
--------------------------------------
I just drafted the tool and had [~gershinsky] to have an earlier look(Thanks
Gidon!). It is working now and I just had a comparison with a regular tool(I
simply write a tool that read each record and write it back immediately). The
result is promising that it is 20X faster than the regular tool.
[~gszadovszky] Are you open to having the tool merge in first and then we
refactor all the existing similar tools to have the universal tool? If yes, I
am going to make a PR shortly.
> Encryption translation tool
> ----------------------------
>
> Key: PARQUET-2071
> URL: https://issues.apache.org/jira/browse/PARQUET-2071
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-mr
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
>
> When translating existing data to encryption state, we could develop a tool
> like TransCompression to translate the data at page level to encryption state
> without reading to record and rewrite. This will speed up the process a lot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)