[ https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402670#comment-17402670 ]
Xinli Shang edited comment on PARQUET-2071 at 8/21/21, 5:40 PM: ---------------------------------------------------------------- I just drafted the tool and had [~gershinsky] to have an earlier look(Thanks Gidon!). It is working now and I just had a comparison with a regular tool(I simply write a tool that read each record and write it back immediately. I have the code example in the [doc|https://docs.google.com/document/d/1-XdE8-QyDHnBsYrClwNsR8X3ks0JmKJ1-rXq7_th0hc/edit] ). The result is promising that it is 20X faster than the regular tool. [~gszadovszky] Are you open to having the tool merge in first and then we refactor all the existing similar tools to have the universal tool? If yes, I am going to make a PR shortly. was (Author: sha...@uber.com): I just drafted the tool and had [~gershinsky] to have an earlier look(Thanks Gidon!). It is working now and I just had a comparison with a regular tool(I simply write a tool that read each record and write it back immediately). The result is promising that it is 20X faster than the regular tool. [~gszadovszky] Are you open to having the tool merge in first and then we refactor all the existing similar tools to have the universal tool? If yes, I am going to make a PR shortly. > Encryption translation tool > ---------------------------- > > Key: PARQUET-2071 > URL: https://issues.apache.org/jira/browse/PARQUET-2071 > Project: Parquet > Issue Type: New Feature > Components: parquet-mr > Reporter: Xinli Shang > Assignee: Xinli Shang > Priority: Major > > When translating existing data to encryption state, we could develop a tool > like TransCompression to translate the data at page level to encryption state > without reading to record and rewrite. This will speed up the process a lot. -- This message was sent by Atlassian Jira (v8.3.4#803005)