[ 
https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17402670#comment-17402670
 ] 

Xinli Shang edited comment on PARQUET-2071 at 8/21/21, 5:40 PM:
----------------------------------------------------------------

I just drafted the tool and had [~gershinsky] to have an earlier look(Thanks 
Gidon!). It is working now and I just had a comparison with a regular tool(I 
simply write a tool that read each record and write it back immediately. I have 
the code example in the 
[doc|https://docs.google.com/document/d/1-XdE8-QyDHnBsYrClwNsR8X3ks0JmKJ1-rXq7_th0hc/edit]
 ). The result is promising that it is 20X faster than the regular tool. 

[~gszadovszky] Are you open to having the tool merge in first and then we 
refactor all the existing similar tools to have the universal tool? If yes, I 
am going to make a PR shortly. 


was (Author: sha...@uber.com):
I just drafted the tool and had [~gershinsky] to have an earlier look(Thanks 
Gidon!). It is working now and I just had a comparison with a regular tool(I 
simply write a tool that read each record and write it back immediately). The 
result is promising that it is 20X faster than the regular tool. 

[~gszadovszky] Are you open to having the tool merge in first and then we 
refactor all the existing similar tools to have the universal tool? If yes, I 
am going to make a PR shortly. 

> Encryption translation tool 
> ----------------------------
>
>                 Key: PARQUET-2071
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2071
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-mr
>            Reporter: Xinli Shang
>            Assignee: Xinli Shang
>            Priority: Major
>
> When translating existing data to encryption state, we could develop a tool 
> like TransCompression to translate the data at page level to encryption state 
> without reading to record and rewrite. This will speed up the process a lot. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to