[
https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393788#comment-17393788
]
Gabor Szadovszky commented on PARQUET-2071:
-------------------------------------------
I think it is a great idea to skip unnecessary deserialization/serialization
steps in such cases. Meanwhile, we already have some tools with similar
approach like trans-compression or prune columns. What do you think of
implementing a more universal tool where you can configure the projection
schema and the configuration of the target file. Then the tool can decide which
level of deserialization/serialization is required. For example for
trans-compression you need to decompress the pages while for encryption you
don't. What do you think?
> Encryption translation tool
> ----------------------------
>
> Key: PARQUET-2071
> URL: https://issues.apache.org/jira/browse/PARQUET-2071
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-mr
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
>
> When translating existing data to encryption state, we could develop a tool
> like TransCompression to translate the data at page level to encryption state
> without reading to record and rewrite. This will speed up the process a lot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)