[ 
https://issues.apache.org/jira/browse/PARQUET-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17112247#comment-17112247
 ] 

ASF GitHub Bot commented on PARQUET-1229:
-----------------------------------------

ggershinsky commented on a change in pull request #776:
URL: https://github.com/apache/parquet-mr/pull/776#discussion_r427843291



##########
File path: parquet-hadoop/src/main/java/org/apache/parquet/crypto/AesCipher.java
##########
@@ -68,19 +67,32 @@
 
   public static byte[] createModuleAAD(byte[] fileAAD, ModuleType moduleType, 
       short rowGroupOrdinal, short columnOrdinal, short pageOrdinal) {

Review comment:
       Also - like with the page numbers in the previous comment, having too 
many row groups will adversely affect encryption performance. There are 
per-rowgroup encryption operations, always performed on small  buffers - 
therefore, very slow (no hardware acceleration, etc). Having dozens/hundreds of 
thousands (or more) of them will significantly affect the overall encryption 
time of a file. Moreover, having lots of row groups might lead to having 
smaller data pages, which decreases the performance further.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> parquet-mr code changes for encryption support
> ----------------------------------------------
>
>                 Key: PARQUET-1229
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1229
>             Project: Parquet
>          Issue Type: Sub-task
>          Components: parquet-mr
>            Reporter: Gidon Gershinsky
>            Assignee: Gidon Gershinsky
>            Priority: Major
>              Labels: pull-request-available
>
> Addition of encryption/decryption support to the existing Parquet classes and 
> APIs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to