ggershinsky commented on a change in pull request #776:
URL: https://github.com/apache/parquet-mr/pull/776#discussion_r427843291
##########
File path: parquet-hadoop/src/main/java/org/apache/parquet/crypto/AesCipher.java
##########
@@ -68,19 +67,32 @@
public static byte[] createModuleAAD(byte[] fileAAD, ModuleType moduleType,
short rowGroupOrdinal, short columnOrdinal, short pageOrdinal) {
Review comment:
Also - like with the page numbers in the previous comment, having too
many row groups will adversely affect encryption performance. There are
per-rowgroup encryption operations, always performed on small buffers -
therefore, very slow (no hardware acceleration, etc). Having dozens/hundreds of
thousands (or more) of them will significantly affect the overall encryption
time of a file. Moreover, having lots of row groups might lead to having
smaller data pages, which decreases the performance further.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]