ggershinsky commented on code in PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1021589628
##########
parquet-hadoop/src/test/java/org/apache/parquet/crypto/TestPropertiesDrivenEncryption.java:
##########
@@ -196,13 +205,13 @@ public static Collection<Object[]> data() {
.append(COLUMN_MASTER_KEY_IDS[5]).append(":
").append(SingleRow.FIXED_LENGTH_BINARY_FIELD_NAME)
.toString();
- private static final int NUM_THREADS = 4;
+ private static final int NUM_THREADS = 1;
Review Comment:
why removing multiple threads?
##########
parquet-format-structures/src/main/java/org/apache/parquet/format/BlockCipher.java:
##########
@@ -51,17 +52,26 @@
* @param AAD - Additional Authenticated Data for the decryption (ignored
in case of CTR cipher)
* @return plaintext - starts at offset 0 of the output value, and fills
up the entire byte array.
*/
- public byte[] decrypt(byte[] lengthAndCiphertext, byte[] AAD);
+ byte[] decrypt(byte[] lengthAndCiphertext, byte[] AAD);
/**
+ * Convenience decryption method that reads the length and ciphertext from
a ByteBuffer
+ *
+ * @param from ByteBuffer with length and ciphertext.
+ * @param AAD - Additional Authenticated Data for the decryption (ignored
in case of CTR cipher)
+ * @return plaintext - starts at offset 0 of the output, and fills up the
entire byte array.
Review Comment:
nit: please update for byte buffer
##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageReadStore.java:
##########
@@ -133,11 +135,33 @@ public DataPage readPage() {
public DataPage visit(DataPageV1 dataPageV1) {
try {
BytesInput bytes = dataPageV1.getBytes();
- if (null != blockDecryptor) {
- bytes =
BytesInput.from(blockDecryptor.decrypt(bytes.toByteArray(), dataPageAAD));
+ ByteBuffer byteBuffer = bytes.toByteBuffer();
Review Comment:
can you call this after "if(options.useOffHeapDecryptBuffer())"
##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageReadStore.java:
##########
@@ -133,11 +135,33 @@ public DataPage readPage() {
public DataPage visit(DataPageV1 dataPageV1) {
try {
BytesInput bytes = dataPageV1.getBytes();
- if (null != blockDecryptor) {
- bytes =
BytesInput.from(blockDecryptor.decrypt(bytes.toByteArray(), dataPageAAD));
+ ByteBuffer byteBuffer = bytes.toByteBuffer();
+ BytesInput decompressed;
+
+ if (byteBuffer.isDirect() && options.useOffHeapDecryptBuffer()) {
+ if (blockDecryptor != null) {
+ byteBuffer = blockDecryptor.decrypt(byteBuffer, dataPageAAD);
+ }
+ long compressedSize = byteBuffer.limit();
+
+ ByteBuffer decompressedBuffer =
+ ByteBuffer.allocateDirect(dataPageV1.getUncompressedSize());
Review Comment:
this is expensive.. Can we use a buffer pool here, or other means to reduce
the allocations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]