[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #1008: PARQUET-2212: Add ByteBuffer api for decryptors to allow direct memory to be decrypted
shangxinli commented on code in PR #1008: URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1065346689 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageReadStore.java: ## @@ -133,11 +135,36 @@ public DataPage readPage() { public DataPage visit(DataPageV1 dataPageV1) { try { BytesInput bytes = dataPageV1.getBytes(); -if (null != blockDecryptor) { - bytes = BytesInput.from(blockDecryptor.decrypt(bytes.toByteArray(), dataPageAAD)); +BytesInput decompressed; + +if (options.getAllocator().isDirect() && options.useOffHeapDecryptBuffer()) { + ByteBuffer byteBuffer = bytes.toByteBuffer(); + if (!byteBuffer.isDirect()) { +throw new ParquetDecodingException("Expected a direct buffer"); + } + if (blockDecryptor != null) { +byteBuffer = blockDecryptor.decrypt(byteBuffer, dataPageAAD); + } + long compressedSize = byteBuffer.limit(); + + ByteBuffer decompressedBuffer = + options.getAllocator().allocate(dataPageV1.getUncompressedSize()); + decompressor.decompress(byteBuffer, (int) compressedSize, decompressedBuffer, + dataPageV1.getUncompressedSize()); + + // HACKY: sometimes we need to do `flip` because the position of output bytebuffer is Review Comment: Ok I see -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #1008: PARQUET-2212: Add ByteBuffer api for decryptors to allow direct memory to be decrypted
shangxinli commented on code in PR #1008: URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1038821325 ## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageReadStore.java: ## @@ -133,11 +135,36 @@ public DataPage readPage() { public DataPage visit(DataPageV1 dataPageV1) { try { BytesInput bytes = dataPageV1.getBytes(); -if (null != blockDecryptor) { - bytes = BytesInput.from(blockDecryptor.decrypt(bytes.toByteArray(), dataPageAAD)); +BytesInput decompressed; + +if (options.getAllocator().isDirect() && options.useOffHeapDecryptBuffer()) { + ByteBuffer byteBuffer = bytes.toByteBuffer(); + if (!byteBuffer.isDirect()) { +throw new ParquetDecodingException("Expected a direct buffer"); + } + if (blockDecryptor != null) { +byteBuffer = blockDecryptor.decrypt(byteBuffer, dataPageAAD); + } + long compressedSize = byteBuffer.limit(); + + ByteBuffer decompressedBuffer = + options.getAllocator().allocate(dataPageV1.getUncompressedSize()); + decompressor.decompress(byteBuffer, (int) compressedSize, decompressedBuffer, + dataPageV1.getUncompressedSize()); + + // HACKY: sometimes we need to do `flip` because the position of output bytebuffer is Review Comment: Do we know in what scenario the output byte buffer is not set? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [parquet-mr] shangxinli commented on a diff in pull request #1008: PARQUET-2212: Add ByteBuffer api for decryptors to allow direct memory to be decrypted
shangxinli commented on code in PR #1008: URL: https://github.com/apache/parquet-mr/pull/1008#discussion_r1038820534 ## parquet-hadoop/src/main/java/org/apache/parquet/ParquetReadOptions.java: ## @@ -44,6 +44,8 @@ public class ParquetReadOptions { private static final int ALLOCATION_SIZE_DEFAULT = 8388608; // 8MB private static final boolean PAGE_VERIFY_CHECKSUM_ENABLED_DEFAULT = false; private static final boolean BLOOM_FILTER_ENABLED_DEFAULT = true; + // Default to true if JDK 17 or newer. Review Comment: Don't quite understand this comment. Where it is set to true? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org