[ https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690111#comment-17690111 ]
ASF GitHub Bot commented on PARQUET-2247: ----------------------------------------- wgtmac commented on code in PR #1031: URL: https://github.com/apache/parquet-mr/pull/1031#discussion_r1109249438 ########## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java: ########## @@ -160,7 +160,7 @@ public void writePage(BytesInput bytes, Encoding valuesEncoding) throws IOException { pageOrdinal++; long uncompressedSize = bytes.size(); - if (uncompressedSize > Integer.MAX_VALUE) { + if (uncompressedSize > Integer.MAX_VALUE || uncompressedSize < 0) { Review Comment: The exception message below also should be changed to reflect the negative check. > Fail-fast if CapacityByteArrayOutputStream write overflow > --------------------------------------------------------- > > Key: PARQUET-2247 > URL: https://issues.apache.org/jira/browse/PARQUET-2247 > Project: Parquet > Issue Type: Bug > Components: parquet-mr > Reporter: dzcxzl > Priority: Critical > > The bytesUsed of CapacityByteArrayOutputStream may overflow when writing some > large byte data, resulting in parquet file write corruption. -- This message was sent by Atlassian Jira (v8.20.10#820010)