[ 
https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690111#comment-17690111
 ] 

ASF GitHub Bot commented on PARQUET-2247:
-----------------------------------------

wgtmac commented on code in PR #1031:
URL: https://github.com/apache/parquet-mr/pull/1031#discussion_r1109249438


##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java:
##########
@@ -160,7 +160,7 @@ public void writePage(BytesInput bytes,
                           Encoding valuesEncoding) throws IOException {
       pageOrdinal++;
       long uncompressedSize = bytes.size();
-      if (uncompressedSize > Integer.MAX_VALUE) {
+      if (uncompressedSize > Integer.MAX_VALUE || uncompressedSize < 0) {

Review Comment:
   The exception message below also should be changed to reflect the negative 
check.





> Fail-fast if CapacityByteArrayOutputStream write overflow
> ---------------------------------------------------------
>
>                 Key: PARQUET-2247
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2247
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>            Reporter: dzcxzl
>            Priority: Critical
>
> The bytesUsed of CapacityByteArrayOutputStream may overflow when writing some 
> large byte data, resulting in parquet file write corruption.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to