[
https://issues.apache.org/jira/browse/PARQUET-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17690111#comment-17690111
]
ASF GitHub Bot commented on PARQUET-2247:
-----------------------------------------
wgtmac commented on code in PR #1031:
URL: https://github.com/apache/parquet-mr/pull/1031#discussion_r1109249438
##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ColumnChunkPageWriteStore.java:
##########
@@ -160,7 +160,7 @@ public void writePage(BytesInput bytes,
Encoding valuesEncoding) throws IOException {
pageOrdinal++;
long uncompressedSize = bytes.size();
- if (uncompressedSize > Integer.MAX_VALUE) {
+ if (uncompressedSize > Integer.MAX_VALUE || uncompressedSize < 0) {
Review Comment:
The exception message below also should be changed to reflect the negative
check.
> Fail-fast if CapacityByteArrayOutputStream write overflow
> ---------------------------------------------------------
>
> Key: PARQUET-2247
> URL: https://issues.apache.org/jira/browse/PARQUET-2247
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Reporter: dzcxzl
> Priority: Critical
>
> The bytesUsed of CapacityByteArrayOutputStream may overflow when writing some
> large byte data, resulting in parquet file write corruption.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)