eadwright opened a new pull request #902: URL: https://github.com/apache/parquet-mr/pull/902
This PR addresses this issue: https://issues.apache.org/jira/browse/PARQUET-1633 I have not added unit tests, as to check overflow conditions I would need test data over 2GB in size (on disk, compressed), considerably larger in-memory and thus requiring significant CI resources. The issue was using an `int` for length field, which for parquet files with very large `row_group_size` (row groups over 2GB) would cause silent integer overflow, manifesting itself as negative length and an attempt to create an ArrayList with negative length. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
