Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5021: Fix count(*) remaining rows overflow in Parquet. ......................................................................
IMPALA-5021: Fix count(*) remaining rows overflow in Parquet. Zero-slot scans of Parquet files that have num_rows > MAX_INT32 in the footer metadata used to run forever due to an overflow when calculating the remaining number of rows to process. Testing: - Added a regression test using a file with num_rows = 2*MAX_INT32. - Locally ran test_scanners.py which succeeded. - Private core/hdfs run succeeded Change-Id: Ib9f8a6b83f8f621451d5977423ef81a6e4b124bd Reviewed-on: http://gerrit.cloudera.org:8080/6286 Reviewed-by: Alex Behm <[email protected]> Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-parquet-scanner.cc M testdata/data/README A testdata/data/huge_num_rows.parquet M tests/query_test/test_scanners.py 4 files changed, 22 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/6286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib9f8a6b83f8f621451d5977423ef81a6e4b124bd Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong <[email protected]>
