Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/19471 )
Change subject: IMPALA-11081: Fix incorrect results in partition key scan ...................................................................... IMPALA-11081: Fix incorrect results in partition key scan This patch fixes incorrect results caused by short-circuit partition key scan in the case where a Parquet/ORC file contains multiple blocks. IMPALA-8834 introduced the optimization that generating only one scan range that corresponding to the first block per file. Backends only issue footer ranges for Parquet/ORC files for file-metadata-only queries(see HdfsScanner::IssueFooterRanges()), which leads to incorrect results if the first block doesn't include a file footer. This bug is fixed by returning a scan range corresponding to the last block for Parquet/ORC files to make sure it contains a file footer. Testing: - Added e2e tests to verify the fix. Change-Id: I17331ed6c26a747e0509dcbaf427cd52808943b1 Reviewed-on: http://gerrit.cloudera.org:8080/19471 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M tests/common/test_dimensions.py M tests/metadata/test_partition_metadata.py M tests/query_test/test_queries.py 4 files changed, 70 insertions(+), 9 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/19471 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I17331ed6c26a747e0509dcbaf427cd52808943b1 Gerrit-Change-Number: 19471 Gerrit-PatchSet: 17 Gerrit-Owner: Yifan Zhang <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Yifan Zhang <[email protected]>
