Quanlong Huang has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/18887 )

Change subject: IMPALA-11346: Migrated partitioned Iceberg tables might return 
ERROR when WHERE condition is used on partition column
......................................................................

IMPALA-11346: Migrated partitioned Iceberg tables might return ERROR when WHERE 
condition is used on partition column

Identity-partitioned columns are not necessarily stored in the data
files. E.g. when we migrate a legacy partitioned table to Iceberg
without rewriting the data files, the partition columns won't be
present in the files.

The Parquet scanner does a few optimizations to eliminate row groups,
i.e. filtering based on stats, bloom filters, etc. When a column is
not present in the data file that has some predicate on, then it is
assumed that the whole row group doesn't pass the filtering criteria.

But for Iceberg some files might contain partition columns, while
other files doesn't, so we need to prepare the scanners to handle
such cases.

The ORC scanner doesn't have that many optimizations so it didn't
ran into this issue.

Testing:
 * e2e tests

Merge conflicts due to missing 23d09638d:
 * file-metadata-utils.cc resolves trivial conflicts
 * hdfs-parquet-scanner.cc removes usage of NeedDataInFile()

Change-Id: Ie706317888981f634d792fb570f3eab1ec11a4f4
Reviewed-on: http://gerrit.cloudera.org:8080/18605
Reviewed-by: Csaba Ringhofer <[email protected]>
Reviewed-by: Tamas Mate <[email protected]>
Reviewed-by: <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-on: http://gerrit.cloudera.org:8080/18887
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
Tested-by: Quanlong Huang <[email protected]>
---
M be/src/exec/file-metadata-utils.cc
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-tables.test
3 files changed, 142 insertions(+), 1 deletion(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Quanlong Huang: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/18887
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: branch-4.1.1
Gerrit-MessageType: merged
Gerrit-Change-Id: Ie706317888981f634d792fb570f3eab1ec11a4f4
Gerrit-Change-Number: 18887
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to