Impala Public Jenkins has submitted this change and it was merged. (
http://gerrit.cloudera.org:8080/20010 )
Change subject: IMPALA-11996: Scanner change for Iceberg metadata querying
......................................................................
IMPALA-11996: Scanner change for Iceberg metadata querying
This commit adds a scan node for querying Iceberg metadata tables. The
scan node creates a Java scanner object that creates and scans the
metadata table. The scanner uses the Iceberg API to scan the table after
that the scan node fetches the rows one by one and materialises them
into RowBatches. The Iceberg row reader on the backend does the
translation between Iceberg and Impala types.
There is only one fragment created to query the Iceberg metadata table
which is supposed to be executed on the coordinator node that already
has the Iceberg table loaded. This way there is no need for further
table loading on the executor side.
This change will not cover nested column types, these slots are set to
NULL, it will be done in IMPALA-12205.
Testing:
- Added e2e tests for querying metadata tables
- Updated planner tests
Performance testing:
Created a table and inserted ~5500 rows one by one, this generated
~270000 ALL_MANIFESTS metadata table records. This table is quite wide
and has a String column as well.
I only mention count(*) test on ALL_MANIFESTS, because every row is
materialized in every scenario currently:
- Cold cache: 15.76s
- IcebergApiScanTime: 124.407ms
- MaterializeTupleTime: 8s368ms
- Warm cache: 7.56s
- IcebergApiScanTime: 3.646ms
- MaterializeTupleTime: 7s477ms
Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Reviewed-on: http://gerrit.cloudera.org:8080/20010
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
---
M be/CMakeLists.txt
M be/src/exec/CMakeLists.txt
M be/src/exec/exec-node.cc
A be/src/exec/iceberg-metadata/CMakeLists.txt
A be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.cc
A be/src/exec/iceberg-metadata/iceberg-metadata-scan-node.h
A be/src/exec/iceberg-metadata/iceberg-row-reader.cc
A be/src/exec/iceberg-metadata/iceberg-row-reader.h
M be/src/scheduling/scheduler.cc
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/service/impalad-main.cc
M be/src/util/jni-util.cc
M be/src/util/jni-util.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/IcebergMetadataTableRef.java
M fe/src/main/java/org/apache/impala/analysis/Path.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergMetadataTable.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/IcebergMetadataScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
A fe/src/main/java/org/apache/impala/util/IcebergMetadataScanner.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-metadata-table-scan.test
M
testdata/workloads/functional-query/queries/QueryTest/iceberg-metadata-tables.test
M tests/authorization/test_ranger.py
M tests/query_test/test_iceberg.py
32 files changed, 1,419 insertions(+), 167 deletions(-)
Approvals:
Impala Public Jenkins: Looks good to me, approved; Verified
--
To view, visit http://gerrit.cloudera.org:8080/20010
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0e943cecd77f5ef7af7cd07e2b596f2c5b4331e7
Gerrit-Change-Number: 20010
Gerrit-PatchSet: 19
Gerrit-Owner: Tamas Mate <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>