Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/24042 )
Change subject: IMPALA-14592: Read Row Lineage of Iceberg tables ...................................................................... Patch Set 3: (7 comments) Thanks for the comments! http://gerrit.cloudera.org:8080/#/c/24042/2/be/src/exec/hdfs-table-writer.cc File be/src/exec/hdfs-table-writer.cc: http://gerrit.cloudera.org:8080/#/c/24042/2/be/src/exec/hdfs-table-writer.cc@39 PS2, Line 39: if (!table_desc_->IsIcebergTable() || table_desc_->IcebergFormatVersion() < 3) { > We'll be skipping this check for V1/V2 tables too which have been checked u We haven't sent the format-version until now. I added it in the new patch set. http://gerrit.cloudera.org:8080/#/c/24042/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropColStmt.java File fe/src/main/java/org/apache/impala/analysis/AlterTableDropColStmt.java: http://gerrit.cloudera.org:8080/#/c/24042/2/fe/src/main/java/org/apache/impala/analysis/AlterTableDropColStmt.java@63 PS2, Line 63: // mappings along with the table columns. > I think we should check if the column is null to prevent NPE. Its being don Good catch, done. http://gerrit.cloudera.org:8080/#/c/24042/2/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java: http://gerrit.cloudera.org:8080/#/c/24042/2/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@418 PS2, Line 418: return super.getColumnsInHiveOrder().stream() : .filter(col -> !col.isHidden()) : .collect(Collectors.toList()) > nit: Shouldn't the dots be at the start of the line Done http://gerrit.cloudera.org:8080/#/c/24042/2/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java File fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java: http://gerrit.cloudera.org:8080/#/c/24042/2/fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java@188 PS2, Line 188: return super.getColumnsInHiveOrder().stream() : .filter(col -> !col.isHidden()) > nit: Shouldn't the dots be at the start of the line: Done http://gerrit.cloudera.org:8080/#/c/24042/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-row-lineage.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-row-lineage.test: PS2: > Can we also add some test cases with partitioned tables. I added an Impala-written partitioned table (which doesn't have row-id / last-updated-sequence-number in the files). Once we unblock OPTIMIZE, UPDATE, MERGE, it will be easier to test all edge cases. http://gerrit.cloudera.org:8080/#/c/24042/2/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-row-lineage.test@335 PS2, Line 335: d_seque > Why is this parquet, the table name suggests it should be orc. Here we output the value of tblproperty 'write.format.default', but in an Iceberg table different data files can have different file formats. Anyway, updated how we create this table and now it's showing 'ORC' to avoid confusion. http://gerrit.cloudera.org:8080/#/c/24042/2/tests/query_test/test_iceberg.py File tests/query_test/test_iceberg.py: http://gerrit.cloudera.org:8080/#/c/24042/2/tests/query_test/test_iceberg.py@2343 PS2, Line 2343: u > flake8: U100 Unused argument 'vector' Done -- To view, visit http://gerrit.cloudera.org:8080/24042 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I71b1076b25c9e7a0a6c9428b24abc986f5382c71 Gerrit-Change-Number: 24042 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Arnab Karmakar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 06 Mar 2026 16:35:37 +0000 Gerrit-HasComments: Yes
