Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/19148 )
Change subject: IMPALA-11666: Modify the warning message when table statistics may be corrupt ...................................................................... Patch Set 3: (6 comments) I addresses some minor comments but could not address the other major comments from Quanlong and Qifan. Will try to see how to address Qifan's comment. http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java@332 PS2, Line 332: number of row > Done Done http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java@333 PS2, Line 333: or > Done Done http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java@333 PS2, Line 333: e size of all > nit remove Done http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java@333 PS2, Line 333: e > Done Done http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java@336 PS2, Line 336: transactional > Probably iceberg-v2 tables can also hit this case? Thanks Quanlong! I guess we may hit this in the case of iceberg-v2 tables but I could not be sure. In Impale shell, I have tried the following. It seems Impala at the moment does not support modifying a non-Kudu table yet. 1. create table test_db.iceberg_v2_parq (id int, user string) STORED BY ICEBERG location '/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_v2_partitioned_position_deletes' tblproperties ('format-version'='2', 'write.format.default'='parquet'); 2. insert into test_db.iceberg_v2_parq values (1, "Alex"); 3. delete from test_db.iceberg_v2_parq where id = 1; I encountered the following error after executing the delete statement. Query submitted at: 2022-10-19 11:00:59 (Coordinator: http://fangyu-upstream-dev.gce.cloudera.com:25000) ERROR: AnalysisException: Impala does not support modifying a non-Kudu table: test_db.iceberg_v2_parq On the other hand, I was not able to use Hive to perform the following steps due to the missing class of HiveIcebergInputFormat. 1. In Impala shell: create table test_db.iceberg_v2_orc (id int, user string) STORED BY ICEBERG location '/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_v2_partitioned_position_deletes' tblproperties ('format-version'='2', 'write.format.default'='orc'); 2. In beeline: insert into test_db.iceberg_v2_orc values (1, "Alex"); Caused by: java.lang.ClassNotFoundException: org.apache.iceberg.mr.hive.HiveIcebergInputFormat http://gerrit.cloudera.org:8080/#/c/19148/2/fe/src/main/java/org/apache/impala/planner/Planner.java@335 PS2, Line 335: "The latter case does not necessarily imply the existence of corrupt \n" + : "statistics when the corresponding tables are transactional > I wonder if it is possible to exclude transactional table names from the > above list. > > In this way, we can report the warning message for tables with corrupted > stats only. Thanks Qifan! I will think about how to do this. I have not read the corresponding code. -- To view, visit http://gerrit.cloudera.org:8080/19148 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I404b00db97f1fa0f6e67995c6e85a124ccf242ef Gerrit-Change-Number: 19148 Gerrit-PatchSet: 3 Gerrit-Owner: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: Fang-Yu Rao <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Wed, 19 Oct 2022 18:20:29 +0000 Gerrit-HasComments: Yes
