Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/23889 )
Change subject: IMPALA-12027: Support additional details for DataSink nodes in ExecSummary ...................................................................... Patch Set 3: (6 comments) Thanks for including MultiDataSink and the testing! http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@9 PS3, Line 9: we currently show nit: 'we show' http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@10 PS3, Line 10: different nit: 'some' It could express that this is/was not available for all types. http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@13 PS3, Line 13: This field is currently unavailable for any type of `DataSink` nodes. : : With this change, we support displaying table names and other details : for table sink nodes and other such `DataSink` nodes nit: Please avoid using 'currently' when you refer to the old behavior. You can say something like this instead: "This change introduces support for displaying table names and other details for table sink nodes and other such `DataSink` nodes as well" http://gerrit.cloudera.org:8080/#/c/23889/3/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/23889/3/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@111 PS3, Line 111: String.join(" ", table_sinks_set); Currently this returns the table name twice, concatenated: the 'table_name' (for the insert sink) and 'table_name-DELETE' (for the delete sink). We should only return 'table_name'. http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py@150 PS3, Line 150: # INSERT query. : query = "create table {0}.tmp as select count(*) from functional.alltypes".format( : unique_database) : result = self.client.execute(query, fetch_exec_summary=True) : # Sanity-check the HDFS writer sink. : assert result.exec_summary[0]['operator'] == 'F01:HDFS WRITER' : assert result.exec_summary[0]['max_time'] >= 0 : assert result.exec_summary[0]['num_rows'] == -1 : assert result.exec_summary[0]['est_num_rows'] == -1 : assert result.exec_summary[0]['peak_mem'] >= 0 : assert result.exec_summary[0]['est_peak_mem'] >= 0 Maybe you could check for the table sink label detail here instead of adding a new test. http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py@268 PS3, Line 268: "select count(*) from functional.alltypes" Or alternatively, you could modify and extend this test case to check for the label details. -- To view, visit http://gerrit.cloudera.org:8080/23889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2652dd896f72c5c6bbe7e76facdede2a237808d5 Gerrit-Change-Number: 23889 Gerrit-PatchSet: 3 Gerrit-Owner: Surya Hebbar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]> Gerrit-Comment-Date: Wed, 25 Feb 2026 11:38:23 +0000 Gerrit-HasComments: Yes
