Noemi Pap-Takacs has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23889 )

Change subject: IMPALA-12027: Support additional details for DataSink nodes in 
ExecSummary
......................................................................


Patch Set 3:

(6 comments)

Thanks for including MultiDataSink and the testing!

http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@9
PS3, Line 9: we currently show
nit: 'we show'


http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@10
PS3, Line 10: different
nit: 'some'
It could express that this is/was not available for all types.


http://gerrit.cloudera.org:8080/#/c/23889/3//COMMIT_MSG@13
PS3, Line 13: This field is currently unavailable for any type of `DataSink` 
nodes.
            :
            : With this change, we support displaying table names and other 
details
            : for table sink nodes and other such `DataSink` nodes
nit: Please avoid using 'currently' when you refer to the old behavior. You can 
say something like this instead:
"This change introduces support for displaying table names and other details 
for table sink nodes and other such `DataSink` nodes as well"


http://gerrit.cloudera.org:8080/#/c/23889/3/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java
File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java:

http://gerrit.cloudera.org:8080/#/c/23889/3/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@111
PS3, Line 111: String.join(" ", table_sinks_set);
Currently this returns the table name twice, concatenated: the 'table_name' 
(for the insert sink) and 'table_name-DELETE' (for the delete sink).
We should only return 'table_name'.


http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py@150
PS3, Line 150: # INSERT query.
             :     query = "create table {0}.tmp as select count(*) from 
functional.alltypes".format(
             :         unique_database)
             :     result = self.client.execute(query, fetch_exec_summary=True)
             :     # Sanity-check the HDFS writer sink.
             :     assert result.exec_summary[0]['operator'] == 'F01:HDFS 
WRITER'
             :     assert result.exec_summary[0]['max_time'] >= 0
             :     assert result.exec_summary[0]['num_rows'] == -1
             :     assert result.exec_summary[0]['est_num_rows'] == -1
             :     assert result.exec_summary[0]['peak_mem'] >= 0
             :     assert result.exec_summary[0]['est_peak_mem'] >= 0
Maybe you could check for the table sink label detail here instead of adding a 
new test.


http://gerrit.cloudera.org:8080/#/c/23889/3/tests/query_test/test_observability.py@268
PS3, Line 268: "select count(*) from functional.alltypes"
Or alternatively, you could modify and extend this test case to check for the 
label details.



--
To view, visit http://gerrit.cloudera.org:8080/23889
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2652dd896f72c5c6bbe7e76facdede2a237808d5
Gerrit-Change-Number: 23889
Gerrit-PatchSet: 3
Gerrit-Owner: Surya Hebbar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Wed, 25 Feb 2026 11:38:23 +0000
Gerrit-HasComments: Yes

Reply via email to