Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24353 )

Change subject: IMPALA-12956: Add profile summary output
......................................................................


Patch Set 4:

(9 comments)

Thanks for working on this!

http://gerrit.cloudera.org:8080/#/c/24353/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/24353/4//COMMIT_MSG@24
PS4, Line 24: option.
It'd be nice to add tests on failed/cancelled queries, especially queries that 
failed due to insufficient memory and already had some progress.


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc
File be/src/util/impala-profile-tool.cc:

http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@239
PS4, Line 239:  
nit: use 4-space indent


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@245
PS4, Line 245: static string ProgressString(int64_t completed, int64_t total) {
             :   stringstream ss;
             :   ss << completed << " / " << total << " (" << std::setw(4);
             :   if (completed == 0 || total == 0) {
             :     ss << "0%)";
             :   } else {
             :     ss << (100.0 * completed / static_cast<double>(total)) << 
"%)";
             :   }
             :   return ss.str();
             : }
This duplicates ImpalaHttpHandler::ProgressToString(). Let's move it to util 
files (e.g. be/src/util/summary-util.[h/cc]) and share it instead.


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@281
PS4, Line 281:   if (!GetTimelineDuration(summary_node, "Query Timeline", 
&duration)) {
To be consistent with what we show in the WebUI, we should try using 
(end_time_us - start_time_us) first. If "End Time" is missing in the profile, 
using "Query Timeline" is a good fallback.


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@304
PS4, Line 304: static bool ParseBytesValue(const string& value, int64_t* bytes) 
{
Can we reuse ParseUtil::ParseMemSpec()?
https://github.com/apache/impala/blob/11b5fd6c1f5201af5710342e9c793b586664d6b1/be/src/util/parse-util.cc#L39


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@378
PS4, Line 378:   // The profile log does not keep total scan ranges, but 
finished queries are 100% done.
We can sum values in the "File Formats" row for the total scan ranges. E.g. 
this has 2 scan ranges: "File Formats: PARQUET/GZIP:1 PARQUET/SNAPPY:1".
This has 150 scan ranges: "File Formats: PARQUET/Unknown(Skipped):150"


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@390
PS4, Line 390:   if (query_state == "FINISHED"
Can we handle failed/cancelled queries here? The progress is not that helpful 
for completed queries.


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@461
PS4, Line 461:       GetInfoString(summary_node, "End Time"),
"End Time" could be missing in failed/cancelled queries. We show an empty 
string here instead of "N/A", which is inconsistent with other fields.

How about adding a default parameter for GetInfoString and use 
SUMMARY_VALUE_UNAVAILABLE here?


http://gerrit.cloudera.org:8080/#/c/24353/4/be/src/util/impala-profile-tool.cc@471
PS4, Line 471:       query_state,
"Query Status" is also helpful. "Query State" just shows EXCEPTION for failed 
queries. "Query Status" shows more details, e.g. "Session closed", 
"AnalysisException: ...", "Query xxx:xxx expired due to client inactivity 
(timeout is 5m)".

We can truncate it when it's too long like sql statement.



--
To view, visit http://gerrit.cloudera.org:8080/24353
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I35da958106ffb35b14ef867da50ed23ffda0a45b
Gerrit-Change-Number: 24353
Gerrit-PatchSet: 4
Gerrit-Owner: Aleksandr Efimov <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jason Fehr <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Thu, 28 May 2026 06:25:09 +0000
Gerrit-HasComments: Yes

Reply via email to