Surya Hebbar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23154 )

Change subject: IMPALA-9846: Enable Aggregated Runtime Profile by Default
......................................................................


Patch Set 31:

(4 comments)

Hi Aleksandr, thank you for the comments, they were helpful. I also wanted to 
note one thing: a discussion on the main test infrastructure change proposed 
here is currently going on, so I would request you to defer the comments until 
the direction this change is moving towards is resolved and stable.

http://gerrit.cloudera.org:8080/#/c/23154/30/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/23154/30/be/src/util/runtime-profile.cc@2327
PS30, Line 2327:   if (verbosity <= Verbosity::LEGACY) {
> This still prints `total=` for derived counters such as `BYTES_PER_SECOND`,
I'd like to keep each of these separate.

On ratios, agreed, summing DOUBLE_VALUE ratios has meaning depending on what it 
represents, they are not always ratios or percentages. But I agree it has no 
sensible meaning for ratios/percentages(per-instance percentages don't add), so 
those shouldn't carry total.

For rates, summing per-instance throughput closely approximates aggregate 
throughput, i.e. such as read throughput in GB/s. I realize the case for 
non-concurrent execution, but when the same type of node (SCAN/AGGREGATION) is 
distributed, they are distributed equally, so these instances almost always run 
concurrently. This seems more meaningful than printing the mean for it.

I'll classify ratio counters and other relevant counters as non-summable via 
TCounterCategory so total is suppressed for them. I had already mentioned the 
ratio problem in an earlier comment, so I will address it soon.


http://gerrit.cloudera.org:8080/#/c/23154/30/be/src/util/runtime-profile.cc@2336
PS30, Line 2336:
> This changes the legacy text profile format as well. With 
> `--aggregated_profile=false`, averaged counters that pass `print_total` now 
> emit `total=...` and return before printing the old legacy shape.
> That seems risky for compatibility with existing tooling that parses 
> non-aggregated text profiles.

Yes. total= should only be in the V2 / aggregated profile, V1 stays unchanged. 
This was the intent from PS21, where total had leaked into the non-aggregated 
averaged fragment and I'd said I would revert it. It came back during the test 
fixes by mistake.

> Can we keep the legacy output unchanged and only add the new `total` field 
> for the aggregated/default profile format?

Aggregated profile is part of the traditional profile and cannot be separated. 
It is represented as the "Averaged Fragment" in traditional profile. The 
existing tools rely on the text from the root profile rather than the 
aggregated (Averaged) fragment. So the concern isn't really external-tool 
compatibility, it's that V2-style total= is currently leaking into the V1 
averaged fragment, which is exactly what I'm removing so the legacy output is 
unchanged.

On mean vs total for V2 itself: printing just the bare value without a 
"mean"/"total" marker is confusing for any interpreter, human or machine, so 
the aggregated profile labels them. For counters like RowsRead/RowsReturned, 
mean across instances isn't very useful, total is the tangible number, while 
for time/duration the total is misleading (sum across instances reads as more 
time than actually elapsed), so those stay mean/max. So it's a deliberate 
per-counter choice in V2, stated explicitly. This also gives the regex tests a 
clean marker (mean / total) to tell a single counter apart from an averaged/agg 
counter.


http://gerrit.cloudera.org:8080/#/c/23154/30/tests/common/test_result_verifier.py
File tests/common/test_result_verifier.py:

http://gerrit.cloudera.org:8080/#/c/23154/30/tests/common/test_result_verifier.py@658
PS30, Line 658:     if re.search(r"([^\\]|^)\([^\?]", pattern):
> This check does not catch capturing groups at the beginning of the pattern,
Good catch, thanks. Updated so a group at position 0 is caught:

r"([^\\]|^)\([^?]"


http://gerrit.cloudera.org:8080/#/c/23154/30/tests/common/test_result_verifier.py@685
PS30, Line 685:       pattern = r"{0}: total=\d+(?:.\d+)? ?(?:[KMGB]+)? 
\((\d+(?:.\d+)?)\)".format(field)
> This regex only allows `[KMB]+` as the human-readable unit, so it does not
TB isn't currently produced by pretty-printer.h, but so I've left it out, but 
you're right about GB.

In the previous test infrastructure the regex also did not contain "G", and no 
counter tests seem to have matched bytes at GB scale (it becomes quite variable 
at that scale, probably why it never came up), so I was caught off-guard when I 
reused that form. I have corrected it now.



--
To view, visit http://gerrit.cloudera.org:8080/23154
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If41d6322361fba82c946efd614cc7d28cb1c36e8
Gerrit-Change-Number: 23154
Gerrit-PatchSet: 31
Gerrit-Owner: Surya Hebbar <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Aleksandr Efimov <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Mon, 22 Jun 2026 12:58:12 +0000
Gerrit-HasComments: Yes

Reply via email to