Surya Hebbar has posted comments on this change. ( http://gerrit.cloudera.org:8080/23154 )
Change subject: IMPALA-9846: Enable Aggregated Runtime Profile by Default ...................................................................... Patch Set 19: (18 comments) Sorry for the late response. I was on PTO for more than a week due to my health. Thank you for the comments. http://gerrit.cloudera.org:8080/#/c/23154/13//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23154/13//COMMIT_MSG@16 PS13, Line 16: are substan > I see 3 kind of changes in the patch, and it may be useful to split them to Thank you for suggesting this. I will try to split the change into separate commits. For rewriting row_regex as aggregate(SUM,..), we would need to manually search such row_regex and sum up the actual values for rewriting the tests. Both searching for these row_regex and manually summing these values will cause increasing delays for enabling aggregated runtime profile patch. After enabling aggregated runtime profile, in later changes we can improve the tests to use aggregate instead of row_regex, as this will be a delaying, verbose task and is not required for enabling aggregated profile. I can combine 1 + 2 into one change and 3 into another change. This way the tests have to run for traditional profile first, then with the next change, the tests will run for aggregated profile. But, separating 1 and 2 would not be possible. As all the tests will break, when the profile output changes due to 1. Hence, 1 and 2 are inseparable for passing the tests successfully. So, finally the change can be broken into two patches. A. 1 + 2 - For testing traditional profile mode as default B. 3 - For testing aggregated profile mode as default Please let me know, if I have misunderstood this. Thank you again for the comments. http://gerrit.cloudera.org:8080/#/c/23154/13//COMMIT_MSG@20 PS13, Line 20: by transforming instance-level counters into operator level arrays : or maps. > It would be useful the list of common reasons why the change was needed. Done. I have enlisted the reasons now. http://gerrit.cloudera.org:8080/#/c/23154/18//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23154/18//COMMIT_MSG@17 PS18, Line 17: 'mt_dop' values. > I would prefer just aggregated_profile. Done. I have updated the name now. http://gerrit.cloudera.org:8080/#/c/23154/13/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/23154/13/be/src/util/runtime-profile.cc@2320 PS13, Line 2320: total > This seems like actually changing the text profile format (and also json in Done. I have explained it now. http://gerrit.cloudera.org:8080/#/c/23154/18/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/23154/18/be/src/util/runtime-profile.cc@2294 PS18, Line 2294: result.total = std::accumulate(vals.begin(), vals.end(), 0.0L); > Total is not a good metric to present in many cases. We should insead print I reverted to using 'total' as the preferred statistic. Please contemplate the following reasons. My initial approach was also to implement 'count'. This implementation was inefficient, verbose and problematic at times. 1. Printing count instead of total for all counters would result in a very repetitive profile. Also, this would be unnecessarily too verbose for the user. For example, - BytesRead: count=3 (3) mean=81.31 KB (85258) - RowsRead: count=3 (3) mean=2.00K (2005) - BytesReadLocal: count=3 (3) mean=81.31 KB (85258) In this way count would be repeated multiple times. On the other hand, when using 'total' as a statistic, 'total bytes read' and 'total memory consumed' are useful statistics. 2. Also majorly, with regard to "total" metric, it's not just "some" tests that need it. It's actually more than 2500+ tests, which mandatorily require this and would fail without this. We would be making an inefficient choice, if we were to repeatedly capture more regex groups(count and mean), while also adding unnecessary verbosity to profile. 3. Also, we can filter the type of values for which "total" can be printed. i.e. time values would not have 'total' bytes and row counts would have 'total' This was already implemented in one of the previous patches. If out approach is to use 'num instances', which is already present in profile(at times not same as count), we would need to be in context, while searching and matching in that specific context. This way, we would also be loosing the 10x faster, efficient regex searching, where we search all at once, instead of line by line. For these reasons, I believe it would be better to use 'total' instead of 'count' each time. http://gerrit.cloudera.org:8080/#/c/23154/18/be/src/util/runtime-profile.cc@2320 PS18, Line 2320: (*s) << prefix << " - " << name << ": total=" > Looking at the test changes, tHis makes the output confusing. Stats should As suggested in the previous comment... We can filter the type of values for which aggregated "total" can be printed. i.e. time values would not have 'total' metric aggregated while bytes and row counts would be aggregated into 'total' In one of the previous patches, a similar method was already implemented once. http://gerrit.cloudera.org:8080/#/c/23154/18/be/src/util/runtime-profile.cc@2397 PS18, Line 2397: counter_json.AddMember("mean", stats.mean, document.GetAllocator()); > See above. Done. This mistakenly remained from one of the previous patches mentioned above. I have made it consistent now. http://gerrit.cloudera.org:8080/#/c/23154/18/be/src/util/runtime-profile.cc@3355 PS18, Line 3355: } > EXPANDED maybe better than UNAGGREGATED? I believe, 'EXPANDED' might confuse users with 'EXTENDED' from the profile-verbosity attribute. It would be easier for the user to see how UNAGGREGATED would be opposite of AGGREGATED. http://gerrit.cloudera.org:8080/#/c/23154/13/testdata/workloads/functional-query/queries/QueryTest/binary-type.test File testdata/workloads/functional-query/queries/QueryTest/binary-type.test: http://gerrit.cloudera.org:8080/#/c/23154/13/testdata/workloads/functional-query/queries/QueryTest/binary-type.test@167 PS13, Line 167: aggregation(SUM, TotalKuduScanRoundTrips): 0 > This changes the semantics of the regexp, it is still correct though becaus Done http://gerrit.cloudera.org:8080/#/c/23154/13/testdata/workloads/functional-query/queries/QueryTest/compute-stats-incremental.test File testdata/workloads/functional-query/queries/QueryTest/compute-stats-incremental.test: http://gerrit.cloudera.org:8080/#/c/23154/13/testdata/workloads/functional-query/queries/QueryTest/compute-stats-incremental.test@544 PS13, Line 544: row_regex: Partition: year=2010 /day=__HIVE_DEFAULT_PARTITION__\nNumModifiedRows: 1000 > I see several tests rewritten to use row_regex - what is the reason for thi The same lines were trying to be matched or searched multiple times, these are not unique lines or searches. In this case, "NumModifiedRows: 1000" was being repeatedly searched Although, the intention of the test was to match NumModifiedRows, when /day=__HIVE_DEFAULT_PARITION__. This is the reason the lines were combined. Mentioning the reason for this and other such test fixes would be too verbose for this large change. I'm sorry for not including them in the commit message. http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py File tests/common/test_result_verifier.py: http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@601 PS13, Line 601: # aggregation(function, field_name): expected_value > This is the same as line 597 Done http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@624 PS13, Line 624: # <field_name>: val.(<value>) ... : # : # Note: Fields witho > I see several UNAGGREGATED in test files. What does this mean? I assumed th I was told to mantain support for both aggregated and unaggregated profiles by @Kurt Deschler. So, I have worked to maintain compatibility and successfully pass all the 2700+ tests for both aggregated and unaggregated profiles. http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@647 PS13, Line 647: "value": 0, : "subgroups": 0, : "function_operator": None, > Is there a reason for not allowing this besides being non useful? To search multiple counter fields for aggregation and row_regex in a really efficient) all at once search(20-50x faster), we are already using multiple capture groups for this. So, additional capture groups would unnecessarily complicate the logic, while not adding functionality. http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@650 PS13, Line 650: "expected_value": 0, : "comparison_operator": None, > This may make the change even bigger, but it would make more sense to me to Done http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@664 PS13, Line 664: fication: > typo Done http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@698 PS13, Line 698: if search_objec > this looks like a debugging print() Done. It was actually for testing purposes. http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@700 PS13, Line 700: return search_object : > Does to code actually support non-aggregated profiles? Does it need to supp Both aggregated and non-aggregated profiles are supported and compatible with tests. We only check and consider "Profile Type: AGGREGATED" to know if profile is aggregated. If this is not present, it means the one of the following 2 cases, where it is better to assume profile as non-aggregated. 1. Profile Type: UNAGGREGATED is present, so unaggregated profile 2. Profile Type is not present at all, so it is a traditional/aggregated profile from before In both cases, we can only assume it is an unaggregated profile. http://gerrit.cloudera.org:8080/#/c/23154/13/tests/common/test_result_verifier.py@723 PS13, Line 723: if negate_regex: > this looks like a debugging print() Done -- To view, visit http://gerrit.cloudera.org:8080/23154 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If41d6322361fba82c946efd614cc7d28cb1c36e8 Gerrit-Change-Number: 23154 Gerrit-PatchSet: 19 Gerrit-Owner: Surya Hebbar <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]> Gerrit-Comment-Date: Mon, 10 Nov 2025 10:48:06 +0000 Gerrit-HasComments: Yes
