Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21683 )

Change subject: IMPALA-13304: Include aggregate instance-level metrics within 
experimental profile(V2)
......................................................................


Patch Set 14:

(11 comments)

The test code looks OK to me. Consider parameterizing it and select some 
deterministic scenario to test so that it is easier to narrow down and 
reproduce when a test scenario failed.

I think you can leave some more explanation comments on what each test section 
is trying to assert. I left some example comments suggestion.

http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc
File be/src/util/runtime-profile-test.cc:

http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@1904
PS14, Line 1904:   
aggregated_profile->ToJsonSubclass(RuntimeProfile::Verbosity::DEFAULT, 
&plan_node_profile, &doc);
> line too long (99 > 90)
Break this line.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@1935
PS14, Line 1935:   const uint64_t BUCKET_SIZE = 
FLAGS_aggregate_event_sequence_timestamps_granularity_count;
> line too long (91 > 90)
Break this line.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@1986
PS14, Line 1986:   const string NODE_LIFECYCLE_EVENTS[] = {"Open Started", 
"Open Finished",
               :       "First Batch Requested", "First Batch Returned", "Last 
Batch Returned", "Closed"};
               :   const string NODE_LIFECYCLE_EVENT_TIMELINE = "Node Lifecycle 
Event Timeline";
Duplicated constant in both CompleteEventsTest and MissingEventsTest. Consider 
moving up this constant.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@2059
PS14, Line 2059:     EXPECT_TRUE(events_json[0].HasMember("ts_list"));
Leave comment inside this branch block.

// There should be no event sequence grouping if num reporting fragment 
instances is less than bucket size.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@2071
PS14, Line 2071:     EXPECT_TRUE(events_json[0].HasMember("ts_stat"));
Leave comment inside this branch block.

// Event sequences should be in a grouped form.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@2090
PS14, Line 2090: TEST(AggregatedEventSequenceToJson, MissingEventsTest) {
Consider parameterizing this test so you can define a deterministic test 
scenario. For example:

  void RunMissingEventTest(int bucket_size, int num_profiles, double 
missing_event_probability);

  TEST(AggregatedEventSequenceToJson, MissingEvents_5_10_5) {
    RunMissingEventTest(5, 10, 0.5);
  }

  TEST(AggregatedEventSequenceToJson, MissingEvents_2_1000_8) {
    RunMissingEventTest(2, 1000, 0.8);
  }

And so on. Same comment for CompleteEventsTest.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@2135
PS14, Line 2135: 2000
Create a constant for this.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile-test.cc@2135
PS14, Line 2135: NUM_PROFILES
Should this be NODE_LIFECYCLE_EVENTS?


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile.cc@66
PS14, Line 66: aggregate_event_sequence_timestamps_granularity_count
Please shorten the flag name. 'json_profile_num_event_grouping' sounds better.


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile.cc@2986
PS14, Line 2986:       info_strings_json = (*parent)["info_strings"];
I think you should put back the agg_info_strings_lock_ to where it used to be. 
Otherwise, you can have concurrent initialization in parent here. Otherwise, 
move everything down inside CollectInfoStringIntoJson, so here you only have:

  // 3. Info strings
  CollectInfoStringIntoJson("Table Name", parent, d);


http://gerrit.cloudera.org:8080/#/c/21683/14/be/src/util/runtime-profile.cc@3037
PS14, Line 3037:   if (order_found) {
               :     for (size_t instance_idx = 0; instance_idx < 
num_instances; ++instance_idx) {
               :       if (timestamps[instance_idx].size() < events_count) {
The order of this branch-loop-branch can be clearer if it is loop-branch-branch.


  for (size_t instance_idx = 0; instance_idx < num_instances; ++instance_idx) {
    if (timestamps[instance_idx].size() < events_count) {
      if (order_found) {



--
To view, visit http://gerrit.cloudera.org:8080/21683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49e18a7a7e1288e3e674e15b6fc86aad60a08214
Gerrit-Change-Number: 21683
Gerrit-PatchSet: 14
Gerrit-Owner: Surya Hebbar <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Surya Hebbar <[email protected]>
Gerrit-Comment-Date: Thu, 16 Jan 2025 18:07:35 +0000
Gerrit-HasComments: Yes

Reply via email to