Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12069 )

Change subject: IMPALA-7694: Add host resource usage metrics to profile
......................................................................


Patch Set 11:

(12 comments)

Still wrapping my head around this big change. Some initial comments for now.

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/coordinator-backend-state.h@67
PS11, Line 67: RuntimeProfile* host_profiles
nit: Please add a comment on the role of this parameter.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/coordinator-backend-state.h@249
PS11, Line 249: host_profile_ = nullptr;
nit: A brief statement of what this contains.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/query-state.cc@296
PS11, Line 296:   // Add profile to report
              :   host_profile_->ToThrift(&profiles_forest->host_profile);
              :   profiles_forest->__isset.host_profile = true;
              :   // Free resources in chunked counters in the profile
              :   host_profile_->ClearChunkedTimeSeriesCounters();
I wonder how this may reconcile with https://gerrit.cloudera.org/#/c/12049/ ? 
In particular, with the mentioned patch, on a failure to send the profile 
report, the query state thread will back off, and retry sending a new profile.

So, the host profile is potentially another non-idempotent state added to the 
report. That said, how tolerant is the coordinator's side logic on handling 
missing chunks in this time series ? In other words, if we fail to send a 
report, how bad is it to let the host profile of this epoch to be dropped ?


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/runtime/runtime-state.cc@76
PS11, Line 76: profile_(RuntimeProfile::Create(obj_pool(), "<fragment 
instance>")),
Why this change ?


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile-counters.h
File be/src/util/runtime-profile-counters.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile-counters.h@443
PS11, Line 443: previous_sample_count_ = 0;
This deserves a comment.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.h
File be/src/util/runtime-profile.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.h@419
PS11, Line 419:  Samples
              :   /// are not re-sampled into larger intervals, instead owners 
of this profile can call
              :   /// ClearChunkedTimeSeriesCounters() to reset the sample 
buffers of all chunked time
              :   /// series counters
So, misuse of this kind of counter (e.g. never calling 
ClearChunkedTimeSeriesCounters) may lead to large untracked memory usage, right 
?


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc@1216
PS11, Line 1216:
nit: blank space


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc@1231
PS11, Line 1231:
nit: blank space


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/runtime-profile.cc@1279
PS11, Line 1279: DCHECK(
DCHECK_EQ


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.h
File be/src/util/system-state-info.h:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.h@66
PS11, Line 66:   /// CaptureSystemStateSnapshot().
'cpu_ratios_' is updated after returning from this function.


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.h@69
PS11, Line 69: NUM_CPU_VALUES = 5;
Seems more future proof to have this as the last value in the enum below like:

   enum PROC_STAT_CPU_VALUES {
    CPU_USER = 0,
    CPU_NICE,
    CPU_SYSTEM,
    CPU_IDLE,
    CPU_IOWAIT,
    NUM_CPU_VALUES
  };


http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.cc
File be/src/util/system-state-info.cc:

http://gerrit.cloudera.org:8080/#/c/12069/11/be/src/util/system-state-info.cc@44
PS11, Line 44: int64_t SystemStateInfo::ParseInt64(const string& val) const {
             :   StringParser::ParseResult result;
             :   int64_t parsed = 
StringParser::StringToInt<int64_t>(val.c_str(), val.size(), &result);
             :   if (result == StringParser::PARSE_SUCCESS) return parsed;
             :   return -1;
             : }
Wonder if this fits better as a utility function in StringParser ?



--
To view, visit http://gerrit.cloudera.org:8080/12069
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3aedc20c553ab8d7ed50f72a1a936eba151487d9
Gerrit-Change-Number: 12069
Gerrit-PatchSet: 11
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 24 Jan 2019 06:01:09 +0000
Gerrit-HasComments: Yes

Reply via email to