Lars Volker has posted comments on this change.
Change subject: IMPALA-3342, IMPALA-3920: Adding thread counters to obtain
thread stats per scanner thread, and to measure time spent in
per-fragment-executor Open() and ProcessBuildInputAsync calls
..
Patch Set 1:
(13 comments)
http://gerrit.cloudera.org:8080/#/c/4561/1//COMMIT_MSG
Commit Message:
Nit: You might want to set your git user name, so it shows up in your commits.
git config --global user.name "Anuj Phadke"
Line 11: Initial draft. Will add a detailed commit message later.
I couldn't figure out how this addresses IMPALA-3920. Can you explain in the
commit message?
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/exec/blocking-join-node.h
File be/src/exec/blocking-join-node.h:
Line 68: RuntimeProfile::ThreadCounters* process_build_input_async_counters_;
Can this be protected?
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:
Line 358: string thread_name = "ScannerThread-" +
std::to_string(syscall(SYS_gettid))+ "-";
Something similar is done in L348. Why are the names different? Also, I think
we often prefer to use stringstream<< or Substitute() to build strings.
Maybe even better, pass in the name from the caller, similar to the kudu scan
node.
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/exec/hdfs-scan-node.h
File be/src/exec/hdfs-scan-node.h:
Line 25: #include
these don't seem used here. Can you move them to the .cc file where they are
used?
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:
PS1, Line 274: thread_name
I think you already get the thread name in the 'name' parameter.
Line 275: scanner_thread_counters_ =
I understand that this adds one counter per thread. Couldn't this be a lot of
counters and thus lines we add to the profile? Have you considered adding a
histogram of counters instead of one per thread?
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/exec/kudu-scan-node.h
File be/src/exec/kudu-scan-node.h:
Line 24: #include
these don't seem used here. Can you move them to the .cc file where they are
used?
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:
Line 306: plan_fragment_counters_ = ADD_THREAD_COUNTERS(profile(),
Does this have to be a member variable, i.e. does it have to be valid past the
scope of this method?
http://gerrit.cloudera.org:8080/#/c/4561/1/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:
Line 150: ///Name of the plan fragment counter
Describe what the counter tracks. The comment and name don't seem to match.
Line 153: ///Fragment thread counters
Can you explain what these count?
Line 154: RuntimeProfile::ThreadCounters* plan_fragment_counters_;
Make this protected?
Line 156: RuntimeProfile::ThreadCounters* plan_fragment_counters() const {
the variable seems to be used internally only, so I don't see why this getter
is needed. However, I saw that pattern in other files, too, and if you want to
follow it, maybe it's best to also have a setter to initialize the variable.
Currently it's set by direct assignment and then referenced with the getter,
which I think looks a bit confusing.
--
To view, visit http://gerrit.cloudera.org:8080/4561
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b148eafb76319e727e8d0ef33c49b300b0c887f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: anujphadke
Gerrit-Reviewer: Lars Volker
Gerrit-Reviewer: Tim Armstrong
Gerrit-HasComments: Yes