Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16220 )

Change subject: IMPALA-9989 Improve admission control pool stats logging
......................................................................


Patch Set 27:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG@77
PS27, Line 77: loggerd
nit: typo


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h@448
PS27, Line 448:   /// memory tracked by query memory trackers. The top element 
in the queue is the
nit: by all children query mem trackers.


http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc@481
PS23, Line 481: MemTracker* MemTracker::GetRootMemTracker() {
              :   MemTracker* ancestor = this;
              :   while (ancestor && ancestor->parent()) {
              :     ancestor = ancestor->parent();
              :   }
              :   return ancestor;
              : }
is this used anywhere?


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@422
PS27, Line 422: UpdatePoolStatsForQueries
can you also add a test for this in mem-tracker-test


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@458
PS27, Line 458: else {
Add a DCHECK(tracker->is_query_mem_tracker_) to make sure these stats are 
collected only for query memtrackers since they dont make sense for trackers 
lower in the mem tracker hierarchy.

Also mention in the method comment for UpdatePoolStatsForQueries() that it 
should only be called for mem-trackers that are either query mem trackers or 
higher in the mem tracker hierarchy.


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc
File be/src/scheduling/admission-controller-test.cc:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc@972
PS27, Line 972: void hook_me(const char* hook)
              : {
              :   bool echo = false;
              :   FILE* fp = nullptr;
              :
              :   while (!(fp = fopen(hook, "r"))) {
              :      if ( !echo ) {
              :        std::cout << "gdb -pid " << getpid() << ", then touch " 
<< hook << std::endl;
              :        echo = true;
              :      }
              :   }
              :   fclose(fp);
              : }
remove?


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@735
PS27, Line 735: contains
nit: contain


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@1055
PS27, Line 1055:   /// A helper type to glue information together to compute 
the topN queries
               :   /// out of <n> topM queries.
               :   typedef std::tuple<int64_t, string, TUniqueId, const 
TPoolStats*> Item;
               :   const int64_t& getMemConsumed(const Item& item) const { 
return std::get<0>(item); }
               :   const string& getName(const Item& item) const { return 
std::get<1>(item); }
               :   const TUniqueId& getTUniqueId(const Item& item) const { 
return std::get<2>(item); }
               :   const TPoolStats* getTPoolStats(const Item& item) const { 
return std::get<3>(item); }
Guess i am a bit biased for using structs since i am kinda used to using them 
in this codebase, but this seems good too.


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@272
PS27, Line 272: DebugPoolStatsForConsumedMemory
nit: AppendStatsForConsumedMemory


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@303
PS27, Line 303: // Return a debug string for memory consumption part of the 
pool stats.
              : string 
AdmissionController::PoolStats::DebugPoolStatsForConsumedMemory(
              :     const TPoolStats& stats) const {
              :   stringstream ss;
              :   DebugPoolStatsForConsumedMemory(ss, stats);
              :   return ss.str();
              : }
since DebugPoolStats is the only method using this, we can probably get rid of 
it and pass the ss object in DebugPoolStats directly, since this would generate 
the string twice (2 seperate calls to ss.str())


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@361
PS27, Line 361: DebugTopNQueriesForAllPoolsInHost
nit:  GetDebugStringForTopNQueriesOnHost() or GetLogStringForTopNQueriesOnHost


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@368
PS27, Line 368: map
nit: directly use pool_stats_ here


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@401
PS27, Line 401: std::get<1>(current) == std::get<1>(next)
nit: please use the method you defined in the header file getName()


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@405
PS27, Line 405: DebugHeavyMemoryQueriesForAPoolInHost
nit: how about LogQueriesForAPoolInHostAtIndices?


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@416
PS27, Line 416: // Return a string reporting top 5 queries with most memory 
consumed among all
              : // hosts in a pool.
can you add an example string here too, since the format here will be a bit 
different


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@418
PS27, Line 418: DebugTopNQueriesForAllHostsInPool
nit: GetDebugStringForTopNQueriesInPool() or GetLogStringForTopNQueriesInPool


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@428
PS27, Line 428:   const TPoolStats& local = pool_stats->local_stats();
              :   for (auto& query : local.heavy_memory_queries) {
              :     listOfTopNs.emplace_back(
              :       Item(query.memory_consumed, host_id_, query.queryId, 
nullptr));
              :   }
              :
              :   // Collect for all remote stats
              :   for (auto& it : pool_stats->remote_stats()) {
              :     const TPoolStats& remote_stats = it.second;
              :     for (auto& query : remote_stats.heavy_memory_queries) {
              :       listOfTopNs.emplace_back(
              :           Item(query.memory_consumed, it.first /*host id*/, 
query.queryId, nullptr));
              :     }
              :   }
              :
              :   // If the number of items is 0, no need to go any further.
              :   if (listOfTopNs.size() == 0) return "";
              :
              :   // Sort the list in descending order of memory consumed, host 
Ids and qid.
              :   sort(listOfTopNs.begin(), listOfTopNs.end(), 
std::greater<Item>());
for my original comment about this. It seems like you are already iterating 
through all the entries in heavy_memory_queries, you can easily aggregate 
memory per query id for all these entries and then sort them at the end to get 
the real top N memory consumption.


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@469
PS27, Line 469: DebugHeavyMemoryQueriesForAHostInPool
nit: LogQueriesForAHostInPoolAtIndices


http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@819
PS27, Line 819:     if ( not_admitted_details )
nit: enclosing brackets for multi line if statement



--
To view, visit http://gerrit.cloudera.org:8080/16220
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
Gerrit-Change-Number: 16220
Gerrit-PatchSet: 27
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Aug 2020 00:46:08 +0000
Gerrit-HasComments: Yes

Reply via email to