Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 )
Change subject: IMPALA-9989 Improve admission control pool stats logging ...................................................................... Patch Set 27: (19 comments) http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG@77 PS27, Line 77: loggerd nit: typo http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h@448 PS27, Line 448: /// memory tracked by query memory trackers. The top element in the queue is the nit: by all children query mem trackers. http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc@481 PS23, Line 481: MemTracker* MemTracker::GetRootMemTracker() { : MemTracker* ancestor = this; : while (ancestor && ancestor->parent()) { : ancestor = ancestor->parent(); : } : return ancestor; : } is this used anywhere? http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@422 PS27, Line 422: UpdatePoolStatsForQueries can you also add a test for this in mem-tracker-test http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@458 PS27, Line 458: else { Add a DCHECK(tracker->is_query_mem_tracker_) to make sure these stats are collected only for query memtrackers since they dont make sense for trackers lower in the mem tracker hierarchy. Also mention in the method comment for UpdatePoolStatsForQueries() that it should only be called for mem-trackers that are either query mem trackers or higher in the mem tracker hierarchy. http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc@972 PS27, Line 972: void hook_me(const char* hook) : { : bool echo = false; : FILE* fp = nullptr; : : while (!(fp = fopen(hook, "r"))) { : if ( !echo ) { : std::cout << "gdb -pid " << getpid() << ", then touch " << hook << std::endl; : echo = true; : } : } : fclose(fp); : } remove? http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@735 PS27, Line 735: contains nit: contain http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@1055 PS27, Line 1055: /// A helper type to glue information together to compute the topN queries : /// out of <n> topM queries. : typedef std::tuple<int64_t, string, TUniqueId, const TPoolStats*> Item; : const int64_t& getMemConsumed(const Item& item) const { return std::get<0>(item); } : const string& getName(const Item& item) const { return std::get<1>(item); } : const TUniqueId& getTUniqueId(const Item& item) const { return std::get<2>(item); } : const TPoolStats* getTPoolStats(const Item& item) const { return std::get<3>(item); } Guess i am a bit biased for using structs since i am kinda used to using them in this codebase, but this seems good too. http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@272 PS27, Line 272: DebugPoolStatsForConsumedMemory nit: AppendStatsForConsumedMemory http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@303 PS27, Line 303: // Return a debug string for memory consumption part of the pool stats. : string AdmissionController::PoolStats::DebugPoolStatsForConsumedMemory( : const TPoolStats& stats) const { : stringstream ss; : DebugPoolStatsForConsumedMemory(ss, stats); : return ss.str(); : } since DebugPoolStats is the only method using this, we can probably get rid of it and pass the ss object in DebugPoolStats directly, since this would generate the string twice (2 seperate calls to ss.str()) http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@361 PS27, Line 361: DebugTopNQueriesForAllPoolsInHost nit: GetDebugStringForTopNQueriesOnHost() or GetLogStringForTopNQueriesOnHost http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@368 PS27, Line 368: map nit: directly use pool_stats_ here http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@401 PS27, Line 401: std::get<1>(current) == std::get<1>(next) nit: please use the method you defined in the header file getName() http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@405 PS27, Line 405: DebugHeavyMemoryQueriesForAPoolInHost nit: how about LogQueriesForAPoolInHostAtIndices? http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@416 PS27, Line 416: // Return a string reporting top 5 queries with most memory consumed among all : // hosts in a pool. can you add an example string here too, since the format here will be a bit different http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@418 PS27, Line 418: DebugTopNQueriesForAllHostsInPool nit: GetDebugStringForTopNQueriesInPool() or GetLogStringForTopNQueriesInPool http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@428 PS27, Line 428: const TPoolStats& local = pool_stats->local_stats(); : for (auto& query : local.heavy_memory_queries) { : listOfTopNs.emplace_back( : Item(query.memory_consumed, host_id_, query.queryId, nullptr)); : } : : // Collect for all remote stats : for (auto& it : pool_stats->remote_stats()) { : const TPoolStats& remote_stats = it.second; : for (auto& query : remote_stats.heavy_memory_queries) { : listOfTopNs.emplace_back( : Item(query.memory_consumed, it.first /*host id*/, query.queryId, nullptr)); : } : } : : // If the number of items is 0, no need to go any further. : if (listOfTopNs.size() == 0) return ""; : : // Sort the list in descending order of memory consumed, host Ids and qid. : sort(listOfTopNs.begin(), listOfTopNs.end(), std::greater<Item>()); for my original comment about this. It seems like you are already iterating through all the entries in heavy_memory_queries, you can easily aggregate memory per query id for all these entries and then sort them at the end to get the real top N memory consumption. http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@469 PS27, Line 469: DebugHeavyMemoryQueriesForAHostInPool nit: LogQueriesForAHostInPoolAtIndices http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@819 PS27, Line 819: if ( not_admitted_details ) nit: enclosing brackets for multi line if statement -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Fri, 07 Aug 2020 00:46:08 +0000 Gerrit-HasComments: Yes