Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17743 )
Change subject: KUDU-1954 [maintenance] Add an optional flush-only thread pool ...................................................................... Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/17743/3/src/kudu/util/maintenance_manager.cc File src/kudu/util/maintenance_manager.cc: http://gerrit.cloudera.org:8080/#/c/17743/3/src/kudu/util/maintenance_manager.cc@535 PS3, Line 535: return {low_io_most_logs_retained_bytes_op, : std::move(notes), : LAUNCH_POOL(low_io_most_logs_retained_bytes_op->type(), flush_pool, general_pool)}; > In my opinion now FindBestOp not only to find best op, and also find the th I do have a few points about FindBestOp/RunSchedulerThread's clarity if going about this approach. - It requires readers to know that FindBestOp() actually returns more than just the op, but also the threadpool it should run in. - While the LAUNCH_POOL macro is fine for now, it's difficult to extend if we think there will be further improvements to our threadpool selection policy. - Now, FindBestOp() is more complex for a reason that doesn't at all relate to finding the best operation. The only reason FindBestOp() would need to know about threadpools is to ensure we have threads available, which is a much simpler task than determining what pools to run work in. Similar points can be made about HasFreeThreads(), which is now replaced with the less intuitive *LaunchPool() methods, which also add complexity to CouldNotLaunchNewOp(). Something like the following would alleviate these concerns: (Leaving MaintenanceManager::FindBestOp() as is...) We could rewrite LAUNCH_POOL as something like: MaintenanceManager::FindPoolForOp(Op* op) { if (thread_pools_.size() == 1) { return thread_pools_[0].get(); } // this will also need to check the currently running ops... switch (op->type) { case PerfImprovementOpType::FLUSH_OP: return thread_pools_[1].get(); default: return thread_pools_[0].get(); } __builtin_unreachable(); } ... and in the scheduler thread, have auto best_op_and_why = FindBestOp(); op = std::get<0>(best_op_and_why); op_note = std::move(std::get<1>(best_op_and_why)); op_launch_pool = FindPoolForOp(op); In that way, FindPoolForOp() could also encapsulate some of the complexity that is now in FindBestOp() that is dedicated to determining what pools are available. We could then do away with 'general_pool' and 'flush_pool' in FindBestOp(), and revert HasFreeThreads() to just checking the number of threads being used. Overall, this decouples the complexity of finding an op and finding a threadpool for an op, which are entirely separate concerns; separating them out into separate functions like this makes them easier to maintain and extend in the long run. -- To view, visit http://gerrit.cloudera.org:8080/17743 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dca9d87e7287cc04482a1c18b1e0eefc5f9bcb1 Gerrit-Change-Number: 17743 Gerrit-PatchSet: 4 Gerrit-Owner: Yingchun Lai <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Bankim Bhavsar <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Yingchun Lai <[email protected]> Gerrit-Comment-Date: Mon, 16 Aug 2021 22:26:24 +0000 Gerrit-HasComments: Yes
