Andrew Wong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17743 )

Change subject: KUDU-1954 [maintenance] Add an optional flush-only thread pool
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17743/3/src/kudu/util/maintenance_manager.cc
File src/kudu/util/maintenance_manager.cc:

http://gerrit.cloudera.org:8080/#/c/17743/3/src/kudu/util/maintenance_manager.cc@535
PS3, Line 535:     return {low_io_most_logs_retained_bytes_op,
             :             std::move(notes),
             :             
LAUNCH_POOL(low_io_most_logs_retained_bytes_op->type(), flush_pool, 
general_pool)};
> In my opinion now FindBestOp not only to find best op, and also find the th
I do have a few points about FindBestOp/RunSchedulerThread's clarity if going 
about this approach.
- It requires readers to know that FindBestOp() actually returns more than just 
the op, but also the threadpool it should run in.
- While the LAUNCH_POOL macro is fine for now, it's difficult to extend if we 
think there will be further improvements to our threadpool selection policy.
- Now, FindBestOp() is more complex for a reason that doesn't at all relate to 
finding the best operation. The only reason FindBestOp() would need to know 
about threadpools is to ensure we have threads available, which is a much 
simpler task than determining what pools to run work in. Similar points can be 
made about HasFreeThreads(), which is now replaced with the less intuitive 
*LaunchPool() methods, which also add complexity to CouldNotLaunchNewOp().

Something like the following would alleviate these concerns:

(Leaving MaintenanceManager::FindBestOp() as is...) We could rewrite 
LAUNCH_POOL as something like:

MaintenanceManager::FindPoolForOp(Op* op) {
  if (thread_pools_.size() == 1) {
    return thread_pools_[0].get();
  }
  // this will also need to check the currently running ops...
  switch (op->type) {
    case PerfImprovementOpType::FLUSH_OP:
      return thread_pools_[1].get();
    default:
      return thread_pools_[0].get();
  }
  __builtin_unreachable();
}

 ... and in the scheduler thread, have
 auto best_op_and_why = FindBestOp();
 op = std::get<0>(best_op_and_why);
 op_note = std::move(std::get<1>(best_op_and_why));
 op_launch_pool = FindPoolForOp(op);

In that way, FindPoolForOp() could also encapsulate some of the complexity that 
is now in FindBestOp() that is dedicated to determining what pools are 
available. We could then do away with 'general_pool' and 'flush_pool' in 
FindBestOp(), and revert HasFreeThreads() to just checking the number of 
threads being used. Overall, this decouples the complexity of finding an op and 
finding a threadpool for an op, which are entirely separate concerns; 
separating them out into separate functions like this makes them easier to 
maintain and extend in the long run.



--
To view, visit http://gerrit.cloudera.org:8080/17743
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dca9d87e7287cc04482a1c18b1e0eefc5f9bcb1
Gerrit-Change-Number: 17743
Gerrit-PatchSet: 4
Gerrit-Owner: Yingchun Lai <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Bankim Bhavsar <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai <[email protected]>
Gerrit-Comment-Date: Mon, 16 Aug 2021 22:26:24 +0000
Gerrit-HasComments: Yes

Reply via email to