keith-turner commented on PR #2665:
URL: https://github.com/apache/accumulo/pull/2665#issuecomment-1156635752

   > The client side dispatcher concept is very different from the executor 
dispatching that is done in the tserver, but has a very similar name. It might 
be helpful to have this named completely differently... like "server chooser" 
or "tablet scanner server type selector"
   
   I kinda like `ScanServerChooser`
   
   > So, that does imply a new kind of SPI or configuration to do the server 
selection inside the client (or... a different client entirely rather than 
modify the existing client).
   
   I am not sure if you are implying the client side plugin should have control 
over choosing tservers and sservers.  If so, I would like to avoid that and 
keep the plugin narrowly scoped to choosing scan servers because of the 
following : 
   
    * Any scan server can be chosen to service a query for a tablet.  Only one 
tserver can be chosen to service a tablet scan.
    * Scan servers have a busy timeout and tservers do not.  The plugin 
specifies the busy timeout to use.
    * History of busy timeout events is given to the plugin.  This allows it to 
possibly choose a different scan server based on past events.
   
   The way we choose which tserver vs which scan server is very different and I 
don't think it would be good to try to have one plugin do both.  Also the logic 
for choosing a tserver is not flexible and there is basically only one way to 
do it ATM.
   
   Working on this I have realized if we did have anything pluggable for 
tservers, it would probably not be around choosing a tserver but more about 
backoff strategies in the case of failures. I think that would be another 
narrowly scoped plugin that makes very specific decisions.
   
   > Those are specifically scan executor hints, and should be used only by the 
dispatcher inside the server, because the dispatcher inside the server 
dispatches to an executor.
   
   I think it makes sense to pass the scan exec hint so the 
ScanServerChooser/ScanServerDispatcher plugin in addition to plugins dealing 
with caching, prioritizationm and thread pool selection on the server side.  
Consider the case where in the code I set scan_hints to either 
`scan_type=gold`, `scan_type=silver`, or `scan_type=iron`. I could start off 
configuring multiple run time plugins to do the following (on tserver and scan 
server).
   
     * When we see scan_type=gold enable full caching, use a dedicated thread 
pool A with 32 threads
     * When we see scan_type=silver enable opportunistic caching, use a thread 
pool B with 8 threads, set the scan prio to 1 in the thread pool queue
     * When we see scan_type=iron enable disable caching, and use a thread pool 
B with 8 threads, set the scan prio to 2 in the thread pool queue
   
   Then later I could change config at runtime to react to the scan types 
differently like
   
   * When we see scan_type gold and its eventual, then use a dedicated group of 
scan servers with large memory and full caching enabled
   * When we see scan_type silver and its eventual use the default set of scan 
servers.  On the scan server enable caching, use a thread pool B with 8 
threads, set the scan prio to 1 in the thread pool queue for this scan type.
   * When we see scan_type iron and its eventual use the default set of scan 
servers.  On the scan server disable caching, use a thread pool B with 8 
threads, set the scan prio to 2 in the thread pool queue for this scan type.
   
   So by passing the hints to any plugin involved in scan execution we can 
change runtime config to respond to those hints in different ways over time 
(using feedback from metrics) including scan server selection.
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to