keith-turner commented on PR #2665:
URL: https://github.com/apache/accumulo/pull/2665#issuecomment-1156635752
> The client side dispatcher concept is very different from the executor
dispatching that is done in the tserver, but has a very similar name. It might
be helpful to have this named completely differently... like "server chooser"
or "tablet scanner server type selector"
I kinda like `ScanServerChooser`
> So, that does imply a new kind of SPI or configuration to do the server
selection inside the client (or... a different client entirely rather than
modify the existing client).
I am not sure if you are implying the client side plugin should have control
over choosing tservers and sservers. If so, I would like to avoid that and
keep the plugin narrowly scoped to choosing scan servers because of the
following :
* Any scan server can be chosen to service a query for a tablet. Only one
tserver can be chosen to service a tablet scan.
* Scan servers have a busy timeout and tservers do not. The plugin
specifies the busy timeout to use.
* History of busy timeout events is given to the plugin. This allows it to
possibly choose a different scan server based on past events.
The way we choose which tserver vs which scan server is very different and I
don't think it would be good to try to have one plugin do both. Also the logic
for choosing a tserver is not flexible and there is basically only one way to
do it ATM.
Working on this I have realized if we did have anything pluggable for
tservers, it would probably not be around choosing a tserver but more about
backoff strategies in the case of failures. I think that would be another
narrowly scoped plugin that makes very specific decisions.
> Those are specifically scan executor hints, and should be used only by the
dispatcher inside the server, because the dispatcher inside the server
dispatches to an executor.
I think it makes sense to pass the scan exec hint so the
ScanServerChooser/ScanServerDispatcher plugin in addition to plugins dealing
with caching, prioritizationm and thread pool selection on the server side.
Consider the case where in the code I set scan_hints to either
`scan_type=gold`, `scan_type=silver`, or `scan_type=iron`. I could start off
configuring multiple run time plugins to do the following (on tserver and scan
server).
* When we see scan_type=gold enable full caching, use a dedicated thread
pool A with 32 threads
* When we see scan_type=silver enable opportunistic caching, use a thread
pool B with 8 threads, set the scan prio to 1 in the thread pool queue
* When we see scan_type=iron enable disable caching, and use a thread pool
B with 8 threads, set the scan prio to 2 in the thread pool queue
Then later I could change config at runtime to react to the scan types
differently like
* When we see scan_type gold and its eventual, then use a dedicated group of
scan servers with large memory and full caching enabled
* When we see scan_type silver and its eventual use the default set of scan
servers. On the scan server enable caching, use a thread pool B with 8
threads, set the scan prio to 1 in the thread pool queue for this scan type.
* When we see scan_type iron and its eventual use the default set of scan
servers. On the scan server disable caching, use a thread pool B with 8
threads, set the scan prio to 2 in the thread pool queue for this scan type.
So by passing the hints to any plugin involved in scan execution we can
change runtime config to respond to those hints in different ways over time
(using feedback from metrics) including scan server selection.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]