ctubbsii commented on PR #2665:
URL: https://github.com/apache/accumulo/pull/2665#issuecomment-1152861564

   > > I'm thinking it would be better to leverage the scan hints to control a 
ScanServer-aware dispatcher, rather than add a new API for the consistency 
level.
   > 
   > I think this goes against the purpose of scan execution hints. They were 
created to modify execution behavior like priority, caching, and thread pool 
selection. They were never intended to change anything about data returned, it 
says so in their javadocs.
   
   That's only because we didn't have even the possibility of returning data 
that wasn't immediately consistent before. Servers that returned data never 
previously had the option of returning data that was stale before. But, now we 
have a whole new server type that we can dispatch to. It's not the scan 
execution hints that are modifying the behavior... it's the configured 
dispatcher. And, the scan hints are still not affecting the data returned... 
it's the server that it was dispatched to that is doing that.
   
   A scan hint that explicitly says the eventual consistency is tolerable seems 
perfectly reasonable to me. It fits very naturally into the whole design of 
scan hints affecting dispatching. And the ability for a dispatcher to use 
ScanServers instead of TabletServers also seems perfectly natural. No 
fundamental design changes at all, and no special-purpose APIs needed to 
support the feature. The feature all works with the existing design elements 
wired together in a particular way.
   
   We can easily update the javadoc to clarify that the scan hints affect how 
the scan is dispatched only, and not the data, but that the dispatcher could 
dispatch to a server that provides stale data (in the case of scan servers) if 
the scan hint specified to do so.
   
   > 
   > 
https://github.com/apache/accumulo/blob/d5f81877fcc794c8158f38b840d02331e3c563dc/core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java#L342-L361
   > 
   > Slightly related I created a new default scan server dispatcher. Its 
currently a PR against Dave's branch: 
[dlmarion#29](https://github.com/dlmarion/accumulo/pull/29). When running 100+ 
test scenarios this is what I realized I wanted. I wish I had had it when 
running all of those test, I could have a run a few more test that I wanted to 
but could not. This new dispatcher is completely configuration driven 
(replacing the algorithm the previous default dispatcher had) and can be 
influenced by scan execution hints. If we merge this PR, I could close the PR 
on Dave's fork and make a PR on the main Accumulo GH.
   
   The idea of creating a custom dispatcher that would work with the scan 
servers is exactly what I had in mind. However, I don't think it should be the 
default.
   
   I think in order to leverage scan servers, the user should:
   1. Run some ScanServers,
   2. Configure a `table.scan.dispatcher` to an implementation that is 
ScanServer-aware, and
   3. Configure individual scans with the scan execution hint recognized by 
that dispatcher to instruct it to dispatch to ScanServers
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to