ctubbsii commented on PR #2665: URL: https://github.com/apache/accumulo/pull/2665#issuecomment-1152861564
> > I'm thinking it would be better to leverage the scan hints to control a ScanServer-aware dispatcher, rather than add a new API for the consistency level. > > I think this goes against the purpose of scan execution hints. They were created to modify execution behavior like priority, caching, and thread pool selection. They were never intended to change anything about data returned, it says so in their javadocs. That's only because we didn't have even the possibility of returning data that wasn't immediately consistent before. Servers that returned data never previously had the option of returning data that was stale before. But, now we have a whole new server type that we can dispatch to. It's not the scan execution hints that are modifying the behavior... it's the configured dispatcher. And, the scan hints are still not affecting the data returned... it's the server that it was dispatched to that is doing that. A scan hint that explicitly says the eventual consistency is tolerable seems perfectly reasonable to me. It fits very naturally into the whole design of scan hints affecting dispatching. And the ability for a dispatcher to use ScanServers instead of TabletServers also seems perfectly natural. No fundamental design changes at all, and no special-purpose APIs needed to support the feature. The feature all works with the existing design elements wired together in a particular way. We can easily update the javadoc to clarify that the scan hints affect how the scan is dispatched only, and not the data, but that the dispatcher could dispatch to a server that provides stale data (in the case of scan servers) if the scan hint specified to do so. > > https://github.com/apache/accumulo/blob/d5f81877fcc794c8158f38b840d02331e3c563dc/core/src/main/java/org/apache/accumulo/core/client/ScannerBase.java#L342-L361 > > Slightly related I created a new default scan server dispatcher. Its currently a PR against Dave's branch: [dlmarion#29](https://github.com/dlmarion/accumulo/pull/29). When running 100+ test scenarios this is what I realized I wanted. I wish I had had it when running all of those test, I could have a run a few more test that I wanted to but could not. This new dispatcher is completely configuration driven (replacing the algorithm the previous default dispatcher had) and can be influenced by scan execution hints. If we merge this PR, I could close the PR on Dave's fork and make a PR on the main Accumulo GH. The idea of creating a custom dispatcher that would work with the scan servers is exactly what I had in mind. However, I don't think it should be the default. I think in order to leverage scan servers, the user should: 1. Run some ScanServers, 2. Configure a `table.scan.dispatcher` to an implementation that is ScanServer-aware, and 3. Configure individual scans with the scan execution hint recognized by that dispatcher to instruct it to dispatch to ScanServers -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
