Github user keith-turner commented on the pull request:
https://github.com/apache/accumulo/pull/25#issuecomment-91069467
I was discussing the use case I mentioned offline w/ @ctubbsii. This use
case was a large number of ranges that can not be generated by a function. We
determined that function could handle this case well by storing the ranges
somewhere else beside the job conf. For example could do the following.
* Store 10,000,000 sorted ranges in file in distributed cache (assume
thousands of tablets)
* Using the provided function, each mapper opens the file and reads the
ranges for the tablet its working on.
* The ranges returned by the function are used to initialize the batch
scanner for each mapper.
It seems like all of the use cases that the current implementation
satisfies could also be satisfied with the functor implementation.
If this PR does not follow the functor approach discussed, thats ok w/ me.
I can open up a follow on issue to record the discussion if its not pursued
here.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---