[GitHub] accumulo pull request: ACCUMULO-3602 BatchScanner optimization for...

keith-turner Wed, 08 Apr 2015 16:43:06 -0700

Github user keith-turner commented on the pull request:

    https://github.com/apache/accumulo/pull/25#issuecomment-91069467
  
    I was discussing the use case I mentioned offline w/ @ctubbsii.  This use 
case was a large number of ranges that can not be generated by a function.  We 
determined that function could handle this case well by storing the ranges 
somewhere else beside the job conf.  For example could do the following.
    
     * Store 10,000,000 sorted ranges in file in distributed cache (assume 
thousands of tablets)
     * Using the provided function, each mapper opens the file and reads the 
ranges for the tablet its working on.
     * The ranges returned by the function are used to initialize the batch 
scanner for each mapper.
    
    It seems like all of the use cases that the current implementation 
satisfies could also be satisfied with the functor implementation.  
    
    If this PR does not follow the functor approach discussed, thats ok w/ me.  
 I can open up a follow on issue to record the discussion if its not pursued 
here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] accumulo pull request: ACCUMULO-3602 BatchScanner optimization for...

Reply via email to