[ 
https://issues.apache.org/jira/browse/ACCUMULO-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13241184#comment-13241184
 ] 

Keith Turner commented on ACCUMULO-501:
---------------------------------------

One thing we have discussed before is storing a count in the index for each 
block.  Using this a scan of the index for the region of the tablet that 
overlaps the tablet will give a fairly accurate count.
                
> RFile should store the key count in metadata
> --------------------------------------------
>
>                 Key: ACCUMULO-501
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-501
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>             Fix For: 1.5.0
>
>
> BulkImport estimates the number of keys in a file to be zero.  We store the 
> largest and smallest key in metadata, I think we can afford to store the key 
> count use it to provide an estimate when we load it into the tablet.  Perhaps 
> if we know the start key is "a" and the end key is "z" and the tablets range 
> is "a->m" we can just estimate 50% of the key count.
> When a bulk file fits completely in a range, the key count estimate will be 
> accurate.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to