[
https://issues.apache.org/jira/browse/ACCUMULO-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15308834#comment-15308834
]
Keith Turner commented on ACCUMULO-4314:
----------------------------------------
I just pushed the changes. Unfortunately I noticed that I put the wrong issue
in the commit message. I put ACCUMULO-4318 in the commit message (another issue
I am currently working). The commit in 1.6 is 63a8a5d. The merge commit in
1.7 is d33b2a0.
> Use statistics to choose better keys for RFile index
> ----------------------------------------------------
>
> Key: ACCUMULO-4314
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4314
> Project: Accumulo
> Issue Type: Improvement
> Reporter: Keith Turner
> Assignee: Keith Turner
> Priority: Blocker
> Fix For: 1.6.6, 1.7.2
>
>
> The commit for ACCUMULO-1124 makes two changes :
> * Generates shorter keys that may not exist in data to place in RFile index
> * Use statistics to make better choices about what keys to place in index.
> These changes look for keys that are average or below and excludes large keys
> (keys that are > 3 std dev).
> The change to generate shorter keys can not be made in 1.7.X and 1.6.X
> because it would generate RFiles that may not work properly with older 1.6
> and 1.7 versions. However the changes to use statistics to pick better keys
> could be made in 1.6 and 1.7.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)