[
https://issues.apache.org/jira/browse/HBASE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658282#action_12658282
]
Andrew Purtell commented on HBASE-1071:
---------------------------------------
One way to approach this is to estimate the size of the index on the heap by
key count and lengths. Then consider a certain limit, and increase the index
interval as necessary until the estimated index size is below threshold. This
is simple and gives only one knob -- easy enough to tweak -- that gets directly
to the effect wanted. Then a suitable default can be found through testing of
some educated guesses with PE.
> Set index interval at flush time based off count of keys and key attributes
> ---------------------------------------------------------------------------
>
> Key: HBASE-1071
> URL: https://issues.apache.org/jira/browse/HBASE-1071
> Project: Hadoop HBase
> Issue Type: Improvement
> Reporter: stack
>
> From Andrew Purtell note up on list:
> "Later, maybe it would make sense to dynamically set the index
> interval based on the distribution of cell sizes in the
> mapfile at some future time, according to some parameterized
> formula that could be adjusted with config variable(s). This
> could be done during compaction. Would make sense to also
> consider the distribution of key lengths. Or there could be
> other similar tricks implemented to keep index sizes down. "
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.