[
https://issues.apache.org/jira/browse/HBASE-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930363#action_12930363
]
Nicolas Spiegelberg commented on HBASE-3209:
--------------------------------------------
St^Ack_: nspiegelberg: what config. would I set so it favored less files and
kept the old read performance?
[3:36pm] nspiegelberg: you have 2 options
[3:37pm] nspiegelberg: 1) set compactionThreshold == 2
[3:38pm] nspiegelberg: 2) make minCompactSize configurable and set it high
[3:39pm] nspiegelberg: basically, before this algo, we would unconditionally
compact 4 files, but the compactionThreshold == 3
[3:40pm] nspiegelberg: this means that we would never use the compaction
algorithm unless our cluster was stressed out
[3:41pm] jdcryans: it used to not be like that tho
[3:42pm] jdcryans: it's a hack that we compact everything
[3:42pm] nspiegelberg: the only downside to the current algorithm is that
sum(storefiles) doesn't take into account dedupe can have a snowball effect of
compacting too aggressively during load. this can be migitated by lowering
hbase.hstore.compaction.max
[3:43pm] nspiegelberg: in reality, this hasn't proved to be an issue for us.
lowering the max compact files will fix it. we can also add on some simple
dedupe heuristics to fix this issue
> New Compaction Heuristic
> ------------------------
>
> Key: HBASE-3209
> URL: https://issues.apache.org/jira/browse/HBASE-3209
> Project: HBase
> Issue Type: Improvement
> Reporter: Nicolas Spiegelberg
> Assignee: Nicolas Spiegelberg
>
> We have a whole bunch of compaction awesome in our internal 0.89 branch.
> Porting this to 0.90:
> 1) don't unconditionally compact 4 files. have a min threshold
> 2) intelligently upgrade minors to majors
> 3) new compaction algo (derived in HBASE-2462 )
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.