[
https://issues.apache.org/jira/browse/HBASE-14263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717404#comment-14717404
]
Vladimir Rodionov commented on HBASE-14263:
-------------------------------------------
I think we need to change wording/definition of these config parameters in
HBase book, they are misleading:
{quote}
hbase.hstore.compaction.min.size
Description
A StoreFile smaller than this size will always be eligible for minor
compaction. HFiles this size or larger are evaluated by
hbase.hstore.compaction.ratio to determine if they are eligible. Because this
limit represents the "automatic include"limit for all StoreFiles smaller than
this value, this value may need to be reduced in write-heavy environments where
many StoreFiles in the 1-2 MB range are being flushed, because every StoreFile
will be targeted for compaction and the resulting StoreFiles may still be under
the minimum size and require further compaction. If this parameter is lowered,
the ratio check is triggered more quickly. This addressed some issues seen in
earlier versions of HBase but changing this parameter is no longer necessary in
most situations. Default: 128 MB expressed in bytes.
Default
134217728
hbase.hstore.compaction.max.size
Description
A StoreFile larger than this size will be excluded from compaction. The effect
of raising hbase.hstore.compaction.max.size is fewer, larger StoreFiles that do
not get compacted often. If you feel that compaction is happening too often
without much benefit, you can try raising this value. Default: the value of
LONG.MAX_VALUE, expressed in bytes.
{quote}
> ExploringCompactionPolicy logic around file selection is broken
> ---------------------------------------------------------------
>
> Key: HBASE-14263
> URL: https://issues.apache.org/jira/browse/HBASE-14263
> Project: HBase
> Issue Type: Bug
> Reporter: Vladimir Rodionov
> Assignee: Vladimir Rodionov
> Fix For: 2.0.0
>
> Attachments: HBASE-14263.patch
>
>
> It seems that logic around selection of store file candidates is broken:
> {code}
> // Compute the total size of files that will
> // have to be read if this set of files is compacted.
> long size = getTotalStoreSize(potentialMatchFiles);
> // Store the smallest set of files. This stored set of files will be
> used
> // if it looks like the algorithm is stuck.
> if (mightBeStuck && size < smallestSize) {
> smallest = potentialMatchFiles;
> smallestSize = size;
> }
> if (size > comConf.getMaxCompactSize()) {
> continue;
> }
> ++opts;
> if (size >= comConf.getMinCompactSize()
> && !filesInRatio(potentialMatchFiles, currentRatio)) {
> continue;
> }
> {code}
> This is from applyCompactionPolicy method. As you can see, both min
> compaction size and max compaction size are applied to a *selection* of files
> and not to individual files. It mostly works as expected only because nobody
> seems using non-default hbase.hstore.compaction.max.size, which is
> Long.MAX_VALUE and it is not that easy to figure out what is going
> on on an opposite side (why small files do not get included?)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)