[
https://issues.apache.org/jira/browse/HBASE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857705#action_12857705
]
Todd Lipcon commented on HBASE-2457:
------------------------------------
This issue got me thinking about why the heuristic is the way it is, and I
don't quite follow.
I stepped away for a minute and tried to come up with a cost model for why it
is we do minor compactions. Let me run this by everyone:
- The cost of a doing compaction is the cost of reading all of the files plus
the cost of writing the newly compacted file. The new file might be a bit
smaller than the sum of the originals, so the total cost = sum(compacted file
size) + writes cost factor * reduction factor * sum(compacted file size). Let's
call this (1 + WF)*sum(file sizes)
- The cost of *not* doing a compaction is just that reads are slower because we
have to seek more. Let's define R as some unknown factor which describes how
important read performance is to us, and S the seek time for each additional
store. So the cost of not doing the compaction is R*S*num stores.
So if we combine these, we want to basically minimize (1 + WF)*size(files to
compact) - R*S*count(files to compact).
So here's a simple algorithm to minimize that:
{noformat}
sort files by increasing size
for each file:
if (1+WF)*size(file) < R*S:
add file to compaction set
else:
break
{noformat}
If you rearrange that inequality, you can make it: if size(file) <
(R*S)/(1+WF). Basically, what this is saying is that we should just make file
size the heuristic, and tune the minimum file size based on the ratio between
how much we care about having a small number of stores vs saving sequential IO
on compactions.
The one flaw if you look closely is that we can't actually sort by increasing
size and compact some set of the smallest ones, because we have to only compact
a set of files that are contiguous in the sequence. I think we can slightly
tweak the algorithm, though, to optimize the same objective but take that
restriction into account.
> RS gets stuck compacting region ad infinitum
> --------------------------------------------
>
> Key: HBASE-2457
> URL: https://issues.apache.org/jira/browse/HBASE-2457
> Project: Hadoop HBase
> Issue Type: Bug
> Affects Versions: 0.20.4
> Reporter: Todd Lipcon
> Priority: Critical
> Attachments: log.gz, stack
>
>
> Testing 0.20_pre_durabil...@934643, I ended up in a state where one region
> server got stuck compacting a single region over and over again forever. This
> was with a special config with very low flush threshold in order to stress
> test flush/compact code.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira