[jira] Commented: (HBASE-2457) RS gets stuck compacting region ad infinitum

Todd Lipcon (JIRA) Thu, 15 Apr 2010 23:57:54 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-2457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857705#action_12857705
 ]


Todd Lipcon commented on HBASE-2457:
------------------------------------

This issue got me thinking about why the heuristic is the way it is, and I 
don't quite follow.

I stepped away for a minute and tried to come up with a cost model for why it 
is we do minor compactions. Let me run this by everyone:

- The cost of a doing compaction is the cost of reading all of the files plus 
the cost of writing the newly compacted file. The new file might be a bit 
smaller than the sum of the originals, so the total cost = sum(compacted file 
size) + writes cost factor * reduction factor * sum(compacted file size). Let's 
call this (1 + WF)*sum(file sizes)
- The cost of *not* doing a compaction is just that reads are slower because we 
have to seek more. Let's define R as some unknown factor which describes how 
important read performance is to us, and S the seek time for each additional 
store. So the cost of not doing the compaction is R*S*num stores.

So if we combine these, we want to basically minimize (1 + WF)*size(files to 
compact) - R*S*count(files to compact).

So here's a simple algorithm to minimize that:
{noformat}
sort files by increasing size
for each file:
  if (1+WF)*size(file) < R*S:
    add file to compaction set
  else:
    break
{noformat}

If you rearrange that inequality, you can make it: if size(file) < 
(R*S)/(1+WF). Basically, what this is saying is that we should just make file 
size the heuristic, and tune the minimum file size based on the ratio between 
how much we care about having a small number of stores vs saving sequential IO 
on compactions.

The one flaw if you look closely is that we can't actually sort by increasing 
size and compact some set of the smallest ones, because we have to only compact 
a set of files that are contiguous in the sequence. I think we can slightly 
tweak the algorithm, though, to optimize the same objective but take that 
restriction into account.


> RS gets stuck compacting region ad infinitum
> --------------------------------------------
>
>                 Key: HBASE-2457
>                 URL: https://issues.apache.org/jira/browse/HBASE-2457
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>         Attachments: log.gz, stack
>
>
> Testing 0.20_pre_durabil...@934643, I ended up in a state where one region 
> server got stuck compacting a single region over and over again forever. This 
> was with a special config with very low flush threshold in order to stress 
> test flush/compact code.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HBASE-2457) RS gets stuck compacting region ad infinitum

Reply via email to