[ 
https://issues.apache.org/jira/browse/HBASE-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089376#comment-14089376
 ] 

Lars Hofhansl commented on HBASE-11695:
---------------------------------------

That is true, we do not actually flush the same region multiple times (except 
for a race Nicolas mentions), we just request the flush multiple times. In our 
case the storm was caused my many regions being eligible for periodic flushing 
at the same time, i.e. they've all been written into slowly not filling them 
within an hour.

I also want to increase the jitter. It is still pointless to wake up the 
flusher thread every 10s when the jitter is 20s (or more) and the requested 
flush interval is 3600s.


> PeriodicFlusher and WakeFrequency issues
> ----------------------------------------
>
>                 Key: HBASE-11695
>                 URL: https://issues.apache.org/jira/browse/HBASE-11695
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.21
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Critical
>             Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6
>
>
> We just ran into a flush storm caused by the PeriodicFlusher.
> Many memstore became eligible for flushing at exactly the same time, the 
> effect we've seen is that the exact same region was flushed multiple times, 
> because the flusher wakes up too often (every 10s). The jitter of 20s is 
> larger than that and it takes some time to actually flush the memstore.
> Here's one example. We've seen 100's of these, monopolizing the flush queue 
> and preventing "important" flushes from happening.
> {code}
> 06-Aug-2014 20:11:56  [regionserver60020.periodicFlusher] INFO  
> org.apache.hadoop.hbase.regionserver.HRegionServer[1397]-- 
> regionserver60020.periodicFlusher requesting flush for region 
> tsdb,\x00\x00\x0AO\xCF* 
> \x00\x00\x01\x00\x01\x1F\x00\x00\x03\x00\x00\x0C,1340147003629.ef4a680b962592de910d0fdeb376dfc2.
>  after a delay of 13449
> 06-Aug-2014 20:12:06  [regionserver60020.periodicFlusher] INFO  
> org.apache.hadoop.hbase.regionserver.HRegionServer[1397]-- 
> regionserver60020.periodicFlusher requesting flush for region 
> tsdb,\x00\x00\x0AO\xCF* 
> \x00\x00\x01\x00\x01\x1F\x00\x00\x03\x00\x00\x0C,1340147003629.ef4a680b962592de910d0fdeb376dfc2.
>  after a delay of 14060
> {code}
> So we need to increase the period of the PeriodicFlusher to at least the 
> random jitter, also increase the default random jitter (20s does not help 
> with many regions).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to