[ 
https://issues.apache.org/jira/browse/HBASE-11695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14089955#comment-14089955
 ] 

Lars Hofhansl commented on HBASE-11695:
---------------------------------------

Based on the observation that a 128mb memstore can be send in about 1s across a 
1ge link, maybe the current jitter of 3-23s is good enough. 10-70s still seems 
a safer bet, but anything between 5s and 5mins should be OK. :)
I still want adjust the wait-time to be greater than the expected jitter in 
order to avoid confusion for folks in the future.


> PeriodicFlusher and WakeFrequency issues
> ----------------------------------------
>
>                 Key: HBASE-11695
>                 URL: https://issues.apache.org/jira/browse/HBASE-11695
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.21
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.99.0, 2.0.0, 0.94.23, 0.98.6
>
>         Attachments: 11695-trunk.txt
>
>
> We just ran into a flush storm caused by the PeriodicFlusher.
> Many memstore became eligible for flushing at exactly the same time, the 
> effect we've seen is that the exact same region was flushed multiple times, 
> because the flusher wakes up too often (every 10s). The jitter of 20s is 
> larger than that and it takes some time to actually flush the memstore.
> Here's one example. We've seen 100's of these, monopolizing the flush queue 
> and preventing "important" flushes from happening.
> {code}
> 06-Aug-2014 20:11:56  [regionserver60020.periodicFlusher] INFO  
> org.apache.hadoop.hbase.regionserver.HRegionServer[1397]-- 
> regionserver60020.periodicFlusher requesting flush for region 
> tsdb,\x00\x00\x0AO\xCF* 
> \x00\x00\x01\x00\x01\x1F\x00\x00\x03\x00\x00\x0C,1340147003629.ef4a680b962592de910d0fdeb376dfc2.
>  after a delay of 13449
> 06-Aug-2014 20:12:06  [regionserver60020.periodicFlusher] INFO  
> org.apache.hadoop.hbase.regionserver.HRegionServer[1397]-- 
> regionserver60020.periodicFlusher requesting flush for region 
> tsdb,\x00\x00\x0AO\xCF* 
> \x00\x00\x01\x00\x01\x1F\x00\x00\x03\x00\x00\x0C,1340147003629.ef4a680b962592de910d0fdeb376dfc2.
>  after a delay of 14060
> {code}
> So we need to increase the period of the PeriodicFlusher to at least the 
> random jitter, also increase the default random jitter (20s does not help 
> with many regions).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to