[jira] [Updated] (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)

Sean Busbey (JIRA) Thu, 21 Aug 2014 12:59:52 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sean Busbey updated HBASE-2236:
-------------------------------

    Component/s: wal
                 regionserver

> Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-2236
>                 URL: https://issues.apache.org/jira/browse/HBASE-2236
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, wal
>            Reporter: stack
>            Priority: Critical
>              Labels: moved_from_0_20_5
>
> So hbase-2053 is not aggressive enough.  WALs can still overwhelm the upper 
> limit on log count.  While the code added by HBASE-2053, when done, will 
> ensure we let go of the oldest WAL, to do it, we might have to flush many 
> regions.  E.g:
> {code}
> 2010-02-15 14:20:29,351 INFO org.apache.hadoop.hbase.regionserver.HLog: Too 
> many hlogs: logs=45, maxlogs=32; forcing flush of 5 regions(s): 
> test1,193717,1266095474624, test1,194375,1266108228663, 
> test1,195690,1266095539377, test1,196348,1266095539377, 
> test1,197939,1266069173999
> {code}
> This takes time.  If we are taking on edits a furious rate, we might have 
> rolled the log again, meantime, maybe more than once.
> Also log rolls happen inline with a put/delete as soon as it hits the 64MB 
> (default) boundary whereas the necessary flushing is done in background by a 
> single thread and the memstore can overrun the (default) 64MB size.  Flushes 
> needed to release logs will be mixed in with "natural" flushes as memstores 
> fill.  Flushes may take longer than the writing of an HLog because they can 
> be larger.
> So, on an RS that is struggling the tendency would seem to be for a slight 
> rise in WALs.  Only if the RS gets a breather will the flusher catch up.
> If HBASE-2087 happens, then the count of WALs get a boost.
> Ideas to fix this for good would be :
> + Priority queue for queuing up flushes with those that are queued to free up 
> WALs having priority
> + Improve the HBASE-2053 code so that it will free more than just the last 
> WAL, maybe even queuing flushes so we clear all WALs such that we are back 
> under the maximum WALS threshold again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Updated] (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)

Reply via email to