[jira] Updated: (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)

stack (JIRA) Tue, 05 Oct 2010 15:04:58 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack updated HBASE-2236:
-------------------------

    Fix Version/s:     (was: 0.90.0)

Moving out of 0.90.

Chatting with Ryan and J-D, this is a difficult issue and count of logs is not 
the right metrics; rather we should be looking at size of all edits out in WALs 
as it relates to size of content up in memstores.

Also, splitting needs to be made run faster so we can afford to carry fatter 
WALs.

> Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)
> ------------------------------------------------------------------------------
>
>                 Key: HBASE-2236
>                 URL: https://issues.apache.org/jira/browse/HBASE-2236
>             Project: HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Critical
>
> So hbase-2053 is not aggressive enough.  WALs can still overwhelm the upper 
> limit on log count.  While the code added by HBASE-2053, when done, will 
> ensure we let go of the oldest WAL, to do it, we might have to flush many 
> regions.  E.g:
> {code}
> 2010-02-15 14:20:29,351 INFO org.apache.hadoop.hbase.regionserver.HLog: Too 
> many hlogs: logs=45, maxlogs=32; forcing flush of 5 regions(s): 
> test1,193717,1266095474624, test1,194375,1266108228663, 
> test1,195690,1266095539377, test1,196348,1266095539377, 
> test1,197939,1266069173999
> {code}
> This takes time.  If we are taking on edits a furious rate, we might have 
> rolled the log again, meantime, maybe more than once.
> Also log rolls happen inline with a put/delete as soon as it hits the 64MB 
> (default) boundary whereas the necessary flushing is done in background by a 
> single thread and the memstore can overrun the (default) 64MB size.  Flushes 
> needed to release logs will be mixed in with "natural" flushes as memstores 
> fill.  Flushes may take longer than the writing of an HLog because they can 
> be larger.
> So, on an RS that is struggling the tendency would seem to be for a slight 
> rise in WALs.  Only if the RS gets a breather will the flusher catch up.
> If HBASE-2087 happens, then the count of WALs get a boost.
> Ideas to fix this for good would be :
> + Priority queue for queuing up flushes with those that are queued to free up 
> WALs having priority
> + Improve the HBASE-2053 code so that it will free more than just the last 
> WAL, maybe even queuing flushes so we clear all WALs such that we are back 
> under the maximum WALS threshold again.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HBASE-2236) Upper bound of outstanding WALs can be overrun; take 2 (take 1 was hbase-2053)

Reply via email to