[jira] Commented: (HBASE-2053) Upper bound of outstanding WALs can be overrun

stack (JIRA) Mon, 28 Dec 2009 16:25:01 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-2053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12794973#action_12794973
 ]


stack commented on HBASE-2053:
------------------------------

Billy, looking at your log, it seems you have set 
hbase.regionserver.logroll.period to 5 minutes instead of the default hour, is 
that right?

Whats happening when the log roll period is so short is that you'll get a log 
roll even if only one edit in it:

{code}
2009-12-22 00:01:38,102 INFO org.apache.hadoop.hbase.regionserver.HLog: Roll 
/hbase/.logs/server-2,60020,1261340217387/hlog.dat.1261461398009, entries=1, 
calcsize=0, filesize=303. New hlog 
/hbase/.logs/server-2,60020,1261340217387/hlog.dat.1261461698084
{code}



> Upper bound of outstanding WALs can be overrun
> ----------------------------------------------
>
>                 Key: HBASE-2053
>                 URL: https://issues.apache.org/jira/browse/HBASE-2053
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>         Attachments: hbase-root-regionserver-server-2.log.2009-12-22.gz
>
>
> Kevin Peterson up on hbase-user posted the following.  Of interest is the 
> link on the end which is logs of WAL rolls and removals.  In once place we 
> remove 70plus logs because the outstanding edits have moved passed the 
> outstanding sequence numbers -- so our basic WAL removal mechanism is working 
> -- but if you study the log, the tendency is steady climb in the number of 
> logs.   HLog#cleanOldLogs needs to notice such an upward tendency and work 
> more aggressively cleaning the old in this case.  Here is Kevin's note:
> {code}
> n Tue, Dec 15, 2009 at 3:17 PM, Kevin Peterson <x...@y.com> wrote:
> This makes some sense now. I currently have 2200 regions across 3 tables. My
> largest table accounts for about 1600 of those regions and is mostly active
> at one end of the keyspace -- our key is based on date, but data only
> roughly arrives in order. I also write to two secondary indexes, which have
> no pattern to the key at all. One of these secondary tables has 488 regions
> and the other has 96 regions.
> We write about 10M items per day to the main table (articles). All of these
> get written to one of the secondary indexes (article-ids). About a third get
> written to the other secondary index. Total volume of data is about 10GB /
> day written.
> I think the key is as you say that the regions aren't filled enough to
> flush. The articles table gets mostly written to near one end and I see
> splits happening regularly. The index tables have no pattern so the 10
> millions writes get scattered across the different regions. I've looked more
> closely at a log file (linked below), and if I forget about my main table
> (which would tend to get flushed), and look only at the indexes, this seems
> to be what's happening:
> 1. Up to maxLogs HLogs, it doesn't do any flushes.
> 2. Once it gets above maxLogs, it will start flushing one region each time
> it creates a new HLog.
> 3. If the first HLog had edits for say 50 regions, it will need to flush the
> region with oldest edits 50 times before the HLog can be removed.
> If N is the number of regions getting written to, but not getting enough
> writes to flush on their own, then I think this converges to maxLogs + N
> logs on average. If I think of maxLogs as "number of logs to start flushing
> regions at" this makes sense.
> http://kdpeterson.net/paste/hbase-hadoop-regionserver-mi-prod-app35.ec2.biz360.com.log.2009-12-14
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-2053) Upper bound of outstanding WALs can be overrun

Reply via email to