[ 
https://issues.apache.org/jira/browse/HBASE-9873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811575#comment-13811575
 ] 

Jeffrey Zhong commented on HBASE-9873:
--------------------------------------

{quote}
Support running multiple hlog splitters on a single RS
{quote}
We can make this configurable(HBase-9736). In many cases, recovery is happening 
while a cluster is serving live traffic so you normally don't want recovery 
traffic to affect other live traffic too much. Making the number of log 
splitter configurable normally helps when the cluster has free IO capacity(SSD 
clusters) or in distributedLogReplay mode where no extra small random writes 
from recovery.edits operations. 

When a WAL splitting takes about 30+ seconds, I guess opening more splitters 
may have counter effect because log splitting normally slows at writing side 
and reader idles to wait for writing to finish. Opening more splitter basically 
add more write load in the cluster so it could even drag current split task.

{quote}
Try to clean old hlog after each memstore flush to avoid unnecessary hlogs 
split in failover. Now hlogs cleaning only be run in rolling hlog writer.
{quote}
I have a different idea in this area: we could be smart on the log cleaning 
such as we can maintain last flushed sequence number of each region and regions 
for each wal in memory so a log cleaner can out of order clean a wal instead of 
checking global smallest flushed sequence number.

{quote}
5) Enable multiple splitters on 'big' hlog file by splitting(logically) hlog to 
slices(configurable size, eg hdfs trunk size 64M)
{quote}
I'd wait for our multiple wal solution. Because it basically assumes we have 
the IO capacity but less worker slots while with the multiple splitter per RS 
and limiting wal size, the suggestion seems not needed.

{quote}
7) Consider the hlog data locality when schedule the hlog split task. Schedule 
the hlog to a splitter which is near to hlog data.
{quote}
We have a JIRA HBASE-6772 on this.

In general, RS failure recovery spends huge percentage in detection time. It'd 
be better if we can look into that as well.  Thanks.





> Some improvements in hlog and hlog split
> ----------------------------------------
>
>                 Key: HBASE-9873
>                 URL: https://issues.apache.org/jira/browse/HBASE-9873
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR, wal
>            Reporter: Liu Shaohui
>            Priority: Critical
>              Labels: failover, hlog
>
> Some improvements in hlog and hlog split
> 1) Try to clean old hlog after each memstore flush to avoid unnecessary hlogs 
> split in failover.  Now hlogs cleaning only be run in rolling hlog writer. 
> 2) Add a background hlog compaction thread to compaction the hlog: remove the 
> hlog entries whose data have been flushed to hfile. The scenario is that in a 
> share cluster, write requests of a table may very little and periodical,  a 
> lots of hlogs can not be cleaned for entries of this table in those hlogs.
> 3) Rely on the smallest of all biggest hfile's seqId of previous served 
> regions to ignore some entries.  Facebook have implemented this in HBASE-6508 
> and we backport it to hbase 0.94 in HBASE-9568.
> 4) Support running multiple hlog splitters on a single RS and on 
> master(latter can boost split efficiency for tiny cluster)
> 5) Enable multiple splitters on 'big' hlog file by splitting(logically) hlog 
> to slices(configurable size, eg hdfs trunk size 64M)
> support concurrent multiple split tasks on a single hlog file slice 
> 6) Do not cancel the timeout split task until one task reports it succeeds 
> (avoids scenario where split for a hlog file fails due to no one task can 
> succeed within the timeout period ), and and reschedule a same split task to 
> reduce split time ( to avoid some straggler in hlog split)
> 7) Consider the hlog data locality when schedule the hlog split task.  
> Schedule the hlog to a splitter which is near to hlog data.
> 8) Support multi hlog writers and switching to another hlog writer when long 
> write latency to current hlog due to possible temporary network spike? 
> This is a draft which lists the improvements about hlog we try to implement 
> in the near future. Comments and discussions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to