[ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16268304#comment-16268304
 ] 

Yu Li commented on HBASE-19358:
-------------------------------

Would be great to know:
1. How to decide the value of 
{{hbase.regionserver.hlog.splitlog.writer.threads}}, or how to take full usage 
of HDFS capacity meantime don't overload.
2. The performance number, or say effect on recovering time before/after the 
patch.

And the same question for your similar JIRA boss, if a similar design (smile) 
[~Apache9]

> Improve the stability of splitting log when do fail over
> --------------------------------------------------------
>
>                 Key: HBASE-19358
>                 URL: https://issues.apache.org/jira/browse/HBASE-19358
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>    Affects Versions: 0.98.24
>            Reporter: Jingyun Tian
>         Attachments: newLogic.jpg, previousLogic.jpg
>
>
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12899558/previousLogic.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12899557/newLogic.jpg!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to