[ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingyun Tian updated HBASE-19358:
---------------------------------
    Attachment:     (was: newLogic.png)

> Improve the stability of splitting log when do fail over
> --------------------------------------------------------
>
>                 Key: HBASE-19358
>                 URL: https://issues.apache.org/jira/browse/HBASE-19358
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>    Affects Versions: 0.98.24
>            Reporter: Jingyun Tian
>
> Now the way we split log is like the following figure:
> !previous-logic.png|thumbnail!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region. If the 
> cluster is small and the number of regions per rs is large, it will create 
> too many HDFS streams at the same time. Then it is prone to failure since 
> each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !newLogic.png|thumbnail!
> We cached the recovered edits unless exceeds the memory limits we set or 
> reach the end, then  we have a thread pool to do the rest things: write them 
> to files and move to the destination.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads, but before it is 
> hbase.regionserver.wal.max.splitters * the number of region the hlog contains.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to