[jira] [Resolved] (HBASE-19358) Improve the stability of splitting log when do fail over

Yu Li (JIRA) Sun, 07 Jan 2018 19:00:19 -0800

     [ 
https://issues.apache.org/jira/browse/HBASE-19358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Yu Li resolved HBASE-19358.
---------------------------
      Resolution: Fixed
    Hadoop Flags: Reviewed
    Release Note: After HBASE-19358 we introduced a new property 
hbase.split.writer.creation.bounded to limit the opening writers for each 
WALSplitter. If set to true, we won't open any writer for recovered.edits until 
the entries accumulated in memory reaching 
hbase.regionserver.hlog.splitlog.buffersize (which defaults at 128M) and will 
write and close the file in one go instead of keeping the writer open. It's 
false by default and we recommend to set it to true if your cluster has a high 
region load (like more than 300 regions per RS), especially when you observed 
obvious NN/HDFS slow down during hbase (single RS or cluster) failover.

Thanks for the note boss [[email protected]] and will notice next time 
(the branch-2-v2 patch was quite close for commit but I was interrupted by 
something else thus left it over, my bad...)

Pushed into branch-2 and added some release note. Please check the release note 
and feel free to amend it if necessary [~tianjingyun] [~Apache9], thanks.

Closing issue, thanks all for review.

> Improve the stability of splitting log when do fail over
> --------------------------------------------------------
>
>                 Key: HBASE-19358
>                 URL: https://issues.apache.org/jira/browse/HBASE-19358
>             Project: HBase
>          Issue Type: Improvement
>          Components: MTTR
>    Affects Versions: 0.98.24
>            Reporter: Jingyun Tian
>            Assignee: Jingyun Tian
>             Fix For: 1.4.1, 1.5.0, 2.0.0-beta-2
>
>         Attachments: HBASE-18619-branch-2-v2.patch, 
> HBASE-19358-branch-1-v2.patch, HBASE-19358-branch-1-v3.patch, 
> HBASE-19358-branch-1.patch, HBASE-19358-branch-2-v3.patch, 
> HBASE-19358-v1.patch, HBASE-19358-v4.patch, HBASE-19358-v5.patch, 
> HBASE-19358-v6.patch, HBASE-19358-v7.patch, HBASE-19358-v8.patch, 
> HBASE-19358.patch
>
>
> The way we splitting log now is like the following figure:
> !https://issues.apache.org/jira/secure/attachment/12904506/split-logic-old.jpg!
> The problem is the OutputSink will write the recovered edits during splitting 
> log, which means it will create one WriterAndPath for each region and retain 
> it until the end. If the cluster is small and the number of regions per rs is 
> large, it will create too many HDFS streams at the same time. Then it is 
> prone to failure since each datanode need to handle too many streams.
> Thus I come up with a new way to split log.  
> !https://issues.apache.org/jira/secure/attachment/12904507/split-logic-new.jpg!
> We try to cache all the recovered edits, but if it exceeds the MaxHeapUsage, 
> we will pick the largest EntryBuffer and write it to a file (close the writer 
> after finish). Then after we read all entries into memory, we will start a 
> writeAndCloseThreadPool, it starts a certain number of threads to write all 
> buffers to files. Thus it will not create HDFS streams more than 
> *_hbase.regionserver.hlog.splitlog.writer.threads_* we set.
> The biggest benefit is we can control the number of streams we create during 
> splitting log, 
> it will not exceeds *_hbase.regionserver.wal.max.splitters * 
> hbase.regionserver.hlog.splitlog.writer.threads_*, but before it is 
> *_hbase.regionserver.wal.max.splitters * the number of region the hlog 
> contains_*.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (HBASE-19358) Improve the stability of splitting log when do fail over

Reply via email to