[
https://issues.apache.org/jira/browse/HBASE-13396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14394629#comment-14394629
]
stack commented on HBASE-13396:
-------------------------------
Excellent.
+1 for all branches.
Nit, remove this line and beef up the other LOG.warn with more on what is
happening:
+ LOG.warn("There are " + unclosedWriters.size() + " unclosed writers");
As I read it, the above line will be emitted on each log roll... and as is it
could freak out operators. Better to do the beefy log warn at the ten minute
mark.
Nice one.
> Cleanup unclosed writers in later writer rolling
> ------------------------------------------------
>
> Key: HBASE-13396
> URL: https://issues.apache.org/jira/browse/HBASE-13396
> Project: HBase
> Issue Type: Bug
> Reporter: Liu Shaohui
> Assignee: Liu Shaohui
> Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-13396-v1.diff
>
>
> Currently, the default value of hbase.regionserver.logroll.errors.tolerated
> is 2, which means regionserver can tolerate two continuous failures of
> closing writers at most. Temporary problems of network or namenode may cause
> those failures. After those failures, the hdfs clients in RS may continue to
> renew the lease of the hlog of the writer and the namenode will not help to
> recover the lease of this hlog. So the last block of this hlog will be
> RBW(replica being written) state until the regionserver is down. Blocks in
> this state will block the datanode decommission and other operations in HDFS.
> So I think we need a mechanism to clean up those unclosed writers afterwards.
> A simple solution is to record those unclosed writers and attempt to close
> these writers until success.
> Discussions and suggestions are welcomed~ Thanks
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)