[jira] [Commented] (HBASE-7937) Retry log rolling to support HA NN scenario

Himanshu Vashishtha (JIRA) Wed, 27 Feb 2013 17:40:12 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13589034#comment-13589034
 ]


Himanshu Vashishtha commented on HBASE-7937:
--------------------------------------------

Thanks for taking a look.


bq. + private int logRollRetryCount;
Yes, they are set in ctr; I will make them final, and make them default in case 
the value is <=0.

bq. // there may be a case when fs has just become available; one can do one 
more retry
I was considering the case when a NN HA recovers in b/w we failed while doing 
an op, and checking via FSUtils#checkFSAvailable call. If that happens, it will 
be in a state for eg: fs.rename() threw an exception, but fs is healthy... so 
rethrow the exception to the caller. In actual, it should have done one more 
retry.
I tried to cover that case with the fsOk variable. If you think this is not 
needed, I will remove it.

bq. incrementing twice. 
Sorry about that. I will fix this.

bq. Default pause time:
1 sec; as defined in HConstants#DEFAULT_HBASE_SERVER_PAUSE

bq. Are we holding up all writes when we are paused like this?
I don't think we are. We are in the retrying loop at two places here:
a) Creating a new log writer
b) Archiving old logs
As long as we haven't created a new writer, we don't change the old log writer. 
So, we are still pointing to the old hlog.
Archiving old logs shouldn't be a blocking call. If it is, it is a bug.

bq. refactoring..
Will do.

TestHLogSplit passes on local. I didn't change the LogSplitter code. Tried to 
keep its scope minimum.
                
> Retry log rolling to support HA NN scenario
> -------------------------------------------
>
>                 Key: HBASE-7937
>                 URL: https://issues.apache.org/jira/browse/HBASE-7937
>             Project: HBase
>          Issue Type: Bug
>          Components: wal
>    Affects Versions: 0.94.5
>            Reporter: Himanshu Vashishtha
>            Assignee: Himanshu Vashishtha
>             Fix For: 0.95.0
>
>         Attachments: HBASE-7937-trunk.patch, HBASE-7937-v1.patch
>
>
> A failure in log rolling causes regionserver abort. In case of HA NN, it will 
> be good if there is a retry mechanism to roll the logs.
> A corresponding jira for MemStore retries is HBASE-7507.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-7937) Retry log rolling to support HA NN scenario

Reply via email to