[
https://issues.apache.org/jira/browse/HBASE-8321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628287#comment-13628287
]
Jeffrey Zhong commented on HBASE-8321:
--------------------------------------
[~jxiang] I think we have heartbeat mechanism inside splitlogworker which is
implemented by HLogSplitter#reportProgressIfIsDistributedLogSplitting. You can
check the code inside SplitLogWorker#grabTask to see how the detailed heartbeat
implementation.
In addition, we had a fixed JIRA HBASE-6738 which deal with the too short "25"
secs wait by splitLogManager and in 0.94 we increased the default to 5 mins.
{code}
...
status = splitTaskExecutor.exec(ZKSplitLog.getFileName(currentTask),
new CancelableProgressable() {
@Override
public boolean progress() {
if (!attemptToOwnTask(false)) {
LOG.warn("Failed to heartbeat the task" + currentTask);
return false;
}
return true;
}
});
...
{code}
> Log split worker should heartbeat to avoid timeout
> --------------------------------------------------
>
> Key: HBASE-8321
> URL: https://issues.apache.org/jira/browse/HBASE-8321
> Project: HBase
> Issue Type: Bug
> Components: wal
> Reporter: Jimmy Xiang
> Assignee: Jimmy Xiang
>
> Currently, hlog splitter could spend quite sometime to split a log in case
> any HDFS issue and recoverLease/retry opening is needed. If distributed log
> split manager times out the log worker, other log worker to take over will
> run into the same issue.
> Ideally, we should not need a timeout monitor. Since we have a timeout
> monitor for DSL now, the worker should heartbeat to avoid wrong/unneeded
> timeouts.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira