[
https://issues.apache.org/jira/browse/HBASE-4007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091295#comment-13091295
]
stack commented on HBASE-4007:
------------------------------
OK. Here's a few comments if they'll help.
Why do this:
{code}
+ private Set<String> deadWorkers = null;
+ private Object deadWorkersLock = new Object();
{code}
Does deadWorkers have to be null? Can it not just be empty? (I see in the
patch where you are setting it to null when work has been done -- could
processed workers just be empty deadWorkers Set instead?) Then you could
allocate deadWorkers when you declare it, make it final while you are at it,
and then lock on deadWorkers rather than have this deadWorkersLock (This
becomes unnecessary?).
Minor: call registerHeartbeat 'heartbeat' instead? We're handling the
heartbeat updating data members.. we're not keeping track -- registering -- the
event.
... more to follow..
> distributed log splitting can get indefinitely stuck
> ----------------------------------------------------
>
> Key: HBASE-4007
> URL: https://issues.apache.org/jira/browse/HBASE-4007
> Project: HBase
> Issue Type: Bug
> Reporter: Prakash Khemani
> Assignee: Prakash Khemani
> Priority: Critical
> Attachments:
> 0001-HBASE-4007-distributed-log-splitting-can-get-indefin.patch
>
>
> After the configured number of retries SplitLogManager is not going to
> resubmit log-split tasks. In this situation even if the splitLogWorker that
> owns the task dies the task will not get resubmitted.
> When a regionserver goes away then all the split-log tasks that it owned
> should be resubmitted by the SplitLogMaster.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira