[
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13184533#comment-13184533
]
stack commented on HBASE-5179:
------------------------------
bq. I think the reason Chunhui introduced a new Set for the dead servers being
processed is that DeadServer is supposed to remember dead servers
Yeah, I seem to remember such a need but I'd think we should doc' it up some
more in DeadServer so next person in here looking at code has a chance figuring
whats up.
On v3:
{code}
getDeadServersUnderProcessing
{code}
is still public and I think it should be named getDeadServersBeingProcessed ...
or BeingHandled... or better so it matches areDeadServersInProgress,
getDeadServersInProgress.. they are in the process of being made into
DeadServers!!! (and there is missing javadoc explaining what this method is at
least relative to getDeadServers -- that its servers that are going through
ServerShutdownHandler processing).
Does this method need to be in the Interface for ServerManager (The less in the
Interface the better)?
knownServers should be onlineServers which makes me think that this check for
DeadServersInProgress should be made inside in ServerManager so that what comes
out of getOnlineServers has already had the InProgress servers stripped?
Do you think we need that the new Collection deadServersUnderProcessing should
instead be called inProgress... and a server is in either inProgress or its in
the deadServers list? On remove, it gets moved (under synchronize) from one
list to the other.
> Concurrent processing of processFaileOver and ServerShutdownHandler may
> cause region is assigned before completing split log, it would cause data loss
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Attachments: 5179-90.txt, 5179-v2.txt, 5179-v3.txt, hbase-5179.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing
> happen concurrently, it may appear following case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be
> doing split log, Therefore, it may cause data loss.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira