[
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13189326#comment-13189326
]
Zhihong Yu commented on HBASE-5179:
-----------------------------------
In patch v10:
{code}
+ this.fileSystemManager.splitLog(metaServerInfo.getServerName());
+ this.serverManager.expireServer(metaServerInfo);
{code}
In latest patch, fileSystemManager.splitLog() is gone.
I think what might have happened was that waitUntilNoLogDir() returned too soon
because the log splitting was carried out in ShutdownHandler which barely got a
chance of doing the split.
Patch v13 adds DEBUG logging in waitUntilNoLogDir() so that we know whether the
log dir exists upon its entry and how long waitUntilNoLogDir() takes before
returning.
> Concurrent processing of processFaileOver and ServerShutdownHandler may cause
> region to be assigned before log splitting is completed, causing data loss
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.6
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch,
> 5179-90v12.patch, 5179-90v2.patch, 5179-90v3.patch, 5179-90v4.patch,
> 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch, 5179-90v8.patch,
> 5179-90v9.patch, 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt,
> 5179-v4.txt, Errorlog, hbase-5179.patch, hbase-5179v10.patch,
> hbase-5179v12.patch, hbase-5179v5.patch, hbase-5179v6.patch,
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing
> happen concurrently, it may appear following case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be
> doing split log, Therefore, it may cause data loss.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira