[
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13188207#comment-13188207
]
chunhui shen commented on HBASE-5179:
-------------------------------------
@Zhihong
{code}+ if (currentMetaServer != null
+ && this.serverManager.isServerOnline(currentMetaServer)) {
+ // Current meta server is dead, we first split its log and then expire
{code}
I think it's right, could you talk about the found problem?
If distributedLogSplitting is true, the following code will be called
{code}
if (distributedLogSplitting) {
splitLogManager.handleDeadWorkers(serverNames);
splitTime = EnvironmentEdgeManager.currentTimeMillis();
splitLogSize = splitLogManager.splitLogDistributed(logDirs);
splitTime = EnvironmentEdgeManager.currentTimeMillis() - splitTime;
}
{code}
and
{code}
/**
* The caller will block until all the log files of the given region server
* have been processed - successfully split or an error is encountered - by an
* available worker region server. This method must only be called after the
* region servers have been brought online.
*
* @param logDirs
* @throws IOException
* if there was an error while splitting any log file
* @return cumulative size of the logfiles split
*/
public long splitLogDistributed(final List<Path> logDirs) throws IOException
{code}
So, it will block until completing splitlog
> Concurrent processing of processFaileOver and ServerShutdownHandler may cause
> region to be assigned before log splitting is completed, causing data loss
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.90.2
> Reporter: chunhui shen
> Assignee: chunhui shen
> Priority: Critical
> Fix For: 0.92.0, 0.94.0, 0.90.6
>
> Attachments: 5179-90.txt, 5179-90v2.patch, 5179-90v3.patch,
> 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 5179-90v7.patch,
> 5179-90v8.patch, 5179-90v9.patch, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt,
> hbase-5179.patch, hbase-5179v5.patch, hbase-5179v6.patch, hbase-5179v7.patch,
> hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing
> happen concurrently, it may appear following case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be
> doing split log, Therefore, it may cause data loss.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira