Umesh Agashe created HBASE-18261:
------------------------------------
Summary: [AMv2] Create new RecoverMetaProcedure and use it from
ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
Key: HBASE-18261
URL: https://issues.apache.org/jira/browse/HBASE-18261
Project: HBase
Issue Type: Improvement
Components: amv2
Affects Versions: 2.0.0-alpha-1
Reporter: Umesh Agashe
Assignee: Umesh Agashe
Fix For: 2.0.0-alpha-2
When unit test
hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta()
is enabled and run several times, it fails intermittently. Cause is meta
recovery is done at two different places:
* ServerCrashProcedure.processMeta()
* HMaster.finishActiveMasterInitialization()
and its not coordinated.
When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog()
first and while its running call from ServerCrashProcedure.processMeta() fails
causing step to be retried again in a loop.
When ServerCrashProcedure.processMeta() submits splitMetaLog after splitMetaLog
from HMaster.finishActiveMasterInitialization() is finished, success is
returned without doing any work.
But if ServerCrashProcedure.processMeta() submits splitMetaLog request and
while its going HMaster.finishActiveMasterInitialization() submits it test
fails with exception.
[~stack] and I discussed the possible solution:
Create RecoverMetaProcedure and call it where required. Procedure framework
provides mutual exclusion and requires idempotence, which should fix the
problem.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)