Umesh Agashe created HBASE-18261:
------------------------------------

             Summary: [AMv2] Create new RecoverMetaProcedure and use it from 
ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
                 Key: HBASE-18261
                 URL: https://issues.apache.org/jira/browse/HBASE-18261
             Project: HBase
          Issue Type: Improvement
          Components: amv2
    Affects Versions: 2.0.0-alpha-1
            Reporter: Umesh Agashe
            Assignee: Umesh Agashe
             Fix For: 2.0.0-alpha-2


When unit test 
hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta()
 is enabled and run several times, it fails intermittently. Cause is meta 
recovery is done at two different places:
* ServerCrashProcedure.processMeta()
* HMaster.finishActiveMasterInitialization()
and its not coordinated.

When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() 
first and while its running call from ServerCrashProcedure.processMeta() fails 
causing step to be retried again in a loop.

When ServerCrashProcedure.processMeta() submits splitMetaLog after splitMetaLog 
from HMaster.finishActiveMasterInitialization() is finished, success is 
returned without doing any work.

But if ServerCrashProcedure.processMeta() submits splitMetaLog request and 
while its going HMaster.finishActiveMasterInitialization() submits it test 
fails with exception.

[~stack] and I discussed the possible solution:
Create RecoverMetaProcedure and call it where required. Procedure framework 
provides mutual exclusion and requires idempotence, which should fix the 
problem.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to