[ https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360401#comment-16360401 ]
Duo Zhang commented on HBASE-19554: ----------------------------------- https://builds.apache.org/job/HBASE-Flaky-Tests/25832/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.TestDLSFSHLog-output.txt/*view*/ {noformat} 2018-02-12 04:56:20,895 DEBUG [PEWorker-16] procedure.ServerCrashProcedure(192): pid=132, state=RUNNABLE:SERVER_CRASH_PROCESS_META; ServerCrashProcedure server=asf911.gq1.ygridcore.net,41715,1518411358686, splitWal=true, meta=true; Processing hbase:meta that was on asf911.gq1.ygridcore.net,41715,1518411358686 2018-02-12 04:56:20,895 INFO [PEWorker-16] procedure2.ProcedureExecutor(1498): Initialized subprocedures=[{pid=135, ppid=132, state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure failedMetaServer=asf911.gq1.ygridcore.net,41715,1518411358686, splitWal=true}] {noformat} Then there is no progress, so at last we time out. Let me add a thread dump when we are about to time out to see if we can find something. > AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit > -------------------------------------------------------------- > > Key: HBASE-19554 > URL: https://issues.apache.org/jira/browse/HBASE-19554 > Project: HBase > Issue Type: Sub-task > Components: Recovery, wal > Reporter: Duo Zhang > Assignee: Duo Zhang > Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19554.patch > > > https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt > The error message is a bit strange: > {quote} > [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) > Time elapsed: 20.627 s <<< ERROR! > org.apache.hadoop.hbase.TableNotFoundException: Region of > 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is > expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the > table of 'hbase:namespace'. hbase:meta might be damaged. > {quote} > It fails for both FSHLog and AsyncFSWAL. Need to dig more. -- This message was sent by Atlassian JIRA (v7.6.3#76005)