[
https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360401#comment-16360401
]
Duo Zhang commented on HBASE-19554:
-----------------------------------
https://builds.apache.org/job/HBASE-Flaky-Tests/25832/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.master.TestDLSFSHLog-output.txt/*view*/
{noformat}
2018-02-12 04:56:20,895 DEBUG [PEWorker-16]
procedure.ServerCrashProcedure(192): pid=132,
state=RUNNABLE:SERVER_CRASH_PROCESS_META; ServerCrashProcedure
server=asf911.gq1.ygridcore.net,41715,1518411358686, splitWal=true, meta=true;
Processing hbase:meta that was on asf911.gq1.ygridcore.net,41715,1518411358686
2018-02-12 04:56:20,895 INFO [PEWorker-16] procedure2.ProcedureExecutor(1498):
Initialized subprocedures=[{pid=135, ppid=132,
state=RUNNABLE:RECOVER_META_SPLIT_LOGS; RecoverMetaProcedure
failedMetaServer=asf911.gq1.ygridcore.net,41715,1518411358686, splitWal=true}]
{noformat}
Then there is no progress, so at last we time out. Let me add a thread dump
when we are about to time out to see if we can find something.
> AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit
> --------------------------------------------------------------
>
> Key: HBASE-19554
> URL: https://issues.apache.org/jira/browse/HBASE-19554
> Project: HBase
> Issue Type: Sub-task
> Components: Recovery, wal
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
> Fix For: 2.0.0-beta-2
>
> Attachments: HBASE-19554.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt
> The error message is a bit strange:
> {quote}
> [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL)
> Time elapsed: 20.627 s <<< ERROR!
> org.apache.hadoop.hbase.TableNotFoundException: Region of
> 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is
> expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the
> table of 'hbase:namespace'. hbase:meta might be damaged.
> {quote}
> It fails for both FSHLog and AsyncFSWAL. Need to dig more.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)