[ https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371024#comment-16371024 ]
Duo Zhang commented on HBASE-19554: ----------------------------------- OK the numEntries is incorrect, will file an issue to address it, not critical, just logging. And I think this is the problem {noformat} 2018-02-20 16:33:41,183 INFO [regionserver/asf903:0.logRoller] wal.AbstractFSWAL(690): Rolled WAL /user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/WALs/asf903.gq1.ygridcore.net,52039,1519144411613/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421143 with entries=0, filesize=35.46 KB; new WAL /user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/WALs/asf903.gq1.ygridcore.net,52039,1519144411613/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421171 2018-02-20 16:33:41,184 DEBUG [regionserver/asf903:0.logRoller] wal.AbstractFSWAL(757): Create new FSHLog writer with pipeline: [DatanodeInfoWithStorage[127.0.0.1:60898,DS-fe8309dd-a316-4ef5-a474-7167557c5c76,DISK], DatanodeInfoWithStorage[127.0.0.1:53328,DS-0f9fbb3d-3cd1-47ea-9867-52f1ecac6f19,DISK], DatanodeInfoWithStorage[127.0.0.1:57523,DS-fc2097ed-4d98-40d7-a4a0-adfd31a5bf68,DISK]] 2018-02-20 16:33:41,184 INFO [regionserver/asf903:0.logRoller] wal.AbstractFSWAL(652): Archiving hdfs://localhost:55625/user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/WALs/asf903.gq1.ygridcore.net,52039,1519144411613/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421143 to hdfs://localhost:55625/user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/oldWALs/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421143 {noformat} Here we move the wal file to oldWALs soon after rolling. We write to WAL directly in the test, so this should not happen. Let me dig more. > AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit > -------------------------------------------------------------- > > Key: HBASE-19554 > URL: https://issues.apache.org/jira/browse/HBASE-19554 > Project: HBase > Issue Type: Sub-task > Components: Recovery, wal > Reporter: Duo Zhang > Assignee: Duo Zhang > Priority: Major > Fix For: 2.0.0-beta-2 > > Attachments: HBASE-19554-thread-dump.patch, HBASE-19554.patch > > > https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt > The error message is a bit strange: > {quote} > [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) > Time elapsed: 20.627 s <<< ERROR! > org.apache.hadoop.hbase.TableNotFoundException: Region of > 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is > expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the > table of 'hbase:namespace'. hbase:meta might be damaged. > {quote} > It fails for both FSHLog and AsyncFSWAL. Need to dig more. -- This message was sent by Atlassian JIRA (v7.6.3#76005)