[ 
https://issues.apache.org/jira/browse/HBASE-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16371024#comment-16371024
 ] 

Duo Zhang commented on HBASE-19554:
-----------------------------------

OK the numEntries is incorrect, will file an issue to address it, not critical, 
just logging.

And I think this is the problem
{noformat}
2018-02-20 16:33:41,183 INFO  [regionserver/asf903:0.logRoller] 
wal.AbstractFSWAL(690): Rolled WAL 
/user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/WALs/asf903.gq1.ygridcore.net,52039,1519144411613/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421143
 with entries=0, filesize=35.46 KB; new WAL 
/user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/WALs/asf903.gq1.ygridcore.net,52039,1519144411613/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421171
2018-02-20 16:33:41,184 DEBUG [regionserver/asf903:0.logRoller] 
wal.AbstractFSWAL(757): Create new FSHLog writer with pipeline: 
[DatanodeInfoWithStorage[127.0.0.1:60898,DS-fe8309dd-a316-4ef5-a474-7167557c5c76,DISK],
 
DatanodeInfoWithStorage[127.0.0.1:53328,DS-0f9fbb3d-3cd1-47ea-9867-52f1ecac6f19,DISK],
 
DatanodeInfoWithStorage[127.0.0.1:57523,DS-fc2097ed-4d98-40d7-a4a0-adfd31a5bf68,DISK]]
2018-02-20 16:33:41,184 INFO  [regionserver/asf903:0.logRoller] 
wal.AbstractFSWAL(652): Archiving 
hdfs://localhost:55625/user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/WALs/asf903.gq1.ygridcore.net,52039,1519144411613/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421143
 to 
hdfs://localhost:55625/user/jenkins/test-data/46c46c01-4dc8-4506-a228-a4460f2a28e9/oldWALs/asf903.gq1.ygridcore.net%2C52039%2C1519144411613.1519144421143
{noformat}

Here we move the wal file to oldWALs soon after rolling. We write to WAL 
directly in the test, so this should not happen. Let me dig more.

> AbstractTestDLS.testThreeRSAbort sometimes fails in pre commit
> --------------------------------------------------------------
>
>                 Key: HBASE-19554
>                 URL: https://issues.apache.org/jira/browse/HBASE-19554
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Recovery, wal
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 2.0.0-beta-2
>
>         Attachments: HBASE-19554-thread-dump.patch, HBASE-19554.patch
>
>
> https://builds.apache.org/job/PreCommit-HBASE-Build/10554/artifact/patchprocess/patch-unit-hbase-server.txt
> The error message is a bit strange:
> {quote}
> [ERROR] testThreeRSAbort(org.apache.hadoop.hbase.master.TestDLSAsyncFSWAL) 
> Time elapsed: 20.627 s <<< ERROR!
> org.apache.hadoop.hbase.TableNotFoundException: Region of 
> 'hbase:namespace,,1513320505933.451650152885a3b41d0b1110deca513c.' is 
> expected in the table of 'testThreeRSAbort', but hbase:meta says it is in the 
> table of 'hbase:namespace'. hbase:meta might be damaged.
> {quote}
> It fails for both FSHLog and AsyncFSWAL. Need to dig more.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to