[
https://issues.apache.org/jira/browse/HBASE-20006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367958#comment-16367958
]
stack commented on HBASE-20006:
-------------------------------
Ok, this is some read replica mess. I don't want to work on this figuring out
this filenaming done for read replicas. Will let it to a read replicas person
-- if any around. And I don't want this messing up our test runs. So for now
disabling this test.
Other exceptions seen are:
{code}
java.io.IOException: java.io.IOException: java.io.FileNotFoundException: File
does not exist:
/user/jenkins/test-data/463e63dc-23bb-44ff-a32c-033c390552a6/data/default/testRestoreSnapshotAfterSplittingRegions-1518810548820/1c8eb80ac0831f0f27074b953eb647bb/cf/testRestoreSnapshotAfterSplittingRegions-1518810548820=1c8eb80ac0831f0f27074b953eb647bb-bfe5320da17b47e4b1553a14bacbc532
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1836)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1808)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1723)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:588)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:366)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213)
at
org.apache.hadoop.hbase.regionserver.HRegion.initializeStores(HRegion.java:1040)
at
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:903)
at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:871)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:7017)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6974)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6945)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6901)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:6852)
at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:284)
at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:109)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}
.. then this makes for failed assigns.
> TestRestoreSnapshotFromClientWithRegionReplicas is flakey
> ---------------------------------------------------------
>
> Key: HBASE-20006
> URL: https://issues.apache.org/jira/browse/HBASE-20006
> Project: HBase
> Issue Type: Sub-task
> Reporter: stack
> Priority: Major
> Fix For: 2.0.0
>
> Attachments: HBASE-20006.branch-2.001.patch
>
>
> Failing 10% of the time. Interestingly, it is below that causes fail. We go
> to split but it is already split. We will then fail the split with an
> internal assert which messes up procedures; at a minimum we should just not
> split (this is in the prepare stage).
> {code}
> 2018-02-15 23:21:42,162 INFO [PEWorker-12]
> procedure.MasterProcedureScheduler(571): pid=105,
> state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure
> table=testOnlineSnapshotAfterSplittingRegions-1518736887838,
> parent=3f850cea7d71a7ebd019f2f009efca4d,
> daughterA=06b5e6366efbef155d70e56cfdf58dc9,
> daughterB=8c175de1b33765a5683ac1e502edb0bd,
> table=testOnlineSnapshotAfterSplittingRegions-1518736887838,
> testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.
> 2018-02-15 23:21:42,162 INFO [PEWorker-12]
> assignment.SplitTableRegionProcedure(440): Split of {ENCODED =>
> 3f850cea7d71a7ebd019f2f009efca4d, NAME =>
> 'testOnlineSnapshotAfterSplittingRegions-1518736887838,,1518736887882.3f850cea7d71a7ebd019f2f009efca4d.',
> STARTKEY => '', ENDKEY => '1'} skipped; state is already SPLIT
> 2018-02-15 23:21:42,163 ERROR [PEWorker-12]
> procedure2.ProcedureExecutor(1480): CODE-BUG: Uncaught runtime exception:
> pid=105, state=RUNNABLE:SPLIT_TABLE_REGION_PREPARE; SplitTableRegionProcedure
> table=testOnlineSnapshotAfterSplittingRegions-1518736887838,
> parent=3f850cea7d71a7ebd019f2f009efca4d,
> daughterA=06b5e6366efbef155d70e56cfdf58dc9,
> daughterB=8c175de1b33765a5683ac1e502edb0bd
> java.lang.AssertionError: split region should have an exception here
> at
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:228)
> at
> org.apache.hadoop.hbase.master.assignment.SplitTableRegionProcedure.executeFromState(SplitTableRegionProcedure.java:89)
> at
> org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:180)
> at
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:845)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1455)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1224)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1734)
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)