[ 
https://issues.apache.org/jira/browse/HBASE-8764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-8764:
-------------------------

      Resolution: Fixed
    Release Note: 
Add retrying of Master operations; helps when running hbase-it and chaos monkey 
kills
Master or Master is not yet up ready to take on operations.

Refactors ServerCallable.  ServerCallable had a public call() method and then 
beside
it a withRetries() and also a withoutRetries().  Confusing.  Also the rpc 
retrying
with its specific handling of server exception returns was not reusable buried 
down
in ServerCallable guts.

This patch moves the rpc retrying code out of ServerCallable into a utility
RpcRetryingCaller class (A 'Caller' runs the 'Callable'). ServerCallable 
shrinks,
implements a new RetryingCallable Interface, and becomes RegionServerCallable, 
a class
that is just about Calling -- no rpc nor retries, a Callable class with added 
details
on where the Callable is to be applied (table name and row), -- etc.

This pattern is then applied to Master operations.  Master operations were not 
retried
previously.  The Master operation Callables are now like RegionServerCallable 
(though
they need to carry way less detail), implement RetryingCallable, and are passed 
to
RpcRetryingCaller so they are retried.  Changed some exceptions so they now 
implement
DoNotRetryException because not all master operations should be retried.
    Hadoop Flags: Reviewed
          Status: Resolved  (was: Patch Available)

Committed to trunk and 0.95.  Thanks for review E.
                
> Some MasterMonitorCallable should retry
> ---------------------------------------
>
>                 Key: HBASE-8764
>                 URL: https://issues.apache.org/jira/browse/HBASE-8764
>             Project: HBase
>          Issue Type: Bug
>          Components: IPC/RPC
>    Affects Versions: 0.95.1
>            Reporter: Elliott Clark
>            Assignee: stack
>             Fix For: 0.95.2
>
>         Attachments: 8764.txt, 8764v2.txt, 8764v3.txt, 8764v4.txt, 
> 8796v5.txt, 8796v7.txt
>
>
> Calls in the admin that only get status should re-try.
> got a call stack like:
> {code}
> org.apache.hadoop.hbase.exceptions.PleaseHoldException: 
> org.apache.hadoop.hbase.exceptions.PleaseHoldException: Master is initializing
>       at 
> org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2266)
>       at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1610)
>       at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1646)
>       at 
> org.apache.hadoop.hbase.protobuf.generated.MasterAdminProtos$MasterAdminService$2.callBlockingMethod(MasterAdminProtos.java:20930)
>       at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2122)
>       at 
> org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1829)
>       at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>       at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>       at java.lang.reflect.Constructor.newInstance(Constructor.java:525)
>       at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
>       at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
>       at 
> org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:230)
>       at 
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:2705)
>       at 
> org.apache.hadoop.hbase.client.HBaseAdmin.execute(HBaseAdmin.java:2674)
>       at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTableAsync(HBaseAdmin.java:524)
>       at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:417)
>       at 
> org.apache.hadoop.hbase.client.HBaseAdmin.createTable(HBaseAdmin.java:349)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator.createSchema(IntegrationTestBigLinkedList.java:437)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator.runGenerator(IntegrationTestBigLinkedList.java:471)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Generator.run(IntegrationTestBigLinkedList.java:505)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Loop.runGenerator(IntegrationTestBigLinkedList.java:698)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedList$Loop.run(IntegrationTestBigLinkedList.java:748)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at 
> org.apache.hadoop.hbase.test.IntegrationTestBigLinkedListWithChaosMonkey.testContinuousIngest(IntegrationTestBigLinkedListWithChaosMonkey.java:80)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:601)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>       at org.junit.runners.Suite.runChild(Suite.java:127)
>       at org.junit.runners.Suite.runChild(Suite.java:26)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:160)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:138)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:117)
>       at 
> org.apache.hadoop.hbase.IntegrationTestsDriver.doWork(IntegrationTestsDriver.java:111)
>       at 
> org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:108)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>       at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>       at 
> org.apache.hadoop.hbase.IntegrationTestsDriver.main(IntegrationTestsDriver.java:47)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to