[ 
https://issues.apache.org/jira/browse/HBASE-18525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116032#comment-16116032
 ] 

Umesh Agashe commented on HBASE-18525:
--------------------------------------

Thanks [~stack]! yes, I would like to consider this together with other related 
JIRAs linked to HBASE-18366. Let discuss this further.

Thanks [~tedyu] for the V2 patch! I'm looking at the patch. It looks better. 
You have mentioned that the test is hanging with this patch. Let me check the 
logs and see why its hanging.


> TestAssignmentManager#testSocketTimeout fails in master branch
> --------------------------------------------------------------
>
>                 Key: HBASE-18525
>                 URL: https://issues.apache.org/jira/browse/HBASE-18525
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 18525.v1.txt, 18525.v2.txt
>
>
> Toward the end of the test output, I saw:
> {code}
> 2017-08-05 03:30:16,591 INFO  [Time-limited test] 
> assignment.TestAssignmentManager(446): ExecutionException
> java.util.concurrent.ExecutionException: 
> org.apache.hadoop.hbase.master.procedure.ServerCrashException: 
> ServerCrashProcedure pid=3, server=localhost,103,1
>   at 
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait$ProcedureFuture.get(ProcedureSyncWait.java:104)
>   at 
> org.apache.hadoop.hbase.master.procedure.ProcedureSyncWait$ProcedureFuture.get(ProcedureSyncWait.java:62)
>   at 
> org.apache.hadoop.hbase.master.assignment.TestAssignmentManager.waitOnFuture(TestAssignmentManager.java:444)
>   at 
> org.apache.hadoop.hbase.master.assignment.TestAssignmentManager.testSocketTimeout(TestAssignmentManager.java:255)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.hadoop.hbase.master.procedure.ServerCrashException: 
> ServerCrashProcedure pid=3, server=localhost,103,1
>   at 
> org.apache.hadoop.hbase.master.assignment.UnassignProcedure.updateTransition(UnassignProcedure.java:169)
>   at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:274)
>   at 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure.execute(RegionTransitionProcedure.java:57)
>   at 
> org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:847)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1440)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1209)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$800(ProcedureExecutor.java:79)
>   at 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1719)
> {code}
> This test failure seems to happen after HBASE-18491 was checked in.
> Looking at the change in UnassignProcedure, it seems we should handle the two 
> conditions differently:
> {code}
>      if (serverCrashed.get() || !isServerOnline(env, regionNode)) {
> {code}
> With attached patch, TestAssignmentManager#testSocketTimeout and 
> TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta pass.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to