Ted Yu created SLIDER-276:
-----------------------------

             Summary: Inaccurate assertion in NodeEntry#release()
                 Key: SLIDER-276
                 URL: https://issues.apache.org/jira/browse/SLIDER-276
             Project: Slider
          Issue Type: Bug
            Reporter: Ted Yu
            Assignee: Ted Yu
            Priority: Minor


I issued flex command to reduce the number of region servers by 1:
{code}
14/08/04 18:14:52 INFO state.AppState: RoleStatus{name='HBASE_REGIONSERVER', 
key=2, desired=1, actual=2, requested=0, releasing=0, failed=0, started=2, 
startFailed=0, completed=0, failureMessage=''}
14/08/04 18:14:52 INFO state.AppState: HBASE_REGIONSERVER: Asking for 1 fewer 
node(s) for a total of 1
14/08/04 18:14:52 INFO state.AppState: RoleStatus{name='HBASE_MASTER', key=1, 
desired=1, actual=1, requested=0, releasing=0, failed=0, started=1, 
startFailed=0, completed=0, failureMessage=''}
14/08/04 18:14:52 INFO state.AppState: RoleStatus{name='HBASE_REST', key=3, 
desired=1, actual=1, requested=0, releasing=0, failed=0, started=1, 
startFailed=0, completed=0, failureMessage=''}
14/08/04 18:14:52 INFO appmaster.SliderAppMaster: onContainersCompleted([1]
14/08/04 18:14:52 INFO appmaster.SliderAppMaster: Container Completion for 
containerID=container_1405721039692_0013_01_000004, state=COMPLETE, 
exitStatus=-100, diagnostics=Container released by application
14/08/04 18:14:52 INFO state.AppState: Container was queued for release
14/08/04 18:14:52 INFO state.AppState: decrementing role count for role 
HBASE_REGIONSERVER
14/08/04 18:14:53 INFO state.AppState: RoleStatus{name='HBASE_REGIONSERVER', 
key=2, desired=1, actual=1, requested=0, releasing=0, failed=0, started=2, 
startFailed=0, completed=1, failureMessage=''}
14/08/04 18:14:53 INFO state.AppState: RoleStatus{name='HBASE_MASTER', key=1, 
desired=1, actual=1, requested=0, releasing=0, failed=0, started=1, 
startFailed=0, completed=0, failureMessage=''}
14/08/04 18:14:53 INFO state.AppState: RoleStatus{name='HBASE_REST', key=3, 
desired=1, actual=1, requested=0, releasing=0, failed=0, started=1, 
startFailed=0, completed=0, failureMessage=''}
14/08/04 18:16:18 WARN agent.HeartbeatMonitor: Component 
container_1405721039692_0013_01_000004___HBASE_REGIONSERVER marked UNHEALTHY. 
Last heartbeat received at 1407176092207 approx. 86129 ms. back.
14/08/04 18:17:18 WARN agent.HeartbeatMonitor: Component 
container_1405721039692_0013_01_000004___HBASE_REGIONSERVER marked 
HEARTBEAT_LOST. Last heartbeat received at 1407176092207 approx. 146130 ms. 
back.
14/08/04 18:17:18 INFO appmaster.SliderAppMaster: Refreshing container 
container_1405721039692_0013_01_000004 per provider request.
14/08/04 18:17:18 WARN agent.HeartbeatMonitor: ERROR
java.lang.AssertionError: no live nodes to release
        at 
org.apache.slider.server.appmaster.state.NodeEntry.release(NodeEntry.java:172)
        at 
org.apache.slider.server.appmaster.state.RoleHistory.onContainerReleaseSubmitted(RoleHistory.java:656)
        at 
org.apache.slider.server.appmaster.state.AppState.containerReleaseSubmitted(AppState.java:919)
        at 
org.apache.slider.server.appmaster.state.AppState.releaseContainer(AppState.java:1491)
        at 
org.apache.slider.server.appmaster.SliderAppMaster.refreshContainer(SliderAppMaster.java:1444)
        at 
org.apache.slider.providers.agent.AgentProviderService.releaseContainer(AgentProviderService.java:391)
        at 
org.apache.slider.providers.agent.HeartbeatMonitor.doWork(HeartbeatMonitor.java:109)
        at 
org.apache.slider.providers.agent.HeartbeatMonitor.run(HeartbeatMonitor.java:69)
        at java.lang.Thread.run(Thread.java:722)
{code}
As can be seen above, NodeEntry#containerCompleted() event was received before 
NodeEntry#release() was called.

This triggered the following assertion:
{code}
  public synchronized void release() {
    assert live > 0 : "no live nodes to release";
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to