[ 
https://issues.apache.org/jira/browse/HDDS-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056664#comment-17056664
 ] 

Siddharth Wagle edited comment on HDDS-3104 at 3/11/20, 5:40 AM:
-----------------------------------------------------------------

[~adoroszlai] I think this happens because:
1. HddsDatanodeService#stop, calls DatanodeStateMachine#stopDaemon which sets 
the StateContext.state = SHUTDOWN
2. The DatanodeStateMachine is calling StateContext.execute because it read the 
stale state of the state
3. StateContext.excute will set shutDownOnError to true when it sees this state 
even though there was no error

So, I have a proposed patch for this, attaching it here before making a PR, 
would like to know your thoughts.


was (Author: swagle):
[~adoroszlai] I think this happens because:
1. HddsDatanodeService#stop, calls DatanodeStateMachine#stopDaemon which sets 
the StateContext.state = SHUTDOWN
2. The DatanodeStateMachine is calling StateContext.execute because the stale 
state it read
3. StateContext.excute will set shutDownOnError to true when it sees this state 
even though there was no error

So, I have a proposed patch for this, attaching it here before making a PR, 
would like to know your thoughts.

> Integration test crashes due to critical error in datanode
> ----------------------------------------------------------
>
>                 Key: HDDS-3104
>                 URL: https://issues.apache.org/jira/browse/HDDS-3104
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Attila Doroszlai
>            Assignee: Siddharth Wagle
>            Priority: Major
>         Attachments: HDDS-3104.patch, 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachineFailureOnRead-output.txt
>
>
> {code:title=test log}
> 2020-02-28 07:36:17,759 [Datanode State Machine Thread - 0] ERROR 
> statemachine.StateContext (StateContext.java:execute(420)) - Critical error 
> occurred in StateMachine, setting shutDownMachine
> ...
> 2020-02-28 07:36:21,216 [Datanode State Machine Thread - 0] INFO  
> util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: 
> ExitException
> {code}
> {code:title=build output}
> [ERROR] ExecutionException The forked VM terminated without properly saying 
> goodbye. VM crash or System.exit called?
> {code}
> https://github.com/adoroszlai/hadoop-ozone/runs/474218807
> https://github.com/adoroszlai/hadoop-ozone/suites/487650271/artifacts/2327174



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to