[ 
https://issues.apache.org/jira/browse/RATIS-209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388775#comment-16388775
 ] 

Tsz Wo Nicholas Sze commented on RATIS-209:
-------------------------------------------

Hey [~msingh], this is a NPE in ContainerStateMachine.java:265 in Ozone (not 
Ratis)
{code:java}
ava.lang.NullPointerException
        at 
org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:265)
 {code}

> StateMachine updater may miss writeLog Request after a new Leader is chosen.
> ----------------------------------------------------------------------------
>
>                 Key: RATIS-209
>                 URL: https://issues.apache.org/jira/browse/RATIS-209
>             Project: Ratis
>          Issue Type: Bug
>    Affects Versions: 0.2.0-alpha
>            Reporter: Mukul Kumar Singh
>            Assignee: Mukul Kumar Singh
>            Priority: Major
>
> This issue is happening on ozone for write chunk request.
> 1) currently write chunk request is processed in two phases, in the first 
> phase the user data is written to the follower as part of 
> {{writeStateMachineData}} and then the entry is committed to the follower as 
> part of {{commit}}.
> 2) The issue which is hit right now is the case where a) 
> {{writeStateMachineData}} didn't happen for a particular chunk however b) the 
> commit entry is still processed. this leads to a case where a corresponding 
> stateMachineFuture is not present in the hashmap.
> {code}
> 2018-02-12 00:26:30,097 INFO org.apache.ratis.server.impl.FollowerState: 
> 172.26.32.228_9858 changes to CANDIDATE, lastRpcTime:1773, 
> electionTimeout:873ms
> 2018-02-12 00:26:30,098 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 172.26.32.228_9858 changes role from FOLLOWER to CANDIDATE at term 3 for 
> changeToCandidate
> 2018-02-12 00:26:30,100 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 172.26.32.228_9858: change Leader from 172.26.32.232_9858 to null at term 3 
> for initElection
> 2018-02-12 00:26:32,869 INFO org.apache.ratis.server.impl.LeaderElection: 
> 172.26.32.228_9858: begin an election in Term 4
> 2018-02-12 00:26:32,901 INFO 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 172.26.32.228_9858: 
> appendEntries completed
> 2018-02-12 00:26:33,217 INFO org.apache.ratis.server.impl.LeaderElection: 
> 172.26.32.228_9858: Election REJECTED; received 2 response(s) 
> [172.26.32.228_9858<-172.26.32.230_9858#0:FAIL-t4, 
> 172.26.32.228_9858<-172.26.32.232_9858#0:FAIL-t4] and 0 exception(s); 
> 172.26.32.228_9858:t4, leader=null, voted=172.26.32.228_9858, raftlog=[(t:3, 
> i:10711)], conf=[172.26.32.228_9858:172.26.32.228:9858, 
> 172.26.32.230_9858:172.26.32.230:9858, 
> 172.26.32.232_9858:172.26.32.232:9858], old=null
> 2018-02-12 00:26:33,217 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 172.26.32.228_9858 changes role from CANDIDATE to FOLLOWER at term 4 for 
> changeToFollower
> 2018-02-12 00:26:39,518 INFO org.apache.ratis.server.impl.FollowerState: 
> 172.26.32.228_9858 changes to CANDIDATE, lastRpcTime:5624, 
> electionTimeout:975ms
> 2018-02-12 00:26:39,518 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 172.26.32.228_9858 changes role from FOLLOWER to CANDIDATE at term 5 for 
> changeToCandidate
> 2018-02-12 00:26:39,518 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 172.26.32.228_9858 changes role from CANDIDATE to FOLLOWER at term 5 for 
> changeToFollower
> 2018-02-12 00:26:39,520 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 172.26.32.228_9858: change Leader from null to 172.26.32.232_9858 at term 5 
> for appendEntries
> {code}
> {code}
> 2018-02-12 00:31:12,400 ERROR 
> org.apache.ratis.server.impl.StateMachineUpdater: Terminating with exit 
> status 2: StateMachineUpdater-172.26.32.228_9858: the StateMachineUpdater 
> hits Throwable
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.applyTransaction(ContainerStateMachine.java:254)
>         at 
> org.apache.ratis.server.impl.RaftServerImpl.applyLogToStateMachine(RaftServerImpl.java:1001)
>         at 
> org.apache.ratis.server.impl.StateMachineUpdater.run(StateMachineUpdater.java:151)
>         at java.lang.Thread.run(Thread.java:745)
> 2018-02-12 00:31:12,406 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> SHUTDOWN_MSG: 
> /************************************************************
> SHUTDOWN_MSG: Shutting down DataNode at 
> y128.l42scl.hortonworks.com/172.26.32.228
> ************************************************************/
> *** shutting down gRPC server since JVM is shutting down
> *** server shut down
> ***
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to