Nilotpal Nandi created HDDS-904: ----------------------------------- Summary: RATIS group not found thrown on datanodes while leader election Key: HDDS-904 URL: https://issues.apache.org/jira/browse/HDDS-904 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode, SCM Reporter: Nilotpal Nandi Attachments: datanode_1.log, datanode_2.log, datanode_3.log, scm.log
Following exception seen in datanode.log of one the docker nodes --------------------------------------------------------------------------------------------- {noformat} 2018-12-06 09:32:11 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 1 2018-12-06 09:32:12 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 0 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t1, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null 2018-12-06 09:32:13 INFO LeaderElection:127 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: begin an election in Term 2 2018-12-06 09:32:13 INFO LeaderElection:230 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500 got exception when requesting votes: {} java.util.concurrent.ExecutionException: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found. at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:214) at org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:146) at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:102) Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 76153aab-4681-40b6-bc32-cc9ed5ef1daf: group-41B8C34A6DE4 not found. at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:222) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:203) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:132) at org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265) at org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:63) at org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:150) at org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$0(LeaderElection.java:188) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2018-12-06 09:32:14 INFO LeaderElection:46 - 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500: Election TIMEOUT; received 0 response(s) [] and 1 exception(s); 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:t2, leader=null, voted=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500, raftlog=0e3aa95d-ab51-4b20-9bff-3f7bd7df0500-SegmentedRaftLog:OPENED, conf=-1: [76153aab-4681-40b6-bc32-cc9ed5ef1daf:192.168.0.7:9858, 79ca7251-7514-4c53-968c-ade59d6df07b:192.168.0.6:9858, 0e3aa95d-ab51-4b20-9bff-3f7bd7df0500:192.168.0.4:9858], old=null{noformat} cc - [~ljain] all logs attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org