[ 
https://issues.apache.org/jira/browse/HDDS-1102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809635#comment-16809635
 ] 

Shashikant Banerjee commented on HDDS-1102:
-------------------------------------------

Thanks [~elek] for successfully testing it. Yes, the issue is happening after 
the pipeline is destroyed and the restarted datanode see the group doesn't 
exist any more.
The error logs are from Ratis and we can just make it WARN instead of INFO in 
ratis as the group remocval is enforced by Ozone, its not a Ratis behaviour.

> Confusing error log when datanode tries to connect to a destroyed pipeline
> --------------------------------------------------------------------------
>
>                 Key: HDDS-1102
>                 URL: https://issues.apache.org/jira/browse/HDDS-1102
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Datanode
>            Reporter: Nilotpal Nandi
>            Assignee: Shashikant Banerjee
>            Priority: Critical
>              Labels: pushed-to-craterlake, test-badlands
>         Attachments: allnode.log, datanode.log
>
>
> steps taken:
> --------------------
>  # created 5 datanode cluster.
>  # shutdown 2 datanodes
>  # started the datanodes again.
> One of the datanodes was shut down.
> exception seen :
>  
> {noformat}
> 2019-02-14 07:37:26 INFO LeaderElection:230 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8 got exception when requesting votes: {}
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> a3d1dd2d-554e-4e87-a2cf-076a229af352: group-FD6FA533F1FB not found.
>  at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>  at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>  at 
> org.apache.ratis.server.impl.LeaderElection.waitForResults(LeaderElection.java:214)
>  at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:146)
>  at org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:102)
> Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
> INTERNAL: a3d1dd2d-554e-4e87-a2cf-076a229af352: group-FD6FA533F1FB not found.
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.toStatusRuntimeException(ClientCalls.java:233)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.getUnchecked(ClientCalls.java:214)
>  at 
> org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls.blockingUnaryCall(ClientCalls.java:139)
>  at 
> org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$RaftServerProtocolServiceBlockingStub.requestVote(RaftServerProtocolServiceGrpc.java:265)
>  at 
> org.apache.ratis.grpc.server.GrpcServerProtocolClient.requestVote(GrpcServerProtocolClient.java:83)
>  at org.apache.ratis.grpc.server.GrpcService.requestVote(GrpcService.java:187)
>  at 
> org.apache.ratis.server.impl.LeaderElection.lambda$submitRequests$0(LeaderElection.java:188)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-14 07:37:26 INFO LeaderElection:46 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8: Election PASSED; received 1 response(s) 
> [6a0522ba-019e-4b77-ac1f-a9322cd525b8<-61ad3bf3-e9b1-48e5-90e3-3b78c8b5bba5#0:OK-t7]
>  and 1 exception(s); 6a0522ba-019e-4b77-ac1f-a9322cd525b8:t7, leader=null, 
> voted=6a0522ba-019e-4b77-ac1f-a9322cd525b8, 
> raftlog=6a0522ba-019e-4b77-ac1f-a9322cd525b8-SegmentedRaftLog:OPENED, conf=3: 
> [61ad3bf3-e9b1-48e5-90e3-3b78c8b5bba5:172.20.0.8:9858, 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8:172.20.0.6:9858, 
> 0f377918-aafa-4d8a-972a-6ead54048fba:172.20.0.3:9858], old=null
> 2019-02-14 07:37:26 INFO LeaderElection:52 - 0: 
> java.util.concurrent.ExecutionException: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: INTERNAL: 
> a3d1dd2d-554e-4e87-a2cf-076a229af352: group-FD6FA533F1FB not found.
> 2019-02-14 07:37:26 INFO RoleInfo:130 - 6a0522ba-019e-4b77-ac1f-a9322cd525b8: 
> shutdown LeaderElection
> 2019-02-14 07:37:26 INFO RaftServerImpl:161 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8 changes role from CANDIDATE to LEADER at 
> term 7 for changeToLeader
> 2019-02-14 07:37:26 INFO RaftServerImpl:258 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8: change Leader from null to 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8 at term 7 for becomeLeader, leader 
> elected after 1066ms
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.staging.catchup.gap = 1000 (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - raft.server.rpc.sleep.time 
> = 25ms (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - raft.server.watch.timeout 
> = 10s (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.watch.timeout.denomination = 1s (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.log.appender.snapshot.chunk.size.max = 16MB (=16777216) (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.log.appender.buffer.byte-limit = 33554432 (custom)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.log.appender.buffer.element-limit = 1 (custom)
> 2019-02-14 07:37:26 INFO GrpcConfigKeys$Server:43 - 
> raft.grpc.server.leader.outstanding.appends.max = 128 (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.rpc.request.timeout = 3000ms (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.log.appender.snapshot.chunk.size.max = 16MB (=16777216) (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.log.appender.buffer.byte-limit = 33554432 (custom)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.log.appender.buffer.element-limit = 1 (custom)
> 2019-02-14 07:37:26 INFO GrpcConfigKeys$Server:43 - 
> raft.grpc.server.leader.outstanding.appends.max = 128 (default)
> 2019-02-14 07:37:26 INFO RaftServerConfigKeys:43 - 
> raft.server.rpc.request.timeout = 3000ms (default)
> 2019-02-14 07:37:26 INFO RoleInfo:139 - 6a0522ba-019e-4b77-ac1f-a9322cd525b8: 
> start LeaderState
> 2019-02-14 07:37:26 INFO RaftLogWorker:303 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8-RaftLogWorker: Rolling segment log-3_4 
> to index:4
> 2019-02-14 07:37:26 INFO RaftLogWorker:403 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8-RaftLogWorker: Rolled log segment from 
> /data/metadata/ratis/134f574e-c1b0-4556-a206-fd6fa533f1fb/current/log_inprogress_3
>  to /data/metadata/ratis/134f574e-c1b0-4556-a206-fd6fa533f1fb/current/log_3-4
> 2019-02-14 07:37:26 INFO RaftServerImpl:354 - 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8: set configuration 5: 
> [61ad3bf3-e9b1-48e5-90e3-3b78c8b5bba5:172.20.0.8:9858, 
> 6a0522ba-019e-4b77-ac1f-a9322cd525b8:172.20.0.6:9858, 
> 0f377918-aafa-4d8a-972a-6ead54048fba:172.20.0.3:9858], old=null at 5
> /opt/starter.sh: line 162: 13 Killed $@
>  
>  
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to