[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto
[ https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-433: --- Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~ljain] and thanks for the review [~hanishakoneru]. I have committed this to trunk. > ContainerStateMachine#readStateMachineData should properly build LogEntryProto > -- > > Key: HDDS-433 > URL: https://issues.apache.org/jira/browse/HDDS-433 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-433.001.patch > > > ContainerStateMachine#readStateMachineData returns LogEntryProto with index > set to 0. This leads to exception in Ratis. The LogEntryProto to return > should be built over the input LogEntryProto. > The following exception was seen using Ozone, where the leader send incorrect > append entries to follower. > {code} > 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: > Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index > to:20312 > 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, > lastRpcTime:1182, electionTimeout:990ms > 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14 > for changeToCandidate > 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 > for changeToFollower > 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, > lastRpcTime:2167, electionTimeout:976ms > 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14 > for changeToCandidate > 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from > 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini > tElection > 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15 > 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from > org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 > for changeToFollower > 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app > endEntries > > 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8: > Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with > term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, > lastRpcElapsed=0ms > > 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 > response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2 > bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, > leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, > voted=2e240240-0fac-4f93-8aa8-fa8f > 74bf1810_9858, raftlog=[(t:14, i:20374)], > conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, > 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, > 2e240240-0fac-4f93-8aa8-fa8f74bf > 1810_9858:172.26.32.228:9858], old=null > 2018-08-20 07:54:31,227 WARN > org.apache.ratis.grpc.server.RaftServerProtocolService: > 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries > b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1 > java.lang.IllegalStateException: Unexpected Index: previous is (t:14, > i:20374) but entries[0].getIndex()=0 > at > org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60) > at > org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786) > at >
[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto
[ https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-433: - Description: ContainerStateMachine#readStateMachineData returns LogEntryProto with index set to 0. This leads to exception in Ratis. The LogEntryProto to return should be built over the input LogEntryProto. The following exception was seen using Ozone, where the leader send incorrect append entries to follower. {code} 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index to:20312 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, lastRpcTime:1182, electionTimeout:990ms 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14 for changeToCandidate 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 for changeToFollower 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, lastRpcTime:2167, electionTimeout:976ms 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14 for changeToCandidate 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini tElection 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 for changeToFollower 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app endEntries 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8: Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, lastRpcElapsed=0ms 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2 bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, voted=2e240240-0fac-4f93-8aa8-fa8f 74bf1810_9858, raftlog=[(t:14, i:20374)], conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 2e240240-0fac-4f93-8aa8-fa8f74bf 1810_9858:172.26.32.228:9858], old=null 2018-08-20 07:54:31,227 WARN org.apache.ratis.grpc.server.RaftServerProtocolService: 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1 java.lang.IllegalStateException: Unexpected Index: previous is (t:14, i:20374) but entries[0].getIndex()=0 at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60) at org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786) at org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:859) at org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:824) at org.apache.ratis.server.impl.RaftServerProxy.appendEntriesAsync(RaftServerProxy.java:247) at org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:76) at org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:66) at org.apache.ratis.shaded.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248) at org.apache.ratis.shaded.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:252) at
[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto
[ https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-433: - Affects Version/s: (was: 0.2.1) > ContainerStateMachine#readStateMachineData should properly build LogEntryProto > -- > > Key: HDDS-433 > URL: https://issues.apache.org/jira/browse/HDDS-433 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-433.001.patch > > > ContainerStateMachine#readStateMachineData returns LogEntryProto with index > set to 0. This leads to exception in Ratis. The LogEntryProto to return > should be built over the input LogEntryProto. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto
[ https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-433: - Fix Version/s: 0.2.1 > ContainerStateMachine#readStateMachineData should properly build LogEntryProto > -- > > Key: HDDS-433 > URL: https://issues.apache.org/jira/browse/HDDS-433 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Blocker > Fix For: 0.2.1 > > Attachments: HDDS-433.001.patch > > > ContainerStateMachine#readStateMachineData returns LogEntryProto with index > set to 0. This leads to exception in Ratis. The LogEntryProto to return > should be built over the input LogEntryProto. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto
[ https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-433: - Status: Patch Available (was: Open) > ContainerStateMachine#readStateMachineData should properly build LogEntryProto > -- > > Key: HDDS-433 > URL: https://issues.apache.org/jira/browse/HDDS-433 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.2.1 >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Blocker > Attachments: HDDS-433.001.patch > > > ContainerStateMachine#readStateMachineData returns LogEntryProto with index > set to 0. This leads to exception in Ratis. The LogEntryProto to return > should be built over the input LogEntryProto. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto
[ https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain updated HDDS-433: - Attachment: HDDS-433.001.patch > ContainerStateMachine#readStateMachineData should properly build LogEntryProto > -- > > Key: HDDS-433 > URL: https://issues.apache.org/jira/browse/HDDS-433 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.2.1 >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Blocker > Attachments: HDDS-433.001.patch > > > ContainerStateMachine#readStateMachineData returns LogEntryProto with index > set to 0. This leads to exception in Ratis. The LogEntryProto to return > should be built over the input LogEntryProto. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org