[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-12 Thread Mukul Kumar Singh (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDDS-433:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the contribution [~ljain] and thanks for the review 
[~hanishakoneru]. I have committed this to trunk.

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.
> The following exception was seen using Ozone, where the leader send incorrect 
> append entries to follower.
> {code}
> 2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
> Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
> to:20312
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:1182, electionTimeout:990ms
> 2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
> for changeToFollower
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
> lastRpcTime:2167, electionTimeout:976ms
> 2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
> for changeToCandidate
> 2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
> tElection
> 2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
> 2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
> org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
> for changeToFollower
> 2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
> endEntries
>  
> 2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
>  Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
> term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
> lastRpcElapsed=0ms
>  
> 2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
> response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
> bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
> leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
> voted=2e240240-0fac-4f93-8aa8-fa8f
> 74bf1810_9858, raftlog=[(t:14, i:20374)], 
> conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
> 2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
> 2e240240-0fac-4f93-8aa8-fa8f74bf
> 1810_9858:172.26.32.228:9858], old=null
> 2018-08-20 07:54:31,227 WARN 
> org.apache.ratis.grpc.server.RaftServerProtocolService: 
> 2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
> b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
> java.lang.IllegalStateException: Unexpected Index: previous is (t:14, 
> i:20374) but entries[0].getIndex()=0
> at 
> org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
> at 
> org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786)
> at 
> 

[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-433:
-
Description: 
ContainerStateMachine#readStateMachineData returns LogEntryProto with index set 
to 0. This leads to exception in Ratis. The LogEntryProto to return should be 
built over the input LogEntryProto.

The following exception was seen using Ozone, where the leader send incorrect 
append entries to follower.

{code}
2018-08-20 07:54:06,200 INFO org.apache.ratis.server.storage.RaftLogWorker: 
Rolling segment:2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-RaftLogWorker index 
to:20312
2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.FollowerState: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
lastRpcTime:1182, electionTimeout:990ms
2018-08-20 07:54:07,800 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
for changeToCandidate
2018-08-20 07:54:07,801 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 14 
for changeToFollower
2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.FollowerState: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes to CANDIDATE, 
lastRpcTime:2167, electionTimeout:976ms
2018-08-20 07:54:21,712 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to CANDIDATE at term 14
for changeToCandidate
2018-08-20 07:54:21,715 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from 
2bf278ca-2dad-4029-a387-2faeb10adef5_9858 to null at term 14 for ini
tElection
2018-08-20 07:54:29,151 INFO org.apache.ratis.server.impl.LeaderElection: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: begin an election in Term 15
2018-08-20 07:54:30,735 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858 changes role from 
org.apache.ratis.server.impl.RoleInfo@6b1e0fb8 to FOLLOWER at term 15 
for changeToFollower
2018-08-20 07:54:30,740 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: change Leader from null to 
b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 at term 15 for app
endEntries
 
2018-08-20 07:54:30,741 INFO org.apache.ratis.server.impl.RaftServerImpl: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858-org.apache.ratis.server.impl.RoleInfo@6b1e0fb8:
 Withhold vote from candidate b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858 with 
term 15. State: leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, term=15, 
lastRpcElapsed=0ms
 
2018-08-20 07:54:30,745 INFO org.apache.ratis.server.impl.LeaderElection: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Election REJECTED; received 1 
response(s) [2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858<-2
bf278ca-2dad-4029-a387-2faeb10adef5_9858#0:FAIL-t15] and 0 exception(s); 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858:t15, 
leader=b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858, 
voted=2e240240-0fac-4f93-8aa8-fa8f
74bf1810_9858, raftlog=[(t:14, i:20374)], 
conf=[b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858:172.26.32.231:9858, 
2bf278ca-2dad-4029-a387-2faeb10adef5_9858:172.26.32.230:9858, 
2e240240-0fac-4f93-8aa8-fa8f74bf
1810_9858:172.26.32.228:9858], old=null
2018-08-20 07:54:31,227 WARN 
org.apache.ratis.grpc.server.RaftServerProtocolService: 
2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858: Failed appendEntries 
b6aaaf2c-2cbf-498f-995c-09cb2bb97cf4_9858->2e240240-0fac-4f93-8aa8-fa8f74bf1810_9858#1
java.lang.IllegalStateException: Unexpected Index: previous is (t:14, i:20374) 
but entries[0].getIndex()=0
at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60)
at 
org.apache.ratis.server.impl.RaftServerImpl.validateEntries(RaftServerImpl.java:786)
at 
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:859)
at 
org.apache.ratis.server.impl.RaftServerImpl.appendEntriesAsync(RaftServerImpl.java:824)
at 
org.apache.ratis.server.impl.RaftServerProxy.appendEntriesAsync(RaftServerProxy.java:247)
at 
org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:76)
at 
org.apache.ratis.grpc.server.RaftServerProtocolService$1.onNext(RaftServerProtocolService.java:66)
at 
org.apache.ratis.shaded.io.grpc.stub.ServerCalls$StreamingServerCallHandler$StreamingServerCallListener.onMessage(ServerCalls.java:248)
at 
org.apache.ratis.shaded.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.messagesAvailable(ServerCallImpl.java:252)
at 

[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-433:
-
Affects Version/s: (was: 0.2.1)

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-433:
-
Fix Version/s: 0.2.1

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-433:
-
Status: Patch Available  (was: Open)

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-433) ContainerStateMachine#readStateMachineData should properly build LogEntryProto

2018-09-11 Thread Lokesh Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-433:
-
Attachment: HDDS-433.001.patch

> ContainerStateMachine#readStateMachineData should properly build LogEntryProto
> --
>
> Key: HDDS-433
> URL: https://issues.apache.org/jira/browse/HDDS-433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.2.1
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Blocker
> Attachments: HDDS-433.001.patch
>
>
> ContainerStateMachine#readStateMachineData returns LogEntryProto with index 
> set to 0. This leads to exception in Ratis. The LogEntryProto to return 
> should be built over the input LogEntryProto.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org