[jira] [Created] (RATIS-802) Add a metric for Pending request count in Grpc Log Appender

2020-01-23 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created RATIS-802:


 Summary: Add a metric for Pending request count in Grpc Log 
Appender
 Key: RATIS-802
 URL: https://issues.apache.org/jira/browse/RATIS-802
 Project: Ratis
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


This Jira is to add metric for Pending append requests in GrpcLogAppender.

This will be helpful in knowing the number of outstanding append requests 
pending per follower in LogAppender.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-789) ConcurrentModification in MetricsRegistriesImpl

2020-01-23 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022299#comment-17022299
 ] 

Aravindan Vijayan commented on RATIS-789:
-

Thank you for the fix. LGTM +1.

> ConcurrentModification in MetricsRegistriesImpl
> ---
>
> Key: RATIS-789
> URL: https://issues.apache.org/jira/browse/RATIS-789
> Project: Ratis
>  Issue Type: Bug
>  Components: metrics
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: RATIS-789.001.patch
>
>
> {code}
> java.util.ConcurrentModificationException
>   at java.util.ArrayList.forEach(ArrayList.java:1260)
>   at 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl.lambda$create$1(MetricRegistriesImpl.java:66)
>   at 
> org.apache.ratis.metrics.impl.RefCountingMap.lambda$put$0(RefCountingMap.java:51)
>   at 
> java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>   at 
> org.apache.ratis.metrics.impl.RefCountingMap.put(RefCountingMap.java:46)
>   at 
> org.apache.ratis.metrics.impl.MetricRegistriesImpl.create(MetricRegistriesImpl.java:59)
>   at 
> org.apache.ratis.server.metrics.RatisMetrics.create(RatisMetrics.java:45)
>   at 
> org.apache.ratis.server.metrics.RatisMetrics.getMetricRegistryForLogAppender(RatisMetrics.java:82)
>   at 
> org.apache.ratis.server.metrics.LogAppenderMetrics.(LogAppenderMetrics.java:32)
>   at org.apache.ratis.server.impl.LeaderState.(LeaderState.java:221)
>   at 
> org.apache.ratis.server.impl.RoleInfo.startLeaderState(RoleInfo.java:94)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.changeToLeader(RaftServerImpl.java:348)
>   at 
> org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:238)
>   at 
> org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:161)
> {code}
> (RATIS-788 has more details, if needed)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-794) Ratils leader should retry append requests based on follower commit info in case of intermittent append failures

2020-01-23 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022039#comment-17022039
 ] 

Shashikant Banerjee commented on RATIS-794:
---

Thanks [~szetszwo] for the patch. The patch needs to be rebased. Can you please 
check?

> Ratils leader should retry append requests based on follower commit info in 
> case of intermittent append failures
> 
>
> Key: RATIS-794
> URL: https://issues.apache.org/jira/browse/RATIS-794
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: r794_20200122.patch
>
>
> During Ozone testing, it was observed that a leader election happens in 
> between the test , where a follower has caught to a certain index 313. The 
> new leader starts sends an append request to the follower which fails with 
> grpc Exception. This leads to leader reset the connection and start from the 
> beginning (index 1). 
>  
>  
> {code:java}
> 2020-01-13 14:56:32,995 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 0.0.0.0:9858@group-4F125BF42C14: changes role from CANDIDATE to LEADER at 
> term 7 for changeToLeader
> 2020-01-13 14:56:32,995 INFO 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
>  Leader change notification received for group: group-4F125BF42C14 with new 
> leaderId: ed90869c-317e-4303-8922-9fa83a3983cb
> 2020-01-13 14:56:33,042 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler:
>  Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-01-13 14:56:33,043 DEBUG org.apache.ratis.util.PeerProxyMap: 
> ed90869c-317e-4303-8922-9fa83a3983cb: reset proxy for 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: RUNNING -> CLOSING
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: CLOSING -> CLOSED
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: NEW
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.TimeoutScheduler: new 
> ScheduledThreadPoolExecutor
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.PeerProxyMap: 
> ed90869c-317e-4303-8922-9fa83a3983cb: Closing proxy for peer 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858
> 2020-01-13 14:56:33,045 DEBUG org.apache.ratis.util.TimeoutScheduler: 
> schedule a task: timeout 6000ms, sid 1 
> 2020-01-13 14:56:33,047 INFO org.apache.ratis.server.impl.FollowerInfo: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: 
> updateUnconditionally 314 -> 1 -> set the next index for 
> the follower back to 1 and  starts from 1)
> 2020-01-13 14:56:35,840 DEBUG org.apache.ratis.grpc.server.GrpcLogAppender: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler:
>  received the first reply 
> ed90869c-317e-4303-8922-9fa83a3983cb<-b65b0b6c-b0bb-429f-a23d-467c72d4b85c#2:OK,SUCCESS,nextIndex:314,term:5,followerCommit:313,
>  request=AppendEntriesRequest:cid=2,entriesCount=0,lastEntry=null .  
> ---> (Receives the response from follower indficating 
> follower is at 312)
> Although the follower is at 313, the leader keeps on sending the 
> appendRequests from index 1. 
> 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.server.impl.FollowerInfo: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: 
> updateIncreasingly 1 -> 2
> 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.util.TimeoutScheduler: 
> schedule a task: timeout 6000ms, sid 7
> 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.server.impl.FollowerInfo: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: 
> updateIncreasingly 2 -> 3
> 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.util.TimeoutScheduler: 
> schedule a task: timeout 6000ms, sid 8
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-631) Propagate state machine's last applied index from follower to leader

2020-01-23 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021932#comment-17021932
 ] 

Tsz-wo Sze commented on RATIS-631:
--

Let's make the changes together since the patch here is not big.  Sound good?

> Propagate state machine's last applied index from follower to leader
> 
>
> Key: RATIS-631
> URL: https://issues.apache.org/jira/browse/RATIS-631
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-631.001.patch
>
>
> State Machine's last applied index denotes the index till which the 
> transactions have been successfully applied by the state machine. This index 
> needs to be propagated from the follower to leader in order for leader to 
> determine by how many transactions follower's state machine is lagging behind.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-632) Leader should throw ResourceUnavailableException when follower lags in commit index

2020-01-23 Thread Tsz-wo Sze (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021929#comment-17021929
 ] 

Tsz-wo Sze commented on RATIS-632:
--

[~ljain], thanks for clarifying it. I can see the logic form the patch but what 
are the meanings for these limitations #1 and #2?  What are the use cases?

We already have a mechanism by limiting the number of pending request to reject 
new client requests.  Why we need these new limitations?

> Leader should throw ResourceUnavailableException when follower lags in commit 
> index
> ---
>
> Key: RATIS-632
> URL: https://issues.apache.org/jira/browse/RATIS-632
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-632.001.patch
>
>
> This Jira aims to determine pipeline slowness in leader using follower 
> indexes (commit index and state machine last applied index). As part of Jira, 
> configurations and algorithm would be defined for determining slowness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-759) Support stream APIs to send large messages

2020-01-23 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021918#comment-17021918
 ] 

Hadoop QA commented on RATIS-759:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
5s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 10s{color} | {color:orange} root: The patch generated 1 new + 14 unchanged - 
0 fixed = 15 total (was 14) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 42s{color} 
| {color:red} root in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 45m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | ratis.logservice.TestLogServiceWithNetty |
|   | ratis.logservice.server.TestMetaServer |
|   | ratis.grpc.TestServerRestartWithGrpc |
|   | ratis.netty.TestLeaderElectionWithNetty |
|   | ratis.server.simulation.TestRaftReconfigurationWithSimulatedRpc |
|   | ratis.grpc.TestRaftStateMachineExceptionWithGrpc |
|   | ratis.netty.TestRaftStateMachineExceptionWithNetty |
|   | ratis.server.simulation.TestRaftStateMachineExceptionWithSimulatedRpc |
|   | ratis.server.raftlog.TestRaftLogMetrics |
|   | ratis.server.simulation.TestServerRestartWithSimulatedRpc |
|   | ratis.server.simulation.TestLeaderElectionWithSimulatedRpc |
|   | ratis.netty.TestRaftSnapshotWithNetty |
|   | ratis.grpc.TestRaftSnapshotWithGrpc |
|   | ratis.server.simulation.TestRaftSnapshotWithSimulatedRpc |
|   | ratis.server.simulation.TestRaftWithSimulatedRpc |
|   | ratis.examples.filestore.TestFileStoreWithNetty |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/ratis:date2020-01-23 |
| JIRA Issue | RATIS-759 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12991621/r759_20200123.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
checkstyle  compile  cc  |
| uname | Linux ffb47d955655 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh
 |
| git revision | master / 90cd474 |
| maven | version: Apache Mav

[jira] [Created] (RATIS-801) Ratis snapshot should consider stateMachine#appliedIndex for triggering snapshot

2020-01-23 Thread Shashikant Banerjee (Jira)
Shashikant Banerjee created RATIS-801:
-

 Summary: Ratis snapshot should consider stateMachine#appliedIndex 
for triggering snapshot
 Key: RATIS-801
 URL: https://issues.apache.org/jira/browse/RATIS-801
 Project: Ratis
  Issue Type: Improvement
Affects Versions: 0.5.0
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.5.0


Currently, while triggering snapshot, snapshotUpdater#appliedIndex is taken 
into account to decide whether it has exceeded the snapshot threshold from the 
last snapshotIndex. This may lead to creating more snapshots than usual as 
stateMachineUpdater#appliedIndex is updated as soon as the applyTransaction 
call happens. Ideally, Ratis snapshot should nbe triggered taking 
stateMachine's applied index into account.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-759) Support stream APIs to send large messages

2020-01-23 Thread Tsz-wo Sze (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz-wo Sze updated RATIS-759:
-
Attachment: r759_20200123.patch

> Support stream APIs to send large messages
> --
>
> Key: RATIS-759
> URL: https://issues.apache.org/jira/browse/RATIS-759
> Project: Ratis
>  Issue Type: New Feature
>  Components: client, server
>Reporter: Tsz-wo Sze
>Assignee: Tsz-wo Sze
>Priority: Major
> Attachments: r759_20200115.patch, r759_20200123.patch
>
>
> It is inefficient to send a large message using 
> send(Message)/sendAsync(Message) in RaftClient.  We already have 
> RaftOutputStream implemented with sendAsync(..).  We propose adding the 
> following new APIs
> {code}
>   /** Create a stream to send a large message. */
>   MessageOutputStream stream();
>   /** Send the given message using a stream. */
>   CompletableFuture streamAsync(Message message);
> {code} 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-631) Propagate state machine's last applied index from follower to leader

2020-01-23 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021870#comment-17021870
 ] 

Lokesh Jain commented on RATIS-631:
---

This will be stored in FollowerInfo of log appender. The plan is to compare 
stateMachineAppliedIndex of peers and fail client requests if peers lag in 
stateMachineAppliedIndex. This would be similar to RATIS-632.

> Propagate state machine's last applied index from follower to leader
> 
>
> Key: RATIS-631
> URL: https://issues.apache.org/jira/browse/RATIS-631
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-631.001.patch
>
>
> State Machine's last applied index denotes the index till which the 
> transactions have been successfully applied by the state machine. This index 
> needs to be propagated from the follower to leader in order for leader to 
> determine by how many transactions follower's state machine is lagging behind.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (RATIS-794) Ratils leader should retry append requests based on follower commit info in case of intermittent append failures

2020-01-23 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/RATIS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee updated RATIS-794:
--
Attachment: (was: RATIS-794.000.patch)

> Ratils leader should retry append requests based on follower commit info in 
> case of intermittent append failures
> 
>
> Key: RATIS-794
> URL: https://issues.apache.org/jira/browse/RATIS-794
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Reporter: Shashikant Banerjee
>Assignee: Tsz-wo Sze
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: r794_20200122.patch
>
>
> During Ozone testing, it was observed that a leader election happens in 
> between the test , where a follower has caught to a certain index 313. The 
> new leader starts sends an append request to the follower which fails with 
> grpc Exception. This leads to leader reset the connection and start from the 
> beginning (index 1). 
>  
>  
> {code:java}
> 2020-01-13 14:56:32,995 INFO org.apache.ratis.server.impl.RaftServerImpl: 
> 0.0.0.0:9858@group-4F125BF42C14: changes role from CANDIDATE to LEADER at 
> term 7 for changeToLeader
> 2020-01-13 14:56:32,995 INFO 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
>  Leader change notification received for group: group-4F125BF42C14 with new 
> leaderId: ed90869c-317e-4303-8922-9fa83a3983cb
> 2020-01-13 14:56:33,042 WARN org.apache.ratis.grpc.server.GrpcLogAppender: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler:
>  Failed appendEntries: 
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
> exception
> 2020-01-13 14:56:33,043 DEBUG org.apache.ratis.util.PeerProxyMap: 
> ed90869c-317e-4303-8922-9fa83a3983cb: reset proxy for 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: RUNNING -> CLOSING
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: CLOSING -> CLOSED
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: NEW
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.TimeoutScheduler: new 
> ScheduledThreadPoolExecutor
> 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.PeerProxyMap: 
> ed90869c-317e-4303-8922-9fa83a3983cb: Closing proxy for peer 
> b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858
> 2020-01-13 14:56:33,045 DEBUG org.apache.ratis.util.TimeoutScheduler: 
> schedule a task: timeout 6000ms, sid 1 
> 2020-01-13 14:56:33,047 INFO org.apache.ratis.server.impl.FollowerInfo: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: 
> updateUnconditionally 314 -> 1 -> set the next index for 
> the follower back to 1 and  starts from 1)
> 2020-01-13 14:56:35,840 DEBUG org.apache.ratis.grpc.server.GrpcLogAppender: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler:
>  received the first reply 
> ed90869c-317e-4303-8922-9fa83a3983cb<-b65b0b6c-b0bb-429f-a23d-467c72d4b85c#2:OK,SUCCESS,nextIndex:314,term:5,followerCommit:313,
>  request=AppendEntriesRequest:cid=2,entriesCount=0,lastEntry=null .  
> ---> (Receives the response from follower indficating 
> follower is at 312)
> Although the follower is at 313, the leader keeps on sending the 
> appendRequests from index 1. 
> 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.server.impl.FollowerInfo: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: 
> updateIncreasingly 1 -> 2
> 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.util.TimeoutScheduler: 
> schedule a task: timeout 6000ms, sid 7
> 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.server.impl.FollowerInfo: 
> 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: 
> updateIncreasingly 2 -> 3
> 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.util.TimeoutScheduler: 
> schedule a task: timeout 6000ms, sid 8
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-795) Read SateMachine data failure should result in log fail notification

2020-01-23 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021868#comment-17021868
 ] 

Shashikant Banerjee commented on RATIS-795:
---

Thanks [~swagle] for the clarification . I am +1 on the change.

> Read SateMachine data failure should result in log fail notification
> 
>
> Key: RATIS-795
> URL: https://issues.apache.org/jira/browse/RATIS-795
> Project: Ratis
>  Issue Type: Bug
>  Components: server
>Affects Versions: 0.4.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: RATIS-795.01.patch
>
>
> Presently an exception thrown during read state machine data causes a 
> RaftLogException to be thrown but we do not notify the state machine.
> Downstream in Ozone the state machine is marked unhealthy and there is no way 
> for the follower to recover.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-632) Leader should throw ResourceUnavailableException when follower lags in commit index

2020-01-23 Thread Lokesh Jain (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021867#comment-17021867
 ] 

Lokesh Jain commented on RATIS-632:
---

[~szetszwo] Thanks for reviewing the patch. The cases where we will reject the 
request are :-
 # Maximum commit index - Majority commit index > configured limit1.
 # Majority commit index - Minimum commit index > configured limit2

Maximum = max(commit index) for all peers

Minimum = min(commit index) for all peers

Majority = majority(commit index) for all peers i.e. for 3 node case it is the 
2nd largest commit index.

The above limits also make sure that the Maximum commit index - Minimum commit 
index < configured limit1 + configured limit2.

> Leader should throw ResourceUnavailableException when follower lags in commit 
> index
> ---
>
> Key: RATIS-632
> URL: https://issues.apache.org/jira/browse/RATIS-632
> Project: Ratis
>  Issue Type: Sub-task
>  Components: server
>Reporter: Lokesh Jain
>Assignee: Lokesh Jain
>Priority: Major
>  Labels: ozone
> Attachments: RATIS-632.001.patch
>
>
> This Jira aims to determine pipeline slowness in leader using follower 
> indexes (commit index and state machine last applied index). As part of Jira, 
> configurations and algorithm would be defined for determining slowness.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (RATIS-796) Add watch time out for Ratis Client

2020-01-23 Thread Shashikant Banerjee (Jira)


[ 
https://issues.apache.org/jira/browse/RATIS-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021865#comment-17021865
 ] 

Shashikant Banerjee commented on RATIS-796:
---

I am +1 on the change.

> Add watch time out for Ratis Client
> ---
>
> Key: RATIS-796
> URL: https://issues.apache.org/jira/browse/RATIS-796
> Project: Ratis
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: RATIS-796.00.patch, RATIS-796.01.patch, 
> RATIS-796.02.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Now in ratis, ratis.client.request.timeout is used for all kind of requests. 
> This Jira is used to add watch time out request parameter to handle for watch 
> requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)