[jira] [Created] (RATIS-802) Add a metric for Pending request count in Grpc Log Appender
Bharat Viswanadham created RATIS-802: Summary: Add a metric for Pending request count in Grpc Log Appender Key: RATIS-802 URL: https://issues.apache.org/jira/browse/RATIS-802 Project: Ratis Issue Type: Bug Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham This Jira is to add metric for Pending append requests in GrpcLogAppender. This will be helpful in knowing the number of outstanding append requests pending per follower in LogAppender. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-789) ConcurrentModification in MetricsRegistriesImpl
[ https://issues.apache.org/jira/browse/RATIS-789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022299#comment-17022299 ] Aravindan Vijayan commented on RATIS-789: - Thank you for the fix. LGTM +1. > ConcurrentModification in MetricsRegistriesImpl > --- > > Key: RATIS-789 > URL: https://issues.apache.org/jira/browse/RATIS-789 > Project: Ratis > Issue Type: Bug > Components: metrics >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Attachments: RATIS-789.001.patch > > > {code} > java.util.ConcurrentModificationException > at java.util.ArrayList.forEach(ArrayList.java:1260) > at > org.apache.ratis.metrics.impl.MetricRegistriesImpl.lambda$create$1(MetricRegistriesImpl.java:66) > at > org.apache.ratis.metrics.impl.RefCountingMap.lambda$put$0(RefCountingMap.java:51) > at > java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853) > at > org.apache.ratis.metrics.impl.RefCountingMap.put(RefCountingMap.java:46) > at > org.apache.ratis.metrics.impl.MetricRegistriesImpl.create(MetricRegistriesImpl.java:59) > at > org.apache.ratis.server.metrics.RatisMetrics.create(RatisMetrics.java:45) > at > org.apache.ratis.server.metrics.RatisMetrics.getMetricRegistryForLogAppender(RatisMetrics.java:82) > at > org.apache.ratis.server.metrics.LogAppenderMetrics.(LogAppenderMetrics.java:32) > at org.apache.ratis.server.impl.LeaderState.(LeaderState.java:221) > at > org.apache.ratis.server.impl.RoleInfo.startLeaderState(RoleInfo.java:94) > at > org.apache.ratis.server.impl.RaftServerImpl.changeToLeader(RaftServerImpl.java:348) > at > org.apache.ratis.server.impl.LeaderElection.askForVotes(LeaderElection.java:238) > at > org.apache.ratis.server.impl.LeaderElection.run(LeaderElection.java:161) > {code} > (RATIS-788 has more details, if needed) -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-794) Ratils leader should retry append requests based on follower commit info in case of intermittent append failures
[ https://issues.apache.org/jira/browse/RATIS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17022039#comment-17022039 ] Shashikant Banerjee commented on RATIS-794: --- Thanks [~szetszwo] for the patch. The patch needs to be rebased. Can you please check? > Ratils leader should retry append requests based on follower commit info in > case of intermittent append failures > > > Key: RATIS-794 > URL: https://issues.apache.org/jira/browse/RATIS-794 > Project: Ratis > Issue Type: Bug > Components: server >Reporter: Shashikant Banerjee >Assignee: Tsz-wo Sze >Priority: Major > Fix For: 0.5.0 > > Attachments: r794_20200122.patch > > > During Ozone testing, it was observed that a leader election happens in > between the test , where a follower has caught to a certain index 313. The > new leader starts sends an append request to the follower which fails with > grpc Exception. This leads to leader reset the connection and start from the > beginning (index 1). > > > {code:java} > 2020-01-13 14:56:32,995 INFO org.apache.ratis.server.impl.RaftServerImpl: > 0.0.0.0:9858@group-4F125BF42C14: changes role from CANDIDATE to LEADER at > term 7 for changeToLeader > 2020-01-13 14:56:32,995 INFO > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > Leader change notification received for group: group-4F125BF42C14 with new > leaderId: ed90869c-317e-4303-8922-9fa83a3983cb > 2020-01-13 14:56:33,042 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler: > Failed appendEntries: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io > exception > 2020-01-13 14:56:33,043 DEBUG org.apache.ratis.util.PeerProxyMap: > ed90869c-317e-4303-8922-9fa83a3983cb: reset proxy for > b65b0b6c-b0bb-429f-a23d-467c72d4b85c > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: RUNNING -> CLOSING > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: CLOSING -> CLOSED > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: NEW > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.TimeoutScheduler: new > ScheduledThreadPoolExecutor > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.PeerProxyMap: > ed90869c-317e-4303-8922-9fa83a3983cb: Closing proxy for peer > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858 > 2020-01-13 14:56:33,045 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 1 > 2020-01-13 14:56:33,047 INFO org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateUnconditionally 314 -> 1 -> set the next index for > the follower back to 1 and starts from 1) > 2020-01-13 14:56:35,840 DEBUG org.apache.ratis.grpc.server.GrpcLogAppender: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler: > received the first reply > ed90869c-317e-4303-8922-9fa83a3983cb<-b65b0b6c-b0bb-429f-a23d-467c72d4b85c#2:OK,SUCCESS,nextIndex:314,term:5,followerCommit:313, > request=AppendEntriesRequest:cid=2,entriesCount=0,lastEntry=null . > ---> (Receives the response from follower indficating > follower is at 312) > Although the follower is at 313, the leader keeps on sending the > appendRequests from index 1. > 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateIncreasingly 1 -> 2 > 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 7 > 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateIncreasingly 2 -> 3 > 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 8 > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-631) Propagate state machine's last applied index from follower to leader
[ https://issues.apache.org/jira/browse/RATIS-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021932#comment-17021932 ] Tsz-wo Sze commented on RATIS-631: -- Let's make the changes together since the patch here is not big. Sound good? > Propagate state machine's last applied index from follower to leader > > > Key: RATIS-631 > URL: https://issues.apache.org/jira/browse/RATIS-631 > Project: Ratis > Issue Type: Sub-task > Components: server >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: ozone > Attachments: RATIS-631.001.patch > > > State Machine's last applied index denotes the index till which the > transactions have been successfully applied by the state machine. This index > needs to be propagated from the follower to leader in order for leader to > determine by how many transactions follower's state machine is lagging behind. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-632) Leader should throw ResourceUnavailableException when follower lags in commit index
[ https://issues.apache.org/jira/browse/RATIS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021929#comment-17021929 ] Tsz-wo Sze commented on RATIS-632: -- [~ljain], thanks for clarifying it. I can see the logic form the patch but what are the meanings for these limitations #1 and #2? What are the use cases? We already have a mechanism by limiting the number of pending request to reject new client requests. Why we need these new limitations? > Leader should throw ResourceUnavailableException when follower lags in commit > index > --- > > Key: RATIS-632 > URL: https://issues.apache.org/jira/browse/RATIS-632 > Project: Ratis > Issue Type: Sub-task > Components: server >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: ozone > Attachments: RATIS-632.001.patch > > > This Jira aims to determine pipeline slowness in leader using follower > indexes (commit index and state machine last applied index). As part of Jira, > configurations and algorithm would be defined for determining slowness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-759) Support stream APIs to send large messages
[ https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021918#comment-17021918 ] Hadoop QA commented on RATIS-759: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 36s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s{color} | {color:blue} Findbugs executables are not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 10s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 5s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 52s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 10s{color} | {color:orange} root: The patch generated 1 new + 14 unchanged - 0 fixed = 15 total (was 14) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 36m 42s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 18s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 45m 42s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | ratis.logservice.TestLogServiceWithNetty | | | ratis.logservice.server.TestMetaServer | | | ratis.grpc.TestServerRestartWithGrpc | | | ratis.netty.TestLeaderElectionWithNetty | | | ratis.server.simulation.TestRaftReconfigurationWithSimulatedRpc | | | ratis.grpc.TestRaftStateMachineExceptionWithGrpc | | | ratis.netty.TestRaftStateMachineExceptionWithNetty | | | ratis.server.simulation.TestRaftStateMachineExceptionWithSimulatedRpc | | | ratis.server.raftlog.TestRaftLogMetrics | | | ratis.server.simulation.TestServerRestartWithSimulatedRpc | | | ratis.server.simulation.TestLeaderElectionWithSimulatedRpc | | | ratis.netty.TestRaftSnapshotWithNetty | | | ratis.grpc.TestRaftSnapshotWithGrpc | | | ratis.server.simulation.TestRaftSnapshotWithSimulatedRpc | | | ratis.server.simulation.TestRaftWithSimulatedRpc | | | ratis.examples.filestore.TestFileStoreWithNetty | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/ratis:date2020-01-23 | | JIRA Issue | RATIS-759 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12991621/r759_20200123.patch | | Optional Tests | dupname asflicense javac javadoc unit findbugs checkstyle compile cc | | uname | Linux ffb47d955655 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-RATIS-Build/yetus-personality.sh | | git revision | master / 90cd474 | | maven | version: Apache Mav
[jira] [Created] (RATIS-801) Ratis snapshot should consider stateMachine#appliedIndex for triggering snapshot
Shashikant Banerjee created RATIS-801: - Summary: Ratis snapshot should consider stateMachine#appliedIndex for triggering snapshot Key: RATIS-801 URL: https://issues.apache.org/jira/browse/RATIS-801 Project: Ratis Issue Type: Improvement Affects Versions: 0.5.0 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.5.0 Currently, while triggering snapshot, snapshotUpdater#appliedIndex is taken into account to decide whether it has exceeded the snapshot threshold from the last snapshotIndex. This may lead to creating more snapshots than usual as stateMachineUpdater#appliedIndex is updated as soon as the applyTransaction call happens. Ideally, Ratis snapshot should nbe triggered taking stateMachine's applied index into account. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (RATIS-759) Support stream APIs to send large messages
[ https://issues.apache.org/jira/browse/RATIS-759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze updated RATIS-759: - Attachment: r759_20200123.patch > Support stream APIs to send large messages > -- > > Key: RATIS-759 > URL: https://issues.apache.org/jira/browse/RATIS-759 > Project: Ratis > Issue Type: New Feature > Components: client, server >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Attachments: r759_20200115.patch, r759_20200123.patch > > > It is inefficient to send a large message using > send(Message)/sendAsync(Message) in RaftClient. We already have > RaftOutputStream implemented with sendAsync(..). We propose adding the > following new APIs > {code} > /** Create a stream to send a large message. */ > MessageOutputStream stream(); > /** Send the given message using a stream. */ > CompletableFuture streamAsync(Message message); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-631) Propagate state machine's last applied index from follower to leader
[ https://issues.apache.org/jira/browse/RATIS-631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021870#comment-17021870 ] Lokesh Jain commented on RATIS-631: --- This will be stored in FollowerInfo of log appender. The plan is to compare stateMachineAppliedIndex of peers and fail client requests if peers lag in stateMachineAppliedIndex. This would be similar to RATIS-632. > Propagate state machine's last applied index from follower to leader > > > Key: RATIS-631 > URL: https://issues.apache.org/jira/browse/RATIS-631 > Project: Ratis > Issue Type: Sub-task > Components: server >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: ozone > Attachments: RATIS-631.001.patch > > > State Machine's last applied index denotes the index till which the > transactions have been successfully applied by the state machine. This index > needs to be propagated from the follower to leader in order for leader to > determine by how many transactions follower's state machine is lagging behind. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (RATIS-794) Ratils leader should retry append requests based on follower commit info in case of intermittent append failures
[ https://issues.apache.org/jira/browse/RATIS-794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee updated RATIS-794: -- Attachment: (was: RATIS-794.000.patch) > Ratils leader should retry append requests based on follower commit info in > case of intermittent append failures > > > Key: RATIS-794 > URL: https://issues.apache.org/jira/browse/RATIS-794 > Project: Ratis > Issue Type: Bug > Components: server >Reporter: Shashikant Banerjee >Assignee: Tsz-wo Sze >Priority: Major > Fix For: 0.5.0 > > Attachments: r794_20200122.patch > > > During Ozone testing, it was observed that a leader election happens in > between the test , where a follower has caught to a certain index 313. The > new leader starts sends an append request to the follower which fails with > grpc Exception. This leads to leader reset the connection and start from the > beginning (index 1). > > > {code:java} > 2020-01-13 14:56:32,995 INFO org.apache.ratis.server.impl.RaftServerImpl: > 0.0.0.0:9858@group-4F125BF42C14: changes role from CANDIDATE to LEADER at > term 7 for changeToLeader > 2020-01-13 14:56:32,995 INFO > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis: > Leader change notification received for group: group-4F125BF42C14 with new > leaderId: ed90869c-317e-4303-8922-9fa83a3983cb > 2020-01-13 14:56:33,042 WARN org.apache.ratis.grpc.server.GrpcLogAppender: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler: > Failed appendEntries: > org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io > exception > 2020-01-13 14:56:33,043 DEBUG org.apache.ratis.util.PeerProxyMap: > ed90869c-317e-4303-8922-9fa83a3983cb: reset proxy for > b65b0b6c-b0bb-429f-a23d-467c72d4b85c > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: RUNNING -> CLOSING > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: CLOSING -> CLOSED > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.LifeCycle: > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858: NEW > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.TimeoutScheduler: new > ScheduledThreadPoolExecutor > 2020-01-13 14:56:33,044 DEBUG org.apache.ratis.util.PeerProxyMap: > ed90869c-317e-4303-8922-9fa83a3983cb: Closing proxy for peer > b65b0b6c-b0bb-429f-a23d-467c72d4b85c:10.120.139.111:9858 > 2020-01-13 14:56:33,045 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 1 > 2020-01-13 14:56:33,047 INFO org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateUnconditionally 314 -> 1 -> set the next index for > the follower back to 1 and starts from 1) > 2020-01-13 14:56:35,840 DEBUG org.apache.ratis.grpc.server.GrpcLogAppender: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858-AppendLogResponseHandler: > received the first reply > ed90869c-317e-4303-8922-9fa83a3983cb<-b65b0b6c-b0bb-429f-a23d-467c72d4b85c#2:OK,SUCCESS,nextIndex:314,term:5,followerCommit:313, > request=AppendEntriesRequest:cid=2,entriesCount=0,lastEntry=null . > ---> (Receives the response from follower indficating > follower is at 312) > Although the follower is at 313, the leader keeps on sending the > appendRequests from index 1. > 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateIncreasingly 1 -> 2 > 2020-01-13 14:56:35,841 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 7 > 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.server.impl.FollowerInfo: > 0.0.0.0:9858@group-4F125BF42C14->10.120.139.111:9858: nextIndex: > updateIncreasingly 2 -> 3 > 2020-01-13 14:56:35,843 DEBUG org.apache.ratis.util.TimeoutScheduler: > schedule a task: timeout 6000ms, sid 8 > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-795) Read SateMachine data failure should result in log fail notification
[ https://issues.apache.org/jira/browse/RATIS-795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021868#comment-17021868 ] Shashikant Banerjee commented on RATIS-795: --- Thanks [~swagle] for the clarification . I am +1 on the change. > Read SateMachine data failure should result in log fail notification > > > Key: RATIS-795 > URL: https://issues.apache.org/jira/browse/RATIS-795 > Project: Ratis > Issue Type: Bug > Components: server >Affects Versions: 0.4.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Fix For: 0.5.0 > > Attachments: RATIS-795.01.patch > > > Presently an exception thrown during read state machine data causes a > RaftLogException to be thrown but we do not notify the state machine. > Downstream in Ozone the state machine is marked unhealthy and there is no way > for the follower to recover. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-632) Leader should throw ResourceUnavailableException when follower lags in commit index
[ https://issues.apache.org/jira/browse/RATIS-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021867#comment-17021867 ] Lokesh Jain commented on RATIS-632: --- [~szetszwo] Thanks for reviewing the patch. The cases where we will reject the request are :- # Maximum commit index - Majority commit index > configured limit1. # Majority commit index - Minimum commit index > configured limit2 Maximum = max(commit index) for all peers Minimum = min(commit index) for all peers Majority = majority(commit index) for all peers i.e. for 3 node case it is the 2nd largest commit index. The above limits also make sure that the Maximum commit index - Minimum commit index < configured limit1 + configured limit2. > Leader should throw ResourceUnavailableException when follower lags in commit > index > --- > > Key: RATIS-632 > URL: https://issues.apache.org/jira/browse/RATIS-632 > Project: Ratis > Issue Type: Sub-task > Components: server >Reporter: Lokesh Jain >Assignee: Lokesh Jain >Priority: Major > Labels: ozone > Attachments: RATIS-632.001.patch > > > This Jira aims to determine pipeline slowness in leader using follower > indexes (commit index and state machine last applied index). As part of Jira, > configurations and algorithm would be defined for determining slowness. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (RATIS-796) Add watch time out for Ratis Client
[ https://issues.apache.org/jira/browse/RATIS-796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17021865#comment-17021865 ] Shashikant Banerjee commented on RATIS-796: --- I am +1 on the change. > Add watch time out for Ratis Client > --- > > Key: RATIS-796 > URL: https://issues.apache.org/jira/browse/RATIS-796 > Project: Ratis > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Attachments: RATIS-796.00.patch, RATIS-796.01.patch, > RATIS-796.02.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Now in ratis, ratis.client.request.timeout is used for all kind of requests. > This Jira is used to add watch time out request parameter to handle for watch > requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)