[
https://issues.apache.org/jira/browse/RATIS-1074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tsz-wo Sze updated RATIS-1074:
------------------------------
Component/s: gRPC
> GrpcLogAppender improperly decrease nextIndex to 1, which may trigger a
> installSnapshot request to follower.
> -------------------------------------------------------------------------------------------------------------
>
> Key: RATIS-1074
> URL: https://issues.apache.org/jira/browse/RATIS-1074
> Project: Ratis
> Issue Type: Bug
> Components: gRPC
> Reporter: Glen Geng
> Assignee: Glen Geng
> Priority: Major
> Labels: MultiRaft, dense-storage
> Fix For: 1.1.0
>
> Time Spent: 50m
> Remaining Estimate: 0h
>
> GrpcLogAppender improperly decrease nextIndex to 1, which will trigger
> installSnapshot request to followers, and make pipeline be vulnerable.
>
> *Below is the interaction log between leader and the follower.*
> Follower side, received an install snapshot request, and closed the pipeline.
> {code:java}
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,719
> [grpc-default-executor-628] INFO org.apache.ratis.server.impl.RaftServerImpl:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8: receive
> installSnapshot:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb->591f38ce-0b11-4003-8528-7a5802f89ea8#0-t514,notify:(t:392,
> i:992920)
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,719
> [grpc-default-executor-628] INFO
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
> Leader change notification received for group: group-622E07D885D8 with new
> leaderId: 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,719
> [grpc-default-executor-628] INFO org.apache.ratis.server.impl.RaftServerImpl:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8: change Leader from
> null to 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb at term 514 for installSnapshot,
> leader elected after 50593ms
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,719
> [grpc-default-executor-628] INFO org.apache.ratis.server.impl.RaftServerImpl:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8:
> notifyInstallSnapshot: nextIndex is 2248004 but the leader's first available
> index is 992920.
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,719
> [grpc-default-executor-628] WARN
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
> Install snapshot notification received from Leader with termIndex: (t:392,
> i:992920), terminating pipeline: group-622E07D885D8
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,720
> [grpc-default-executor-628] ERROR
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
> pipeline Action CLOSE on pipeline
> PipelineID=433c67af-f834-4959-bcf7-622e07d885d8.Reason :
> 591f38ce-0b11-4003-8528-7a5802f89ea8 closes pipeline when installSnapshot
> from leader because leader snapshot doesn't contain any data to replay, all
> the log entries prior to the snapshot might have been purged.So follower
> should not try to install snapshot from leader butcan close the pipeline
> here. It's in follower state for 1761ms
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,720
> [grpc-default-executor-628] INFO org.apache.ratis.server.impl.RaftServerImpl:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8: StateMachine
> successfully installed snapshot index 992920. Reloading the StateMachine.
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,720
> [grpc-default-executor-628] INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8-SegmentedRaftLogWorker:
> flushIndex: setUnconditionally 2248003 -> 992920
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,720
> [grpc-default-executor-628] INFO
> org.apache.ratis.server.raftlog.segmented.SegmentedRaftLogWorker:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8-SegmentedRaftLogWorker:
> safeCacheEvictIndex: setUnconditionally 2248001 -> 992920
> ozone-root-datanode-172.16.90.69.log.9:2020-09-21 19:28:49,720
> [grpc-default-executor-628] INFO org.apache.ratis.server.impl.RaftServerImpl:
> 591f38ce-0b11-4003-8528-7a5802f89ea8@group-622E07D885D8: reply
> installSnapshot:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb<-591f38ce-0b11-4003-8528-7a5802f89ea8#0:FAIL-t514,IN_PROGRESS{code}
>
> Leader side, just take charge, got vote from another follower, but miss vote
> from this follower
> {code:java}
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:21,499
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.impl.LeaderElection:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143:
> begin an election at term 510 for 2248004:
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb:172.16.90.54:9858:0,
> 591f38ce-0b11-4003-8528-7a5802f89ea8:172.16.90.69:9858:1,
> e5596d61-c021-4a4b-89bc-6d8e6d4f1baf:172.16.90.56:9858:0], old=null
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,580
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.impl.LeaderElection:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143:
> Election PASSED; received 1 response(s)
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb<-e5596d61-c021-4a4b-89bc-6d8e6d4f1baf#0:OK-t510]
> and 0 exception(s);
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8:t510, leader=null,
> voted=6769b0ef-fb4a-4235-bcf0-f1623fcd74cb,
> raftlog=6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-SegmentedRaftLog:OPENED:c2248003,f2248004,i2248004,
> conf=2248004: [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb:172.16.90.54:9858:0,
> 591f38ce-0b11-4003-8528-7a5802f89ea8:172.16.90.69:9858:1,
> e5596d61-c021-4a4b-89bc-6d8e6d4f1baf:172.16.90.56:9858:0], old=null
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,580
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.impl.RoleInfo:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb: shutdown LeaderElection
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,580
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.impl.RaftServerImpl:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8: changes role from
> CANDIDATE to LEADER at term 510 for changeToLeader
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,580
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis:
> Leader change notification received for group: group-622E07D885D8 with new
> leaderId: 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,588
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.impl.RaftServerImpl:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8: change Leader from
> null to 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb at term 510 for becomeLeader,
> leader elected after 20419ms
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,588
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.RaftServerConfigKeys:
> raft.server.staging.catchup.gap = 1000 (default)
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:26,588
> [6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8-LeaderElection143]
> INFO org.apache.ratis.server.RaftServerConfigKeys: raft.server.rpc.sleep.time
> = 25ms (default){code}
>
> Due to Grpc exception, leader decrease next index of follower to 1. Since
> leader has purged log, it has no choice but send out an installSnapshot
> request to follower, which will trigger follower to close the pipeline.
> {code:java}
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:51,531
> [grpc-default-executor-345] WARN
> org.apache.ratis.grpc.server.GrpcLogAppender:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8->591f38ce-0b11-4003-8528-7a5802f89ea8-AppendLogResponseHandler:
> Failed appendEntries:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:51,531
> [grpc-default-executor-345] INFO org.apache.ratis.server.impl.FollowerInfo:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8->591f38ce-0b11-4003-8528-7a5802f89ea8:
> nextIndex: updateUnconditionally 2248006 -> 1
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:51,672
> [org.apache.ratis.server.impl.LogAppender$AppenderDaemon$$Lambda$601/1151550579@71fbee17]
> INFO org.apache.ratis.grpc.server.GrpcLogAppender:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8->591f38ce-0b11-4003-8528-7a5802f89ea8-GrpcLogAppender:
> followerNextIndex = 1 but logStartIndex = 992920, notify follower to install
> snapshot-(t:392, i:992920){code}
>
> *Here is the RCA*
> According to raft algo, for the given follower, the initial value of
> matchIndex is 0, the initial value of nextIndex is lastLogIndex + 1.
>
> In GrpcLogAppender::AppendLogResponseHandler
> {code:java}
> public void onError(Throwable t) {
> if (!isAppenderRunning()) {
> LOG.info("{} is stopped", GrpcLogAppender.this);
> return;
> }
> GrpcUtil.warn(LOG, () -> this + ": Failed appendEntries", t);
> grpcServerMetrics.onRequestRetry(); // Update try counter
> AppendEntriesRequest request =
> pendingRequests.remove(GrpcUtil.getCallId(t), GrpcUtil.isHeartbeat(t));
> resetClient(request);
> }
> {code}
> since this is not exception thrown back from GrpcServerProtocolService,
> callId will not be set, request will be null.
>
> {code:java}
> private synchronized void resetClient(AppendEntriesRequest request) {
> try {
> rpcService.getProxies().getProxy(getFollowerId()).resetConnectBackoff();
> } catch (IOException ie) {
> LOG.warn(this + ": Failed to reset channel by " + ie);
> }
> appendLogRequestObserver = null;
> firstResponseReceived = false;
> // clear the pending requests queue and reset the next index of follower
> final long nextIndex = 1 + Optional.ofNullable(request)
> .map(AppendEntriesRequest::getPreviousLog)
> .map(TermIndex::getIndex)
> .orElseGet(getFollower()::getMatchIndex);
> pendingRequests.clear();
> getFollower().decreaseNextIndex(nextIndex);
> }
> {code}
> request is null, nextIndex will be calculated from matchIndex.
>
> Since leader has never talked with this follower, its matchIndex will be 0,
> thus nextIndex will be matchIndex + 1, 0 + 1 = 1. Just shown as this trace:
> {code:java}
> ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:51,531
> [grpc-default-executor-345] WARN
> org.apache.ratis.grpc.server.GrpcLogAppender:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8->591f38ce-0b11-4003-8528-7a5802f89ea8-AppendLogResponseHandler:
> Failed appendEntries:
> org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io
> exception ozone-root-datanode-172.16.90.54.log.3:2020-09-21 19:28:51,531
> [grpc-default-executor-345] INFO org.apache.ratis.server.impl.FollowerInfo:
> 6769b0ef-fb4a-4235-bcf0-f1623fcd74cb@group-622E07D885D8->591f38ce-0b11-4003-8528-7a5802f89ea8:
> nextIndex: updateUnconditionally 2248006 -> 1{code}
>
> *The fix is simple*
> If leader has talked with follower, matchIndex will catchup, thus won't be 0,
> the calculation logic for nextIndex is fine.
> otherwise, matchIndex is 0, we just keep nextIndex unchanged, and retry.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)