jsancio commented on a change in pull request #9553:
URL: https://github.com/apache/kafka/pull/9553#discussion_r547556738



##########
File path: raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java
##########
@@ -1037,6 +1047,35 @@ private boolean handleFetchResponse(
                     logger.info("Truncated to offset {} from Fetch response 
from leader {}",
                         truncationOffset, quorum.leaderIdOrNil());
                 });
+            } else if (partitionResponse.snapshotId().epoch() >= 0 ||
+                       partitionResponse.snapshotId().endOffset() >= 0) {
+                // The leader is asking us to fetch a snapshot
+
+                if (partitionResponse.snapshotId().epoch() < 0) {
+                    throw new KafkaException(

Review comment:
       > I would suggest that we log an error saying that the remote replica 
seemed to return an invalid response and just keep fetching. Then a user can 
see the log message and restart the remote replica.
   
   Yeah. This is what I implemented and added a test for it. In other words.
   1. Log an error message
   2. Tell the raft client that the response was handle successfully but the 
fetch timer was not reset
   
   In practice this results in the follower continuing to send `Fetch` 
requests. After `fetchTimeoutMs` the follower will transition to candidate as 
the existing client code does. See 
https://github.com/apache/kafka/pull/9553/files#diff-86474ad1438150630c21b29a3da2f6dd79d1357e33ac034f00e5fcef0f2e889cR350
   
   Let me know if this is what you were thinking.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to