vvcephei commented on a change in pull request #8834:
URL: https://github.com/apache/kafka/pull/8834#discussion_r439525305



##########
File path: 
clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java
##########
@@ -1069,6 +1069,13 @@ private HeartbeatResponseHandler(final Generation 
generation) {
         public void handle(HeartbeatResponse heartbeatResponse, 
RequestFuture<Void> future) {
             sensors.heartbeatSensor.record(response.requestLatencyMs());
             Errors error = heartbeatResponse.error();
+
+            if (state != MemberState.STABLE) {

Review comment:
       I can see it either way. It seems like this PR is about sending the 
heartbeats _optimistically_ during rebalance, so there doesn't seem to really 
be any harm in ignoring the response for now. If we ignore the errors, then 
everything should still work, as the JoinGroup or SyncGroup response will tell 
us that we've been fenced next time we poll.
   
   It seems like the advantage of handling the error here is that we can 
potentially rejoin just a tiny bit sooner by not having to wait for the 
JoinGroup or SyncGroup response. But it's not clear to me that it's actually ok 
not to handle those responses, so then we would also need to make sure the 
response handling logic can detect that the response has already been 
invalidated if we've sent a new JoinGroup request in the mean time.
   
   This definitely has the potential to decrease the MTTR, but I'm wondering if 
we should take on the complexity right now, or consider it as a follow-on 
optimization.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to