vvcephei commented on a change in pull request #8834: URL: https://github.com/apache/kafka/pull/8834#discussion_r439525305
########## File path: clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java ########## @@ -1069,6 +1069,13 @@ private HeartbeatResponseHandler(final Generation generation) { public void handle(HeartbeatResponse heartbeatResponse, RequestFuture<Void> future) { sensors.heartbeatSensor.record(response.requestLatencyMs()); Errors error = heartbeatResponse.error(); + + if (state != MemberState.STABLE) { Review comment: I can see it either way. It seems like this PR is about sending the heartbeats _optimistically_ during rebalance, so there doesn't seem to really be any harm in ignoring the response for now. If we ignore the errors, then everything should still work, as the JoinGroup or SyncGroup response will tell us that we've been fenced next time we poll. It seems like the advantage of handling the error here is that we can potentially rejoin just a tiny bit sooner by not having to wait for the JoinGroup or SyncGroup response. But it's not clear to me that it's actually ok not to handle those responses, so then we would also need to make sure the response handling logic can detect that the response has already been invalidated if we've sent a new JoinGroup request in the mean time. This definitely has the potential to decrease the MTTR, but I'm wondering if we should take on the complexity right now, or consider it as a follow-on optimization. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org