dajac commented on a change in pull request #11566: URL: https://github.com/apache/kafka/pull/11566#discussion_r771371068
########## File path: core/src/main/scala/kafka/coordinator/group/GroupCoordinator.scala ########## @@ -1249,7 +1260,7 @@ class GroupCoordinator(val brokerId: Int, // for new members. If the new member is still there, we expect it to retry. completeAndScheduleNextExpiration(group, member, NewMemberJoinTimeoutMs) - maybePrepareRebalance(group, s"Adding new member $memberId with group instance id $groupInstanceId") + maybePrepareRebalance(group, s"Adding new member $memberId with group instance id $groupInstanceId. Member joined due to $reason") Review comment: The output log is quite hard to follow at the moment. Example: ``` [2021-12-17 11:29:16,061] INFO [GroupCoordinator 0]: Preparing to rebalance group test in state PreparingRebalance with old generation 1 (__consumer_offsets-48) (reason: Adding new member console-consumer-1d5a9905-c271-4700-a817-62fc9b9f28fc with group instance id None. Member joined due to rebalance failed due to class org.apache.kafka.common.errors.MemberIdRequiredException error: The group member needs to have a valid member id before actually entering a consumer group.) (kafka.coordinator.group.GroupCoordinator) ``` How about doing the following? For each reason, we could add `; client reason: $reason`. With this, we will always have (reason: ....; client reason: ...) in each rebalance logs. It might be clearer. What do you think? ########## File path: clients/src/main/resources/common/message/JoinGroupResponse.json ########## @@ -31,7 +31,9 @@ // Version 6 is the first flexible version. // // Starting from version 7, the broker sends back the Protocol Type to the client (KIP-559). - "validVersions": "0-7", + // + // Version 8 adds the Reason field (KIP-800). Review comment: nit: Should we rather say Version 8 is the same as version 7. here? ########## File path: clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java ########## @@ -467,6 +468,7 @@ boolean joinGroupIfNeeded(final Timer timer) { final RuntimeException exception = future.exception(); resetJoinGroupFuture(); + rejoinReason = "rebalance failed due to " + exception.getClass() + " error: " + exception.getMessage(); Review comment: Example on the broker side: ``` [2021-12-17 11:29:16,061] INFO [GroupCoordinator 0]: Preparing to rebalance group test in state PreparingRebalance with old generation 1 (__consumer_offsets-48) (reason: Adding new member console-consumer-1d5a9905-c271-4700-a817-62fc9b9f28fc with group instance id None. Member joined due to rebalance failed due to class org.apache.kafka.common.errors.MemberIdRequiredException error: The group member needs to have a valid member id before actually entering a consumer group.) (kafka.coordinator.group.GroupCoordinator) ``` * Should we only get the `getSimpleName` of the class? * There are many `:` in the log. I wonder if we could remove the one we've put here. Perhaps, we could use the following pattern: `rebalance failed due to '$message' ($class)`. What do you think? ########## File path: core/src/main/scala/kafka/coordinator/group/GroupCoordinator.scala ########## @@ -181,6 +182,7 @@ class GroupCoordinator(val brokerId: Int, responseCallback(JoinGroupResult(memberId, Errors.UNKNOWN_MEMBER_ID)) case Some(group) => group.inLock { + val joinReason = reason.getOrElse("unknown reason") Review comment: If we do this, it might be better to not use an `Option` after all. We could simply provided the default reason to `handleJoinGroup` if none is provided. Also, how about using `not provided` instead of `unknown reason`? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org