[ https://issues.apache.org/jira/browse/KAFKA-8896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16941822#comment-16941822 ]
ASF GitHub Bot commented on KAFKA-8896: --------------------------------------- mumrah commented on pull request #7377: KAFKA-8896: Check group state before completing delayed heartbeat URL: https://github.com/apache/kafka/pull/7377 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > NoSuchElementException after coordinator move > --------------------------------------------- > > Key: KAFKA-8896 > URL: https://issues.apache.org/jira/browse/KAFKA-8896 > Project: Kafka > Issue Type: Bug > Affects Versions: 2.2.0, 2.3.0, 2.2.1 > Reporter: Jason Gustafson > Assignee: Boyang Chen > Priority: Major > Fix For: 2.3.1 > > > Caught this exception in the wild: > {code:java} > java.util.NoSuchElementException: key not found: > consumer-group-38981ebe-4361-44e7-b710-7d11f5d35639 > at scala.collection.MapLike.default(MapLike.scala:235) > at scala.collection.MapLike.default$(MapLike.scala:234) > at scala.collection.AbstractMap.default(Map.scala:63) > at scala.collection.mutable.HashMap.apply(HashMap.scala:69) > at kafka.coordinator.group.GroupMetadata.get(GroupMetadata.scala:214) > at > kafka.coordinator.group.GroupCoordinator.$anonfun$tryCompleteHeartbeat$1(GroupCoordinator.scala:1008) > at > scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209) > at > kafka.coordinator.group.GroupCoordinator.tryCompleteHeartbeat(GroupCoordinator.scala:1001) > at > kafka.coordinator.group.DelayedHeartbeat.tryComplete(DelayedHeartbeat.scala:34) > at > kafka.server.DelayedOperation.maybeTryComplete(DelayedOperation.scala:122) > at > kafka.server.DelayedOperationPurgatory$Watchers.tryCompleteWatched(DelayedOperation.scala:391) > at > kafka.server.DelayedOperationPurgatory.checkAndComplete(DelayedOperation.scala:295) > at > kafka.coordinator.group.GroupCoordinator.completeAndScheduleNextExpiration(GroupCoordinator.scala:802) > at > kafka.coordinator.group.GroupCoordinator.completeAndScheduleNextHeartbeatExpiration(GroupCoordinator.scala:795) > at > kafka.coordinator.group.GroupCoordinator.$anonfun$handleHeartbeat$2(GroupCoordinator.scala:543) > at > scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253) > at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209) > at > kafka.coordinator.group.GroupCoordinator.handleHeartbeat(GroupCoordinator.scala:516) > at kafka.server.KafkaApis.handleHeartbeatRequest(KafkaApis.scala:1617) > at kafka.server.KafkaApis.handle(KafkaApis.scala:155) {code} > > Looking at the logs, I see a coordinator change just prior to this exception. > The group was first unloaded as the coordinator moved to another broker and > then was loaded again as the coordinator was moved back. I am guessing that > somehow the delayed heartbeat is retaining the reference to the old > GroupMetadata instance. Not sure exactly how this can happen though. > -- This message was sent by Atlassian Jira (v8.3.4#803005)