squah-confluent commented on code in PR #18499: URL: https://github.com/apache/kafka/pull/18499#discussion_r1918456215
########## coordinator-common/src/main/java/org/apache/kafka/coordinator/common/runtime/CoordinatorRuntime.java: ########## @@ -1837,7 +1849,14 @@ public void onHighWatermarkUpdated( // exists and is in the active state. log.debug("Updating high watermark of {} to {}.", tp, newHighWatermark); context.coordinator.updateLastCommittedOffset(newHighWatermark); - context.deferredEventQueue.completeUpTo(newHighWatermark); + try { + context.deferredEventQueue.completeUpTo(newHighWatermark); + } catch (Throwable e) { + log.error("Failed to complete deferred events for {} up to {}, flushing deferred event queue.", + tp, newHighWatermark, e); + context.deferredEventQueue.failAll(Errors.NOT_COORDINATOR.exception()); + context.failCurrentBatch(Errors.NOT_COORDINATOR.exception()); Review Comment: I think I've gotten myself confused. For some reason I thought that later DeferredEvents depended on earlier DeferredEvents being run. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org