m1a2st commented on code in PR #17440: URL: https://github.com/apache/kafka/pull/17440#discussion_r1827819017
########## clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java: ########## @@ -1574,7 +1583,11 @@ private boolean updateFetchPositions(final Timer timer) { try { CheckAndUpdatePositionsEvent checkAndUpdatePositionsEvent = new CheckAndUpdatePositionsEvent(calculateDeadlineMs(timer)); wakeupTrigger.setActiveTask(checkAndUpdatePositionsEvent.future()); - cachedSubscriptionHasAllFetchPositions = applicationEventHandler.addAndGet(checkAndUpdatePositionsEvent); + applicationEventHandler.add(checkAndUpdatePositionsEvent); + cachedSubscriptionHasAllFetchPositions = processBackgroundEvents( Review Comment: > I'm afraid that processing background here events could bring undesired effects, mainly because it may include callbacks, which are totally unrelated to a call to consumer.position() for instance. If there's a call to consumer.position, and a reconciliation starts in the background around the same time (ie. new partition assigned from the broker), the background will enqueue a CallbackNeededEvent, so we could end up actually running the callback here, as part of the call to position, which is not right. This is a preliminary thought: to prevent CallbackNeededEvent from being executed out of order, we could use a PriorityQueue to manage events based on their timestamps. This way, if an incorrect event is retrieved, it can be re-queued without disrupting the processing sequence. > I believe that we should not use processBackgroundEvents as a means of knowing that a specific event that we sent to the background failed. We should ensure that the event is completed exceptionally in the background instead. Wouldn't that work, and avoid mixing errors and callbacks? (related info in the jira too) I've been exploring the use of CompletableFuture to manage metadata exceptions in the new Kafka consumer. However, I've encountered some test failures, which may indicate differences in behavior compared to the classic consumer. This makes me question if this approach is the best way to refactor error handling in the new consumer. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org