lianetm commented on PR #17440: URL: https://github.com/apache/kafka/pull/17440#issuecomment-2489427648
Hey @m1a2st, sharing a thought in case it helps. First, the problem we have is that api calls like position/endOffsets trigger events that should fail with topic metadata errors but they don't, and are left hanging until they time out. So, with that in mind, it occurred to me that we do have all the events that are awaiting responses in hand when then `ConsumerNetworkThread.runOnce` happens, because we have them within the reaper, that keeps all the completableEvents so they can be expired eventually. Couldn't we take those events and let them know about the error when it happens? Then each event decides if it should fail on topic metadata error or not. I'm picturing something along these lines: On ConsumerNetworkThread.runOnce: ``` // 1. get metadata error that happens here networkClientDelegate.poll(pollWaitTimeMs, currentTimeMs); ... // 2. get all awaiting events after expiration applies (the reaper has them all, not just the ones generated on the current runOnce) List<CompletableApplicationEvent> awaitingEvents = reapExpiredApplicationEvents(currentTimeMs); // 3. notify awaiting events about the metadata error if (metadataError != null) { awaitingEvents.forEach(e -> e.onMetadataError(metadataError)); } ``` Would that work? I see that the main advantages would be to avoid the complexity of metadata future errors passed around to specific manager calls, and also it would be a solution applied consistently to all events (each event type then deciding if it should fail or not on topic metadata errors). onMetadataError, events could no-op by default, and some should override to simply do future.completeExceptionally, ex. `CheckAndUpdatePositionsEvent`, `CommitEvent` (these two seem to be the ones leading to the failed tests in the Authorizer file, we can get into details later about what others should consider the error). I could be missing something but sharing in case it helps! Let me know. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org