[
https://issues.apache.org/jira/browse/KAFKA-17696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lianet Magrans updated KAFKA-17696:
-----------------------------------
Description:
When a metadata error happens (ie. Unauthorized topic), the network layer is
the one to detect it and it just propagates it to the app thread via en
ErrorEvent.
[https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153]
That allows api calls that processBackgroundEvents to throw the error in the
app thread (ex. poll, unsubscribe and close, which are the only api calls that
currently processBackgroundEvents).
This means that all other api calls that do not processBackgroundEvent will
never know about errors like Unauthorized topics. Moreover, it really means
that the background operations are not notified/aborted when a metadata error
happens (auth error). Ex. a call to position that blocks waiting for the
updateFetchPositions
([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]),
will leave a
pendingOffsetFetchEvent waiting to complete, even when the background already
got an Unauthorized exception (but it only passes it to the app thread via
ErrorEvent)
I wonder if we should ensure that metadata errors are not only propagated to
the app thread via ErrorEvents, but also ensure that we notify all request
managers in the background (so that they can decide if completeExceptionally
their outstanding events). Ex. OffsetsRequestManager.onMetadataError should
completeExceptionally the pendingOffsetFetchEvent (just first thought, there
could be other approaches, but note that calling processBackgroundEvent in api
calls like positions will not do because we would block first on the
CheckAndUpdatePositions, then processBackgroundEvents that would only happen
after the CheckAndUpdate)
This behaviour can be repro with the integration test
AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer
enabled (discovered with [https://github.com/apache/kafka/pull/17107] )
was:
When a metadata error happens (ie. Unauthorized topic), the network layer is
the one to detect it and it just propagates it to the app thread via en
ErrorEvent.
[https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153]
That allows api calls that processBackgroundEvents to throw the error in the
app thread (ex. poll, unsubscribe and close, which are the only api calls that
currently processBackgroundEvents).
This means that all other api calls that do not processBackgroundEvent will
never know about errors like Unauthorized topics. Moreover, it really means
that the background operations are not notified/aborted when a metadata error
happens (auth error). Ex. call to position block waiting for the
updateFetchPositions
([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]),
will leave a
pendingOffsetFetchEvent waiting to complete, even when the background already
got an Unauthorized exception (but it only passed it to the app thread via
ErrorEvent)
I wonder if we should ensure that metadata errors are not only propagated to
the app thread via ErrorEvents, but also ensure that we notify all request
managers in the background (so that they can decide if completeExceptionally
their outstanding events). Ex. OffsetsRequestManager.onMetadataError should
completeExceptionally the pendingOffsetFetchEvent (just first thought, there
could be other approaches, but note that calling processBackgroundEvent in api
calls like positions will not do because we would block first on the
CheckAndUpdatePositions, then processBackgroundEvents that would only happen
after the CheckAndUpdate)
This behaviour can be repro with the integration test
AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer
enabled (discovered with [https://github.com/apache/kafka/pull/17107] )
> New consumer background operations unaware of metadata errors
> -------------------------------------------------------------
>
> Key: KAFKA-17696
> URL: https://issues.apache.org/jira/browse/KAFKA-17696
> Project: Kafka
> Issue Type: Bug
> Components: clients, consumer
> Reporter: Lianet Magrans
> Assignee: 黃竣陽
> Priority: Blocker
> Labels: consumer-threading-refactor
> Fix For: 4.0.0
>
>
> When a metadata error happens (ie. Unauthorized topic), the network layer is
> the one to detect it and it just propagates it to the app thread via en
> ErrorEvent.
> [https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/NetworkClientDelegate.java#L153]
> That allows api calls that processBackgroundEvents to throw the error in the
> app thread (ex. poll, unsubscribe and close, which are the only api calls
> that currently processBackgroundEvents).
> This means that all other api calls that do not processBackgroundEvent will
> never know about errors like Unauthorized topics. Moreover, it really means
> that the background operations are not notified/aborted when a metadata error
> happens (auth error). Ex. a call to position that blocks waiting for the
> updateFetchPositions
> ([here|https://github.com/apache/kafka/blob/0edf5dbd204df9eb62bfea1b56993e95737df5a3/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L1586]),
> will leave a
> pendingOffsetFetchEvent waiting to complete, even when the background already
> got an Unauthorized exception (but it only passes it to the app thread via
> ErrorEvent)
> I wonder if we should ensure that metadata errors are not only propagated to
> the app thread via ErrorEvents, but also ensure that we notify all request
> managers in the background (so that they can decide if completeExceptionally
> their outstanding events). Ex. OffsetsRequestManager.onMetadataError should
> completeExceptionally the pendingOffsetFetchEvent (just first thought, there
> could be other approaches, but note that calling processBackgroundEvent in
> api calls like positions will not do because we would block first on the
> CheckAndUpdatePositions, then processBackgroundEvents that would only happen
> after the CheckAndUpdate)
>
> This behaviour can be repro with the integration test
> AuthorizerIntegrationTest.testOffsetFetchWithNoAccess with the new consumer
> enabled (discovered with [https://github.com/apache/kafka/pull/17107] )
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)