[
https://issues.apache.org/jira/browse/KAFKA-17900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lianet Magrans updated KAFKA-17900:
-----------------------------------
Description:
[This concerns the regex used on consumer.subscribe(Pattern), that is computed
on the client side, not the new SubscribedPattern introduced by KIP-848
computed on the broker side]
With the new AsyncKafkaConsumer, we are already moving the computation of the
subscription regex to the background thread as part of the
UpdatePatternSubscriptionEvent. This will eval the regex against all topics
from metadata whenever the metadata changes.
Even though this is already being done in a separate thread, it's used from the
app thread in a blocking mode from within the poll loop
(updateAssignmentMetadataIfNeeded does addAndGet on the
UpdatePatternSubscriptionEvent).
We could consider making this regex refresh truly async by triggering the
UpdatePatternSubscriptionEvent with a call to applicationEventHandler.add
instead. The expectation is that the consumer poll loop will continue without
initially reflecting any change in the subscribed topics (regex not
re-evaluated yet). The moment the background completes the regex eval, it will
update the subscription state with the new topics, making the new subscription
effective. This btw aligns with the approach on the broker-side regex
resolution, that does not block members HB, it just updates the assignment
including the new topics when the regex resolution is ready.
The main gain is to not block the poll loop on a computation of a regex that
may be expensive , largely depending on the number of topics in metadata and
the regex in use, but we should consider if there would be any undesired
side-effect on the client poll loop (that I'm not seeing)
Initial PR comment here:
https://github.com/apache/kafka/pull/17569#discussion_r1822808489
was:
[This concerns the regex used on consumer.subscribe(Pattern), that is computed
on the client side, not the new SubscribedPattern introduced by KIP-848
computed on the broker side]
With the new AsyncKafkaConsumer, we are already moving the computation of the
subscription regex to the background thread as part of the
UpdatePatternSubscriptionEvent. This will eval the regex against all topics
from metadata whenever the metadata changes.
Even though this is already being done in a separate thread, it's used from the
app thread in a blocking mode from within the poll loop
(updateAssignmentMetadataIfNeeded does addAndGet on the
UpdatePatternSubscriptionEvent).
We could consider making this regex refresh truly async by triggering the
UpdatePatternSubscriptionEvent with a call to applicationEventHandler.add
instead. The expectation is that the consumer poll loop will continue without
initially reflecting any change in the subscribed topics (regex not
re-evaluated yet). The moment the background completes the regex eval, it will
update the subscription state with the new topics, making the new subscription
effective. This btw aligns with the approach on the broker-side regex
resolution, that does not block members HB, it just updates the assignment
including the new topics when the regex resolution is ready.
The main gain is to not block the poll loop on a computation of a regex that
may be expensive , largely depending on the number of topics in metadata and
the regex in use, but we should consider if there would be any undesired
side-effect on the client poll loop (that I'm not seeing)
> Consider async resolution for client-side regex in new consumer
> ---------------------------------------------------------------
>
> Key: KAFKA-17900
> URL: https://issues.apache.org/jira/browse/KAFKA-17900
> Project: Kafka
> Issue Type: Task
> Components: clients, consumer
> Reporter: Lianet Magrans
> Priority: Minor
> Labels: consumer-threading-refactor
>
> [This concerns the regex used on consumer.subscribe(Pattern), that is
> computed on the client side, not the new SubscribedPattern introduced by
> KIP-848 computed on the broker side]
> With the new AsyncKafkaConsumer, we are already moving the computation of the
> subscription regex to the background thread as part of the
> UpdatePatternSubscriptionEvent. This will eval the regex against all topics
> from metadata whenever the metadata changes.
> Even though this is already being done in a separate thread, it's used from
> the app thread in a blocking mode from within the poll loop
> (updateAssignmentMetadataIfNeeded does addAndGet on the
> UpdatePatternSubscriptionEvent).
>
> We could consider making this regex refresh truly async by triggering the
> UpdatePatternSubscriptionEvent with a call to applicationEventHandler.add
> instead. The expectation is that the consumer poll loop will continue without
> initially reflecting any change in the subscribed topics (regex not
> re-evaluated yet). The moment the background completes the regex eval, it
> will update the subscription state with the new topics, making the new
> subscription effective. This btw aligns with the approach on the broker-side
> regex resolution, that does not block members HB, it just updates the
> assignment including the new topics when the regex resolution is ready.
>
> The main gain is to not block the poll loop on a computation of a regex that
> may be expensive , largely depending on the number of topics in metadata and
> the regex in use, but we should consider if there would be any undesired
> side-effect on the client poll loop (that I'm not seeing)
>
> Initial PR comment here:
> https://github.com/apache/kafka/pull/17569#discussion_r1822808489
--
This message was sent by Atlassian Jira
(v8.20.10#820010)