[
https://issues.apache.org/jira/browse/KAFKA-13543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459425#comment-17459425
]
Guozhang Wang commented on KAFKA-13543:
---------------------------------------
Thanks [~ableegoldman] for reporting this. I think it was by-design to do
non-blocking call to update the metadata, and allow temporary staleness.
I think this could be mitigated when we have
https://cwiki.apache.org/confluence/display/KAFKA/KIP-698%3A+Add+Explicit+User+Initialization+of+Broker-side+State+to+Kafka+Streams
in place, i.e. when new named topology is added we would wait until the
internal topics are created before re-subscribe the consumer. This does not
guarantee we are 100% safe though, since we still do not block on metadata
being succesfully propagated.
Maybe we can distinguish by using the task timeout at the initialization when
we do KIP-698, to wait until the metadata of both internal topics / source
topics be received, and throw timeout when we failed to get the updated
metadata in time.
> Consumer may pass stale cluster metadata to the assignor following a
> subscription update
> ----------------------------------------------------------------------------------------
>
> Key: KAFKA-13543
> URL: https://issues.apache.org/jira/browse/KAFKA-13543
> Project: Kafka
> Issue Type: Bug
> Components: consumer
> Reporter: A. Sophie Blee-Goldman
> Priority: Major
>
> A consumer only ever tracks metadata corresponding to its subscribed topics,
> which can cause a race condition during a rebalance immediately after a
> change to the consumer's subscription. Particularly, when new topics are
> added to the subscription but a rebalance in kicked off before the consumer's
> metadata is updated with the new topics, it will pass a stale copy of the
> cluster metadata in to the ConsumerPartitionAssignor#assign method, which may
> not include the newly subscribed topics regardless of whether they do or do
> not exist.
> Most apps are likely unaffected by this, including any consumer client apps
> using OOTB assignors, since a new rebalance will be kicked off when the
> metadata is updated and any partitions from the new topics will be assigned
> at that time. But in Kafka Streams, we do a check during each rebalance to
> ensure that any user input topics are created ahead of time. This race
> condition can result in Streams incorrectly identifying user topics as
> missing and throwing a MissingSourceTopicException when a new topology
> subscribed to new topics is added to the applicationĀ
> We can work around this for now, but it's unfortunate that we can't
> distinguish between true missing source topics and a transient lack of these
> topics in the metadata. There might also be some plain consumer client apps
> with custom assignors that run into this as well, for more advanced users.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)