[
https://issues.apache.org/jira/browse/FLINK-22147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated FLINK-22147:
-----------------------------------
Labels: pull-request-available (was: )
> Refactor Partition Discovery Logic in KafkaSourceEnumerator
> -----------------------------------------------------------
>
> Key: FLINK-22147
> URL: https://issues.apache.org/jira/browse/FLINK-22147
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / Kafka
> Affects Versions: 1.13.0
> Reporter: Qingsheng Ren
> Priority: Major
> Labels: pull-request-available
>
> Currently the logic of partition discovery is: the worker thread checks if
> there's new partitions and initialize new splits if so, then coordinator
> thread marks these splits as pending and try to make assignments.
> Under current design, the worker thread needs to keep an internal data
> structure tracking already discovered partitions, which is duplicated with
> pending splits + assigned partitions tracked by coordinator thread. Usually
> this kind of double-bookkeeping is fragile.
> Another issue is that the worker thread always fetches descriptions of ALL
> topics at partition discovery, which will comes to a problem working with a
> giant Kafka clusters with millions of topics/partitions.
> In order to fix issues above, a refactor is needed for the partition
> discovery logic in Kafka enumerator. Basically the logic can be changed to:
> # The worker thread fetches descriptions of subscribed topics/partitions,
> then hands over to coordinator thread
> # The coordinator thread filters out already discovered partitions (pending
> + assigned partitions), then invokes worker thread with {{callAsync}} to
> fetch offsets for new partitions
> # The worker thread fetches offsets and creates splits for new partitions,
> then hands over new splits to coordinator thread
> # The coordinator thread marks these splits as pending and try to make
> assignment.
> Discussion of this issue can be found in
> [https://github.com/apache/flink/pull/15461] .
--
This message was sent by Atlassian Jira
(v8.3.4#803005)