Thanks Robert. Do you mind pointing me to the code that shuffled and
assigned to the workers?
I trace through the code where it loops through the topic partitions and
for each, create a KafkaSourceDescriptor
and calls ReadFromKafkaDoFn.initialRestriction(). But I couldn't find out
where it got shuffled and assigned to the workers.

Thank you very much.

Antonio.

On Mon, Nov 22, 2021 at 5:05 PM Robert Bradshaw <rober...@google.com> wrote:

> The source code can be found at
>
> https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java
>
> In a nutshell, a full set of (topic, partionIndex) pairs gets shuffled
> and assigned to the workers randomly, each of which gets processed by
> a SplittableDoFn that emits the actual data on that topic/partition.
>
> On Mon, Nov 22, 2021 at 12:07 PM Antonio Si <antonio...@gmail.com> wrote:
> >
> > Hello Beam community,
> >
> > I am experimenting with the new SDF KafkaIO in a bit more detail. I have
> a quick question. How are the topics and partitions got assigned to each
> Task Manager?
> >
> > Can someone point me to the code?
> >
> > Thanks in advance.
> >
> > Antonio.
>

Reply via email to