Thanks Robert. Do you mind pointing me to the code that shuffled and assigned to the workers? I trace through the code where it loops through the topic partitions and for each, create a KafkaSourceDescriptor and calls ReadFromKafkaDoFn.initialRestriction(). But I couldn't find out where it got shuffled and assigned to the workers.
Thank you very much. Antonio. On Mon, Nov 22, 2021 at 5:05 PM Robert Bradshaw <rober...@google.com> wrote: > The source code can be found at > > https://github.com/apache/beam/blob/master/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaIO.java > > In a nutshell, a full set of (topic, partionIndex) pairs gets shuffled > and assigned to the workers randomly, each of which gets processed by > a SplittableDoFn that emits the actual data on that topic/partition. > > On Mon, Nov 22, 2021 at 12:07 PM Antonio Si <antonio...@gmail.com> wrote: > > > > Hello Beam community, > > > > I am experimenting with the new SDF KafkaIO in a bit more detail. I have > a quick question. How are the topics and partitions got assigned to each > Task Manager? > > > > Can someone point me to the code? > > > > Thanks in advance. > > > > Antonio. >