Hi,
I have a very simple DAG on my dataflow job. (KafkaIO->Filter->WriteGCS).
When I submit this Dataflow job per topic it has 4kps per instance
processing speed. However I want to consume two different topics in one DF
job. I used TupleTag. I created TupleTags per message type. Each topic has
different message types and also needs different filters. So my pipeline
turned to below DAG. Message Extractor is a very simple step checking
header of kafka messages and writing the correct TupleTag. However after
starting to use this new DAG, dataflow canprocess 2kps per instance.
|--->Filter1-->WriteGCS
KafkaIO->MessageExtractor-> |
|--->Filter2-->WriteGCS
Do you have any idea why my data process speed decreased ?
Thanks