Hello kafka community, I imagine the following behavior that I could code in 3 kafka-streams pipelines and wondering if it can be done in fewer kafka streams with the same guarantees:
I have 3 compacted topics, t1, t2 and t3, where t2 is the link (many-many) between t1 & t3. The same about t4-t6. I need to replicate t1-3 in t4-6 with a slightly different domain and move a column (avro attribute) from t1 to t6 to adapt the domains. In a multi-kafka-streams-pipelines paradigm I could do the following: I would create 2 or 3 intermediary topics: - a topic keyed in t1 key which gets events copies from both t1 & t2, both message types keyed in t1 key (kafka-streams-1 which merges 2 streams based on t1 & t2 with slight re-keying) - a topic keyed in t3 key which gets events copies from both t3 & t2, both message types keyed in t3 key (kafka-streams-2 which merges 2 streams based on t3 & t2 with slight re-keying) Until now I have created 2 joins, and I know joins exist in KafkaStreams, but these joins I can understand with my mind why they would be linearizable, since the messages on topics t1 & t2 will be sent to a common pipe with a total order between messages keyed in any particular t1 key, so the state processing would be linearizable at the level of each t1 entity (including t1 messages & t2 messages). Now the output topics are having the same key: t2 key (which is t1+t3 keys combined). Now I can join these topics, again sending them both to a 3rd topic with this join via merge strategy which i understand will not lose combinations because it cannot allow concurrency, so that all the messages keyed in a specific t2 key (a specific t1+t3 key combination) are read in order and applied single thread fashion, linearizable (kafka-streams-3). So from these 2+1 pipelines I can have an output back to t6 where I rewrite t6 records with records that contain a new value that is taken from t1. Does this make sense? Would you think that it is doable in less pipelines? Would using joins instead of these merges allow any such guarantees of single threaded processing across topics? I think not? Thank you, -- Dumitru-Nicolae Marasoui Software Engineer w kaluza.com <https://www.kaluza.com/> LinkedIn <https://www.linkedin.com/company/kaluza> | Twitter <https://twitter.com/Kaluza_tech> Kaluza Ltd. registered in England and Wales No. 08785057 VAT No. 100119879 Help save paper - do you need to print this email?
