Writing streams into some sink (preferably fault-tolerant, exactly once
sink, see docs) and then joining is definitely a possible way. But you will
likely incur higher latency. If you want lower latency, then stream-stream
joins is the best approach, which we are working on right now. Spark 2.3 is
I have streams of data coming in from various applications through Kafka.
These streams are converted into dataframes in Spark. I would like to join
these dataframes on a common ID they all contain.
Since joining streaming dataframes is currently not supported, what is the
current recommended