Hi, I have recently made some experiments with stream data processing systems: Kafka Streams, Spark Structured Streaming, and Flink.
Spark Structured Streaming has micro-batch architecture. Kafka Streams is described as having a full stream architecture. I have noticed a few days ago, that by default Kafka Streams does not process events in the data stream immediately. The results are sent to the output approximately every 30 seconds. So where is the difference between micro-batch and full stream processing performed by Kafka Streams? I wonder if I set the Spark Structured Streaming trigger to 30 seconds and outputMode to update, can I notice any difference in stream data processing? Does anyone have any example of a data stream that would show the differences? Kind regards Krzysztof
