Hi, "I'd assume this is because Kafka Streams is positioned for building streaming applications, rather than doing analytics, whereas Spark is more often used for analytics purposes."
Well not necessarily the full picture. Spark can do both analytics and streaming, especially with Spark Structured Streaming. Spark Structured Streaming is the Apache Spark API that lets you express computation on streaming data *in the same way you express a batch computation on static data.* That is the strength of Spark. Spark supports Java, Scala and Python among others. Python or more specifically Pyspark is particularly popular with Data Science plus the conventional analytics. Structured Streaming Programming Guide - Spark 3.1.1 Documentation (apache.org) <https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html> There are two scenarios with Spark Structured Streaming. There are called *foreach* and *foreachBatch* operations allow you to apply arbitrary operations and write logic on the output of a streaming query. They have slightly different use cases - w*hile **foreach** allows custom write logic on every row,* *foreachBatch** allows arbitrary operations and custom logic on the output of each micro-batch*. HTH view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 28 Apr 2021 at 20:12, Andrew Otto <o...@wikimedia.org> wrote: > I'd assume this is because Kafka Streams is positioned for building > streaming applications, rather than doing analytics, whereas Spark is more > often used for analytics purposes. >