Thanks for the feedback Giridhar! We'll add a comparison with KStreams there as well.
Roughly, the two are similar - The design of Samza certainly influenced what went into Kafka Streams. However, here are some key differences: - Support for non-Kafka source and sink natively: Samza has native connectors for various systems like ElasticSearch, AWS Kinesis, Azure EventHubs, HDFS in the open-source. This has cost-benefits if you don't want to maintain dual copies to import the data into Kafka. - Async-mode: At LinkedIn, we have observed that jobs are bottle-necked by remote I/O. For this reason, we built native async-processing into Samza. As far as I can remember , Samza is the only stream processor that supports this feature (as of early 2017). - Stability at LinkedIn: We run Samza in production at LinkedIn, and it's battle-tested at scale powering all of our near-realtime processing use-cases. On YARN, Samza supports durable local state and host-affinity for instant state recovery. We have made improvements to this by adding incremental checkpointing. - Single API and SQL for streaming and batch processing: Samza can run the same code on both batching and streaming sources. We just added SQL support in the open-source. PS: Some of this discussion is based on Kartik's and Yi's earlier responses in 2016. Yi's earlier response: http://mail-archives.apache.org/mod_mbox/samza-dev/201608.mbox/%3CCAFvExu1KghxR1dN7Awwr70k3b4aMmfBVLhKFjFd2smsUAt3rDg%40mail.gmail.com%3E Kartik's earlier response: http://mail-archives.apache.org/mod_mbox/samza-dev/201605.mb ox/%3CCACsAj_XZZBohSz7Cf9%3DLO5MDOn2vEzfMrDF6Te%3DwrpeMEab1d Q%40mail.gmail.com%3E On Thu, Nov 23, 2017 at 10:15 PM, Giridhar Addepalli <giridhar1...@gmail.com > wrote: > Hi, > > Thank you for providing comparison between Samza and Spark Streaming, > Mupd8, Storm. > Looks like there is new player in the field : Kafka Streams ( > https://docs.confluent.io/current/streams/index.html). > > It will good to have comparison between Samza and Kafka Streams as well. > > From high-level it looks like "Samza when used as a library" is similar to > "Kafka Streams". > > Thanks, > Giridhar. > -- Jagadish V, Graduate Student, Department of Computer Science, Stanford University