Getting started with Samza and have a question about concurrency. Looking to 
confirm my understanding of how concurrency works with the Samza event loop and 
StreamTasks.

http://samza.incubator.apache.org/learn/documentation/0.7.0/container/event-loop.html

Assuming a (Kafka) inbound stream for a task has multiple partitions, the 
TaskRunner will setup a consumer for each partition in a separate thread. 
However, the messages from these consumers are funneled into a single message 
queue managed by the event loop. This essentially results in a single message 
being processed at-a-time across all StreamTask instances. In other words, 
StreamTasks will never process separate messages concurrently. 

If my understanding is correct, is there a way to have Samza process messages 
concurrently across StreamTasks for the job?

Thanks!
James

Reply via email to