Hello, As explained in the following docs:
http://samza.incubator.apache.org/learn/documentation/0.7.0/introduction/architecture.html The input topic is partitioned using Kafka. Each Samza process reads messages from one or more of the input topic's partitions, and emits them back out to a different Kafka topic keyed by the message's member ID attribute. In the example above, the task will created many topics keyed by "message's member ID attribute", if there's millions of intermediate keys, how does Samza handle the topic limitations of Kafka? (Ref http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic ) Best Regards, Stone
