Great, I'm running the quick start and can see that in operation. Ok, last question on this thread:
> So if you have two consumer groups consuming a topic, and each consumer group > has 4 machines in it, then a message published to this topic would be > delivered to one machine in each of the two groups. How is topic load-balancing for consumers handled? For example, if a consumer group has 4 machines in it (consumer per machine), in reality only one machine in the group is actually working. If I want multiple machines handling items in a topic, how is that approach handled? I could see producers generating more topics, and consumers subscribing to those (making a high-volume topic more granular). What's best practice when consumer tasks on topic messages need to be handled by multiple consumers? -Jeff On Jun 12, 2012, at 11:46 AM, Jay Kreps wrote: > Basically the rule is this "every message sent to the topic is delivered to > one machine/process in each consumer group". So if you have two consumer > groups consuming a topic, and each consumer group has 4 machines in it, > then a message published to this topic would be delivered to one machine in > each of the two groups. > > -Jay > > On Tue, Jun 12, 2012 at 11:34 AM, Rodenburg, Jeff < > jeff.rodenb...@teamaol.com> wrote: > >> Thanks for the info, Jun. >> >>> if you just want each message to be consumed by a consumer, not a >> particular one >> >> What is intended to be a particular consumer? Something on the order of >> Consumer #3 within a group needs message #123? >> >> Ok, next question: >> >> What is the relationship between topics and consumer groups? More to the >> point, can I have multiple consumer groups that all consume the same topic? >> For example, assume a set of producers are publishing to the topic "ABC". >> Suppose I have multiple processes that take action on a given ABC message >> -- process 1 handles billing, process 2 handles file management, process 3 >> handles history/archiving, etc. Can I structure multiple groups that >> consume the same topic? How does partitioning work at that point? >> >> >> >> >> On Jun 12, 2012, at 10:11 AM, Jun Rao wrote: >> >>> Jeff, >>> >>> Your understanding is correct. Operational wise, we have some jmx that >>> gives consumer stats per topic. There is also a tool CheckOffsetLag that >>> tells you how far behind a consumer is. For coordination btw producers >> and >>> consumers, if you just want each message to be consumed by a consumer, >> not >>> a particular one, there is no coordination needed. >>> >>> Thanks, >>> >>> Jun >>> >>> On Tue, Jun 12, 2012 at 9:58 AM, Rodenburg, Jeff < >> jeff.rodenb...@teamaol.com >>>> wrote: >>> >>>> Hi all - >>>> >>>> Just getting familiar with Kafka, and learning about consumer groups. >>>> Hoping someone can provide some context here. >>>> >>>> As I understand it, consumers register with the broker and consume a >>>> topic. Multiple consumers can consume a single topic, as a consumer >> group. >>>> Each consumer actually gets a partition of messages, so there is no >> overlap >>>> -- a single consumer within a group will receive a message on its >>>> topic/partition. Consumer rebalancing is the process whereby members >> of a >>>> consumer group are added and/or dropped from the group, and partitions >> are >>>> sorted/reassigned to the current consumer group members. >>>> >>>> Some questions: >>>> >>>> * Is this accurate? What am I missing? >>>> * Operationally, is consumer "failover" basically service monitoring >> at >>>> the consumer process level? >>>> * How much coordination is required between producers and consumers >>>> around partitioning? (Automated, configuration, etc.) >>>> * How are topics monitored for SLA on throughput/load, i.e. spinning >> up >>>> consumers as needed for topic message spikes? >>>> >>>> Appreciate any further information and/or context anyone can share. >>>> >>>> cheers, >>>> Jeff >>>> >> >>