Re: are offsets per consumer or per consumer group?
Thanks very much Luke!
Re: are offsets per consumer or per consumer group?
1) In more recent versions of Kafka, the consumer group coordinator runs on the broker. Previously, there was a "high level consumer" that spoke directly to zookeeper and did group management within the client libraries, but this is no longer used. 2) That depends on when your consumer commits offsets. The normal case is to commit the offset for a message after that message has been processed. In that case, the next consumer to be assigned that partition would reprocess message 77. The other option is to commit offsets as messages are received but before they are processed. This cause messages to be processed at most once, instead of at least once. 3) The best way to get at least once processing is to make sure your client is not configured to automatically commit offsets, and to do so explicitly. This way you can be sure commits only happen once the result of processing a message has been durably stored (or whatever needs to happen for your use case). That doesn't necessarily mean you need to commit immediately after each individual message, only that when you commit it is only for messages that have been completely processed. On Thu, Feb 8, 2018 at 9:37 AM, Xavier Noria wrote: > On Thu, Feb 8, 2018 at 4:27 PM, Luke Steensen < > luke.steen...@braintreepayments.com> wrote: > > Offsets are maintained per consumer group. When an individual consumer > > crashes, the consumer group coordinator will detect that failure and > > trigger a rebalance. This redistributes the partitions being consumed > > across the available consumer processes, using the most recently > committed > > offset for each as the starting point. > > > > Excellent, the getting started guide uses "consumer" sometimes meaning an > individual consumer, and sometimes meaning a consumer group. That > difficults a bit understanding how it works with exactitude. Thanks for > clarifying. > > Let me followup with these questions then: > > 1) The group coordinator runs in Kafka? Or is the client library > responsible for that? > > 2) Say that a consumer group has consumers A, B and C, assigned to the 3 > partitions resectively. Consumer A polls and gets messages 75-80, but when > it is processing message 77 crashes. The coordinator rebalances and assigns > that partition to some of the other two, but at which offset is that > partition left? > > 3) If the answer is 81, a critical consumer group that cannot miss messages > is expected to write custom coordination code to avoid missing 77-80? If > yes, are there best practices out there for doing this? >
Re: are offsets per consumer or per consumer group?
On Thu, Feb 8, 2018 at 4:27 PM, Luke Steensen < luke.steen...@braintreepayments.com> wrote: Offsets are maintained per consumer group. When an individual consumer > crashes, the consumer group coordinator will detect that failure and > trigger a rebalance. This redistributes the partitions being consumed > across the available consumer processes, using the most recently committed > offset for each as the starting point. > Excellent, the getting started guide uses "consumer" sometimes meaning an individual consumer, and sometimes meaning a consumer group. That difficults a bit understanding how it works with exactitude. Thanks for clarifying. Let me followup with these questions then: 1) The group coordinator runs in Kafka? Or is the client library responsible for that? 2) Say that a consumer group has consumers A, B and C, assigned to the 3 partitions resectively. Consumer A polls and gets messages 75-80, but when it is processing message 77 crashes. The coordinator rebalances and assigns that partition to some of the other two, but at which offset is that partition left? 3) If the answer is 81, a critical consumer group that cannot miss messages is expected to write custom coordination code to avoid missing 77-80? If yes, are there best practices out there for doing this?
Re: are offsets per consumer or per consumer group?
Offsets are maintained per consumer group. When an individual consumer crashes, the consumer group coordinator will detect that failure and trigger a rebalance. This redistributes the partitions being consumed across the available consumer processes, using the most recently committed offset for each as the starting point. On Thu, Feb 8, 2018 at 6:58 AM, Xavier Noria wrote: > Let's suppose a topic has three partitions and two consumer groups > listening. > > The offset maintained by Kafka in each partition is associated with the > consumer group? Or with the individual consumer polling from that partition > in each consumer group respectively? > > I am trying to understand the system behavior when listeners crash, but in > order to formulate more questions I need to double-check that before. >
are offsets per consumer or per consumer group?
Let's suppose a topic has three partitions and two consumer groups listening. The offset maintained by Kafka in each partition is associated with the consumer group? Or with the individual consumer polling from that partition in each consumer group respectively? I am trying to understand the system behavior when listeners crash, but in order to formulate more questions I need to double-check that before.