1) In more recent versions of Kafka, the consumer group coordinator runs on the broker. Previously, there was a "high level consumer" that spoke directly to zookeeper and did group management within the client libraries, but this is no longer used.
2) That depends on when your consumer commits offsets. The normal case is to commit the offset for a message after that message has been processed. In that case, the next consumer to be assigned that partition would reprocess message 77. The other option is to commit offsets as messages are received but before they are processed. This cause messages to be processed at most once, instead of at least once. 3) The best way to get at least once processing is to make sure your client is not configured to automatically commit offsets, and to do so explicitly. This way you can be sure commits only happen once the result of processing a message has been durably stored (or whatever needs to happen for your use case). That doesn't necessarily mean you need to commit immediately after each individual message, only that when you commit it is only for messages that have been completely processed. On Thu, Feb 8, 2018 at 9:37 AM, Xavier Noria <f...@hashref.com> wrote: > On Thu, Feb 8, 2018 at 4:27 PM, Luke Steensen < > luke.steen...@braintreepayments.com> wrote: > > Offsets are maintained per consumer group. When an individual consumer > > crashes, the consumer group coordinator will detect that failure and > > trigger a rebalance. This redistributes the partitions being consumed > > across the available consumer processes, using the most recently > committed > > offset for each as the starting point. > > > > Excellent, the getting started guide uses "consumer" sometimes meaning an > individual consumer, and sometimes meaning a consumer group. That > difficults a bit understanding how it works with exactitude. Thanks for > clarifying. > > Let me followup with these questions then: > > 1) The group coordinator runs in Kafka? Or is the client library > responsible for that? > > 2) Say that a consumer group has consumers A, B and C, assigned to the 3 > partitions resectively. Consumer A polls and gets messages 75-80, but when > it is processing message 77 crashes. The coordinator rebalances and assigns > that partition to some of the other two, but at which offset is that > partition left? > > 3) If the answer is 81, a critical consumer group that cannot miss messages > is expected to write custom coordination code to avoid missing 77-80? If > yes, are there best practices out there for doing this? >