Hi! By default kafka uses internally a round robin partitioner that will send the messages to the right partition based on the message key. Each of your consumer will receive message for its allocated partition for that they subscribed. In case of rebalance, if you add more consumers than the partitions then some of the consumers will not get any data. If one of the consumers dies, then the remained consumers will get messages from the partitions depending on their client id. Kafka internally uses the client id (lexicogarphic order) to allocate the partitions.
I hope that this give you an overview of what happens and somehow answer to your questions. Regards, florin On Thu, Jun 30, 2016 at 12:36 AM, Milind Vaidya <kava...@gmail.com> wrote: > Hi > > Background : > > I am using a java based multithreaded kafka consumer. > > Two instances of this consumer are running on 2 different machines i.e. > one consumer process per box, and belong to same consumer group. > > Internally each process has 2 threads each. > > Both the consumer processes consume from same topic "rawlogs" which has 4 > partitions. > > Problem : > > As per the documentation of consumer group "each message published to a > topic is delivered to one consumer instance within each subscribing > consumer > group" . But is there any mechanism by which a each consumer consumes from > disjoint set of partitions too ? or each message from whichever partition > it is, will be given randomly to one of the consumers ? > > In case of rebalance, the partitions may get shuffled among consumers but > then again they should get divided into 2 disjoint sets one for each > consumer. >