Is it possible to receive duplicate messages from Kafka 0.9.0.1 or 0.10.1.0 
when you have a topic with three partitions, and one consumer group with three 
consumer clients. One client stops consuming and is taken offline. These 
clients do not commit offset immediately, but the offsets are committed 
automatically after a default wait time setting. The partition assigned to the 
client that goes down is moved to another client in the same group 
automatically.

Meanwhile, the client that went down gets some TLC, still holds some messages 
that were retrieved but never fully processed. When it comes back up, it 
happily completes processing the data and writes it to an HDFS.

Will the second client be given uncommitted messages that the first client had 
already received, but never committed? This would result in duplicate messages 
on HDFS, which is what we witnessed this week when just such a thing happened.

Regards,
Ben



This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

Reply via email to