Is it possible to receive duplicate messages from Kafka 0.9.0.1 or 0.10.1.0 when you have a topic with three partitions, and one consumer group with three consumer clients. One client stops consuming and is taken offline. These clients do not commit offset immediately, but the offsets are committed automatically after a default wait time setting. The partition assigned to the client that goes down is moved to another client in the same group automatically.
Meanwhile, the client that went down gets some TLC, still holds some messages that were retrieved but never fully processed. When it comes back up, it happily completes processing the data and writes it to an HDFS. Will the second client be given uncommitted messages that the first client had already received, but never committed? This would result in duplicate messages on HDFS, which is what we witnessed this week when just such a thing happened. Regards, Ben This e-mail, including attachments, may include confidential and/or proprietary information, and may be used only by the person or entity to which it is addressed. If the reader of this e-mail is not the intended recipient or his or her authorized agent, the reader is hereby notified that any dissemination, distribution or copying of this e-mail is prohibited. If you have received this e-mail in error, please notify the sender by replying to this message and delete this e-mail immediately.