Hi, I read this link https://cwiki.apache.org/KAFKA/consumer-group-example.html and have a few questions (if not too many).
1 When you say the iterator may block, do you mean hasNext() may block? 2 "Remember, you can only use a single process per Consumer Group." Do you mean we can only use a single process on one node of the cluster for a consumer group? Or there can be only one process on the whole cluster for a consumer group? Please clarify on this. 3 Why save offset to zookeeper? Is it easier to save it to a local file? 4 When client exits/crashes or leader for a partition is changed, duplicate messages may be replayed. "To help avoid this (replayed duplicate messages), make sure you provide a clean way for your client to exit instead of assuming it can be 'kill -9'd." a. For client exit, if the client is receiving data at the time, how to do a clean exit? How can client tell consumer to write offset to zookeepr before exiting? b. For client crash, what can client do to avoid duplicate messages when restarted? What I can think of is to read last message from log file and ignore the first few received duplicate messages until receiving the last read message. But is it possible for client to read log file directly? c. For the change of the partition leader, is there anything that clients can do to avoid duplicates? Thanks. Libo