Re: Message lost after consumer crash in kafka 0.9

2016-02-02 Thread Guozhang Wang
It is indeed wired that if you kill -9 before the first commit, then there is no data loss. But with what I suspect you can get data loss in the middle, not only the last messages. Since once consumer1 is killed, consumer2 will take over partitions assigned to consumer1 and resume from committed o

Re: Message lost after consumer crash in kafka 0.9

2016-02-02 Thread Han JU
Sorry in fact the test code in gist does not exactly reproduce the problem we're facing. I'm working on that. 2016-02-02 10:46 GMT+01:00 Han JU : > Thanks Guazhang for the reply! > > So in fact if it's the case you said, if I understand correctly, then the > messages lost should be the last messa

Re: Message lost after consumer crash in kafka 0.9

2016-02-02 Thread Han JU
Thanks Guazhang for the reply! So in fact if it's the case you said, if I understand correctly, then the messages lost should be the last messages. But in our use case it is not the last messages get lost. And this does not explain that the different behavior depending on `kill -9` moment (before

Re: Message lost after consumer crash in kafka 0.9

2016-02-01 Thread Guozhang Wang
One thing to add, is that by doing this you could possibly get duplicates but not data loss, which obeys Kafka's at-least once semantics. Guozhang On Mon, Feb 1, 2016 at 3:17 PM, Guozhang Wang wrote: > Hi Han, > > I looked at your test code and actually the error is in this line: > https://gist

Re: Message lost after consumer crash in kafka 0.9

2016-02-01 Thread Guozhang Wang
Hi Han, I looked at your test code and actually the error is in this line: https://gist.github.com/darkjh/437ac72cdd4b1c4ca2e7#file-kafkabug2-scala-L61 where you call "commitSync" in the finally block, which will commit messages that is returned to you from poll() call. More specifically, for e

Message lost after consumer crash in kafka 0.9

2016-02-01 Thread Han JU
Hi, One of our usage of kafka is to tolerate arbitrary consumer crash without losing or duplicating messages. So in our code we manually commit offset after successfully persisted the consumer state. In prototyping with kafka-0.9's new consumer API, I found that in some cases, kafka failed to sen