Yes the code is slightly ahead of that release. Svn log will give the change list. The change we made improved the rebalancing protocol so that in non-failure cases there is no duplication. The duplication isn't really a problem per se--essentially all messaging systems either give "at most once" or "at least once" semantics, we are the former. In the event of a hard kill of a consumer process you will still see some duplicate messages as the process that takes over the partitions from the now-killed consumer will start from the last commit point.
-Jay On Wed, May 16, 2012 at 11:45 PM, navneet sharma <navneetsharma0...@gmail.com> wrote: > I downloaded the tar from the download link provided in quickstart page. > Almost more than a month back. > > I trunk maintaining different code than the tar? > > Can number of partitions cause this problem, beacuse i am using 2 > partitions on each of the two brokers.? > > Thanks, > Navneet Sharma > > On Wed, May 16, 2012 at 10:00 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > >> Technically this is the guarantee we provide--at least once delivery. >> It is very expensive to completely eliminate this possibility in the >> general case as you need to co-ordinate any state changes the consumer >> makes with committing the offset that marks the position. But we have >> improved the common cases for normal rebalancing so if you are using >> trunk the only time this would happen is when there is a hard crash of >> a process. >> >> -Jay >> >> On Wed, May 16, 2012 at 2:41 AM, navneet sharma >> <navneetsharma0...@gmail.com> wrote: >> > Hi, >> > >> > I tried a scenario wherein: >> > 1) i had 1 producer and 3 consumers subscribed for a topic - "cartTopic", >> > all in same group. >> > 2) Now, when everything is executing, i introduced another consumer for >> the >> > same topic and in the same group. So, overall there are 4 consumers. >> > 3) Ofcourse, it triggered re-balancing. >> > >> > But then final result is that few messages are duplicated. >> > In my example run, producer sent 800,000 records, but consumer received >> > 801,448 records. >> > I am using log4j to generate the output file. >> > >> > Is there any reasons for duplicacy? >> > >> > Thanks, >> > Navneet Sharma >>