With compression enabled (as you have) it is possible for a consumer to see duplicates during rebalance. This is because iteration may be in the middle of a compressed message set just before a rebalance, but the checkpointed offsets are at MessageSet boundaries. However, this would only be during rebalance - i.e., in steady state, when you have no change in # consumers/# partitions you shouldn't see duplicates.
Joel On Fri, Nov 16, 2012 at 10:02 AM, Mark Grabois <mark.grab...@trendrr.com>wrote: > https://gist.github.com/4089354.git > > https://gist.github.com/4089369.git > > > > On Fri, Nov 16, 2012 at 12:59 PM, Mark Grabois <mark.grab...@trendrr.com > >wrote: > > > Hi all, > > > > I'm encountering a problem where i'm getting far too many duplicate > > messages being sent to my kafka setup (statically, using broker.list), > > being picked up by zk-based consumers. > > > > I've provided my test classes here and the kafka/zk versions i'm using to > > run them and my servers: > > > > *client side*: > > producer: git://gist.github.com/4089354.git > > consumer: git://gist.github.com/4089369.git > > *jars*: > > kafka-0.7.2 > > > > *server side*: > > 5 kafka servers, zk servers on 3 of those > > 1 partition per test topic per server > > *server versions*: > > kafka-0.7.1 > > zookeeper-3.4.3 > > > > Any advice would be greatly appreciated. > > > > Thank you, > > Mark > > > > > > >