Hi Guozhang, If I use high level consumer, how do I ensure all data goes to master even if slave was up and running ? Is it just by forcing master to have enough consumer thread to cover maximum number of partitions of a topic since high level consumer doesn't have assumption of consumers who are master and consumers who are slave.
For example, master A initiate enough thread such that it can cover all the partitions. slave B is standby with same consumer group and same number of threads but since master A has enough thread to cover all the partitions. Slave B won't get any data. Suddenly master A goes down, slave B becomes new master, and it start to get data based on high level consumer rebalance design. After that old master A comes up and becomes slave, will A get data ? Or A will not get data because B has enough thread to cover all partitions in the rebalancing logic. Thanks, Weide On Fri, Aug 1, 2014 at 4:45 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Hello Weide, > > That should be doable via high-level consumer, you can take a look at this > page: > > https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example > > Guozhang > > > On Fri, Aug 1, 2014 at 3:20 PM, Weide Zhang <weo...@gmail.com> wrote: > > > Hi, > > > > I have a use case for a master slave cluster where the logic inside > master > > need to consume data from kafka and publish some aggregated data to kafka > > again. When master dies, slave need to take the latest committed offset > > from master and continue consuming the data from kafka and doing the > push. > > > > My questions is what will be easiest kafka consumer design for this > > scenario to work ? I was thinking about using simpleconsumer and doing > > manual consumer offset syncing between master and slave. That seems to > > solve the problem but I was wondering if it can be achieved by using high > > level consumer client ? > > > > Thanks, > > > > Weide > > > > > > -- > -- Guozhang >