Just did some testing.It seems that the rebalance will occur upon *zookeeper.session.timeout.ms <http://zookeeper.session.timeout.ms>. * *So yes, if one thread died, the left over messages will be picked up by other threads.*
On Thu, Aug 7, 2014 at 5:36 PM, Chen Wang <chen.apache.s...@gmail.com> wrote: > Guozhang, > Just to make it clear: > If I have 10 threads with the same consumer group id, read the topic T. > The auto commit is turned off, and commitOffset is called only when the > message is processed successfully. > If thread 1 dies when processing message from partition P1, and the last > offset is Offset1. Then kafka will ensure that one of the other running 9 > threads will automatically pick up the message on partition P1 from Offset1 > ? will the thread have the risk of reading the same message more than once? > > Also I would assume commit offset for each message is a bit heavy. What > you guys usually do for error handling during reading kafka? > Thanks much! > Chen > > > > On Thu, Aug 7, 2014 at 5:18 PM, Guozhang Wang <wangg...@gmail.com> wrote: > >> Yes, in that case you can turn of auto commit and call commitOffsets >> manually after processing is finished. commitOffsets() will only write the >> offset of the partitions that the consumer is currently fetching, so there >> is no need to coordinate this operation. >> >> >> On Thu, Aug 7, 2014 at 5:03 PM, Chen Wang <chen.apache.s...@gmail.com> >> wrote: >> >> > But with the auto commit turned on, I am risking off losing the failed >> > message, right? should I turn off the auto commit, and only commit the >> > offset when the message is processed successfully..But that would >> require >> > the coordination between threads in order to know what is the right >> timing >> > to commit offset.. >> > >> > >> > >> > On Thu, Aug 7, 2014 at 4:54 PM, Guozhang Wang <wangg...@gmail.com> >> wrote: >> > >> > > Hello Chen, >> > > >> > > With high-level consumer, the partition re-assignment is automatic >> upon >> > > consumer failures. >> > > >> > > Guozhang >> > > >> > > >> > > On Thu, Aug 7, 2014 at 4:41 PM, Chen Wang <chen.apache.s...@gmail.com >> > >> > > wrote: >> > > >> > > > Folks, >> > > > I have a process started at specific time and read from a specific >> > > topic. >> > > > I am currently using the High Level API(consumer group) to read from >> > > > kafka(and will stop once there is nothing in the topic by >> specifying a >> > > > timeout). i am most concerned about error recovery in multiple >> thread >> > > > context. If one thread dies, will other running bolt threads picks >> up >> > the >> > > > failed message? Or I have to start another thread in order to pick >> up >> > the >> > > > failed message? What would be a good practice to ensure the message >> > can >> > > be >> > > > processed at least once? >> > > > >> > > > Note that all threads are using the same group id. >> > > > >> > > > Thanks, >> > > > Chen >> > > > >> > > >> > > >> > > >> > > -- >> > > -- Guozhang >> > > >> > >> >> >> >> -- >> -- Guozhang >> > >