Re: error recovery in multiple thread reading from Kafka with HighLevel api

Chen Wang Thu, 07 Aug 2014 19:51:08 -0700

Just did some testing.It seems that the rebalance will occur upon
*zookeeper.session.timeout.ms
<http://zookeeper.session.timeout.ms>. *
*So yes, if one thread died, the left over messages will be picked up by
other threads.*



On Thu, Aug 7, 2014 at 5:36 PM, Chen Wang <chen.apache.s...@gmail.com>
wrote:

> Guozhang,
> Just to make it clear:
> If I have 10 threads with the same consumer group id, read the topic T.
> The auto commit is turned off, and commitOffset is called only when the
> message is processed successfully.
> If thread 1 dies when processing message from partition P1, and the last
> offset is Offset1.   Then kafka will ensure that one of the other running 9
> threads will automatically pick up the message on partition P1 from Offset1
> ? will the thread have the risk of reading the same message more than once?
>
> Also I would assume commit offset for each message is a bit heavy. What
> you guys usually do for error handling during reading kafka?
> Thanks much!
> Chen
>
>
>
> On Thu, Aug 7, 2014 at 5:18 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>
>> Yes, in that case you can turn of auto commit and call commitOffsets
>> manually after processing is finished. commitOffsets() will only write the
>> offset of the partitions that the consumer is currently fetching, so there
>> is no need to coordinate this operation.
>>
>>
>> On Thu, Aug 7, 2014 at 5:03 PM, Chen Wang <chen.apache.s...@gmail.com>
>> wrote:
>>
>> > But with the auto commit turned on, I am risking off losing the failed
>> > message, right? should I turn off the auto commit, and only commit the
>> > offset when the message is processed successfully..But that would
>> require
>> > the coordination between threads in order to know what is the right
>> timing
>> > to commit offset..
>> >
>> >
>> >
>> > On Thu, Aug 7, 2014 at 4:54 PM, Guozhang Wang <wangg...@gmail.com>
>> wrote:
>> >
>> > > Hello Chen,
>> > >
>> > > With high-level consumer, the partition re-assignment is automatic
>> upon
>> > > consumer failures.
>> > >
>> > > Guozhang
>> > >
>> > >
>> > > On Thu, Aug 7, 2014 at 4:41 PM, Chen Wang <chen.apache.s...@gmail.com
>> >
>> > > wrote:
>> > >
>> > > > Folks,
>> > > >  I have a process started at specific time and read from a specific
>> > > topic.
>> > > > I am currently using the High Level API(consumer group) to read from
>> > > > kafka(and will stop once there is nothing in the topic by
>> specifying a
>> > > > timeout). i am most concerned about error recovery in multiple
>> thread
>> > > > context. If one thread dies, will other running bolt threads picks
>> up
>> > the
>> > > > failed message? Or I have to start another thread in order to pick
>> up
>> > the
>> > > > failed message? What would be  a good practice to ensure the message
>> > can
>> > > be
>> > > > processed at least once?
>> > > >
>> > > > Note that all threads are using the same group id.
>> > > >
>> > > > Thanks,
>> > > > Chen
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > -- Guozhang
>> > >
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>

Re: error recovery in multiple thread reading from Kafka with HighLevel api

Reply via email to