Kafka-users,

I'm looking to use kafka in a pub-sub model where a consumer reads from Kafka 
and does some processing on the message.  How would you recommend a commit to 
Zookeeper / setting the last message consumed location if processing one of the 
messages in the pipe is more unreliable than the others.
Let's say I read a batch of 10 messages (1-10) and I successfully process 
messages 1-8 and 10 quickly, but message #9 is taking an inordinately long time 
to process. I don't want to write the message as consumed against Zookeeper but 
I also don't want to block forward progress of the pipeline for that 
topic/partition.
In the edge case, let's say processing all messages was successful but 
processing message #9 totally failed/timed out, but a reattempt at processing 
it would not result in failure (web-service call, strange network condition). 
In the timeout/failure case would the suggestion be to re-queue the message?
Does anyone have any recommendations for the above scenarios?
ThanksVito

                                          

Reply via email to