Figured out myself, it’s because I throw FailedException() after calling the 
_collector.fail(input), apparently, HolmesNL spout doesn’t handle 
FailedException, and the whole worker process halted, as soon as I remove that 
exception, retry works well

> On Nov 26, 2014, at 3:00 PM, Hefeng Yuan <[email protected]> wrote:
> 
> Thanks for the reply, yeah that explains the auto restart part, still not 
> clear why it retries 4 times and stop
> 
> I did start with the official Kafka spout, totally doesn't work for me, loses 
> message, and constantly restart worker with timed-out
> 
> Are there someone else also using HolmesNL spout? Wondering how you guys deal 
> with failed tuple retry
> 
> 
> 
> On Nov 26, 2014, at 13:59, Harsha <[email protected] <mailto:[email protected]>> 
> wrote:
> 
>>  
>> If your bolt hanged it will cause workers not to send heartbeats and 
>> supervisor.worker.timeout.secs trigger causing workers to be killed and 
>> restarted. Did you try using 
>> https://github.com/apache/storm/tree/master/external/storm-kafka 
>> <https://github.com/apache/storm/tree/master/external/storm-kafka> 
>> -Harsha
>>  
>> On Wed, Nov 26, 2014, at 01:40 PM, Hefeng Yuan wrote:
>>> Hello, 
>>>  
>>> I’m trying to us HolmesNL/kafka-spout, it worked pretty well for happy 
>>> path, however, when tuple fails (e.g. _collector.fail(input) gets called in 
>>> bolt), it seems like only retry 3 or 4 times, and then hang there, until 
>>> the supervisor.worker.timeout.secs reaches, and topology got restarted.
>>> Just wondering where is this number of retried controlled, and also, since 
>>> the tuple already fail, why would it still trigger 
>>> supervisor.worker.timeout.secs?
>>>  
>>> Thanks,
>>> Hefeng
>>  

Reply via email to