[ 
https://issues.apache.org/jira/browse/STORM-1351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15027632#comment-15027632
 ] 

Roshan Naik commented on STORM-1351:
------------------------------------

I dont think there is a need for a different backoff logic for errors v/s no 
data. only metric update need to be different.

Taylor's suggestion seems promising.

Exponential backoff is typically the preferred way in such systems as opposed 
to 1ms fixed back off.

> Storm spouts and bolts need a way to communicate problems back to toplogy 
> runner
> --------------------------------------------------------------------------------
>
>                 Key: STORM-1351
>                 URL: https://issues.apache.org/jira/browse/STORM-1351
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>            Reporter: Roshan Naik
>            Assignee: Roshan Naik
>
> A spout can be having a problem generating a  tuple in nextTuple()  because 
>  -a) there is no data to generate currently 
>  - b) there is some I/O  issues it is experiencing
> If the spout returns immediately from the nextTuple() call then the 
> nextTuple() will be invoked immediately leading to CPU spike. The CPU spike 
> would last till the situation is remedied by new coming data or the i/o 
> issues getting resolved.
> Currently to work around this, the spouts will have to implement a 
> exponential backoff mechanism internally. There are two problems with this:
>  - each spout needs to implement this backoff logic
>  - since nextTuple() has an internal sleep and takes longer to return, the 
> latency metrics computation gets thrown off
> *Thoughts for Solution:*
> The spout should be able to indicate a 'no data',  'experiencing error' or 
> 'all good' status back to the caller of nextTuple() so that the right backoff 
> logic can kick in.
> - The most natural way to do this is using the return type of the nextTuple 
> method. Currently nextTuple() returns void.  However, this will break source 
> and binary compat since the new storm will not be able to invoke the methods 
> on the unmodified spouts. This breaking change can only be considered as an 
> option only prior to v1.0. 
> - Alternatively this can be done by providing an additional method on the 
> collector to indicate the condition to the topology runner. The spout can 
> invoke this explicitly. the metrics can then also account for 'no data' and 
> 'error attempts'
> - Alternatively - The toplogy  runner may just examine the collector if there 
> was new data generated by the nextTuple() call. In this case it cannot 
> distinguish between errors v/s no incoming data. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to