[ 
https://issues.apache.org/jira/browse/STORM-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241377#comment-14241377
 ] 

Derek Dagit commented on STORM-589:
-----------------------------------

I may not have thought of a reason why the current defaults are good and not 
necessarily suboptimal.  Please do comment if that is the case.

> Suboptimal default worker hb timeouts for nimbus & supervisor
> -------------------------------------------------------------
>
>                 Key: STORM-589
>                 URL: https://issues.apache.org/jira/browse/STORM-589
>             Project: Apache Storm
>          Issue Type: Bug
>    Affects Versions: 0.9.2-incubating
>            Reporter: Derek Dagit
>            Priority: Minor
>
> Both worker heartbeat timeouts for nimbus and supervisor are set to 30 
> seconds by default:
> https://github.com/apache/storm/blob/3bbdc166bda7fb1a39b6906eda40da9bc83d5d4c/conf/defaults.yaml#L58
> https://github.com/apache/storm/blob/3bbdc166bda7fb1a39b6906eda40da9bc83d5d4c/conf/defaults.yaml#L118
> This means that it is when a worker dies in relation to its heartbeats that 
> would determine whether the supervisor relaunches it or nimbus reassigns it.
> If the supervisor heartbeat is found to have timed out first, it is 
> relaunched.  If the nimbus heartbeat is found to have timed out first, it is 
> rescheduled.
> We may want the nimbus time-out to be larger than the supervisor time-out, to 
> give the supervisor a chance to relaunch the worker before nimbus re-assigns 
> it.
> As always, users administrating clusters are encouraged to set these as 
> needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to