[
https://issues.apache.org/jira/browse/STORM-510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Noll updated STORM-510:
-------------------------------
Affects Version/s: 0.9.3
> Netty messaging client blocks transfer thread on reconnect
> ----------------------------------------------------------
>
> Key: STORM-510
> URL: https://issues.apache.org/jira/browse/STORM-510
> Project: Apache Storm
> Issue Type: Sub-task
> Affects Versions: 0.9.2-incubating, 0.9.3
> Reporter: Robert Joseph Evans
> Priority: Critical
>
> The latest netty client code will attempt to reestablish the connection on
> failure as part of the send method call. It will block until the connection
> is established or a timeout happens, by default this is about 30 seconds,
> which is also the default tuple timeout.
> This is exacerbated by the read lock that is held during the send, that
> prevents the node->socket mapping from changing while we are sending. This
> is mostly so that we don't close connections while we are trying to write to
> them, which would cause an exception. But this makes it so if there are
> multiple workers on a node that all get rescheduled we will wait the full 30
> seconds to timeout for each worker.
> send must be non-blocking in the current design of the worker, or it will
> prevent other messages from being delivered, and is likely to cause many many
> messages to timeout on a reschedule.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)