[ 
https://issues.apache.org/jira/browse/STORM-510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363333#comment-14363333
 ] 

ASF GitHub Bot commented on STORM-510:
--------------------------------------

Github user d2r commented on the pull request:

    https://github.com/apache/storm/pull/430#issuecomment-81730400
  
    -0 unless we have evidence of a real performance issue that needs fixing 
here.
    
    @darionyaphet, If this pull request is related to an existing JIRA issue, 
it would help us out if you would update the title of this pull request with 
[STORM-X].  
    
    Here is a 
[link](https://issues.apache.org/jira/browse/STORM-510?jql=project%20%3D%20STORM%20AND%20resolution%20%3D%20Unresolved%20ORDER%20BY%20priority%20DESC)
 to some currently unresolved issues in the JIRA system, for reference.



> Netty messaging client blocks transfer thread on reconnect
> ----------------------------------------------------------
>
>                 Key: STORM-510
>                 URL: https://issues.apache.org/jira/browse/STORM-510
>             Project: Apache Storm
>          Issue Type: Sub-task
>    Affects Versions: 0.9.2-incubating, 0.9.3
>            Reporter: Robert Joseph Evans
>            Priority: Critical
>
> The latest netty client code will attempt to reestablish the connection on 
> failure as part of the send method call.  It will block until the connection 
> is established or a timeout happens, by default this is about 30 seconds, 
> which is also the default tuple timeout.  
> This is exacerbated by the read lock that is held during the send, that 
> prevents the node->socket mapping from changing while we are sending.  This 
> is mostly so that we don't close connections while we are trying to write to 
> them, which would cause an exception.  But this makes it so if there are 
> multiple workers on a node that all get rescheduled we will wait the full 30 
> seconds to timeout for each worker.
> send must be non-blocking in the current design of the worker, or it will 
> prevent other messages from being delivered, and is likely to cause many many 
> messages to timeout on a reschedule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to