[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated ZOOKEEPER-3384:
--------------------------------------
    Labels: pull-request-available  (was: )

> Avoid long quorum unavailable time due to TLS connection close stalled with 
> full send buffer
> --------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-3384
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3384
>             Project: ZooKeeper
>          Issue Type: Sub-task
>          Components: server
>            Reporter: Fangmin Lv
>            Assignee: Fangmin Lv
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.6.0
>
>
>  
> *Problem*
> For SSL socket, when calling close(), it is required to send a close_notify 
> alert before closing the write side of the connection. In case the leader is 
> partitioned away, it's possible that the learner shutdown may take long time 
> if the send buffer is full, because it will block on sending close_notify 
> packet.
> From the SSLSocketImpl implementation, it still honors the SO_LINGER socket 
> option, the difference is that even we set the SO_LINGER time to be 0 it will 
> still try to issue the close_notify packet. But it will fail immediately and 
> close the socket if it failed to acquire the write lock immediately.
> Set SO_LINGER to a small number will avoid stall for a long time during 
> shutdown, this is what we're going to do here.
> *Any Cons of doing this?*
> From the TCP RFC, the close handshake is added to avoid a truncation attack 
> where an attacker inserts into a message a TCP code indicating the message 
> has finished, thus preventing the recipient picking up the rest of the 
> message. But it's fine if the peer didn't send close_notify in some cases, 
> for example, the client crashed or being killed, etc. For us, usually the 
> close_notify won't be and don't have chance to send during rolling restart.
> Another thing mentioned in the RFC is that not able to send close_notify will 
> cause the SSL session not able to be resume. Given reusable session id is not 
> benefiting ZooKeeper quorum anyway, this is not a problem for us.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to