Todd Lipcon created HADOOP-9898: ----------------------------------- Summary: Set SO_KEEPALIVE on all our sockets Key: HADOOP-9898 URL: https://issues.apache.org/jira/browse/HADOOP-9898 Project: Hadoop Common Issue Type: Bug Components: ipc, net Affects Versions: 3.0.0 Reporter: Todd Lipcon Priority: Minor
We recently saw an issue where network issues between slaves and the NN caused ESTABLISHED TCP connections to pile up and leak on the NN side. It looks like the RST packets were getting dropped, which meant that the client thought the connections were closed, while they hung open forever on the server. Setting the SO_KEEPALIVE option on our sockets would prevent this kind of leak from going unchecked. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira