[ https://issues.apache.org/jira/browse/HADOOP-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811936#comment-13811936 ]
Hudson commented on HADOOP-9898: -------------------------------- SUCCESS: Integrated in Hadoop-Yarn-trunk #380 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/380/]) HADOOP-9898. Set SO_KEEPALIVE on all our sockets. Contributed by Todd Lipcon. (wang: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1537637) * /hadoop/common/trunk/hadoop-common-project/hadoop-common/CHANGES.txt * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Client.java * /hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > Set SO_KEEPALIVE on all our sockets > ----------------------------------- > > Key: HADOOP-9898 > URL: https://issues.apache.org/jira/browse/HADOOP-9898 > Project: Hadoop Common > Issue Type: Bug > Components: ipc, net > Affects Versions: 3.0.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Minor > Fix For: 2.2.1 > > Attachments: hadoop-9898.txt > > > We recently saw an issue where network issues between slaves and the NN > caused ESTABLISHED TCP connections to pile up and leak on the NN side. It > looks like the RST packets were getting dropped, which meant that the client > thought the connections were closed, while they hung open forever on the > server. > Setting the SO_KEEPALIVE option on our sockets would prevent this kind of > leak from going unchecked. -- This message was sent by Atlassian JIRA (v6.1#6144)