Nathan Roberts created YARN-4052: ------------------------------------ Summary: Set SO_KEEPALIVE on NM servers Key: YARN-4052 URL: https://issues.apache.org/jira/browse/YARN-4052 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.7.1 Reporter: Nathan Roberts
Shuffle handler does not set SO_KEEPALIVE so we've seen cases where FDs/sockets get stuck in ESTABLISHED state indefinitely because the server did not see the client leave (network cut or otherwise). -- This message was sent by Atlassian JIRA (v6.3.4#6332)