Jason Lowe created YARN-4414: -------------------------------- Summary: Nodemanager connection errors are retried at multiple levels Key: YARN-4414 URL: https://issues.apache.org/jira/browse/YARN-4414 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Affects Versions: 2.6.2, 2.7.1 Reporter: Jason Lowe
This is related to YARN-3238. Ran into more scenarios where connection errors are being retried at multiple levels, like NoRouteToHostException. The fix for YARN-3238 was too specific, and I think we need a more general solution to catch a wider array of connection errors that can occur to avoid retrying them both at the RPC layer and at the NM proxy layer. -- This message was sent by Atlassian JIRA (v6.3.4#6332)