Wangda Tan created YARN-4113: -------------------------------- Summary: RM should respect retry-interval when uses RetryPolicies.RETRY_FOREVER Key: YARN-4113 URL: https://issues.apache.org/jira/browse/YARN-4113 Project: Hadoop YARN Issue Type: Bug Reporter: Wangda Tan Priority: Critical
Found one issue in RMProxy how to initialize RetryPolicy: In RMProxy#createRetryPolicy. When rmConnectWaitMS is set to -1 (wait forever), it uses RetryPolicies.RETRY_FOREVER which doesn't respect {{yarn.resourcemanager.connect.retry-interval.ms}} setting. RetryPolicies.RETRY_FOREVER uses 0 as the interval, when I run the test without properly setup localhost name: {{TestYarnClient#testShouldNotRetryForeverForNonNetworkExceptions}}, it wrote 14G DEBUG exception message to system before it dies. This will be very bad if we do the same thing in a production cluster. We should fix two places: - Make RETRY_FOREVER can take retry-interval as constructor parameter. - Respect retry-interval when we uses RETRY_FOREVER policy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)