>>> I believe it is the same issue for node manage connection This would be probably related to below issues https://issues.apache.org/jira/i#browse/YARN-3944 https://issues.apache.org/jira/i#browse/YARN-3238
Thanks & Regards Rohith Sharma K S From: Jeff Zhang [mailto:[email protected]] Sent: 18 August 2015 09:11 To: [email protected] Subject: Confusing Yarn RPC Configuration I use yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> to control how much time to wait for setting up RM connection. But the weird thing I found that this configuration is not the real max wait time. Actually Yarn will convert it to retry count with configuration yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>. Let's say yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms>=10000 and yarn.resourcemanager.connect.retry-interval.ms<http://yarn.resourcemanager.connect.retry-interval.ms>=2000, then yarn will create RetryUpToMaximumCountWithFixedSleep with max count = 5 (10000/2000) Because for each RM connection, there's retry policy inside of hadoop RPC. Let's say ipc.client.connect.retry.interval=1000 and ipc.client.connect.max.retries=10, so for each RM connection it will try 10 times and totally cost 10 seconds (1000*10). So overall for the RM connection it would cost 50 seconds (10 * 5), and this number is not consistent with yarn.resourcemanager.connect.max-wait.ms<http://yarn.resourcemanager.connect.max-wait.ms> which confuse users. I am not sure the purpose of 2 rounds of retry policy (Yarn side and RPC internal side), should it be only 1 round of retry policy and yarn related configuration is just for override the RPC configuration ? BTW, I believe it is the same issue for node manage connection. -- Best Regards Jeff Zhang
