[
https://issues.apache.org/jira/browse/MAPREDUCE-4074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
xieguiming updated MAPREDUCE-4074:
----------------------------------
Status: Patch Available (was: Open)
Retry Scenarios:
1, AM finished, and will contact HS, and need to retry
2, If RM goes down, the IPC will retry automaticly. and no need retry here.
3, If HS goes down, the IPC will retry automaticly. and no need retry here.
So, we only need to consider scenario 1.
> Client continuously retries to RM When RM goes down before launching
> Application Master
> ---------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-4074
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4074
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 0.23.1
> Reporter: Devaraj K
> Attachments: MAPREDUCE-4074.patch
>
>
> Client continuously tries to RM and logs the below messages when the RM goes
> down before launching App Master.
> I feel exception should be thrown or break the loop after finite no of
> retries.
> {code:xml}
> 28/03/12 07:15:03 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 0 time(s).
> 28/03/12 07:15:04 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 1 time(s).
> 28/03/12 07:15:05 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 2 time(s).
> 28/03/12 07:15:06 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 3 time(s).
> 28/03/12 07:15:07 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 4 time(s).
> 28/03/12 07:15:08 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 5 time(s).
> 28/03/12 07:15:09 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 6 time(s).
> 28/03/12 07:15:10 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 7 time(s).
> 28/03/12 07:15:11 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 8 time(s).
> 28/03/12 07:15:12 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 9 time(s).
> 28/03/12 07:15:13 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 0 time(s).
> 28/03/12 07:15:14 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 1 time(s).
> 28/03/12 07:15:15 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 2 time(s).
> 28/03/12 07:15:16 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 3 time(s).
> 28/03/12 07:15:17 INFO ipc.Client: Retrying connect to server:
> linux-f330.site/10.18.40.182:8032. Already tried 4 time(s).
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira