[
https://issues.apache.org/jira/browse/HADOOP-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607622#action_12607622
]
Steve Loughran commented on HADOOP-3618:
----------------------------------------
I think it makes sense to not worry too much about making the sleep
configurable. The jitter is sometimes useful for dealing with mass numbers of
callers, though the only time we had a big problem it was with embedded
hardware whose RNGs all initialised the same way. Randomness is sometimes hard
to find.
When a full cluster reboots, its the data nodes that come up first; their boot
time depends on the state of their disks. The name node ought to come up faster
if its a RAID5 FS, but as it has to do playback it will take a while to go
live. What happens to the job and task trackers in this situation? Will they
just sit around? Because if we arent saving a job list over a cluster-crash
there wont be a big set of jobs trying get restarted, not unless there are
external clients hitting the site hard.
> JobClient should keep on retrying if the jobtracker is still initializing
> -------------------------------------------------------------------------
>
> Key: HADOOP-3618
> URL: https://issues.apache.org/jira/browse/HADOOP-3618
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Attachments: HADOOP-3618.patch
>
>
> When the user submits the job while the jobtracker is still initializing, the
> jobclient comes out with an exception. ideally the jobclient should keep on
> retrying until the jobtracker is up and ready. This will also take care of
> HADOOP-3289.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.