[ 
https://issues.apache.org/jira/browse/HADOOP-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607622#action_12607622
 ] 

Steve Loughran commented on HADOOP-3618:
----------------------------------------

I think it makes sense to not worry too much about making the sleep 
configurable. The jitter is sometimes useful for dealing with mass numbers of 
callers, though the only time we had a big problem it was with embedded 
hardware whose RNGs all initialised the same way. Randomness is sometimes hard 
to find. 

When a full cluster reboots, its the data nodes that come up first; their boot 
time depends on the state of their disks. The name node ought to come up faster 
if its a RAID5 FS, but as it has to do playback it will take a while to go 
live. What happens to the job and task trackers in this situation? Will they 
just sit around? Because if we arent saving a job list over a cluster-crash 
there wont be a big set of jobs trying get restarted, not unless there are 
external clients hitting the site hard. 

> JobClient should keep on retrying if the jobtracker is still initializing
> -------------------------------------------------------------------------
>
>                 Key: HADOOP-3618
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3618
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>         Attachments: HADOOP-3618.patch
>
>
> When the user submits the job while the jobtracker is still initializing, the 
> jobclient comes out with an exception. ideally the jobclient should keep on 
> retrying until the jobtracker is up and ready. This will also take care of 
> HADOOP-3289. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to