[
https://issues.apache.org/jira/browse/HADOOP-3618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607533#action_12607533
]
Steve Loughran commented on HADOOP-3618:
----------------------------------------
the wait loop spins and sleeps:
+ // check if the jobtracker is ready
+ while (true) {
+ if (jobSubmitClient.isReady()) {
+ break;
+ }
+ try {
+ Thread.sleep(JOBTRACKER_POLL_INTERVAL);
+ } catch (InterruptedException ie){}
+ }
1. If the thread is interrupted, it implies somebody wanted to stop it. why not
listen to that request by ending the thread, rather than spinning indefinately.
This loop will make a job client thread impossible to kill in-process until the
tracker is live.
2. in other projects, we've found problems if a few hundred machines have just
come up fully synchronised, as they can do when a site's power gets toggled.
They all poll simultaneously, flood the network and then wait..even with
exponential back-off they are all in sync. So: a bit of random jitter on the
sleep is good; likewise, the poll interval may be a configuration point.
If this sleep-until-ready pattern is common, it should be factored out into a
method of its own and shared across things. I've been stubbing out (for my
deployment use) a simple lifecycle interface (start/stop/getstatus/ping)...if
that were adopted then we this patch could poll the getStatus() method.
> JobClient should keep on retrying if the jobtracker is still initializing
> -------------------------------------------------------------------------
>
> Key: HADOOP-3618
> URL: https://issues.apache.org/jira/browse/HADOOP-3618
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Amar Kamat
> Assignee: Amar Kamat
> Attachments: HADOOP-3618.patch
>
>
> When the user submits the job while the jobtracker is still initializing, the
> jobclient comes out with an exception. ideally the jobclient should keep on
> retrying until the jobtracker is up and ready. This will also take care of
> HADOOP-3289.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.