[
https://issues.apache.org/jira/browse/HADOOP-3842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12617809#action_12617809
]
Steve Loughran commented on HADOOP-3842:
----------------------------------------
Looking at this code, there's always been a bit of an implicit race condition,
with the offerService() method being the point where the service is meant to go
live.
1. we could have the JT reject submitJob() operations until offerService()
operation called to take the tracker live, and do not consider the service to
be live until that operation is called.
2. In the HADOOP-3628 lifecycle changes, we could make offerService() a
deprecated no-op (and delete its current uses), and instead
-have the service go live in the start() lifecycle event
-reject all attempts to submit work until that service is live
Clearly I'm biased towards option (2); I could even write a test to verify jobs
were rejected unless the service was started. But for a quick inelegant fix,
start the scheduler in the JobTracker's constructor.
> There is a window where the JobTracker is in the RUNNING state (i.e ready to
> accept jobs) and never executes them.
> ------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-3842
> URL: https://issues.apache.org/jira/browse/HADOOP-3842
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Reporter: Amar Kamat
> Priority: Blocker
>
> Prior to HADOOP-3412, job tracker could accept jobs without even offering
> service (i.e without {{JobTracker.offerService()}} being called). In such a
> case the job stays in JT's memory and job execution was guaranteed. With
> HADOOP-3412, {{JobTracker.submitJob()}} adds the job to JT's local structures
> and passes it to the scheduler. Scheduler gets initialized in
> {{JobTracker.offerService()}} and hence calling {{JobTracker.submitJob()}}
> before calling {{JobTracker.offerService()}} is actually a no-op. The job
> stays in JT's memory but never gets initialized. This is
> - backward incompatible
> - erroneous as there is a window where the jobtracker is ready to accept
> jobs, accepts them and never executes them.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.