[
https://issues.apache.org/jira/browse/HADOOP-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613562#action_12613562
]
Vivek Ratan commented on HADOOP-3412:
-------------------------------------
bq. we want to support lazy loading of jobs so that we can scale up the job
tracker, which means we shouldn't use JobInProgress in the API since it is very
heavy
Isn't there a far easier way to do this? A JobInProgress object grows in size
when you call JobInProgress.initTasks(). The current JT code calls initTasks()
when a job is submitted to the JT (the newly created JobInProgress object is
actually put in a queue and some other thread calls initTasks(), but it's
really called as early as possible). A simple fix to this is to call
initTasks() only when a job is considered for running by the scheduler. That
way, you initialize a JobInProgress object only when needed. Otherwise, its
memory footprint is low. It's even worth arguing that a newly created
JobInProgress object should look a lot like what JobDescription looks like.
It's only when you 'initialize' it, at the point when the (first task in the)
job can be considered for running, do you need to expand all the other data
structures. This seems, IMO, to be a better way to handle scale than have
another class. To be fair, JobDescription is really what the scheduler should
be looking at, but it makes more sense if the Scheduler is a separate
component/process. Otherwise, you're duplicating state in JobDescription and
JobInProgress. You could also refactor JobInProgress so that it has a
JobDescription member variable which it exposes, rather than expose separate
methods for getting/setting priority or queue names, but there doesn't seem to
be an advantage to it, other that conceptually encapsulating information that a
Scheduler might need in one class.
As to whether queue names need to be part of TaskScheduler: we have two options
here.
* Queues are explicit in the system, and jobs are always submitted to a queue.
If so, you want this notion everywhere. JobTracker.submitJob() should be
changed to take in a jobID and a queue name, as you're explicitly submitting a
job to a queue. Then, TaskScheduler requires both a job and a queue name, in
order to tell it that a job was submitted to the system (as per Matei and Tom's
comment earlier, addJob() is a listener method and just needs to know when a
job is submitted to the system).
* Or, you could treat queues (and other things we may add later, such as Orgs)
as part of the job configuration. So, a user submits a job, and everything the
system needs to know is encapsulated in the jobID, when JobTracker.submitJob is
called.
bq. Then the only data structure holding jobs is in the scheduler and doing
queries can be done through this api.
Do we want the Scheduler to serve queries? In the future, you may well want to
think of the Scheduler as just an algorithm that, given the state of the
system, only decides what task to give to a TT. Web serving may be done through
a completely different component.
I really think some other component besides the Scheduler needs to be
responsible for storing jobs and maintaining data structures that associate the
job with queues and deal with job persistence - everything to do with keeping
track of jobs & queues in memory. Different schedulers impose different
filters/sorting on these structures - they're really just algorithms that
access these data structures. Schedulers may keep other data structures for
their use. For example, in HADOOP-3445, the scheduler needs to know how many
unique users have submitted jobs to a queue, or how many tasks for a given user
are running. This information is kept in a different data structure that the
scheduling code controls. It doesn't need to be persisted and doesn't need the
same scaling/persistence functionality as you need for JobInProgress objects.
So in that sense, the TaskScheduler interface should not also expose jobs and
queues. getQueueNames() and getJobs() belong elsewhere (probably in a
JobQueueManager class).
You may actually want two separate interfaces - one for Scheduling (which will
be similar to what TaskScheduler exposes) and one for iterating through jobs
and queues. For performance sake, you may have the same class implement both,
but they are two separate interfaces.
> Refactor the scheduler out of the JobTracker
> --------------------------------------------
>
> Key: HADOOP-3412
> URL: https://issues.apache.org/jira/browse/HADOOP-3412
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Brice Arnould
> Assignee: Brice Arnould
> Priority: Minor
> Fix For: 0.19.0
>
> Attachments: JobScheduler-v10.patch, JobScheduler-v9.1.patch,
> JobScheduler-v9.2.patch, JobScheduler-v9.patch, JobScheduler.patch,
> JobScheduler_v2.patch, JobScheduler_v3.patch, JobScheduler_v3b.patch,
> JobScheduler_v4.patch, JobScheduler_v5.patch, JobScheduler_v6.1.patch,
> JobScheduler_v6.2.patch, JobScheduler_v6.3.patch, JobScheduler_v6.4.patch,
> JobScheduler_v6.patch, JobScheduler_v7.1.patch, JobScheduler_v7.patch,
> JobScheduler_v8.patch, RackAwareJobScheduler.java,
> SimpleResourceAwareJobScheduler.java
>
>
> First I would like warn you that my proposition is assumed to be very naive.
> I just hope that reading it won't make you lose time.
> h4. The aim
> It seems to me that improving Hadoop scheduling could be very profitable.
> But, it is hard to implement and compare schedulers, because the scheduling
> logic is mixed within the rest of the JobTracker.
> This bug is the first step of an attempt to improve the Hadoop scheduler. It
> re-implements the current scheduling algorithm in a separate class called
> JobScheduler. This new class is instantiated in the JobTracker.
> h4. Bug fixed as a side effects
> This patch probably cannot be submited as it is.
> A first difficulty is that it does not have exactly the same behaviour than
> the current JobTracker. More precisely, it doesn't re-implement things like
> code that seems to be never called or concurency problems.
> I wrote TOCONFIRM where my proposition differ from the current
> implementation, so you can find them easily.
> I know that fixing bugs silently is bad. So, independently of what you decide
> about this patch, I will open issues for bugs that you confirm.
> h4. Other side effects
> Another side effect of this patch is to add documentation about each step of
> the scheduling. I hope that it will help future improvement by lowering the
> level required to contribute to the scheduler.
> It also reduces the complexity and the granularity of the JobTracker (making
> it more parallel).
> h4. The future
> If you feel that this is a step the right direction, I will try to propose a
> JobSchedulerInterface that many JobSchedulers could implement and to propose
> alternatives to the current « FifoJobScheduler ». If some of you have ideas
> about that please tell ^^ I will also open issues for things marked as FIXME
> in the patch.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.