[
https://issues.apache.org/jira/browse/HADOOP-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tom White updated HADOOP-3412:
------------------------------
Attachment: JobScheduler-v9.1.patch
A new patch with a few small changes.
> I think the addJob and removeJob methods in the TaskScheduler are only meant
> to be "listener" methods to notify it that a job should be considered for
> scheduling.
To emphasise this I have renamed these methods to be consistent with the
"listener style" often found in Java: jobAdded and jobRemoved. I've also added
a jobUpdated method since the job is actually being updated rather than added
then removed.
> Maybe the TaskTrackerManager interface could be package-private, since it is
> not meant to be used by end-users (if I understood well).
I didn't make it package-private as I wanted implementors of TaskScheduler
(which shouldn't have to be in the same package) to be able to use it. However,
I've just noticed that Task, JobInProgress and TaskTrackerStatus are
package-private, so I have made TaskTrackerManager the same. I suggest we
figure out how to allow TaskScheduler implementations to be in other packages
as a separate issue, where we consider how much of Task, JobInProgress and
TaskTrackerStatus we need to make public. This can be done in parallel with
HADOOP-3445 and HADOOP-3746.
> I suggest that subclasses could specify a Set<JobInProgress>
I agree - I've done this.
I've also fixed up some badly named TaskTrackerManager instances which still
had the word "Container" in them.
> Refactor the scheduler out of the JobTracker
> --------------------------------------------
>
> Key: HADOOP-3412
> URL: https://issues.apache.org/jira/browse/HADOOP-3412
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Brice Arnould
> Assignee: Brice Arnould
> Priority: Minor
> Fix For: 0.19.0
>
> Attachments: JobScheduler-v9.1.patch, JobScheduler-v9.patch,
> JobScheduler.patch, JobScheduler_v2.patch, JobScheduler_v3.patch,
> JobScheduler_v3b.patch, JobScheduler_v4.patch, JobScheduler_v5.patch,
> JobScheduler_v6.1.patch, JobScheduler_v6.2.patch, JobScheduler_v6.3.patch,
> JobScheduler_v6.4.patch, JobScheduler_v6.patch, JobScheduler_v7.1.patch,
> JobScheduler_v7.patch, JobScheduler_v8.patch, RackAwareJobScheduler.java,
> SimpleResourceAwareJobScheduler.java
>
>
> First I would like warn you that my proposition is assumed to be very naive.
> I just hope that reading it won't make you lose time.
> h4. The aim
> It seems to me that improving Hadoop scheduling could be very profitable.
> But, it is hard to implement and compare schedulers, because the scheduling
> logic is mixed within the rest of the JobTracker.
> This bug is the first step of an attempt to improve the Hadoop scheduler. It
> re-implements the current scheduling algorithm in a separate class called
> JobScheduler. This new class is instantiated in the JobTracker.
> h4. Bug fixed as a side effects
> This patch probably cannot be submited as it is.
> A first difficulty is that it does not have exactly the same behaviour than
> the current JobTracker. More precisely, it doesn't re-implement things like
> code that seems to be never called or concurency problems.
> I wrote TOCONFIRM where my proposition differ from the current
> implementation, so you can find them easily.
> I know that fixing bugs silently is bad. So, independently of what you decide
> about this patch, I will open issues for bugs that you confirm.
> h4. Other side effects
> Another side effect of this patch is to add documentation about each step of
> the scheduling. I hope that it will help future improvement by lowering the
> level required to contribute to the scheduler.
> It also reduces the complexity and the granularity of the JobTracker (making
> it more parallel).
> h4. The future
> If you feel that this is a step the right direction, I will try to propose a
> JobSchedulerInterface that many JobSchedulers could implement and to propose
> alternatives to the current « FifoJobScheduler ». If some of you have ideas
> about that please tell ^^ I will also open issues for things marked as FIXME
> in the patch.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.