[jira] Commented: (HADOOP-3412) Refactor the scheduler out of the JobTracker

Brice Arnould (JIRA) Fri, 23 May 2008 10:00:20 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599429#action_12599429
 ]


Brice Arnould commented on HADOOP-3412:
---------------------------------------

bq. Why is the scheduler responsible for managing the task tracker statuses? 
Shouldn't that stay in JobTracker? Though you'd still need 
updateTaskTrackerStatus() in the JobScheduler so that the scheduler can do 
something smart if there's been a change in status.
You're right. My aim was to move inside the JobScheduler the two structures we 
might want to reorder in order to select tasks in a more efficient way. I told 
myself that it would be a waste to keep two copies of the same collections (the 
unordered in the JobTracker, the ordered in the JobScheduler). But on the other 
hand, it can make the code more verbose on simple algorithms.
So I propose to write an AbstractJobScheduler with default implementation for 
{remove,update}TaskTracker (and maybe also jobs, if it seems reasonable). This 
way, the schedulers who just need to change addJob, removeJob and assignTask 
will be able to reuse a common basis.

bq. It might be wise to add an initialize() to the JobScheduler interface so 
that JobSchedulers can be written using only the default constructor. This 
would make it easier to push the choice of scheduler into a config file; you'd 
just list a class name and the system could use reflection to load the 
scheduler and start it.
The latest version of the patch (v4) already use reflection to load a scheduler 
according to the configuration file. I'm really unsure about the name I choose 
for the option however ("mapred.jobtracker.scheduler").
I didn't used Utils.Reflection because it seems to be restricted by design to 
objects with a single argument of type Configuration. Moreover, I like better 
constructor that return objects "ready for use" :-P. But if you feel that it 
doesn't respect the Hadoop code style, I can change that, it's not a problem.

bq. For which structures do you intend getLockOnJobs() to return a lock? Saying 
that the scheduler "won't move anything" is a little open-ended. I don't see 
why the scheduler would ever need to write to externally-visible structures.
This description really needs some clarification. getLocksOnJobs() give you the 
insurance that every operation on internally stored jobs, namely addJob, 
removeJob and assignTask, will wait for you to release the lock before to alter 
the list. It's intended use is that a user of this class intending to alter a 
Job do it in that way :
synchronize (scheduler.getLocksOnJobs()) {
  scheduler.remove (job);
  [Modify job]
  scheduler.add(job);
}
I told myself that in this way (add() then remove()), the asymptotic complexity 
with most containers would be lesser than it would be if I provided a 
notifyJobModification() which ressorted everything (my original intention).

Thanks for your comments, I'm going to post soon a new version of this patch 
that will take them in account. Please tell me if you see other errors in that 
one.

PS: Please excuse me for my English. If I said something misplaced or impolite, 
it is not by intention. 

> Refactor the scheduler out of the JobTracker
> --------------------------------------------
>
>                 Key: HADOOP-3412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3412
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Brice Arnould
>            Assignee: Brice Arnould
>            Priority: Minor
>         Attachments: JobScheduler.patch, JobScheduler_v2.patch, 
> JobScheduler_v3.patch, JobScheduler_v3b.patch, JobScheduler_v4.patch, 
> RackAwareJobScheduler.java
>
>
> First I would like warn you that my proposition is assumed to be very naive. 
> I just hope that reading it won't make you lose time.
> h4. The aim
> It seems to me that improving Hadoop scheduling could be very profitable. 
> But, it is hard to implement and compare schedulers, because the scheduling 
> logic is mixed within the rest of the JobTracker.
> This bug is the first step of an attempt to improve the Hadoop scheduler. It 
> re-implements the current scheduling algorithm in a separate class called 
> JobScheduler. This new class is instantiated in the JobTracker.
> h4. Bug fixed as a side effects
> This patch probably cannot be submited as it is.
> A first difficulty is that it does not have exactly the same behaviour than 
> the current JobTracker. More precisely, it doesn't re-implement things like 
> code that seems to be never called or concurency problems.
> I wrote TOCONFIRM where my proposition differ from the current 
> implementation, so you can find them easily.
> I know that fixing bugs silently is bad. So, independently of what you decide 
> about this patch, I will open issues for bugs that you confirm.
> h4. Other side effects
> Another side effect of this patch is to add documentation about each step of 
> the scheduling. I hope that it will help future improvement by lowering the 
> level required to contribute to the scheduler.
> It also reduces the complexity and the granularity of the JobTracker (making 
> it more parallel).
> h4. The future
> If you feel that this is a step the right direction, I will try to propose a 
> JobSchedulerInterface that many JobSchedulers could implement and to propose 
> alternatives to the current « FifoJobScheduler ».  If some of you have ideas 
> about that please tell ^^ I will also open issues for things marked as FIXME 
> in the patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3412) Refactor the scheduler out of the JobTracker

Reply via email to