[ 
https://issues.apache.org/jira/browse/HADOOP-3412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614239#action_12614239
 ] 

Vivek Ratan commented on HADOOP-3412:
-------------------------------------

I realize we're trying to get this patch committed quickly, but having looked 
at the latest patch, I do have a concern. As per patch v11, we have the 
following class hierarchy: TaskScheduler --> 
EagerTaskInitializationTaskScheduler --> JobQueueTaskScheduler --> 
LimitTasksPerJobTaskScheduler. EagerTaskInitializationTaskScheduler provides a 
mechanism for initializing multiple JobInProgress objects asynchronously 
(through a separate thread). JobQueueTaskScheduler is your basic scheduler, and 
LimitTasksPerJobTaskScheduler is a scheduler that limits concurrent tasks per 
job. My concern is that this kind of class hierarchy is the wrong way to allow 
people to build their own schedulers. What you really want is a library, or a 
set of separate classes, that provide individual functionality: one for 
limiting concurrent tasks per job, one for initializing jobs in a separate 
thread, and so on. Then, somebody can build a scheduler by picking the various 
functionality they want and composing the classes that provide this 
functionality. Inheritance does not let you do that. 

Here's an example (based on 3445). I want to build a scheduler that limits 
concurrent tasks per job, but does not want to initialize jobs in a separate 
thread (it wants to initialize individual jobs directly, only when required, in 
order to scale). What do I do? I don;t want to extend 
LimitTasksPerJobTaskScheduler because then I get the functionality of 
EagerTaskInitializationTaskScheduler, which I don't want. What if my scheduler 
also supports some sort of fair share (give equal time slots to each user's 
jobs), and it also supports user limits (limit the number of total tasks 
associated by a user). Do I still extend LimitTasksPerJobTaskScheduler? In one 
class? Or do I define a further hierarchy: LimitTasksPerJobTaskScheduler --> 
FairShareScheduler --> LimitTasksPerUserScheduler ? Which class extends which 
other class? You really want people to build schedulers by composing lots of 
individual functionality because scheduling incorporates lots of individual 
algorithms, each possibly very different from another. I think you really want 
any class that extends TaskScheduler to be a complete scheduler in itself, made 
up of lots of different functionalities (the 3445 scheduler, for example, 
provides task limits per user AND capacities AND priorities AND some 
preemption, each of which may be reused by some scheduler). Because you want to 
allow different schedulers to share functionality (for example, two different 
schedulers may want to limit tasks per jobs, but may also support differing 
features, so they should ideally share code that limits tasks per job), you 
want this functionality to be available as composable objects, or in a separate 
library. You don't want hierarchies based on inheritance. 

I don't quite know what these composable objects looks like. Perhaps you define 
an interface which takes in tasks and decides if those tasks pass or fail the 
policy that the object is implementing. That should be our discussion. Having 
class hierarchies, as we do today, will severely limit extensibility, IMO.  



> Refactor the scheduler out of the JobTracker
> --------------------------------------------
>
>                 Key: HADOOP-3412
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3412
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Brice Arnould
>            Assignee: Brice Arnould
>            Priority: Minor
>             Fix For: 0.19.0
>
>         Attachments: JobScheduler-v10.patch, JobScheduler-v11.patch, 
> JobScheduler-v9.1.patch, JobScheduler-v9.2.patch, JobScheduler-v9.patch, 
> JobScheduler.patch, JobScheduler_v2.patch, JobScheduler_v3.patch, 
> JobScheduler_v3b.patch, JobScheduler_v4.patch, JobScheduler_v5.patch, 
> JobScheduler_v6.1.patch, JobScheduler_v6.2.patch, JobScheduler_v6.3.patch, 
> JobScheduler_v6.4.patch, JobScheduler_v6.patch, JobScheduler_v7.1.patch, 
> JobScheduler_v7.patch, JobScheduler_v8.patch, RackAwareJobScheduler.java, 
> SimpleResourceAwareJobScheduler.java
>
>
> First I would like warn you that my proposition is assumed to be very naive. 
> I just hope that reading it won't make you lose time.
> h4. The aim
> It seems to me that improving Hadoop scheduling could be very profitable. 
> But, it is hard to implement and compare schedulers, because the scheduling 
> logic is mixed within the rest of the JobTracker.
> This bug is the first step of an attempt to improve the Hadoop scheduler. It 
> re-implements the current scheduling algorithm in a separate class called 
> JobScheduler. This new class is instantiated in the JobTracker.
> h4. Bug fixed as a side effects
> This patch probably cannot be submited as it is.
> A first difficulty is that it does not have exactly the same behaviour than 
> the current JobTracker. More precisely, it doesn't re-implement things like 
> code that seems to be never called or concurency problems.
> I wrote TOCONFIRM where my proposition differ from the current 
> implementation, so you can find them easily.
> I know that fixing bugs silently is bad. So, independently of what you decide 
> about this patch, I will open issues for bugs that you confirm.
> h4. Other side effects
> Another side effect of this patch is to add documentation about each step of 
> the scheduling. I hope that it will help future improvement by lowering the 
> level required to contribute to the scheduler.
> It also reduces the complexity and the granularity of the JobTracker (making 
> it more parallel).
> h4. The future
> If you feel that this is a step the right direction, I will try to propose a 
> JobSchedulerInterface that many JobSchedulers could implement and to propose 
> alternatives to the current « FifoJobScheduler ».  If some of you have ideas 
> about that please tell ^^ I will also open issues for things marked as FIXME 
> in the patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to