[ 
https://issues.apache.org/jira/browse/HADOOP-4513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12646873#action_12646873
 ] 

Sreekanth Ramakrishnan commented on HADOOP-4513:
------------------------------------------------

After off-line discussion with Hemanth and Vivek, following is the proposal for 
implementing asynchronous initialization of jobs by capacity Scheduler:

- Modify _CapacityTaskScheduler_ to look only at the Run-queue maintained by 
_JobQueueManager_. This queue contains all initialized jobs.
- Modify _JobQueueManager_ to change semantics of waiting job queue to a list 
of jobs which with are waiting to be scheduled. Please note that when a job is 
waiting to be scheduled it means, that there is a possibility that a Job J1 
would be in both running and job queue at same time. When the first map or 
reduce of the job is scheduled, the job would be removed from the job queue 
which _JobQueueManager_ maintains.
- Introduce a new poller class, which looks at the 
_JobQueueManager.getJobs(queue)_ and picks up tasks to initialize for that 
queue.
- Following will be parameters which would be parameters which would be used 
for selecting jobs for eager initialization:
-- Maximum jobs which can be initialized per user. This would be a 
configuration parameter which would be introduced in _capacity_scheduler.xml_
-- Number of concurrent users supported by the queue, so the initialization 
poller would initialize ((userlimits/100) + 2 ) user jobs.
- The selected jobs would be passed on to worker threads, which can be assigned 
duty of initializing jobs from one or more queues.
- The worker thread maintains separate lists for jobs from different queues 
sorted by priority as same as _JobQueueManager_
- The worker thread then initializes the jobs from queues in a round robin 
fashion amongst the job queues assigned to it, i.e. it initializes first job 
from q1 and then first job from q2.

Illustration:

Consider a job queue : q which can support one con-current user (i.e. 
userlimits = 100%). Three users U1,U2,U3 are submittign jobs in following 
distribution:

Maximum number of jobs to be initialized per user : 2


J1U1,J2U1,J3U1,J4U1,J1U2,J2U2,J3U3,J4U4,J1U3,J2U3,J3U3,J4U3.

Jobs initialized by the Initialization threads would be:

J1U1,J2U1,J1U2,J2U2,J1U3,J2U3.

And all these are just initialized but not scheduled and a User U4 submits a 
very high priority Job and a normal priority, so our job queue in t+1 instance 
would look like :

J1U4,J1U1,J2U1,J3U1,J4U1,J1U2,J2U2,J3U3,J4U4,J1U3,J2U3,J3U3,J4U3,J2U4.

So next iteration poller would have initialized following :

J1U4,J1U1,J2U1,J1U2,J2U2,J1U3,J2U3. 

Please note that U4's second job would not be initialized.

If user1 had submitted the very high priority Job then he would be crossing the 
maximum limit of jobs which are allowed to be initialized per user. 


In above example if J1U1 is a job which takes long initialization time, the 
next job to be initialized would be the next highest priority  or highest 
priority jobs (if the job is submitted late as above example).


Any thoughts on the above approach?




> Capacity scheduler should initialize tasks asynchronously
> ---------------------------------------------------------
>
>                 Key: HADOOP-4513
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4513
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Sreekanth Ramakrishnan
>
> Currently, the capacity scheduler initializes tasks on demand, as opposed to 
> the eager initialization technique used by the default scheduler. This is 
> done in order to save JT memory footprint. However, the initialization is 
> done in the {{assignTasks}} API which is not a good idea as task 
> initialization could be a time consuming operation. This JIRA is to move out 
> the initialization outside the {{assignTasks}} API and do it asynchronously.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to