[ 
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12634777#action_12634777
 ] 

Hemanth Yamijala commented on HADOOP-4035:
------------------------------------------

Following an offline discussion with Owen, his proposal was the following:

- The scheduler assigns a task to a TT only if the amount of free memory 
reported is greater than the task's requirements.
- If it doesn't match, we don't move to the next job. That is, we block, thus 
removing any possible starvation of this job.
- We don't bother about making this job account for more usage at this point, 
and handle that problem later, mostly after 0.19.

Thinking about this, I think the only disadvantage with this approach is that a 
user who submits a job with high memory requirements could essentially block 
other users, atleast until his limit is hit.

So, I would suggest we change the above proposal to not block, but instead move 
over to the next job. This way, a user with high RAM requirements cannot block 
other users, and cannot game the system in that way.

Note that:
- This is exactly what we do in HADOOP-657 for disk space usage.
- When we introduce accounting, we can also change the behavior of blocking.

Can we agree on this ?


> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory 
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify 
> memory requirements for jobs, and also modified the tasktrackers to report 
> their free memory. The capacity scheduler in HADOOP-3445 should schedule 
> tasks based on these parameters. A task that is scheduled on a TT that uses 
> more than the default amount of memory per slot can be viewed as effectively 
> using more than one slot, as it would decrease the amount of free memory on 
> the TT by more than the default amount while it runs. The scheduler should 
> make the used capacity account for this additional usage while enforcing 
> limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to