[
https://issues.apache.org/jira/browse/HADOOP-4803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671723#action_12671723
]
Matei Zaharia commented on HADOOP-4803:
---------------------------------------
Yes exactly. The last time each pool was at its min / fair share is already
being maintained by the preemption patch (HADOOP-4665), so it won't be much
work. One other benefit of this change will be that jobs will tend to reuse the
same slot more often, leading to more JVM reuse. This can be a bad thing if it
leads to poor locality, but HADOOP-4667 will ensure that a job keeps using a
node till it runs out of local blocks to read on that node, and then waits and
switches to hopefully a node where it has more local data to process. This
should give us the best of both JVM reuse and data locality. (When I talked to
Arun and Owen about the use of deficits in the fair scheduler before, they were
concerned that it may lead to less JVM reuse because jobs will jump between
slots more often.)
> large pending jobs hog resources
> --------------------------------
>
> Key: HADOOP-4803
> URL: https://issues.apache.org/jira/browse/HADOOP-4803
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/fair-share
> Reporter: Joydeep Sen Sarma
> Assignee: Matei Zaharia
>
> observing the cluster over the last day - one thing i noticed is that small
> jobs (single digit tasks) are not doing a good job competing against large
> jobs. what seems to happen is that:
> - large job comes along and needs to wait for a while for other large jobs.
> - slots are slowly transfered from one large job to another
> - small tasks keep waiting forever.
> is this an artifact of deficit based scheduling? it seems that long pending
> large jobs are out-scheduling small jobs
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.