[
https://issues.apache.org/jira/browse/MAPREDUCE-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407403#comment-13407403
]
Andrew Ferguson commented on MAPREDUCE-4327:
--------------------------------------------
hi Robert,
Thanks for you feedback! since I posted the earlier update, I've been pushing
it to completion: adding CPU core information to the queue metrics, resource
manager web interface, etc. I've also been adding test cases and ensuring that
the new patch passes existing test cases as well. currently, the patch is
failing just a few unit tests, but I expect it will be done in a day or two.
as the patch has grown quite large (the diff is pushing 7000 lines..), it's
clear we want to minimize the cost of adding a third resource. as it is, most
of the diff is new testing. I will strive to keep function calls as general as
possible (eg, "Resource r" instead of "int memory, float cores"), but there are
quite a few places where we want to consider each resource separately since the
math can be different, and it should be clear to anyone adding additional
resources that they need to consider something in that function's logic.
Regarding applications which haven't been updated for CPU cores, and might
submit a request with 0 or NULL, my current patch does round the request to the
minimum resource request, so those applications will be fine. (not sure if the
currently attached patch does this)
Regarding "spare capacity" -- I think this is one of the differences between
the capacity scheduler and the fair scheduler. should the capacity not in use
(or leftover capacity from queues which can't fill it because of the new
multi-dimensional nature of resources) be simply split over the queues based on
their capacity percentages? or should that capacity be treated as a single
pool, and allocations be made treating the capacity percentages as weights?
(this is more of a Fair Sched approach). anyway, I agree,, that should probably
be left as a separate JIRA, or perhaps simply left to the Fair Scheduler.
I'll incorporate your other points (eg, comparator name, ASF license) in my
updated patch.
thanks!
Andrew
> Enhance CS to schedule accounting for both memory and cpu cores
> ---------------------------------------------------------------
>
> Key: MAPREDUCE-4327
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4327
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Components: mrv2, resourcemanager, scheduler
> Affects Versions: 2.0.0-alpha
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Attachments: MAPREDUCE-4327-v2.patch, MAPREDUCE-4327-v3.patch,
> MAPREDUCE-4327-v4.patch, MAPREDUCE-4327.patch
>
>
> With YARN being a general purpose system, it would be useful for several
> applications (MPI et al) to specify not just memory but also CPU (cores) for
> their resource requirements. Thus, it would be useful to the
> CapacityScheduler to account for both.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira