[ https://issues.apache.org/jira/browse/HADOOP-5884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715131#action_12715131 ]
Hemanth Yamijala commented on HADOOP-5884: ------------------------------------------ Some comments: - TaskSchedulingInfo.toString() - displaying the actual value had some problem in terms of exactness and mismatch between cluster info and the state we kept. That's why we shifted to percentages. May be a good idea to retain the model. Same argument can be made for running tasks and numSlotsOccupiedByThisUser - "Occupied slots" seems too techie. Call it 'Used capacity' ? Likewise instead of '% of total slots occupied by all users', call it '% of used capacity' ? - TaskSchedulingMgr.isUserOverLimit() - we add 1 if we're using more than the queue capacity. It could be more than 1, depending on the task we are assigning (if it's part of high RAM job) - MapSchedulingMgr constructor: typo: schedulr - should be scheduler. Similar for Reduce... - Minor NIT: Use format instead of the complicated StringBuffer.append()... kind of code. Makes it really hard to find what's happening. - updateQSIObjects. The log statement is printing numMapSlotsForThisJob instead of numMapsRunningForThisJob. > Capacity scheduler should account high memory jobs as using more capacity of > the queue > -------------------------------------------------------------------------------------- > > Key: HADOOP-5884 > URL: https://issues.apache.org/jira/browse/HADOOP-5884 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/capacity-sched > Reporter: Hemanth Yamijala > Assignee: Vinod K V > Attachments: HADOOP-5884-20090529.1.txt > > > Currently, when a high memory job is scheduled by the capacity scheduler, > each task scheduled counts only once in the capacity of the queue, though it > may actually be preventing other jobs from using spare slots on that node > because of its higher memory requirements. In order to be fair, the capacity > scheduler should proportionally (with respect to default memory) account high > memory jobs as using a larger capacity of the queue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.