[ 
https://issues.apache.org/jira/browse/HADOOP-4980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Ratan updated HADOOP-4980:
--------------------------------

    Attachment: 4980.3.patch

Attaching new patch (4980.3.patch). 

bq. 1. In the toString of TaskSchedulingInfo, we can use float instead of int, 
like we do for running tasks per queue, so that users who are running 
fractional values will also be represented correctly. And just to avoid 
confusion, should we exclude users who have no running tasks ?

Good points. I now print out a float value, and I only print users who have at 
least one running task. 

bq. 2. The strings "Map Tasks" can be printed in TaskSchedulingInfo.toString, 
as that's the class that has knowledge about what is being printed.

The QSI knows which of the two TSIs is for a map and which is for a reduce. So 
it can display the right string. Having the TSI display the Map or Reduce 
string is harder. It can be done but I think it's fine for the QSI to print 
what it does. 

bq. 3. There was code in assignTasks that did not return a task for queues with 
zero GC. This seems to have been inadvertently removed.

Good catch. I've added that code in. It got removed by mistake. 

bq. 4. Rather than passing in the JobQueuesManager object to the QSI, which is 
a rather loaded object, can we just pass the boolean supportsPriority, which is 
all the information that the QSI needs ? That way, it will be one less place to 
check if we make any change to JobQueuesManager. Note that we are already 
reading the value of the configuration in the start() method.

The QSI needs a JobQueuesManager object for other things: to print out 
job-related information, for example. And this list will likely grow in the 
future, so it should have access to the JobQueuesManager object. 




> Cleanup the Capacity Scheduler code
> -----------------------------------
>
>                 Key: HADOOP-4980
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4980
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>         Attachments: 4980.1.patch, 4980.2.patch, 4980.3.patch
>
>
> Given the number of changes that have been made by different folks to the 
> Capacity Scheduler code, the code needs to be cleaned up. Some comments and 
> variable names are misleading, and the core logic is not in a central place, 
> making it harder to understand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to