[jira] Commented: (MAPREDUCE-1118) Capacity Scheduler scheduling information is hard to read / should be tabular format

Hemanth Yamijala (JIRA) Fri, 13 Aug 2010 03:21:46 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12898174#action_12898174
 ]


Hemanth Yamijala commented on MAPREDUCE-1118:
---------------------------------------------

I took this patch for a spin. Without going into too much detail of the code, I 
could see some high level points to discuss:

- The /scheduler page that is being added does not have all the fields for the 
queues - it only has fields related to capacity parameters. The queueinfo page 
on the other hand has more fields - like tasks, limits, job counts etc . I 
would imagine we'll need more information in the servlet, no ? Allen ?

- The patch doesn't play well with hierarchical queues introduced in Hadoop 
0.21 (MAPREDUCE-853). When I configured a simple hierarchy: parent level 
queues: p1 and p2, each having one child queue: q1 and q2 respectively, I got 
the following exception when I accessed /scheduler: 

{code}
java.lang.NullPointerException
        at 
org.apache.hadoop.mapred.CapacitySchedulerServlet.showQueues(CapacitySchedulerServlet.java:127)
        at 
org.apache.hadoop.mapred.CapacitySchedulerServlet.doGet(CapacitySchedulerServlet.java:90)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
{code}

More importantly, I am not even sure *what* should be displayed in the case of 
hierarchical queues that makes sense to meet Allen's original requirement - 
i.e. provide an easy-to-read interface to compare queue information. The 
confusion clearly is that queues across hierarchies do not make sense to be 
compared (unless the information is normalized at some global level). So, 
probably what makes sense is to have this kind of tabular structure for queues 
at every level. Clearly, this issue does not arise for Hadoop 0.20.

- From an interface point of view, I think we can do better in how scheduling 
information is accessed from the main page. This information has been available 
via the 'queues' link, and this patch adds another entry point - the /scheduler 
page. Perhaps discussion around the hierarchical queue interface will give us 
ideas around this as well.

> Capacity Scheduler scheduling information is hard to read / should be tabular 
> format
> ------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1118
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1118
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: Allen Wittenauer
>            Assignee: Krishna Ramachandran
>         Attachments: mapred-1118-1.patch, mapred-1118-2.patch, 
> mapred-1118-3.patch, mapred-1118.20S.patch, mapred-1118.patch
>
>
> The scheduling information provided by the capacity scheduler is extremely 
> hard to read on the job tracker web page.  Instead of just flat text, it 
> should be presenting the information in a tabular format, similar to what the 
> fair share scheduler provides.  This makes it much easier to compare what 
> different queues are doing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1118) Capacity Scheduler scheduling information is hard to read / should be tabular format

Reply via email to