[ 
https://issues.apache.org/jira/browse/HADOOP-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665684#action_12665684
 ] 

Hemanth Yamijala commented on HADOOP-4413:
------------------------------------------

Vivek, I reviewed the patch (from the patch file) and have a few comments:

- Can you explain why you need the scheduler object in the instrumentation 
classes ? It seems like the dependency should be the other way round, and I 
also couldn't see where it is being used.
- For many of the APIs defined, it seems to make sense to include some more 
information like which job and which task are affected. This will allow us to 
consolidate information by task or job and give better information.
- I did not see events related to job lifecycle - like when it was submitted, 
initialized, scheduled, completed etc. I think this is required, no ?
- At the same time, I don't see how the lookingFor*Task events are useful. Can 
you explain a use case ?
- For creating an instance of the CapacitySchedulerInstrumentation, you can use 
the ReflectionUtils API. Not entirely sure, but this may mean you have a 
default constructor with a setter for the Scheduler config object, and if 
necessary the scheduler object as well.
- The API toFullPropertyName is made private, but is being used by 
TestQueueCapacities. Either the code must be duplicated in the test method, or 
it should be left package private.

Apart from these, I also had a discussion with Mac and the Chukwa team, and we 
thought of two things that would really help integration with Chukwa:
- Like Mac indicated above, it would be good if the event log was more 
formatted than it is now.
- For the time series data, which is being captured via the setQueueStats, it 
was felt that this could be very easily done via a MetricsContext, and the 
advantage is that there is automatic integration with Chukwa (as it is doing 
this for all other parts of Hadoop - like DFS, Mapred etc). Please look at 
o.a.h.ipc.metrics.RpcMetrics for a simple example of how to use it.

> Capacity Scheduler to provide a scheduler history log to record actions taken 
> and why
> -------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4413
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4413
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>            Reporter: Mac Yang
>         Attachments: 4413.1.patch
>
>
> It would be very useful if the capacity scheduler can provide a log that 
> record the decisions made and actions taken by the scheduler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to