[
https://issues.apache.org/jira/browse/HADOOP-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665684#action_12665684
]
Hemanth Yamijala commented on HADOOP-4413:
------------------------------------------
Vivek, I reviewed the patch (from the patch file) and have a few comments:
- Can you explain why you need the scheduler object in the instrumentation
classes ? It seems like the dependency should be the other way round, and I
also couldn't see where it is being used.
- For many of the APIs defined, it seems to make sense to include some more
information like which job and which task are affected. This will allow us to
consolidate information by task or job and give better information.
- I did not see events related to job lifecycle - like when it was submitted,
initialized, scheduled, completed etc. I think this is required, no ?
- At the same time, I don't see how the lookingFor*Task events are useful. Can
you explain a use case ?
- For creating an instance of the CapacitySchedulerInstrumentation, you can use
the ReflectionUtils API. Not entirely sure, but this may mean you have a
default constructor with a setter for the Scheduler config object, and if
necessary the scheduler object as well.
- The API toFullPropertyName is made private, but is being used by
TestQueueCapacities. Either the code must be duplicated in the test method, or
it should be left package private.
Apart from these, I also had a discussion with Mac and the Chukwa team, and we
thought of two things that would really help integration with Chukwa:
- Like Mac indicated above, it would be good if the event log was more
formatted than it is now.
- For the time series data, which is being captured via the setQueueStats, it
was felt that this could be very easily done via a MetricsContext, and the
advantage is that there is automatic integration with Chukwa (as it is doing
this for all other parts of Hadoop - like DFS, Mapred etc). Please look at
o.a.h.ipc.metrics.RpcMetrics for a simple example of how to use it.
> Capacity Scheduler to provide a scheduler history log to record actions taken
> and why
> -------------------------------------------------------------------------------------
>
> Key: HADOOP-4413
> URL: https://issues.apache.org/jira/browse/HADOOP-4413
> Project: Hadoop Core
> Issue Type: Improvement
> Components: contrib/capacity-sched
> Reporter: Mac Yang
> Attachments: 4413.1.patch
>
>
> It would be very useful if the capacity scheduler can provide a log that
> record the decisions made and actions taken by the scheduler.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.