[ 
https://issues.apache.org/jira/browse/HADOOP-4413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12666061#action_12666061
 ] 

Vivek Ratan commented on HADOOP-4413:
-------------------------------------

@Mac:
bq. I think either way we will want to be able to correlate the job life cycle 
events with the scheduler events.
Absolutely. That's why I kept the Job IDs out of the methods of 
CapacitySchedulerInstrumentation. If we can't synchronize the scheduler's 
events with the jobs' events,  we can look at modify these methods. We're 
logging, or collecting, a lot of information. The key is to see how to parse 
this information to present a unified life cycle view - for a job, for a queue, 
etc. 

@hemanth:
bq. The other two classes are using it, and so they need it. We could add it 
when required, no ?
ChukwaTTInstru doesn't use the TaskTracker member variable, though 
TaskTrackerMetricsInst does. The Scheduler member variable seems useful (for 
future classes) and logical to be in CapacitySchedulerInstrumentation. Plus, we 
don't want too many changes to CapacitySchedulerInstrumentation - it acts like 
an interface. 

bq. I think some of the information is not captured by the jobtracker 
instrumentation at a job level - memory based blocking for instance, also our 
initialization logic is different.
We capture memory based blocking through 
CapacitySchedulerInstrumentation.blockOnHighMemJob. Does that need a job 
parameter? Maybe not. Maybe we only care to know about how many times we 
blocked. If we also want to know on which job we blocked, we can add a job 
parameter. 
Do we want to capture events in job initialization? I'm not sure. On one hand, 
job initialization is an internal thing - it's not an external facing event. I 
see CapacitySchedulerInstrumentation as capturing the external events of the 
scheduler, events that are familiar to a use or to Ops. If a job's running, I 
know it's initialized. If I want to detect how well my initialization routine 
is running, I'd use log files for that. However, if we feel the need to capture 
and track job initialization events, we can add them. I just didn't see a need. 
But if you do, it would be great if you can suggest what methods to add to 
capture initialization of jobs. 

bq. Essentially, if we could work a little bit on what kind of information we 
want captured, it might help us better
I think we have, at least to get started. There's a listing of what we want to 
capture at the beginning of this Jira. I think we're covering all of that. Do 
you feel we're missing something?  Again, I sense that what all we want to 
capture will become clearer once we run this thing and start analyzing life 
cycle events. I've tried to capture whatever I thought would be important. But 
feel free to suggest other events. 



> Capacity Scheduler to provide a scheduler history log to record actions taken 
> and why
> -------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4413
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4413
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: contrib/capacity-sched
>            Reporter: Mac Yang
>         Attachments: 4413.1.patch
>
>
> It would be very useful if the capacity scheduler can provide a log that 
> record the decisions made and actions taken by the scheduler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to