[ 
https://issues.apache.org/jira/browse/OOZIE-103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13099801#comment-13099801
 ] 

Hadoop QA commented on OOZIE-103:
---------------------------------

mislam77 remarked:
Objective :
---------------
1. The long term objective is: when hadoop is slow oozie should be able to  
throttle the JT/NN load through submitting fewer jobs(e.g.). 

2. In short term, we want to instrument oozie so that it could report the 
response time of JT/NN  at any time. How will the value be meat or presented is 
not the scope of this short term goal.  

3. It is expected that the design to achieve the short term objective should be 
extend-able and reusable for long term objective.

Solution:
------------
Following ideas were discussed internally at Y! .
Approach 1:
Use a separate monitoring thread that will periodically ping with a 
representative command to the Hadoop server. For example,  in namenode, the 
thread will invoke "ls /tmp" like  command.

Pros & Cons :
*  This thread will add extra overhead to hadoop as well as to oozie.
* Find a representative command that would represent the actual health of 
hadoop might not be trivial.

Approach 2:
 When oozie calls to NN, JT, oozie could instrument that turn-around time. The 
benefit is: there  is no extra command sent.

Pros  & Cons :
* There are different types of commands and there normal response time also 
varied. In this case, oozie could restrict the instrumentation to a subset of 
commonly used commands. Each command type will have a different instrumented 
value.
 
* When oozie is idle, oozie might miss the data for that period. 

Comments please.

> GH-68: Better reporting/handling of problems in Hadoop
> ------------------------------------------------------
>
>                 Key: OOZIE-103
>                 URL: https://issues.apache.org/jira/browse/OOZIE-103
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> Add instrumentation to track performance stats of NN and JT (how long to get 
> directory listing on hdfs; how long to submit a job or query JT queue)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to