[ 
https://issues.apache.org/jira/browse/OOZIE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13166408#comment-13166408
 ] 

[email protected] commented on OOZIE-588:
-----------------------------------------------------



bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java, 
line 470
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62926#file62926line470>
bq.  >
bq.  >     use a defined constant for "oozie.action.stats.properties"
bq.  >     
bq.  >     it seems it has been used for some properties 
[CONF_OOZIE_EXTERNAL_STATS_MAX_SIZE] but not for others.

ok


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java, 
line 471
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62926#file62926line471>
bq.  >
bq.  >     what happen if it doesn't exist

If it doesnt exist, then the action has no associated stats stored with it. 
Will add a comment


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java, 
line 486
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62926#file62926line486>
bq.  >
bq.  >     remove extra tabs

removed in 2nd diff


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java, 
line 562
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62926#file62926line562>
bq.  >
bq.  >     use constant .. specially this has been used more than once in this 
file

ok


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java, 
line 45
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62927#file62927line45>
bq.  >
bq.  >     do not suppress "unchecked" unless you really need it

I need it because JSONObject doesn't support generics


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java, 
line 79
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62927#file62927line79>
bq.  >
bq.  >     same as before putting the separator in the loop doesn't seem right 
specially the first element will not be separated

done in 2nd diff


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java, 
line 88
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62927#file62927line88>
bq.  >
bq.  >     i do not see any type casting here, why you suppress "unchecked"

it is there


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java, 
line 47
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line47>
bq.  >
bq.  >     Another practice that is used in different pieces of code is to 
define the prefix [in this case it will be "oozie.action" as a constant and 
reuse it in the three constant. In this case if the prefex has been changed, 
there is not need to change all of them.

Good point..will do


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java, 
line 222
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line222>
bq.  >
bq.  >     not clear to me if the comment reflect the reality. did you check 
for the pig version here or for the embedded python

will put a better comment


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java, 
line 263
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line263>
bq.  >
bq.  >     extra line

ok


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java, 
line 341
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line341>
bq.  >
bq.  >     it looks odd to have try, finally block without catching any 
exception. Do you need to handle exception here
bq.  >     
bq.  >     Especially if you were not able to write to a file.. at least you 
have to log the error here

As it is an IOException, I am throwing it and will make the laucher handle it.


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java, 
line 364
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line364>
bq.  >
bq.  >     make the 100 constant or configurable .. hard coding is not always a 
good programming practice

ok


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/resources/oozie-default.xml, line 1107
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62930#file62930line1107>
bq.  >
bq.  >     you may add what -1 means here  [unlimited]

ok


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java, 
line 294
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line294>
bq.  >
bq.  >     check for the existence of the file and for exception
bq.  >

The file shouldn't not be there in the first place as the temp files from the 
compute node are deleted once the action ends. 
If the file, due to some reason, is still there, the Java's FileWriter will 
overwrite it


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java, 
line 533
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62926#file62926line533>
bq.  >
bq.  >     The map method seems very long which make it hard to read and 
increase its error prune, specially with a catch all at the end. it may be 
better to add you new code in a sperate methods instead of increasing the size 
of the method

good observation, will put in separate methods


bq.  On 2011-12-08 19:33:53, Mohamed Battisha wrote:
bq.  > 
trunk/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java,
 line 893
bq.  > <https://reviews.apache.org/r/3059/diff/1/?file=62925#file62925line893>
bq.  >
bq.  >     I believe you need to handle the HadoopAccessorException here

This exception is converted to ActionExecutorException after preserving the the 
exception cause and message. So, I dont feel the need to handle it separately.


- Virag


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3059/#review3708
-----------------------------------------------------------


On 2011-12-08 08:54:20, Virag Kothari wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3059/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-08 08:54:20)
bq.  
bq.  
bq.  Review request for oozie, Mohammad Islam and Angelo K. Huang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  This patch and https://reviews.apache.org/r/3047/diff addresses Oozie-588
bq.  
bq.  
bq.  Docs for how to access pig stats through el functions needs to be added 
bq.  
bq.  
bq.  This addresses bug Oozie-588.
bq.      https://issues.apache.org/jira/browse/Oozie-588
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/core/src/test/java/org/apache/oozie/action/hadoop/TestPigMain.java 
1211573 
bq.    trunk/core/src/test/java/org/apache/oozie/action/hadoop/PigTestCase.java 
1211573 
bq.    
trunk/core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java
 1211573 
bq.    
trunk/core/src/test/java/org/apache/oozie/action/hadoop/TestPigActionExecutor.java
 1211573 
bq.    trunk/core/src/main/resources/oozie-default.xml 1211573 
bq.    trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java 
1211573 
bq.    
trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigActionExecutor.java 
1211573 
bq.    
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java 
1211573 
bq.    
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java 
PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java 
1211573 
bq.  
bq.  Diff: https://reviews.apache.org/r/3059/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Test cases for PigMain and PigAE added
bq.  Tested pig action after running examples
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Virag
bq.  
bq.


                
> Oozie to allow drill down to hadoop job's details  
> ---------------------------------------------------
>
>                 Key: OOZIE-588
>                 URL: https://issues.apache.org/jira/browse/OOZIE-588
>             Project: Oozie
>          Issue Type: New Feature
>            Reporter: Mohammad Kamrul Islam
>
> High-level Requirements:
> -----------------------------------
> Since Oozie is designed as the gateway to grid, we need to support WS API for 
> most common hadoop commands through Oozie.  User doesn't want to go to 
> multiple system to get the required data. Based on these, we propose to 
> implement the following requirements into Oozie.
>  
> R1: Oozie will provide WS endpoints to get hadoop job details (including job 
> counters).
> R2: It will support both types of hadoop jobs : MR job created for MR action, 
>  MR jobs created as part of pig script.
> R3:  In addition, for pig action, oozie will provide a way to query the pig 
> stats.
> Proposed design:
> ----------------------
>      D1: Oozie will store the *summary* jobcounter /pigstats into oozie DB. 
> The items in the summary stats will be determined by oozie to limit the size. 
> However,the commonly used stats will be include into the summary. It is 
> important to note that summary information will be collected *after* the job 
> finished.
>      
>     D2: If the user asks for *details* hadoop job stats ,  the user needs to 
> query using different WS API. In this query, a user will specify *a* hadoop 
> job id.  Oozie will directly query the hadoop JT/RM/HS. Since it is an 
> external call with undetermined response time, Oozie will provide only one 
> hadoop job id per-request to avoid the timeout in WS call. Caveats: If hadoop 
> is down or the job is not in JT/RM/History Server, Oozie will fail to collect 
> the details. 
>     
>      D3: For pig, Oozie will store the pig-generated hadoop ids in it DB and 
> will expose that to user throw the "verbose" query.
>      D4: Oozie will need to collect those summary pig stats and corresponding 
> job counters and store it in Oozie DB. PigStats has a way of getting job 
> counter for each hadoop job that it submits. We could use that API to collect 
> summary counters for pig-created jobs.
>      D5: The complete/detail pigstats will be stored into Pig Launcher Mapper 
> as job counter. So that if a user wants to get the detail pig stats, we could 
> get it from the LM directly.
>      
> Open questions:
> ----------------------
> * What should be in the summary counters/stats? 
> * What is the max size of stats?
> Advanced planning: <Not in the scope of this task, but might required for 
> design to support later>
> --------------------------
> * Some users are asking to query the job stats when the job is RUNNING. They 
> need it to decide for subsequent job submissions.
> * By the above design , user could use D2, to get the counter when MR action 
> is running.
> * However, for pig, it is not that straight forward. Because Pig submits the 
> jobs during execution. But the new PigRunner provide a listener concept where 
> user can get the notifications such as when a new MR job submitted and its ID.
> * By using this, Oozie could get the running hadoop job id instantly. In 
> future, user might want this to query using D2.
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to