[
https://issues.apache.org/jira/browse/OOZIE-588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13165102#comment-13165102
]
[email protected] commented on OOZIE-588:
-----------------------------------------------------
bq. On 2011-12-08 01:31:02, Prakhar Sharma wrote:
bq. >
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java,
line 473
bq. > <https://reviews.apache.org/r/3059/diff/1/?file=62926#file62926line473>
bq. >
bq. > Using the same counter for stats (as for output) can result in cases
where we can not infer just by looking at counter value - who appended it.
bq. > It is better to have a separate counter for stats (in the same
counter group).
good point, separate counter added
bq. On 2011-12-08 01:31:02, Prakhar Sharma wrote:
bq. >
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java,
line 40
bq. > <https://reviews.apache.org/r/3059/diff/1/?file=62927#file62927line40>
bq. >
bq. > ActionType in ctor redundant
changed
bq. On 2011-12-08 01:31:02, Prakhar Sharma wrote:
bq. > trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java,
line 368
bq. > <https://reviews.apache.org/r/3059/diff/1/?file=62929#file62929line368>
bq. >
bq. > Just return sb.toString() inside if block and you do not need the
following else block and related indentation.
done
bq. On 2011-12-08 01:31:02, Prakhar Sharma wrote:
bq. >
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java,
line 73
bq. > <https://reviews.apache.org/r/3059/diff/1/?file=62927#file62927line73>
bq. >
bq. > [Styling comment]
bq. > Setting separator = "" at start is confusing. An alternative can be
something like: -
bq. >
bq. > String separator = ",";
bq. > for (JobStats jobStats : jobGraph) {
bq. > if (sb.length()) {
bq. > sb.append(separator);
bq. > }
bq. > // do whatever you are doing
bq. > }
bq. >
bq. > // .length() calls on containers is generally very efficient...
done
- Virag
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3059/#review3722
-----------------------------------------------------------
On 2011-12-08 08:54:20, Virag Kothari wrote:
bq.
bq. -----------------------------------------------------------
bq. This is an automatically generated e-mail. To reply, visit:
bq. https://reviews.apache.org/r/3059/
bq. -----------------------------------------------------------
bq.
bq. (Updated 2011-12-08 08:54:20)
bq.
bq.
bq. Review request for oozie, Mohammad Islam and Angelo K. Huang.
bq.
bq.
bq. Summary
bq. -------
bq.
bq. This patch and https://reviews.apache.org/r/3047/diff addresses Oozie-588
bq.
bq.
bq. Docs for how to access pig stats through el functions needs to be added
bq.
bq.
bq. This addresses bug Oozie-588.
bq. https://issues.apache.org/jira/browse/Oozie-588
bq.
bq.
bq. Diffs
bq. -----
bq.
bq. trunk/core/src/test/java/org/apache/oozie/action/hadoop/TestPigMain.java
1211573
bq. trunk/core/src/test/java/org/apache/oozie/action/hadoop/PigTestCase.java
1211573
bq.
trunk/core/src/test/java/org/apache/oozie/action/hadoop/TestJavaActionExecutor.java
1211573
bq.
trunk/core/src/test/java/org/apache/oozie/action/hadoop/TestPigActionExecutor.java
1211573
bq. trunk/core/src/main/resources/oozie-default.xml 1211573
bq. trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigMain.java
1211573
bq.
trunk/core/src/main/java/org/apache/oozie/action/hadoop/PigActionExecutor.java
1211573
bq.
trunk/core/src/main/java/org/apache/oozie/action/hadoop/LauncherMapper.java
1211573
bq.
trunk/core/src/main/java/org/apache/oozie/action/hadoop/OoziePigStats.java
PRE-CREATION
bq.
trunk/core/src/main/java/org/apache/oozie/action/hadoop/JavaActionExecutor.java
1211573
bq.
bq. Diff: https://reviews.apache.org/r/3059/diff
bq.
bq.
bq. Testing
bq. -------
bq.
bq. Test cases for PigMain and PigAE added
bq. Tested pig action after running examples
bq.
bq.
bq. Thanks,
bq.
bq. Virag
bq.
bq.
> Oozie to allow drill down to hadoop job's details
> ---------------------------------------------------
>
> Key: OOZIE-588
> URL: https://issues.apache.org/jira/browse/OOZIE-588
> Project: Oozie
> Issue Type: New Feature
> Reporter: Mohammad Kamrul Islam
>
> High-level Requirements:
> -----------------------------------
> Since Oozie is designed as the gateway to grid, we need to support WS API for
> most common hadoop commands through Oozie. User doesn't want to go to
> multiple system to get the required data. Based on these, we propose to
> implement the following requirements into Oozie.
>
> R1: Oozie will provide WS endpoints to get hadoop job details (including job
> counters).
> R2: It will support both types of hadoop jobs : MR job created for MR action,
> MR jobs created as part of pig script.
> R3: In addition, for pig action, oozie will provide a way to query the pig
> stats.
> Proposed design:
> ----------------------
> D1: Oozie will store the *summary* jobcounter /pigstats into oozie DB.
> The items in the summary stats will be determined by oozie to limit the size.
> However,the commonly used stats will be include into the summary. It is
> important to note that summary information will be collected *after* the job
> finished.
>
> D2: If the user asks for *details* hadoop job stats , the user needs to
> query using different WS API. In this query, a user will specify *a* hadoop
> job id. Oozie will directly query the hadoop JT/RM/HS. Since it is an
> external call with undetermined response time, Oozie will provide only one
> hadoop job id per-request to avoid the timeout in WS call. Caveats: If hadoop
> is down or the job is not in JT/RM/History Server, Oozie will fail to collect
> the details.
>
> D3: For pig, Oozie will store the pig-generated hadoop ids in it DB and
> will expose that to user throw the "verbose" query.
> D4: Oozie will need to collect those summary pig stats and corresponding
> job counters and store it in Oozie DB. PigStats has a way of getting job
> counter for each hadoop job that it submits. We could use that API to collect
> summary counters for pig-created jobs.
> D5: The complete/detail pigstats will be stored into Pig Launcher Mapper
> as job counter. So that if a user wants to get the detail pig stats, we could
> get it from the LM directly.
>
> Open questions:
> ----------------------
> * What should be in the summary counters/stats?
> * What is the max size of stats?
> Advanced planning: <Not in the scope of this task, but might required for
> design to support later>
> --------------------------
> * Some users are asking to query the job stats when the job is RUNNING. They
> need it to decide for subsequent job submissions.
> * By the above design , user could use D2, to get the counter when MR action
> is running.
> * However, for pig, it is not that straight forward. Because Pig submits the
> jobs during execution. But the new PigRunner provide a listener concept where
> user can get the notifications such as when a new MR job submitted and its ID.
> * By using this, Oozie could get the running hadoop job id instantly. In
> future, user might want this to query using D2.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira