[jira] Commented: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats

Hong Tang (JIRA) Fri, 12 Feb 2010 17:43:51 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12833278#action_12833278
 ]


Hong Tang commented on MAPREDUCE-1309:
--------------------------------------

The latest patch (2010-02-12) looks good. Only a few minor comments below:

- incorrect changes in javadoc comments for 
mapred.FileInputFormat.setInputPaths and 
mapreduce.lib.input.FileInputFormat.setInputPaths.
- in TestRumenJobTraces.java, change "path.makeQualified(fs)" to 
"path.makeQualified(fs.getUri(), fs.getWorkingDirectory())" to avoid explicit 
suppression of warnings.
- DefaultInputDemuxer does not guard against misuse such as 
demuxer.bindTo(...); demuxer.close(); demuxer.getNext(). Would be better to put 
a final block for close to reset name and input to be null.
- unused import junit.Ignore in HadoopLogAnalyzer.java
- HEE.nameNames() should be removed.
- processReduceAttemptFinishedEvent and processMapAttemptFinishedEvent contains 
a commented line as follows: "// attempt.setLocation(???);". Is it a 
placeholder for some TODO work? If so, please fill in more comments.
- should the following be removed from JobBuilder.processJobFinishedEvent()? 
"// ???? result.setOutcome(event.)"
- MapAttempt20LineHistoryEventEmitter.makePrototype does not seem to be used.

> I want to change the rumen job trace generator to use a more modular internal 
> structure, to allow for more input log formats 
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1309
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1309
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Dick King
>            Assignee: Dick King
>         Attachments: demuxer-plus-concatenated-files--2009-12-21.patch, 
> demuxer-plus-concatenated-files--2010-01-06.patch, 
> demuxer-plus-concatenated-files--2010-01-08-b.patch, 
> demuxer-plus-concatenated-files--2010-01-08-c.patch, 
> demuxer-plus-concatenated-files--2010-01-08-d.patch, 
> demuxer-plus-concatenated-files--2010-01-08.patch, 
> demuxer-plus-concatenated-files--2010-01-11.patch, 
> mapreduce-1309--2009-01-14-a.patch, mapreduce-1309--2009-01-14.patch, 
> mapreduce-1309--2010-01-20.patch, mapreduce-1309--2010-02-03.patch, 
> mapreduce-1309--2010-02-04.patch, mapreduce-1309--2010-02-10.patch, 
> mapreduce-1309--2010-02-12.patch
>
>
> There are two orthogonal questions to answer when processing a job tracker 
> log: how will the logs and the xml configuration files be packaged, and in 
> which release of hadoop map/reduce were the logs generated?  The existing 
> rumen only has a couple of answers to this question.  The new engine will 
> handle three answers to the version question: 0.18, 0.20 and current, and two 
> answers to the packaging question: separate files with names derived from the 
> job ID, and concatenated files with a header between sections [used for 
> easier file interchange].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats

Reply via email to