[jira] Commented: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats

Dick King (JIRA) Thu, 04 Feb 2010 14:23:51 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829823#action_12829823
 ]


Dick King commented on MAPREDUCE-1309:
--------------------------------------

I am submitting a new patch.

Some of Hudson's points are well taken:  I have cleaned up the javac and 
findbugs and release audit warnings.

The javadoc warnings are in some module I haven't touched.   I'll fix them 
under MAPREDUCE-1459 .

The three test failures are in 
org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter.test*  .  They 
appear to have come from a .zip failure, which I understand is a recurrent 
problem, not a test that actually runs and fails.  Here is one example:

{noformat}
Error Message

java.util.zip.ZipException: error reading zip file
Stacktrace

java.lang.RuntimeException: java.util.zip.ZipException: error reading zip file
        at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1715)
        at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1529)
        at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1475)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:564)
        at 
org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1892)
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:347)
        at 
org.apache.hadoop.mapred.HadoopTestCase.setUp(HadoopTestCase.java:145)
        at 
org.apache.hadoop.mapreduce.lib.output.TestJobOutputCommitter.setUp(TestJobOutputCommitter.java:59)
Caused by: java.util.zip.ZipException: error reading zip file
        at java.util.zip.ZipFile.read(Native Method)
        at java.util.zip.ZipFile.access$1200(ZipFile.java:29)
        at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:447)
        at java.util.zip.ZipFile$1.fill(ZipFile.java:230)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:141)
        at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:105)
        at java.io.FilterInputStream.read(FilterInputStream.java:66)
        at 
com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStream.read(XMLEntityManager.java:2910)
        at 
com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:704)
        at 
com.sun.org.apache.xerces.internal.impl.XMLVersionDetector.determineDocVersion(XMLVersionDetector.java:186)
        at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:771)
        at 
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
        at 
com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:107)
        at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:225)
        at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:180)
        at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1628)
{noformat}

see 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-h6.grid.sp2.yahoo.net/433/testReport/org.apache.hadoop.mapreduce.lib.output/TestJobOutputCommitter/testCustomAbort/
 .

> I want to change the rumen job trace generator to use a more modular internal 
> structure, to allow for more input log formats 
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1309
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1309
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Dick King
>            Assignee: Dick King
>         Attachments: demuxer-plus-concatenated-files--2009-12-21.patch, 
> demuxer-plus-concatenated-files--2010-01-06.patch, 
> demuxer-plus-concatenated-files--2010-01-08-b.patch, 
> demuxer-plus-concatenated-files--2010-01-08-c.patch, 
> demuxer-plus-concatenated-files--2010-01-08-d.patch, 
> demuxer-plus-concatenated-files--2010-01-08.patch, 
> demuxer-plus-concatenated-files--2010-01-11.patch, 
> mapreduce-1309--2009-01-14-a.patch, mapreduce-1309--2009-01-14.patch, 
> mapreduce-1309--2010-01-20.patch, mapreduce-1309--2010-02-03.patch
>
>
> There are two orthogonal questions to answer when processing a job tracker 
> log: how will the logs and the xml configuration files be packaged, and in 
> which release of hadoop map/reduce were the logs generated?  The existing 
> rumen only has a couple of answers to this question.  The new engine will 
> handle three answers to the version question: 0.18, 0.20 and current, and two 
> answers to the packaging question: separate files with names derived from the 
> job ID, and concatenated files with a header between sections [used for 
> easier file interchange].

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-1309) I want to change the rumen job trace generator to use a more modular internal structure, to allow for more input log formats

Reply via email to