[ https://issues.apache.org/jira/browse/HADOOP-5539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712400#action_12712400 ]
Hadoop QA commented on HADOOP-5539: ----------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12408652/hadoop-5539-v1.patch against trunk revision 777761. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/386/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/386/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/386/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch-vesta.apache.org/386/console This message is automatically generated. > o.a.h.mapred.Merger not maintaining map out compression on intermediate files > ----------------------------------------------------------------------------- > > Key: HADOOP-5539 > URL: https://issues.apache.org/jira/browse/HADOOP-5539 > Project: Hadoop Core > Issue Type: Bug > Components: mapred > Affects Versions: 0.19.1 > Environment: 0.19.2-dev, r753365 > Reporter: Billy Pearson > Priority: Blocker > Fix For: 0.19.2 > > Attachments: 5539.patch, hadoop-5539-v1.patch, hadoop-5539.patch > > > hadoop-site.xml : > mapred.compress.map.output = true > map output files are compressed but when the in memory merger closes > on the reduce the on disk merger runs to reduce input files to <= > io.sort.factor if needed. > when this happens it outputs files called intermediate.x files these > do not maintain compression setting the writer (o.a.h.mapred.Merger.class > line 432) > passes the codec but I added some logging and its always null map output > compression set true or false. > This causes task to fail if they can not hold the uncompressed size of the > data of the reduce its holding > I thank this is just and oversight of the codec not getting set correctly for > the on disk merges. > {code} > 2009-03-20 01:30:30,005 INFO org.apache.hadoop.mapred.Merger: Merging 30 > intermediate segments out of a total of 3000 > 2009-03-20 01:30:30,005 INFO org.apache.hadoop.mapred.Merger: intermediate.1 > used codec: null > {code} > I added > {code} > // added my me > if (codec != null){ > LOG.info("intermediate." + passNo + " used codec: " + > codec.toString()); > } else { > LOG.info("intermediate." + passNo + " used codec: Null"); > } > // end added by me > {code} > Just before the creation of the writer o.a.h.mapred.Merger.class line 432 > and it outputs the second line above. > I have confirmed this with the logging and I have looked at the files on the > disk of the tasktracker. I can read the data in > the intermediate files clearly telling me that there not compressed but I can > not read the map.out files direct from the map output > telling me the compression is working on the map end but not on the on disk > merge that produces the intermediate. > I can see no benefit for these not maintaining the compression setting and as > it looks they where intended to maintain it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.