[ https://issues.apache.org/jira/browse/HADOOP-4640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12650119#action_12650119 ]
Hadoop QA commented on HADOOP-4640: ----------------------------------- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12394241/HADOOP-4640.patch against trunk revision 719787. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 1 new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3642/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3642/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3642/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3642/console This message is automatically generated. > Add ability to split text files compressed with lzo > --------------------------------------------------- > > Key: HADOOP-4640 > URL: https://issues.apache.org/jira/browse/HADOOP-4640 > Project: Hadoop Core > Issue Type: Improvement > Components: io, mapred > Reporter: Johan Oskarsson > Assignee: Johan Oskarsson > Priority: Trivial > Fix For: 0.20.0 > > Attachments: HADOOP-4640.patch, HADOOP-4640.patch, HADOOP-4640.patch, > HADOOP-4640.patch > > > Right now any file compressed with lzop will be processed by one mapper. This > is a shame since the lzo algorithm would be very suitable for large log files > and similar common hadoop data sets. The compression rate is not the best out > there but the decompression speed is amazing. Since lzo writes compressed > data in blocks it would be possible to make an input format that can split > the files. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.