Compression for intermediate map output is broken
-------------------------------------------------
Key: HADOOP-2943
URL: https://issues.apache.org/jira/browse/HADOOP-2943
Project: Hadoop Core
Issue Type: Bug
Components: mapred
Reporter: Chris Douglas
It looks like SequenceFile::RecordCompressWriter and
SequenceFile::BlockCompressWriter weren't updated to use the new serialization
added in HADOOP-1986. This causes failures in the merge when
mapred.compress.map.output is true and mapred.map.output.compression.type=BLOCK:
{noformat}
java.io.IOException: File is corrupt!
at
org.apache.hadoop.io.SequenceFile$Reader.readBlock(SequenceFile.java:1656)
at
org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:1969)
at
org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(SequenceFile.java:2985)
at
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2785)
at
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2494)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:654)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:740)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:212)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2077)
{noformat}
mapred.map.output.compression.type=RECORD works for Writables, but should be
updated.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.