Hi, I am using LZO to compress my intermediate map outputs.
These are the settings: mapred.map.output.compression.codec = com.hadoop.compression.lzo.LzoCodec pig.tmpfilecompression.codec = lzo But I am consistently getting the following exception (I dont get this exception when I use "gz" as pig.tmpfilecompression.codec): Perhaps a bug ? I am using Hadoop 0.20.2 and Pig 0.8.1. java.io.EOFException. at org.apache.hadoop.io.compress.BlockDecompressorStream.getCompressedData(BlockDecompressorStream.java:112) at org.apache.hadoop.io.compress.BlockDecompressorStream.decompress(BlockDecompressorStream.java:88) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74) at org.apache.hadoop.mapred.IFile$Reader.readData(IFile.java:328) at org.apache.hadoop.mapred.IFile$Reader.rejigData(IFile.java:358) at org.apache.hadoop.mapred.IFile$Reader.readNextBlock(IFile.java:342) at org.apache.hadoop.mapred.IFile$Reader.next(IFile.java:404) at org.apache.hadoop.mapred.Merger$Segment.next(Merger.java:220) at org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:330) at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:350) at org.apache.hadoop.mapreduce.ReduceContext.nextKeyValue(ReduceContext.java:117) at org.apache.hadoop.mapreduce.ReduceContext.nextKey(ReduceContext.java:92) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:175) at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1217) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1500) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1116) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:512) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:585) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) at org.apache.hadoop.mapred.Child.main(Child.java:170) Thanks, -Rakesh