Sounds like lzo is not set up correctly on slave nodes. Can you check what's suggested in this conversation? https://groups.google.com/forum/#!msg/elephantbird-dev/1be_Tjyd2gw/aY1e8w_egTEJ
On Thu, Jul 18, 2013 at 9:27 PM, Bhavesh Shah <[email protected]>wrote: > Hello, > > I have written one PIG Script and tried to execute it, but after executing > some part it gives me error java.io.IOException: Spill failed. I have > included below statements in my script. And also I have set the classpath > for hadoop-LZO jar. > 1) set mapred.compress.map.output true; > 2) set mapred.map.output.compression.codec > com.hadoop.compression.lzo.LzopCodec; > > This error is caused by java.lang.RuntimeException: native-lzo library not > available. But I have set the CLASSPATH for Hadoop-LZO jar. > > After searching for this error I came to know some like: > "Since you have a very large number of records, if the individual records > are small it's likely the > map task is spilling not because the data buffer is full, but because the > accounting area is full." > > So what should I do to avoid the Spill Failed? > Below is the exception I got. > > Backend error message > --------------------- > java.io.IOException: Spill failed > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1213) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1194) > at java.io.DataOutputStream.writeByte(DataOutputStream.java:153) > at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:555) > at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435) > at > org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135) > at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613) > at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:443) > at org.apache.pig.data.BinSedesTuple.write(BinSedesTuple.java:41) > at > org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:123) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90) > at > org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1061) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691) > at > org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:123) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:285) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:416) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: java.lang.RuntimeException: native-lzo library not available > at > com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:135) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:100) > at > org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112) > at org.apache.hadoop.mapred.IFile$Writer.<init>(IFile.java:101) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1407) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853) > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344) >
