Hello,
 
I have written one PIG Script and tried to execute it, but after executing some 
part it gives me error java.io.IOException: Spill failed. I have included below 
statements in my script. And also I have set the classpath for hadoop-LZO jar.
1) set mapred.compress.map.output true;
2) set mapred.map.output.compression.codec com.hadoop.compression.lzo.LzopCodec;
 
This error is caused by java.lang.RuntimeException: native-lzo library not 
available. But I have set the CLASSPATH for Hadoop-LZO jar. 

After searching for this error I came to know some like:
"Since you have a very large number of records, if the individual records are 
small it's likely the
map task is spilling not because the data buffer is full, but because the 
accounting area is full."
 
So what should I do to avoid the Spill Failed?
Below is the exception I got.
 
Backend error message
---------------------
java.io.IOException: Spill failed
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1213)
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1194)
 at java.io.DataOutputStream.writeByte(DataOutputStream.java:153)
 at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:555)
 at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:435)
 at 
org.apache.pig.data.utils.SedesHelper.writeGenericTuple(SedesHelper.java:135)
 at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:613)
 at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:443)
 at org.apache.pig.data.BinSedesTuple.write(BinSedesTuple.java:41)
 at 
org.apache.pig.impl.io.PigNullableWritable.write(PigNullableWritable.java:123)
 at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
 at 
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1061)
 at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
 at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
 at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map.collect(PigGenericMapReduce.java:123)
 at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.runPipeline(PigGenericMapBase.java:285)
 at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:278)
 at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapBase.map(PigGenericMapBase.java:64)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:416)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: native-lzo library not available
 at com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:135)
 at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:100)
 at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:112)
 at org.apache.hadoop.mapred.IFile$Writer.<init>(IFile.java:101)
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1407)
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.access$1800(MapTask.java:853)
 at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1344)
                                          

Reply via email to