Here is some update of the problem.
I tried a very simple example (word count) and tried to compress the
reducer output using default Codec, or GzipCodec. I didn't try LZO to
avoid further troubles. I didn't use the combiner class, and set the
number of reducer to 1. I am trying it on a 64-bit Debian. My java
version is
java version "1.6.0"
Java(TM) SE Runtime Environment (build 1.6.0-b105)
Java HotSpot(TM) 64-Bit Server VM (build 1.6.0-b105, mixed mode)
I use SequenceFileOutputFormat, and the output value class is a Vector.
At first, I didn't specify the compressOutput and the there was no
problem. The sequenceFile was generated correctly. However, when I
compressed the output adding the following 3-line-command:
SequenceFileOutputFormat.setCompressOutput(conf,true);
SequenceFileOutputFormat.setOutputCompressorClass(conf,
DefaultCodec.class);
SequenceFileOutputFormat.setOutputCompressionType(conf,
SequenceFile.CompressionType.BLOCK);
The reducer kept on generating error and the task finally crashed.
11/03/09 12:28:05 INFO mapred.JobClient: map 100% reduce 33%
11/03/09 12:28:09 INFO mapred.JobClient: map 100% reduce 0%
11/03/09 12:28:09 INFO mapred.JobClient: Task Id :
attempt_201103081457_0024_r_000000_0, Status : FAILED
java.io.IOException: Task process exit with nonzero status of 134.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
11/03/09 12:28:19 INFO mapred.JobClient: map 100% reduce 29%
11/03/09 12:28:21 INFO mapred.JobClient: map 100% reduce 0%
11/03/09 12:28:22 INFO mapred.JobClient: Task Id :
attempt_201103081457_0024_r_000000_1, Status : FAILED
java.io.IOException: Task process exit with nonzero status of 134.
at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418)
I checked the logs of the reducer (only 1), there is no error in syslog.
But in the stdout file, there is an error information:
An unexpected error has been detected by Java Runtime Environment:
#
# SIGFPE (0x8) at pc=0x00002b22eecc7b83, pid=13306, tid=1076017504
#
# Java VM: Java HotSpot(TM) 64-Bit Server VM (1.6.0-b105 mixed mode)
# Problematic frame:
# C [ld-linux-x86-64.so.2+0x7b83]
#
# An error report file with more information is saved as hs_err_pid13306.log
#
# If you would like to submit a bug report, please visit:
# http://java.sun.com/webapps/bugreport/crash.jsp
So I would like to know how should I solve this problem. Should I
upgrade anything? I guess this problem is not new. Thanks for the
information.
Shi
On 3/8/2011 4:04 PM, Shi Yu wrote:
What is the true reason of causing this? I realized there are many
reports on web, but couldn't find the exact solution? I have this
problem when using compressed sequence file output.
SequenceFileOutputFormat.setCompressOutput(conf, true);
SequenceFileOutputFormat.setOutputCompressorClass(conf, GzipCodec.class);
SequenceFileOutputFormat.setOutputCompressionType(conf,
CompressionType.BLOCK);
If I remove that 3 lines, everything is fine. I am using hadoop
0.19.2, is there any way to avoid the problem without upgrading hadoop?
Thanks!
Shi