[jira] [Commented] (HADOOP-11334) Mapreduce Job Failed due to failure fetching mapper output on the reduce side

Yuanbo Liu (JIRA) Tue, 05 Apr 2016 23:57:08 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15227852#comment-15227852
 ]


Yuanbo Liu commented on HADOOP-11334:
-------------------------------------

It seems that the property "io.native.lib.available" has already been 
discussed. 
This property will be supported for 2.x because of backwards compatibility, and 
removed for 3.x
Akira AJISAKA has provided document patch for 2.x to clarify that the property 
only
control bz2 and zlib. See the details here: 
https://issues.apache.org/jira/browse/HADOOP-8642.
So I think we can just close this jira since the community has a good decision.



> Mapreduce Job Failed due to failure fetching mapper output on the reduce side
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-11334
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11334
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 2.4.1
>            Reporter: Jinghui Wang
>            Assignee: Yuanbo Liu
>
> Running terasort with the following options hadoop jar 
> hadoop-mapreduce-examples.jar terasort *-Dio.native.lib.available=false 
> -Dmapreduce.map.output.compress=true 
> -Dmapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.GzipCodec*
>   /tmp/tera-in /tmp/tera-out
> The job failed with the reducer failed to fetch the output from mappers (see 
> the following stacktrace). The problem is that in JIRA MAPREDUCE-1784, it 
> added support to handle null compressors to default to non-compressed output. 
> In this case, when the *io.native.lib.available* is set to false, the 
> compressor will be null. However, the decompressor has a Java implementation, 
> so when the reducer tries to read the mapper output, it uses the 
> decompressor, but the output does not have the Gzip header.
> 2014-11-25 10:39:48,108 WARN [fetcher#9] 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher: Failed to shuffle output of 
> attempt_1416875111322_0005_m_000002_0 from bdvs130:13562
> java.io.IOException: not a gzip file
>       at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.processBasicHeader(BuiltInGzipDecompressor.java:495)
>       at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.executeHeaderState(BuiltInGzipDecompressor.java:256)
>       at 
> org.apache.hadoop.io.compress.zlib.BuiltInGzipDecompressor.decompress(BuiltInGzipDecompressor.java:185)
>       at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91)
>       at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>       at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>       at 
> org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97)
>       at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyMapOutput(Fetcher.java:434)
>       at 
> org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:341)
>       at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:165)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HADOOP-11334) Mapreduce Job Failed due to failure fetching mapper output on the reduce side

Reply via email to