[ 
https://issues.apache.org/jira/browse/TEZ-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-3104:
---------------------------------
    Attachment: TEZ-3104.1.patch

Attaching sample patch.

Tez reuses CodecPool inside fetcher threads but does not initialize native bits 
before starting the threads. This creates memory tension and causes the native 
bits to not be loaded in some threads.

In addition, if I set the mapreduce.reduce.shuffle.parallelcopies=1, it also 
worked since it was in sync with itself.

> Tez fails on Bzip2 intermediate output format
> ---------------------------------------------
>
>                 Key: TEZ-3104
>                 URL: https://issues.apache.org/jira/browse/TEZ-3104
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3104.1.patch
>
>
> HADOOP_CLASSPATH="$TEZ_CONF_DIR:$TEZ_HOME/*:$TEZ_HOME/lib/*" yarn jar 
> /home/gs/tez/current/tez-tests-*.jar mrrsleep 
> -Dmapreduce.reduce.log.level=TRACE -Dtez.task.log.level=TRACE 
> -Dtez.runtime.compress=true 
> -Dmapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.BZip2Codec
>  
> -Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec
>  -Dmapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec 
> -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec
>  -Dmapreduce.reduce.shuffle.parallelcopies=30 -m 100 -ir 10 -r 100
> {noformat}
> 2016-02-09 02:31:36,605 [ERROR] [ShuffleAndMergeRunner {map}] 
> |orderedgrouped.Shuffle|: map: ShuffleRunner failed with error
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
>  error in shuffle in fetcher {map} #16
>       at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:360)
>       at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
>       at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>       at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.UnsupportedOperationException
>       at 
> org.apache.hadoop.io.compress.bzip2.BZip2DummyDecompressor.decompress(BZip2DummyDecompressor.java:32)
>       at 
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91)
>       at 
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
>       at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
>       at 
> org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:626)
>       at 
> org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:113)
>       at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyMapOutput(FetcherOrderedGrouped.java:502)
>       at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:279)
>       at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:169)
>       at 
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:184)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to