[ 
https://issues.apache.org/jira/browse/TEZ-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296880#comment-14296880
 ] 

Jeff Zhang edited comment on TEZ-2000 at 1/29/15 2:07 PM:
----------------------------------------------------------

[~rohini] This patch resolve the "source vertex exists error", but when I run 
the pig e2e test, new exception arise.  I'm not sure whether it is my 
configuration issue. The exception arise both on tez-0.6 and tez-0.5.3.  Could 
you help run it on either tez-0.5.3 and tez-0.6 to verify it ? I run it on 
Union_4

{code}
vertexId=vertex_1422521498936_0005_1_02, initRequestedTime=1422537088981, 
initedTime=1422537089011, startRequestedTime=1422537089175, 
startedTime=1422537089175, finishTime=1422537125491, timeTaken=36316, 
status=FAILED, diagnostics=Task failed, 
taskId=task_1422521498936_0005_1_02_000000, diagnostics=[TaskAttempt 0 failed, 
info=[Error: 
exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in fetcher [scope_6] #2
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:391)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:306)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:350)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:245)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:167)
, errorMessage=Shuffle Runner 
Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in fetcher [scope_6] #2
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
{code}


was (Author: zjffdu):
[~rohini] This patch resolve the "source vertex exists error", but when I run 
the pig e2e test, new exception arise.  I'm not sure whether it is my 
configuration issue. The exception arise both on tez-0.6 and tez-0.5.3.  I run 
it on Union_4

{code}
vertexId=vertex_1422521498936_0005_1_02, initRequestedTime=1422537088981, 
initedTime=1422537089011, startRequestedTime=1422537089175, 
startedTime=1422537089175, finishTime=1422537125491, timeTaken=36316, 
status=FAILED, diagnostics=Task failed, 
taskId=task_1422521498936_0005_1_02_000000, diagnostics=[TaskAttempt 0 failed, 
info=[Error: 
exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in fetcher [scope_6] #2
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:391)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:306)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:350)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:245)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:167)
, errorMessage=Shuffle Runner 
Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in fetcher [scope_6] #2
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
        at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
{code}

> Source vertex exists error during DAG submission
> ------------------------------------------------
>
>                 Key: TEZ-2000
>                 URL: https://issues.apache.org/jira/browse/TEZ-2000
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Jeff Zhang
>         Attachments: TEZ-2000-1.patch
>
>
>  Pig e2e tests - Cross_5, Union_4 and Union_9 fail. This is due to some jira 
> that went in after TEZ-1931, but from the titles I cannot easily associate 
> one that could cause this failure. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to