[
https://issues.apache.org/jira/browse/TEZ-2000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296880#comment-14296880
]
Jeff Zhang edited comment on TEZ-2000 at 1/29/15 2:07 PM:
----------------------------------------------------------
[~rohini] This patch resolve the "source vertex exists error", but when I run
the pig e2e test, new exception arise. I'm not sure whether it is my
configuration issue. The exception arise both on tez-0.6 and tez-0.5.3. Could
you help run it on either tez-0.5.3 and tez-0.6 to verify it ? I run it on
Union_4
{code}
vertexId=vertex_1422521498936_0005_1_02, initRequestedTime=1422537088981,
initedTime=1422537089011, startRequestedTime=1422537089175,
startedTime=1422537089175, finishTime=1422537125491, timeTaken=36316,
status=FAILED, diagnostics=Task failed,
taskId=task_1422521498936_0005_1_02_000000, diagnostics=[TaskAttempt 0 failed,
info=[Error:
exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
error in shuffle in fetcher [scope_6] #2
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:391)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:306)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:350)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:245)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:167)
, errorMessage=Shuffle Runner
Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
error in shuffle in fetcher [scope_6] #2
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
was (Author: zjffdu):
[~rohini] This patch resolve the "source vertex exists error", but when I run
the pig e2e test, new exception arise. I'm not sure whether it is my
configuration issue. The exception arise both on tez-0.6 and tez-0.5.3. I run
it on Union_4
{code}
vertexId=vertex_1422521498936_0005_1_02, initRequestedTime=1422537088981,
initedTime=1422537089011, startRequestedTime=1422537089175,
startedTime=1422537089175, finishTime=1422537125491, timeTaken=36316,
status=FAILED, diagnostics=Task failed,
taskId=task_1422521498936_0005_1_02_000000, diagnostics=[TaskAttempt 0 failed,
info=[Error:
exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
error in shuffle in fetcher [scope_6] #2
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:391)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:306)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:350)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:245)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:167)
, errorMessage=Shuffle Runner
Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
error in shuffle in fetcher [scope_6] #2
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:346)
at
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
{code}
> Source vertex exists error during DAG submission
> ------------------------------------------------
>
> Key: TEZ-2000
> URL: https://issues.apache.org/jira/browse/TEZ-2000
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Rohini Palaniswamy
> Assignee: Jeff Zhang
> Attachments: TEZ-2000-1.patch
>
>
> Pig e2e tests - Cross_5, Union_4 and Union_9 fail. This is due to some jira
> that went in after TEZ-1931, but from the titles I cannot easily associate
> one that could cause this failure.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)