[
https://issues.apache.org/jira/browse/TEZ-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang updated TEZ-1587:
----------------------------
Description:
*JoinExample run indefinitely, don't finish*
{code}
19:13:58,703 - Thread(Fetcher [hashSide] #1) - (HttpConnection.java:273) -
Closing connection on fetcher [hashSide] 114
19:13:58,703 - Thread(ShuffleRunner [hashSide]) - (ShuffleManager.java:270) -
Scheduling fetch for inputHost: jzhangMBPr.local:0
19:13:58,704 - Thread(ShuffleRunner [hashSide]) - (ShuffleManager.java:333) -
Created Fetcher for host: jzhangMBPr.local, with inputs: []
19:14:03,599 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State:
RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
19:14:03,601 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
VertexName: hashSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed:
0 Killed: 0
19:14:03,602 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
VertexName: streamingSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0
Failed: 0 Killed: 0
19:14:03,604 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
VertexName: joiner Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 1 Failed: 0
Killed: 0
19:14:08,629 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State:
RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
19:14:08,631 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
VertexName: hashSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0 Failed:
0 Killed: 0
19:14:08,632 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
VertexName: streamingSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0
Failed: 0 Killed: 0
19:14:08,633 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
VertexName: joiner Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 1 Failed: 0
Killed: 0
19:14:13,658 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State:
RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
{code}
*WordCount and OrderedWordCount fail due to the following exception*
{code}
19:16:47,499 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG completed.
FinalState=FAILED
WordCount failed with diagnostics: [Vertex re-running, vertexName=Tokenizer,
vertexId=vertex_1410779802886_0001_1_00, Vertex failed, vertexName=Summation,
vertexId=vertex_1410779802886_0001_1_01, diagnostics=[Task failed,
taskId=task_1410779802886_0001_1_01_000000, diagnostics=[TaskAttempt 0 failed,
info=[Error: Failure while running
task:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError:
error in shuffle in fetcher [Tokenizer] #1
at
org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:335)
at
org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at
org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:375)
at
org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copyFailed(ShuffleScheduler.java:292)
at
org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.copyFromHost(Fetcher.java:274)
at
org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.run(Fetcher.java:160)
, Container container_1410779802886_0001_00_000002 finished with diagnostics
set to [TaskExecutionFailure: error in shuffle in fetcher [Tokenizer] #1]],
TaskAttempt 1 failed, info=[Error: Failure while running
task:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError:
error in shuffle in fetcher [Tokenizer] #2
at
org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:335)
at
org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:1)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
{code}
{code}
> Some tez-examples fail in local mode
> ------------------------------------
>
> Key: TEZ-1587
> URL: https://issues.apache.org/jira/browse/TEZ-1587
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jeff Zhang
>
> *JoinExample run indefinitely, don't finish*
> {code}
> 19:13:58,703 - Thread(Fetcher [hashSide] #1) - (HttpConnection.java:273) -
> Closing connection on fetcher [hashSide] 114
> 19:13:58,703 - Thread(ShuffleRunner [hashSide]) - (ShuffleManager.java:270) -
> Scheduling fetch for inputHost: jzhangMBPr.local:0
> 19:13:58,704 - Thread(ShuffleRunner [hashSide]) - (ShuffleManager.java:333) -
> Created Fetcher for host: jzhangMBPr.local, with inputs: []
> 19:14:03,599 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State:
> RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
> 19:14:03,601 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
> VertexName: hashSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0
> Failed: 0 Killed: 0
> 19:14:03,602 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
> VertexName: streamingSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0
> Failed: 0 Killed: 0
> 19:14:03,604 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
> VertexName: joiner Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 1 Failed:
> 0 Killed: 0
> 19:14:08,629 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State:
> RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
> 19:14:08,631 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
> VertexName: hashSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0
> Failed: 0 Killed: 0
> 19:14:08,632 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
> VertexName: streamingSide Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 0
> Failed: 0 Killed: 0
> 19:14:08,633 - Thread( main) - (DAGClientRPCImpl.java:444) - VertexStatus:
> VertexName: joiner Progress: 0% TotalTasks: 2 Succeeded: 0 Running: 1 Failed:
> 0 Killed: 0
> 19:14:13,658 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG: State:
> RUNNING Progress: 0% TotalTasks: 6 Succeeded: 0 Running: 1 Failed: 0 Killed: 0
> {code}
> *WordCount and OrderedWordCount fail due to the following exception*
> {code}
> 19:16:47,499 - Thread( main) - (DAGClientRPCImpl.java:444) - DAG completed.
> FinalState=FAILED
> WordCount failed with diagnostics: [Vertex re-running, vertexName=Tokenizer,
> vertexId=vertex_1410779802886_0001_1_00, Vertex failed, vertexName=Summation,
> vertexId=vertex_1410779802886_0001_1_01, diagnostics=[Task failed,
> taskId=task_1410779802886_0001_1_01_000000, diagnostics=[TaskAttempt 0
> failed, info=[Error: Failure while running
> task:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError:
> error in shuffle in fetcher [Tokenizer] #1
> at
> org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:335)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:1)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:695)
> Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES;
> bailing-out.
> at
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:375)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.ShuffleScheduler.copyFailed(ShuffleScheduler.java:292)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.copyFromHost(Fetcher.java:274)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.Fetcher.run(Fetcher.java:160)
> , Container container_1410779802886_0001_00_000002 finished with diagnostics
> set to [TaskExecutionFailure: error in shuffle in fetcher [Tokenizer] #1]],
> TaskAttempt 1 failed, info=[Error: Failure while running
> task:org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$ShuffleError:
> error in shuffle in fetcher [Tokenizer] #2
> at
> org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:335)
> at
> org.apache.tez.runtime.library.common.shuffle.impl.Shuffle$RunShuffleCallable.call(Shuffle.java:1)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at java.lang.Thread.run(Thread.java:695)
> {code}
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)