Jeff Zhang created TEZ-1238:
-------------------------------

             Summary: Display more clear diagnostics info on client side if 
missing jar in LocalResource or Exception happen in Processor
                 Key: TEZ-1238
                 URL: https://issues.apache.org/jira/browse/TEZ-1238
             Project: Apache Tez
          Issue Type: Improvement
    Affects Versions: 0.4.0
            Reporter: Jeff Zhang
            Assignee: Jeff Zhang


I have a tez job which is failed due to that I didn't put my jar to the local 
resources. But on the client side, the exception is not clear for me to figure 
what's wrong with it. The real reason is that It couldn't load the Processor 
class. I have to run command "yarn logs" to find the real exception in the 
container logs.  
I also have another case that has exception in the my Processor, the message on 
the client side is still not clear to me. I think that should we pass the real 
exception to the diagnostics and display it in client side, this should help 
user to find out what's wrong with their program.


*Exception on client side*
{code}
14/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName:
summer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed:
114/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName:
tokenizer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1
Killed: 014/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: DAG completed.
FinalState=FAILEDDAG diagnostics:[Vertex failed, vertexName=tokenizer,
vertexId=vertex_1403765612557_0004_1_00, diagnostics=[Task failed,
taskId=task_1403765612557_0004_1_00_000000, diagnostics=[TaskAttempt 0
failed, info=[Container container_1403765612557_0004_01_000002 COMPLETED
with diagnostics set to [Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
org.apache.hadoop.util.Shell$ExitCodeException: at
org.apache.hadoop.util.Shell.runCommand(Shell.java:505)

at org.apache.hadoop.util.Shell.run(Shell.java:418)

at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)

at
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(
DefaultContainerExecutor.java:195)

at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(
ContainerLaunch.java:300)

at
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(
ContainerLaunch.java:81)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 1
{code}

*The real exception in container log:*

{code}
2014-06-26 14:57:02,146 ERROR [main]
org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread
Thread[main,5,main] threw an Exception.
org.apache.tez.dag.api.TezUncheckedException: Unable to load class:
com.zjffdu.tutorial.tez.WordCount$TokenProcessor
    at org.apache.tez.common.RuntimeUtils.getClazz(RuntimeUtils.java:44)
    at
org.apache.tez.common.RuntimeUtils.createClazzInstance(RuntimeUtils.java:66)
    at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:533)
    at
org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.<init>(LogicalIOProcessorRuntimeTask.java:146)
    at
org.apache.tez.runtime.task.TezTaskRunner.<init>(TezTaskRunner.java:78)
    at org.apache.tez.runtime.task.TezChild.run(TezChild.java:208)
    at org.apache.tez.runtime.task.TezChild.main(TezChild.java:363)
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to