Besides, IMHO if user forget to specify the LocalResource for his Processor, it should cause a TaskAttempt Fail rather than Container Fail.
On Thu, Jun 26, 2014 at 4:43 PM, Jeff Zhang <zjf...@gmail.com> wrote: > I did some experiment by making some code change to send the real > exception to client side. Let me know your comments whether this is > valuable to fix it. > > > On Thu, Jun 26, 2014 at 3:37 PM, Jeff Zhang <zjf...@gmail.com> wrote: > >> Hi all, >> >> I have a tez job which is failed due to that I didn't put my jar to the >> local resources. But on the client side, the exception is not clear for me >> to figure what's wrong with it. The real reason is that It couldn't load >> the Processor class. I have to run command "yarn logs" to find the real >> exception in the container logs. I also have another case that has >> exception in the my Processor, the message on the client side is still not >> clear to me. I think that should we pass the real exception to the >> diagnostics and display it in client side, this should help user to find >> out what's wrong with their program. Let me know your comments, thanks ( >> following is the logs in client side and container ) >> >> >> *Exception on client side* >> >> 14/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName: >> summer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 0 Killed: >> 114/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: VertexStatus: VertexName: >> tokenizer Progress: 0% TotalTasks: 1 Succeeded: 0 Running: 0 Failed: 1 >> Killed: 014/06/26 14:57:15 INFO rpc.DAGClientRPCImpl: DAG completed. >> FinalState=FAILEDDAG diagnostics:[Vertex failed, vertexName=tokenizer, >> vertexId=vertex_1403765612557_0004_1_00, diagnostics=[Task failed, >> taskId=task_1403765612557_0004_1_00_000000, diagnostics=[TaskAttempt 0 >> failed, info=[Container container_1403765612557_0004_01_000002 COMPLETED >> with diagnostics set to [Exception from container-launch: >> org.apache.hadoop.util.Shell$ExitCodeException: >> org.apache.hadoop.util.Shell$ExitCodeException: at >> org.apache.hadoop.util.Shell.runCommand(Shell.java:505) >> >> at org.apache.hadoop.util.Shell.run(Shell.java:418) >> >> at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute( >> Shell.java:650) >> >> at >> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer( >> DefaultContainerExecutor.java:195) >> >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call( >> ContainerLaunch.java:300) >> >> at >> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call( >> ContainerLaunch.java:81) >> >> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> >> at java.util.concurrent.ThreadPoolExecutor.runWorker( >> ThreadPoolExecutor.java:1145) >> >> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> ThreadPoolExecutor.java:615) >> >> at java.lang.Thread.run(Thread.java:745) >> >> Container exited with a non-zero exit code 1 >> *The reason exception:* >> >> 2014-06-26 14:57:02,146 ERROR [main] >> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread >> Thread[main,5,main] threw an Exception. >> org.apache.tez.dag.api.TezUncheckedException: Unable to load class: >> com.zjffdu.tutorial.tez.WordCount$TokenProcessor >> at org.apache.tez.common.RuntimeUtils.getClazz(RuntimeUtils.java:44) >> at >> org.apache.tez.common.RuntimeUtils.createClazzInstance(RuntimeUtils.java:66) >> at >> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.createProcessor(LogicalIOProcessorRuntimeTask.java:533) >> at >> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.<init>(LogicalIOProcessorRuntimeTask.java:146) >> at >> org.apache.tez.runtime.task.TezTaskRunner.<init>(TezTaskRunner.java:78) >> at org.apache.tez.runtime.task.TezChild.run(TezChild.java:208) >> at org.apache.tez.runtime.task.TezChild.main(TezChild.java:363) >> >> >> -- >> Best Regards >> >> Jeff Zhang >> > > > > -- > Best Regards > > Jeff Zhang > -- Best Regards Jeff Zhang