I have a user running 0.6.1 tez (git hash
6e588d15184dc691df1c0227f40db91d9bc6d7d6) and a few times a month the job
submitted returns this error.

2015-09-16 01:04:05,069 INFO [IPC Server handler 0 on 50500]
app.DAGAppMaster: Running DAG: PigLatin:user_1.pig-0_scope-0
2015-09-16 01:04:05,617 INFO [IPC Server handler 0 on 50500]
history.HistoryEventHandler:
[HISTORY][DAG:dag_1440165794704_806241_1][Event:DAG_SUBMITTED]:
dagID=dag_1440165794704_806241_1, submitTime=1442365444946
2015-09-16 01:04:05,676 WARN [IPC Server handler 0 on 50500]
ipc.Server: IPC Server handler 0 on 50500, call
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPB.submitDAG
from hostname:55741 Call#585 Retry#0
2015-09-16 01:04:05,679 FATAL [IPC Server handler 0 on 50500]
yarn.YarnUncaughtExceptionHandler: Thread Thread[IPC Server handler 0
on 50500,5,main] threw an Error.  Shutting down now...
2015-09-16 01:04:05,764 INFO [IPC Server handler 0 on 50500]
util.ExitUtil: Exiting with status -1


Normally it looks like this

2015-09-17 10:03:40,234 INFO [IPC Server handler 0 on 50503]
app.DAGAppMaster: Running DAG: PigLatin:user_1.pig-0_scope-0
2015-09-17 10:03:40,659 INFO [IPC Server handler 0 on 50503]
history.HistoryEventHandler:
[HISTORY][DAG:dag_1440165794704_873289_1][Event:DAG_SUBMITTED]:
dagID=dag_1440165794704_873289_1, submitTime=1442484220062
2015-09-17 10:03:40,694 INFO [IPC Server handler 0 on 50503]
impl.VertexImpl: setting additional outputs for vertex scope-2573
2015-09-17 10:03:40,696 INFO [IPC Server handler 0 on 50503]
impl.DAGImpl: Using DAG Scheduler:
org.apache.tez.dag.app.dag.impl.DAGSchedulerNaturalOrder
2015-09-17 10:03:40,698 INFO [IPC Server handler 0 on 50503]
history.HistoryEventHandler:
[HISTORY][DAG:dag_1440165794704_873289_1][Event:DAG_INITIALIZED]:
dagID=dag_1440165794704_873289_1, initTime=1442484220660
2015-09-17 10:03:40,698 INFO [IPC Server handler 0 on 50503]
impl.DAGImpl: dag_1440165794704_873289_1 transitioned from NEW to
INITED
...

and then it succeeds.

Is there any known escaped Exceptions over rpc that have been fixed
that would solved this issue?

Jon

Reply via email to