[ 
https://issues.apache.org/jira/browse/TEZ-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541421#comment-14541421
 ] 

Richard Huang commented on TEZ-494:
-----------------------------------

[~zjffdu]  And full log
2015-05-13 05:59:08,576 INFO [main] app.DAGAppMaster: Created DAGAppMaster for 
application appattempt_1430152270521_27653_000001, versionInfo=[ 
component=tez-dag, version=0.5.3, 
revision=043c2c1511f1134add20e98e69a0fab49702667f, 
SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, 
buildTime=20150430-1655 ]
2015-05-13 05:59:09,061 INFO [main] Configuration.deprecation: fs.default.name 
is deprecated. Instead, use fs.defaultFS
2015-05-13 05:59:09,989 INFO [main] app.DAGAppMaster: Comparing client version 
with AM version, clientVersion=0.5.3, AMVersion=0.5.3
2015-05-13 05:59:10,739 INFO [main] app.DAGAppMaster: Adding session token to 
jobTokenSecretManager for application
2015-05-13 05:59:10,761 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.rm.container.AMContainerEventType for class 
org.apache.tez.dag.app.rm.container.AMContainerMap
2015-05-13 05:59:10,764 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.rm.node.AMNodeEventType for class 
org.apache.tez.dag.app.rm.node.AMNodeTracker
2015-05-13 05:59:10,767 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.dag.event.DAGAppMasterEventType for class 
org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler
2015-05-13 05:59:10,768 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.dag.event.DAGEventType for class 
org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher
2015-05-13 05:59:10,779 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.dag.event.VertexEventType for class 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher
2015-05-13 05:59:10,781 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.dag.event.TaskEventType for class 
org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher
2015-05-13 05:59:10,783 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.dag.event.TaskAttemptEventType for class 
org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher
2015-05-13 05:59:10,793 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.rm.AMSchedulerEventType for class 
org.apache.tez.dag.app.rm.TaskSchedulerEventHandler
2015-05-13 05:59:10,797 INFO [main] event.AsyncDispatcher: Registering class 
org.apache.tez.dag.app.rm.NMCommunicatorEventType for class 
org.apache.tez.dag.app.launcher.ContainerLauncherImpl
2015-05-13 05:59:11,116 INFO [main] node.AMNodeTracker: blacklistDisablePercent 
is 33, blacklistingEnabled: true, maxTaskFailuresPerNode: 3
2015-05-13 05:59:11,118 INFO [main] launcher.ContainerLauncherImpl: Upper limit 
on the thread pool size is 500
2015-05-13 05:59:11,118 INFO [main] history.HistoryEventHandler: Initializing 
HistoryEventHandler
2015-05-13 05:59:11,125 INFO [main] impl.SimpleHistoryLoggingService: Log file 
location for SimpleHistoryLoggingService not specified, defaulting to 
containerLogDir=/opt/mapr/hadoop/hadoop-2.4.1/logs/userlogs/application_1430152270521_27653/container_1430152270521_27653_01_000005
2015-05-13 05:59:11,126 INFO [main] impl.SimpleHistoryLoggingService: 
Initializing SimpleHistoryLoggingService, 
logFileLocation=/opt/mapr/hadoop/hadoop-2.4.1/logs/userlogs/application_1430152270521_27653/container_1430152270521_27653_01_000005/history.txt.appattempt_1430152270521_27653_000001,
 maxErrors=10
2015-05-13 05:59:11,127 INFO [main] recovery.RecoveryService: Initializing 
RecoveryService
2015-05-13 05:59:11,129 INFO [main] history.HistoryEventHandler: 
[HISTORY][DAG:N/A][Event:APP_LAUNCHED]: 
applicationId=application_1430152270521_27653, appSubmitTime=1431496719307, 
launchTime=1431496748534
2015-05-13 05:59:11,131 INFO [main] history.HistoryEventHandler: 
[HISTORY][DAG:N/A][Event:AM_LAUNCHED]: 
appAttemptId=appattempt_1430152270521_27653_000001, 
appSubmitTime=1431496719307, launchTime=1431496748534
2015-05-13 05:59:11,251 INFO 
[ServiceThread:org.apache.tez.dag.history.HistoryEventHandler] 
history.HistoryEventHandler: Starting HistoryEventHandler
2015-05-13 05:59:11,253 INFO 
[ServiceThread:org.apache.tez.dag.history.HistoryEventHandler] 
impl.SimpleHistoryLoggingService: Starting SimpleHistoryLoggingService
2015-05-13 05:59:11,279 INFO 
[ServiceThread:org.apache.tez.dag.app.launcher.ContainerLauncherImpl] 
impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies : 
500
2015-05-13 05:59:11,344 INFO [ServiceThread:DAGClientRPCServer] 
ipc.CallQueueManager: Using callQueue class 
java.util.concurrent.LinkedBlockingQueue
2015-05-13 05:59:11,346 INFO 
[ServiceThread:org.apache.tez.dag.app.TaskAttemptListenerImpTezDag] 
ipc.CallQueueManager: Using callQueue class 
java.util.concurrent.LinkedBlockingQueue
2015-05-13 05:59:11,449 INFO [Socket Reader #1 for port 32794] ipc.Server: 
Starting Socket Reader #1 for port 32794
2015-05-13 05:59:11,468 INFO [Socket Reader #1 for port 49701] ipc.Server: 
Starting Socket Reader #1 for port 49701
2015-05-13 05:59:11,573 INFO [IPC Server listener on 32794] ipc.Server: IPC 
Server listener on 32794: starting
2015-05-13 05:59:11,574 INFO [IPC Server Responder] ipc.Server: IPC Server 
Responder: starting
2015-05-13 05:59:11,620 INFO [ServiceThread:DAGClientRPCServer] 
client.DAGClientServer: Instantiated DAGClientRPCServer at 
IRVHadoop01b.uswest.bluecava.net/10.52.2.101:32794
2015-05-13 05:59:11,650 INFO [IPC Server Responder] ipc.Server: IPC Server 
Responder: starting
2015-05-13 05:59:11,687 INFO 
[ServiceThread:org.apache.tez.dag.history.HistoryEventHandler] 
recovery.RecoveryService: Starting RecoveryService
2015-05-13 05:59:11,688 INFO [IPC Server listener on 49701] ipc.Server: IPC 
Server listener on 49701: starting
2015-05-13 05:59:11,724 INFO [HistoryEventHandlingThread] 
impl.SimpleHistoryLoggingService: Writing event APP_LAUNCHED to history file
2015-05-13 05:59:11,912 INFO 
[ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] 
rm.YarnTaskSchedulerService: TaskScheduler initialized with configuration: 
maxRMHeartbeatInterval: 1000, containerReuseEnabled: true, reuseRackLocal: 
true, reuseNonLocal: false, localitySchedulingDelay: 250, preemptionPercentage: 
10, numHeartbeatsBetweenPreemptions3, idleContainerMinTimeout=5000, 
idleContainerMaxTimeout=10000, sessionMinHeldContainers=0
2015-05-13 05:59:12,059 INFO [HistoryEventHandlingThread] 
impl.SimpleHistoryLoggingService: Writing event AM_LAUNCHED to history file
2015-05-13 05:59:12,060 INFO 
[ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler] 
client.RMProxy: Connecting to ResourceManager at 
IRVHadoopSvc03.uswest.bluecava.net/10.52.0.13:8030
2015-05-13 05:59:12,548 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:12,564 INFO [AsyncDispatcher event handler] 
node.AMNodeTracker: Num cluster nodes = 20
2015-05-13 05:59:12,591 INFO [main] impl.MetricsConfig: loaded properties from 
hadoop-metrics2.properties
2015-05-13 05:59:12,775 INFO [main] impl.MetricsSystemImpl: Scheduled snapshot 
period at 10 second(s).
2015-05-13 05:59:12,776 INFO [main] impl.MetricsSystemImpl: DAGAppMaster 
metrics system started
2015-05-13 05:59:12,786 INFO [HistoryEventHandlingThread] 
impl.SimpleHistoryLoggingService: Writing event AM_STARTED to history file
2015-05-13 05:59:12,786 INFO [main] history.HistoryEventHandler: 
[HISTORY][DAG:N/A][Event:AM_STARTED]: 
appAttemptId=appattempt_1430152270521_27653_000001, startTime=1431496752784
2015-05-13 05:59:12,787 INFO [main] app.DAGAppMaster: In Session mode. Waiting 
for DAG over RPC
2015-05-13 05:59:13,545 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:13,899 WARN [Socket Reader #1 for port 32794] ipc.Server: 
Requested data length 77710987 is longer than maximum configured RPC length 
67108864.  RPC came from 10.51.5.41
2015-05-13 05:59:13,901 INFO [Socket Reader #1 for port 32794] ipc.Server: 
Socket Reader #1 for port 32794: readAndProcess from client 10.51.5.41 threw 
exception [java.io.IOException: Requested data length 77710987 is longer than 
maximum configured RPC length 67108864.  RPC came from 10.51.5.41]
java.io.IOException: Requested data length 77710987 is longer than maximum 
configured RPC length 67108864.  RPC came from 10.51.5.41
        at 
org.apache.hadoop.ipc.Server$Connection.checkDataLength(Server.java:1433)
        at 
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1495)
        at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:752)
        at 
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:626)
        at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:597)
2015-05-13 05:59:14,547 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794] 
client.DAGClientHandler: Received message to shutdown AM
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794] 
rm.TaskSchedulerEventHandler: TaskScheduler notified that it should unregister 
from RM
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster: 
No current running DAG, shutting down the AM
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster: 
DAGAppMasterShutdownHandler invoked
2015-05-13 05:59:14,810 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster: 
Handling DAGAppMaster shutdown
2015-05-13 05:59:14,817 INFO [AMShutdownThread] app.DAGAppMaster: Sleeping for 
5 seconds before shutting down
2015-05-13 05:59:14,827 INFO [IPC Server handler 0 on 32794] 
client.DAGClientHandler: Received message to shutdown AM
2015-05-13 05:59:14,827 INFO [IPC Server handler 0 on 32794] 
rm.TaskSchedulerEventHandler: TaskScheduler notified that it should unregister 
from RM
2015-05-13 05:59:14,827 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster: 
No current running DAG, shutting down the AM
2015-05-13 05:59:14,838 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster: 
DAGAppMasterShutdownHandler invoked
2015-05-13 05:59:14,838 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster: 
Ignoring multiple shutdown events
2015-05-13 05:59:15,549 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:16,550 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:17,552 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:18,554 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:19,556 INFO [AMRM Callback Handler Thread] 
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0 
taskAllocations: 0
2015-05-13 05:59:19,818 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop 
for all the services
2015-05-13 05:59:19,819 INFO [AMShutdownThread] history.HistoryEventHandler: 
Stopping HistoryEventHandler
2015-05-13 05:59:19,820 INFO [AMShutdownThread] recovery.RecoveryService: 
Stopping RecoveryService
2015-05-13 05:59:19,820 INFO [AMShutdownThread] 
impl.SimpleHistoryLoggingService: Stopping SimpleHistoryLoggingService, 
eventQueueBacklog=0
2015-05-13 05:59:19,821 INFO [HistoryEventHandlingThread] 
impl.SimpleHistoryLoggingService: EventQueue take interrupted. Returning
2015-05-13 05:59:19,821 INFO [RecoveryEventHandlingThread] 
recovery.RecoveryService: EventQueue take interrupted. Returning
2015-05-13 05:59:19,823 INFO [DelayedContainerManager] 
rm.YarnTaskSchedulerService: AllocatedContainerManager Thread interrupted
2015-05-13 05:59:19,828 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: 
Unregistering application from RM, exitStatus=SUCCEEDED, exitMessage=Session 
stats:submittedDAGs=0, successfulDAGs=0, failedDAGs=0, killedDAGs=0
, trackingURL=
2015-05-13 05:59:19,839 INFO [AMShutdownThread] impl.AMRMClientImpl: Waiting 
for application to be successfully unregistered.
2015-05-13 05:59:19,940 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: 
Successfully unregistered application from RM
2015-05-13 05:59:20,044 INFO [AMRM Callback Handler Thread] 
impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
java.lang.InterruptedException
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)
2015-05-13 05:59:20,044 INFO [AMShutdownThread] ipc.Server: Stopping server on 
49701
2015-05-13 05:59:20,049 INFO [AMShutdownThread] ipc.Server: Stopping server on 
32794
2015-05-13 05:59:20,049 INFO [IPC Server listener on 49701] ipc.Server: 
Stopping IPC Server listener on 49701
2015-05-13 05:59:20,051 INFO [IPC Server Responder] ipc.Server: Stopping IPC 
Server Responder
2015-05-13 05:59:20,052 INFO [IPC Server listener on 32794] ipc.Server: 
Stopping IPC Server listener on 32794
2015-05-13 05:59:20,052 INFO [IPC Server Responder] ipc.Server: Stopping IPC 
Server Responder
2015-05-13 05:59:20,071 INFO [Thread-1] app.DAGAppMaster: 
DAGAppMasterShutdownHook invoked
2015-05-13 05:59:20,071 INFO [Thread-1] app.DAGAppMaster: The shutdown handler 
is still running, waiting for it to complete
2015-05-13 05:59:20,798 INFO [AMShutdownThread] app.DAGAppMaster: Completed 
deletion of tez scratch data dir, 
path=maprfs:/tmp/temp-427461361/.tez/application_1430152270521_27653
2015-05-13 05:59:20,798 INFO [AMShutdownThread] app.DAGAppMaster: Exiting 
DAGAppMaster..GoodBye!
2015-05-13 05:59:20,798 INFO [Thread-1] app.DAGAppMaster: The shutdown handler 
has completed

> Investigate: Exception while submitting a dag to a TezSession
> -------------------------------------------------------------
>
>                 Key: TEZ-494
>                 URL: https://issues.apache.org/jira/browse/TEZ-494
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>            Assignee: Rekha Joshi
>
> {code}
>  destination host is: "local":54446;
>       at org.apache.tez.client.TezSession.submitDAG(TezSession.java:139)
>       at 
> org.apache.tez.mapreduce.examples.TestBroadcast.main(TestBroadcast.java:155)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
>       at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>       at 
> org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:78)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed 
> on local exception: java.io.EOFException; Host Details : local host is: 
> "local/10.11.2.219"; destination host is: "local":54446;
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
>       at $Proxy12.submitDAG(Unknown Source)
>       at org.apache.tez.client.TezSession.submitDAG(TezSession.java:137)
>       ... 13 more
> Caused by: java.io.IOException: Failed on local exception: 
> java.io.EOFException; Host Details : local host is: "local/10.11.2.219"; 
> destination host is: "local":54446;
>       at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1408)
>       at org.apache.hadoop.ipc.Client.call(Client.java:1357)
>       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>       ... 15 more
> Caused by: java.io.EOFException
>       at java.io.DataInputStream.readInt(DataInputStream.java:375)
>       at 
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1049)
>       at org.apache.hadoop.ipc.Client$Connection.run(Client.java:944)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to