[
https://issues.apache.org/jira/browse/TEZ-494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541421#comment-14541421
]
Richard Huang commented on TEZ-494:
-----------------------------------
[~zjffdu] And full log
2015-05-13 05:59:08,576 INFO [main] app.DAGAppMaster: Created DAGAppMaster for
application appattempt_1430152270521_27653_000001, versionInfo=[
component=tez-dag, version=0.5.3,
revision=043c2c1511f1134add20e98e69a0fab49702667f,
SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git,
buildTime=20150430-1655 ]
2015-05-13 05:59:09,061 INFO [main] Configuration.deprecation: fs.default.name
is deprecated. Instead, use fs.defaultFS
2015-05-13 05:59:09,989 INFO [main] app.DAGAppMaster: Comparing client version
with AM version, clientVersion=0.5.3, AMVersion=0.5.3
2015-05-13 05:59:10,739 INFO [main] app.DAGAppMaster: Adding session token to
jobTokenSecretManager for application
2015-05-13 05:59:10,761 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.rm.container.AMContainerEventType for class
org.apache.tez.dag.app.rm.container.AMContainerMap
2015-05-13 05:59:10,764 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.rm.node.AMNodeEventType for class
org.apache.tez.dag.app.rm.node.AMNodeTracker
2015-05-13 05:59:10,767 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.dag.event.DAGAppMasterEventType for class
org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler
2015-05-13 05:59:10,768 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.dag.event.DAGEventType for class
org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher
2015-05-13 05:59:10,779 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.dag.event.VertexEventType for class
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher
2015-05-13 05:59:10,781 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.dag.event.TaskEventType for class
org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher
2015-05-13 05:59:10,783 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.dag.event.TaskAttemptEventType for class
org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher
2015-05-13 05:59:10,793 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.rm.AMSchedulerEventType for class
org.apache.tez.dag.app.rm.TaskSchedulerEventHandler
2015-05-13 05:59:10,797 INFO [main] event.AsyncDispatcher: Registering class
org.apache.tez.dag.app.rm.NMCommunicatorEventType for class
org.apache.tez.dag.app.launcher.ContainerLauncherImpl
2015-05-13 05:59:11,116 INFO [main] node.AMNodeTracker: blacklistDisablePercent
is 33, blacklistingEnabled: true, maxTaskFailuresPerNode: 3
2015-05-13 05:59:11,118 INFO [main] launcher.ContainerLauncherImpl: Upper limit
on the thread pool size is 500
2015-05-13 05:59:11,118 INFO [main] history.HistoryEventHandler: Initializing
HistoryEventHandler
2015-05-13 05:59:11,125 INFO [main] impl.SimpleHistoryLoggingService: Log file
location for SimpleHistoryLoggingService not specified, defaulting to
containerLogDir=/opt/mapr/hadoop/hadoop-2.4.1/logs/userlogs/application_1430152270521_27653/container_1430152270521_27653_01_000005
2015-05-13 05:59:11,126 INFO [main] impl.SimpleHistoryLoggingService:
Initializing SimpleHistoryLoggingService,
logFileLocation=/opt/mapr/hadoop/hadoop-2.4.1/logs/userlogs/application_1430152270521_27653/container_1430152270521_27653_01_000005/history.txt.appattempt_1430152270521_27653_000001,
maxErrors=10
2015-05-13 05:59:11,127 INFO [main] recovery.RecoveryService: Initializing
RecoveryService
2015-05-13 05:59:11,129 INFO [main] history.HistoryEventHandler:
[HISTORY][DAG:N/A][Event:APP_LAUNCHED]:
applicationId=application_1430152270521_27653, appSubmitTime=1431496719307,
launchTime=1431496748534
2015-05-13 05:59:11,131 INFO [main] history.HistoryEventHandler:
[HISTORY][DAG:N/A][Event:AM_LAUNCHED]:
appAttemptId=appattempt_1430152270521_27653_000001,
appSubmitTime=1431496719307, launchTime=1431496748534
2015-05-13 05:59:11,251 INFO
[ServiceThread:org.apache.tez.dag.history.HistoryEventHandler]
history.HistoryEventHandler: Starting HistoryEventHandler
2015-05-13 05:59:11,253 INFO
[ServiceThread:org.apache.tez.dag.history.HistoryEventHandler]
impl.SimpleHistoryLoggingService: Starting SimpleHistoryLoggingService
2015-05-13 05:59:11,279 INFO
[ServiceThread:org.apache.tez.dag.app.launcher.ContainerLauncherImpl]
impl.ContainerManagementProtocolProxy: yarn.client.max-nodemanagers-proxies :
500
2015-05-13 05:59:11,344 INFO [ServiceThread:DAGClientRPCServer]
ipc.CallQueueManager: Using callQueue class
java.util.concurrent.LinkedBlockingQueue
2015-05-13 05:59:11,346 INFO
[ServiceThread:org.apache.tez.dag.app.TaskAttemptListenerImpTezDag]
ipc.CallQueueManager: Using callQueue class
java.util.concurrent.LinkedBlockingQueue
2015-05-13 05:59:11,449 INFO [Socket Reader #1 for port 32794] ipc.Server:
Starting Socket Reader #1 for port 32794
2015-05-13 05:59:11,468 INFO [Socket Reader #1 for port 49701] ipc.Server:
Starting Socket Reader #1 for port 49701
2015-05-13 05:59:11,573 INFO [IPC Server listener on 32794] ipc.Server: IPC
Server listener on 32794: starting
2015-05-13 05:59:11,574 INFO [IPC Server Responder] ipc.Server: IPC Server
Responder: starting
2015-05-13 05:59:11,620 INFO [ServiceThread:DAGClientRPCServer]
client.DAGClientServer: Instantiated DAGClientRPCServer at
IRVHadoop01b.uswest.bluecava.net/10.52.2.101:32794
2015-05-13 05:59:11,650 INFO [IPC Server Responder] ipc.Server: IPC Server
Responder: starting
2015-05-13 05:59:11,687 INFO
[ServiceThread:org.apache.tez.dag.history.HistoryEventHandler]
recovery.RecoveryService: Starting RecoveryService
2015-05-13 05:59:11,688 INFO [IPC Server listener on 49701] ipc.Server: IPC
Server listener on 49701: starting
2015-05-13 05:59:11,724 INFO [HistoryEventHandlingThread]
impl.SimpleHistoryLoggingService: Writing event APP_LAUNCHED to history file
2015-05-13 05:59:11,912 INFO
[ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler]
rm.YarnTaskSchedulerService: TaskScheduler initialized with configuration:
maxRMHeartbeatInterval: 1000, containerReuseEnabled: true, reuseRackLocal:
true, reuseNonLocal: false, localitySchedulingDelay: 250, preemptionPercentage:
10, numHeartbeatsBetweenPreemptions3, idleContainerMinTimeout=5000,
idleContainerMaxTimeout=10000, sessionMinHeldContainers=0
2015-05-13 05:59:12,059 INFO [HistoryEventHandlingThread]
impl.SimpleHistoryLoggingService: Writing event AM_LAUNCHED to history file
2015-05-13 05:59:12,060 INFO
[ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler]
client.RMProxy: Connecting to ResourceManager at
IRVHadoopSvc03.uswest.bluecava.net/10.52.0.13:8030
2015-05-13 05:59:12,548 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:12,564 INFO [AsyncDispatcher event handler]
node.AMNodeTracker: Num cluster nodes = 20
2015-05-13 05:59:12,591 INFO [main] impl.MetricsConfig: loaded properties from
hadoop-metrics2.properties
2015-05-13 05:59:12,775 INFO [main] impl.MetricsSystemImpl: Scheduled snapshot
period at 10 second(s).
2015-05-13 05:59:12,776 INFO [main] impl.MetricsSystemImpl: DAGAppMaster
metrics system started
2015-05-13 05:59:12,786 INFO [HistoryEventHandlingThread]
impl.SimpleHistoryLoggingService: Writing event AM_STARTED to history file
2015-05-13 05:59:12,786 INFO [main] history.HistoryEventHandler:
[HISTORY][DAG:N/A][Event:AM_STARTED]:
appAttemptId=appattempt_1430152270521_27653_000001, startTime=1431496752784
2015-05-13 05:59:12,787 INFO [main] app.DAGAppMaster: In Session mode. Waiting
for DAG over RPC
2015-05-13 05:59:13,545 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:13,899 WARN [Socket Reader #1 for port 32794] ipc.Server:
Requested data length 77710987 is longer than maximum configured RPC length
67108864. RPC came from 10.51.5.41
2015-05-13 05:59:13,901 INFO [Socket Reader #1 for port 32794] ipc.Server:
Socket Reader #1 for port 32794: readAndProcess from client 10.51.5.41 threw
exception [java.io.IOException: Requested data length 77710987 is longer than
maximum configured RPC length 67108864. RPC came from 10.51.5.41]
java.io.IOException: Requested data length 77710987 is longer than maximum
configured RPC length 67108864. RPC came from 10.51.5.41
at
org.apache.hadoop.ipc.Server$Connection.checkDataLength(Server.java:1433)
at
org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1495)
at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:752)
at
org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:626)
at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:597)
2015-05-13 05:59:14,547 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794]
client.DAGClientHandler: Received message to shutdown AM
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794]
rm.TaskSchedulerEventHandler: TaskScheduler notified that it should unregister
from RM
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster:
No current running DAG, shutting down the AM
2015-05-13 05:59:14,809 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster:
DAGAppMasterShutdownHandler invoked
2015-05-13 05:59:14,810 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster:
Handling DAGAppMaster shutdown
2015-05-13 05:59:14,817 INFO [AMShutdownThread] app.DAGAppMaster: Sleeping for
5 seconds before shutting down
2015-05-13 05:59:14,827 INFO [IPC Server handler 0 on 32794]
client.DAGClientHandler: Received message to shutdown AM
2015-05-13 05:59:14,827 INFO [IPC Server handler 0 on 32794]
rm.TaskSchedulerEventHandler: TaskScheduler notified that it should unregister
from RM
2015-05-13 05:59:14,827 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster:
No current running DAG, shutting down the AM
2015-05-13 05:59:14,838 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster:
DAGAppMasterShutdownHandler invoked
2015-05-13 05:59:14,838 INFO [IPC Server handler 0 on 32794] app.DAGAppMaster:
Ignoring multiple shutdown events
2015-05-13 05:59:15,549 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:16,550 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:17,552 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:18,554 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:19,556 INFO [AMRM Callback Handler Thread]
rm.YarnTaskSchedulerService: App total resource memory: 0 cpu: 0
taskAllocations: 0
2015-05-13 05:59:19,818 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop
for all the services
2015-05-13 05:59:19,819 INFO [AMShutdownThread] history.HistoryEventHandler:
Stopping HistoryEventHandler
2015-05-13 05:59:19,820 INFO [AMShutdownThread] recovery.RecoveryService:
Stopping RecoveryService
2015-05-13 05:59:19,820 INFO [AMShutdownThread]
impl.SimpleHistoryLoggingService: Stopping SimpleHistoryLoggingService,
eventQueueBacklog=0
2015-05-13 05:59:19,821 INFO [HistoryEventHandlingThread]
impl.SimpleHistoryLoggingService: EventQueue take interrupted. Returning
2015-05-13 05:59:19,821 INFO [RecoveryEventHandlingThread]
recovery.RecoveryService: EventQueue take interrupted. Returning
2015-05-13 05:59:19,823 INFO [DelayedContainerManager]
rm.YarnTaskSchedulerService: AllocatedContainerManager Thread interrupted
2015-05-13 05:59:19,828 INFO [AMShutdownThread] rm.YarnTaskSchedulerService:
Unregistering application from RM, exitStatus=SUCCEEDED, exitMessage=Session
stats:submittedDAGs=0, successfulDAGs=0, failedDAGs=0, killedDAGs=0
, trackingURL=
2015-05-13 05:59:19,839 INFO [AMShutdownThread] impl.AMRMClientImpl: Waiting
for application to be successfully unregistered.
2015-05-13 05:59:19,940 INFO [AMShutdownThread] rm.YarnTaskSchedulerService:
Successfully unregistered application from RM
2015-05-13 05:59:20,044 INFO [AMRM Callback Handler Thread]
impl.AMRMClientAsyncImpl: Interrupted while waiting for queue
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)
2015-05-13 05:59:20,044 INFO [AMShutdownThread] ipc.Server: Stopping server on
49701
2015-05-13 05:59:20,049 INFO [AMShutdownThread] ipc.Server: Stopping server on
32794
2015-05-13 05:59:20,049 INFO [IPC Server listener on 49701] ipc.Server:
Stopping IPC Server listener on 49701
2015-05-13 05:59:20,051 INFO [IPC Server Responder] ipc.Server: Stopping IPC
Server Responder
2015-05-13 05:59:20,052 INFO [IPC Server listener on 32794] ipc.Server:
Stopping IPC Server listener on 32794
2015-05-13 05:59:20,052 INFO [IPC Server Responder] ipc.Server: Stopping IPC
Server Responder
2015-05-13 05:59:20,071 INFO [Thread-1] app.DAGAppMaster:
DAGAppMasterShutdownHook invoked
2015-05-13 05:59:20,071 INFO [Thread-1] app.DAGAppMaster: The shutdown handler
is still running, waiting for it to complete
2015-05-13 05:59:20,798 INFO [AMShutdownThread] app.DAGAppMaster: Completed
deletion of tez scratch data dir,
path=maprfs:/tmp/temp-427461361/.tez/application_1430152270521_27653
2015-05-13 05:59:20,798 INFO [AMShutdownThread] app.DAGAppMaster: Exiting
DAGAppMaster..GoodBye!
2015-05-13 05:59:20,798 INFO [Thread-1] app.DAGAppMaster: The shutdown handler
has completed
> Investigate: Exception while submitting a dag to a TezSession
> -------------------------------------------------------------
>
> Key: TEZ-494
> URL: https://issues.apache.org/jira/browse/TEZ-494
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Siddharth Seth
> Assignee: Rekha Joshi
>
> {code}
> destination host is: "local":54446;
> at org.apache.tez.client.TezSession.submitDAG(TezSession.java:139)
> at
> org.apache.tez.mapreduce.examples.TestBroadcast.main(TestBroadcast.java:155)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
> at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> at
> org.apache.tez.mapreduce.examples.ExampleDriver.main(ExampleDriver.java:78)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed
> on local exception: java.io.EOFException; Host Details : local host is:
> "local/10.11.2.219"; destination host is: "local":54446;
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
> at $Proxy12.submitDAG(Unknown Source)
> at org.apache.tez.client.TezSession.submitDAG(TezSession.java:137)
> ... 13 more
> Caused by: java.io.IOException: Failed on local exception:
> java.io.EOFException; Host Details : local host is: "local/10.11.2.219";
> destination host is: "local":54446;
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764)
> at org.apache.hadoop.ipc.Client.call(Client.java:1408)
> at org.apache.hadoop.ipc.Client.call(Client.java:1357)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
> ... 15 more
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at
> org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1049)
> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:944)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)