I checked master log before and did not find anything wrong. Unfortunately I 
have lost the master log now.

So you think master log will tell you why executor is down?

Regards,

Ningjun Wang


-----Original Message-----
From: Jean-Baptiste Onofré [mailto:j...@nanthrax.net] 
Sent: Tuesday, October 13, 2015 10:42 AM
To: user@spark.apache.org
Subject: Re: Why is my spark executor is terminated?

Hi Ningjun,

Nothing special in the master log ?

Regards
JB

On 10/13/2015 04:34 PM, Wang, Ningjun (LNG-NPV) wrote:
> We use spark on windows 2008 R2 servers. We use one spark context 
> which create one spark executor. We run spark master, slave, driver, 
> executor on one single machine.
>
>  From time to time, we found that the executor JAVA process was 
> terminated. I cannot fig out why it was terminated. Can anybody help 
> me on how to find out why the executor was terminated?
>
> The spark slave log. It shows that it kill the executor process
>
> 2015-10-13 09:58:06,087 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Asked to kill executor
> app-20151009201453-0000/0
>
> But why does it do that?
>
> Here is the detailed logs from spark slave
>
> 2015-10-13 09:58:04,915 WARN
> [sparkWorker-akka.actor.default-dispatcher-16]
> remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71)) 
> - Association with remote system 
> [akka.tcp://sparkexecu...@qa1-cas01.pcc.lexisnexis.com:61234] has 
> failed, address is now gated for [5000] ms. Reason is: [Disassociated].
>
> 2015-10-13 09:58:05,134 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message 
> [akka.remote.EndpointWriter$AckIdleCheckTimer$] from 
> Actor[akka://sparkWorker/system/endpointManager/reliableEndpointWriter
> -akka.tcp%3A%2F%2FsparkExecutor%40QA1-CAS01.pcc.lexisnexis.com%3A61234
> -2/endpointWriter#-175670388]
> to
> Actor[akka://sparkWorker/system/endpointManager/reliableEndpointWriter
> -akka.tcp%3A%2F%2FsparkExecutor%40QA1-CAS01.pcc.lexisnexis.com%3A61234
> -2/endpointWriter#-175670388] was not delivered. [2] dead letters 
> encountered. This logging can be turned off or adjusted with 
> configuration settings 'akka.log-dead-letters' and 
> 'akka.log-dead-letters-during-shutdown'.
>
> 2015-10-13 09:58:05,134 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message 
> [akka.remote.transport.AssociationHandle$Disassociated] from 
> Actor[akka://sparkWorker/deadLetters] to 
> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/ak
> kaProtocol-tcp%3A%2F%2FsparkWorker%4010.196.116.184%3A61236-3#-1210125
> 680] was not delivered. [3] dead letters encountered. This logging can 
> be turned off or adjusted with configuration settings 
> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>
> 2015-10-13 09:58:05,134 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message 
> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
> from Actor[akka://sparkWorker/deadLetters] to 
> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/ak
> kaProtocol-tcp%3A%2F%2FsparkWorker%4010.196.116.184%3A61236-3#-1210125
> 680] was not delivered. [4] dead letters encountered. This logging can 
> be turned off or adjusted with configuration settings 
> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>
> 2015-10-13 09:58:06,087 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Asked to kill executor
> app-20151009201453-0000/0
>
> 2015-10-13 09:58:06,103 INFO  [ExecutorRunner for 
> app-20151009201453-0000/0] worker.ExecutorRunner
> (Logging.scala:logInfo(59)) - Runner thread for executor
> app-20151009201453-0000/0 interrupted
>
> 2015-10-13 09:58:06,118 INFO  [ExecutorRunner for 
> app-20151009201453-0000/0] worker.ExecutorRunner
> (Logging.scala:logInfo(59)) - Killing process!
>
> 2015-10-13 09:58:06,509 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Executor app-20151009201453-0000/0 
> finished with state KILLED exitStatus 1
>
> 2015-10-13 09:58:06,509 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Cleaning up local directories for 
> application app-20151009201453-0000
>
> Thanks
>
> Ningjun Wang
>

--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to