Re: job reports as KILLED in standalone mode

Ameet Kini Fri, 18 Oct 2013 09:17:39 -0700

Gotcha, so its expected behavior. Thanks Aaron.

Ameet



On Fri, Oct 18, 2013 at 12:10 PM, Aaron Davidson <[email protected]> wrote:

> Whenever an Executor ends, it enters into one of three states: KILLED,
> FAILED, LOST (see: 
> 1<https://github.com/falaki/incubator-spark/blob/79868fe7246d8e6d57e0a376b2593fabea9a9d83/core/src/main/scala/org/apache/spark/deploy/ExecutorState.scala>).
> None of these sound like "exited cleanly," which I agree is weird, but I
> don't believe this is a regression, as it has been this way for quite some
> time. Out of the three, KILLED sounds most reasonable for normal
> termination.
>
> I've went ahead and created
> https://spark-project.atlassian.net/browse/SPARK-937 to fix this.
>
>
> On Fri, Oct 18, 2013 at 7:56 AM, Ameet Kini <[email protected]> wrote:
>
>> Jey,
>>
>> I don't see a "close()" method on SparkContext.
>>
>> http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.SparkContext
>>
>> I tried the "stop()" method but still see the job is reported KILLED.
>> Btw, I don't recall getting this behavior in 0.7.3, my standalone programs
>> used to cleanly shutdown without requiring any further operations on
>> SparkContext. Also, I notice that none of the examples do a stop() or any
>> other closing method calls on the SparkContext, so I'm not sure what I
>> could be doing differently with the SparkContext that jobs get reported as
>> KILLED even though they run through successfully.
>>
>> Ameet
>>
>>
>> On Thu, Oct 17, 2013 at 5:59 PM, Jey Kottalam <[email protected]>wrote:
>>
>>> You can try calling the "close()" method on your SparkContext, which
>>> should allow for a cleaner shutdown.
>>>
>>> On Thu, Oct 17, 2013 at 2:38 PM, Ameet Kini <[email protected]> wrote:
>>> >
>>> > I'm using the scala 2.10 branch of Spark in standalone mode, and am
>>> seeing
>>> > the job reports itself as KILLED in the UI with the below message in
>>> each of
>>> > the executors log, even though the job processes correctly and returns
>>> the
>>> > correct result. The job is triggered by a .count on an RDD and the
>>> count
>>> > seems right. The only thing I can thing of is I'm doing a
>>> System.exit(0) at
>>> > the end of the main method. If I remove that call, I don't see the
>>> below
>>> > message but the job hangs, and the UI reports it as still running.
>>> >
>>> >
>>> >
>>> >
>>> > 13/10/17 15:31:52 INFO actor.LocalActorRef: Message
>>> > [akka.remote.transport.AssociationHandle$Disassociated] from
>>> > Actor[akka://spark/deadLetters] to
>>> >
>>> Actor[akka://spark/system/transports/akkaprotocolmanager.tcp1/akkaProtocol-tcp%3A%2F%2Fspark%40ec2-cdh4u2-dev-master.geoeyeanalytics.ec2%3A47366-1#136073268]
>>> > was not delivered. [1] dead letters encountered. This logging can be
>>> turned
>>> > off or adjusted with configuration settings 'akka.log-dead-letters' and
>>> > 'akka.log-dead-letters-during-shutdown'.
>>> > 13/10/17 15:31:52 ERROR executor.StandaloneExecutorBackend: Driver
>>> > terminated or disconnected! Shutting down.
>>> > 13/10/17 15:31:52 INFO actor.LocalActorRef: Message
>>> > [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
>>> from
>>> > Actor[akka://spark/deadLetters] to
>>> >
>>> Actor[akka://spark/system/transports/akkaprotocolmanager.tcp1/akkaProtocol-tcp%3A%2F%2Fspark%40ec2-cdh4u2-dev-master.geoeyeanalytics.ec2%3A47366-1#136073268]
>>> > was not delivered. [2] dead letters encountered. This logging can be
>>> turned
>>> > off or adjusted with configuration settings 'akka.log-dead-letters' and
>>> > 'akka.log-dead-letters-during-shutdown'.
>>> > 13/10/17 15:31:52 INFO actor.LocalActorRef: Message
>>> > [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
>>> from
>>> > Actor[akka://sparkExecutor/deadLetters] to
>>> >
>>> Actor[akka://sparkExecutor/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2Fspark%40ec2-cdh4u2-dev-master.geoeyeanalytics.ec2%3A47366-1#593252773]
>>> > was not delivered. [1] dead letters encountered. This logging can be
>>> turned
>>> > off or adjusted with configuration settings 'akka.log-dead-letters' and
>>> > 'akka.log-dead-letters-during-shutdown'.
>>> > 13/10/17 15:31:52 ERROR remote.EndpointWriter: AssociationError
>>> > [akka.tcp://sparkExecutor@ec2-cdh4u2-dev-slave1:46566] ->
>>> > [akka.tcp://spark@ec2-cdh4u2-dev-master:47366]: Error [Association
>>> failed
>>> > with [akka.tcp://[email protected]:47366]]
>>> [
>>> > akka.remote.EndpointAssociationException: Association failed with
>>> > [akka.tcp://spark@ec2-cdh4u2-dev-master:47366]
>>>
>>
>>
>

Re: job reports as KILLED in standalone mode

Reply via email to