[jira] [Commented] (SPARK-13039) Spark Streaming with Mesos shutdown without any reason on logs

Shixiong Zhu (JIRA) Thu, 28 Jan 2016 11:33:12 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-13039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122166#comment-15122166
 ]


Shixiong Zhu commented on SPARK-13039:
--------------------------------------

May be killed by Mesos because exceeding the memory limitation, such as some 
memory leak in your app or Streaming. Could you check the Mesos log?

> Spark Streaming with Mesos shutdown without any reason on logs
> --------------------------------------------------------------
>
>                 Key: SPARK-13039
>                 URL: https://issues.apache.org/jira/browse/SPARK-13039
>             Project: Spark
>          Issue Type: Question
>          Components: Streaming
>    Affects Versions: 1.5.1
>            Reporter: Luis Alves
>            Priority: Minor
>
> I've a Spark Application running with Mesos that is being killed (this 
> happens every 2 days). When I see the logs, this is what I have in the spark 
> driver:
> {quote}
> 16/01/27 05:24:24 INFO JobScheduler: Starting job streaming job 1453872264000 
> ms.0 from job set of time 1453872264000 ms
> 16/01/27 05:24:24 INFO JobScheduler: Added jobs for time 1453872264000 ms
> 16/01/27 05:24:24 INFO SparkContext: Starting job: foreachRDD at 
> StreamingApplication.scala:59
> 16/01/27 05:24:24 INFO DAGScheduler: Got job 40085 (foreachRDD at 
> StreamingApplication.scala:59) with 1 output partitions
> 16/01/27 05:24:24 INFO DAGScheduler: Final stage: ResultStage 
> 40085(foreachRDD at StreamingApplication.scala:59)
> 16/01/27 05:24:24 INFO DAGScheduler: Parents of final stage: List()
> 16/01/27 05:24:24 INFO DAGScheduler: Missing parents: List()
> 16/01/27 05:24:24 INFO DAGScheduler: Submitting ResultStage 40085 
> (MapPartitionsRDD[80171] at map at StreamingApplication.scala:59), which has 
> no missing parents
> 16/01/27 05:24:24 INFO MemoryStore: ensureFreeSpace(4720) called with 
> curMem=147187, maxMem=560497950
> 16/01/27 05:24:24 INFO MemoryStore: Block broadcast_40085 stored as values in 
> memory (estimated size 4.6 KB, free 534.4 MB)
> Killed
> {quote}
> And this is what I see in the spark slaves:
> {quote}
> 16/01/27 05:24:20 INFO BlockManager: Removing RDD 80167
> 16/01/27 05:24:20 INFO BlockManager: Removing RDD 80166
> 16/01/27 05:24:20 INFO BlockManager: Removing RDD 80166
> I0127 05:24:24.070618 11142 exec.cpp:381] Executor asked to shutdown
> 16/01/27 05:24:24 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15: 
> SIGTERM
> 16/01/27 05:24:24 ERROR CoarseGrainedExecutorBackend: Driver 
> 10.241.10.13:51810 disassociated! Shutting down.
> 16/01/27 05:24:24 INFO DiskBlockManager: Shutdown hook called
> 16/01/27 05:24:24 WARN ReliableDeliverySupervisor: Association with remote 
> system [akka.tcp://[email protected]:51810] has failed, address is now 
> gated for [5000] ms. Reason: [Disassociated]
> 16/01/27 05:24:24 INFO ShutdownHookManager: Shutdown hook called
> 16/01/27 05:24:24 INFO ShutdownHookManager: Deleting directory 
> /tmp/spark-f80464b5-1de2-461e-b78b-8ddbd077682a
> {quote}
> As you can see, this doesn't give any information about the reason why the 
> driver was killed.
> The mesos version I'm using is 0.25.0.
> How can I get more information about why it is being killed?
> Curious fact: I also have a Spark Jobserver clustering running and without 
> any problem (same versions). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-13039) Spark Streaming with Mesos shutdown without any reason on logs

Reply via email to