[ 
https://issues.apache.org/jira/browse/SPARK-11049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen resolved SPARK-11049.
-------------------------------
    Resolution: Not A Problem

Pending more info

> If a single executor fails to allocate memory, entire job fails
> ---------------------------------------------------------------
>
>                 Key: SPARK-11049
>                 URL: https://issues.apache.org/jira/browse/SPARK-11049
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.4.0
>            Reporter: Brian
>
> To reproduce:
> * Create a spark cluster using start-master.sh and start-slave.sh (I believe 
> this is the "standalone cluster manager?").  
> * Leave a process running on some nodes that take up about significant 
> amounts of RAM.
> * Leave some nodes with plenty of RAM to run spark.
> * Run a job against this cluster with spark.executor.memory asking for all or 
> most of the memory available on each node.
> On the node that has insufficient memory, there will of course be an error 
> like:
> Error occurred during initialization of VM
> Could not reserve enough space for object heap
> Could not create the Java virtual machine.
> On the driver node, and in the spark master UI, I see that _all_ executors 
> exit or are killed, and the entire job fails.  It would be better if there 
> was an indication of which individual node is actually at fault.  It would 
> also be better if the cluster manager could handle failing-over to nodes that 
> are still operating properly and have sufficient RAM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to