[ 
https://issues.apache.org/jira/browse/SPARK-25679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-25679.
----------------------------------
    Resolution: Invalid

Questions should better go to mailing list as pointed out.

> OOM Killed observed for spark thrift executors with dynamic allocation 
> enabled 
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-25679
>                 URL: https://issues.apache.org/jira/browse/SPARK-25679
>             Project: Spark
>          Issue Type: Question
>          Components: Kubernetes
>    Affects Versions: 2.2.0
>         Environment: Physical ab configurations.
> 8 baremetal servers, 
> Each 56 Cores, 384GB RAM, RHEL 7.4
> Kernel : 3.10.0-862.9.1.el7.x86_64
> redhat-release-server.x86_64 7.4-18.el7
>  
> Spark Thrift server configurations 
> driver memory :10GB
> driver core :4
> executor memory :35GB
> executor core :8
>  
> Kubernetes info:
> Client Version: version.Info\{Major:"1", Minor:"10", GitVersion:"v1.10.2", 
> GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", 
> BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", 
> Platform:"linux/amd64"}
> Server Version: version.Info\{Major:"1", Minor:"10", GitVersion:"v1.10.2", 
> GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", 
> BuildDate:"2018-04-27T09:10:24Z", GoVersion:"go1.9.3", Compiler:"gc", 
> Platform:"linux/amd64"}
>            Reporter: neenu
>            Priority: Major
>         Attachments: query_0_correct.sql
>
>
> Spark thrift executors are getting killed with OOM error , where dynamic 
> allocation is enabled.
> Tried to run TPCDS queries , on a 1TB parquet snappy data , where the 
> executor memory was set as 35GB and cores as 8. The max executors set was 
> 100. Saw around 30 executors running at a time.
> Since dynamic allocation is enabled , where spark decides the no:of executors 
> being spawned , should there be OOM errors ? Couldn't the spark decide to 
> launch more executors to avoid the same ?
> Note : There was enough cluster resources available to launch more executors 
> if needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to