[ https://issues.apache.org/jira/browse/SPARK-25679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16642393#comment-16642393 ]
Yuming Wang commented on SPARK-25679: ------------------------------------- Thanks [~nanndomi], Could you provide which query are getting killed with OOM? > OOM Killed observed for spark thrift executors with dynamic allocation > enabled > ------------------------------------------------------------------------------- > > Key: SPARK-25679 > URL: https://issues.apache.org/jira/browse/SPARK-25679 > Project: Spark > Issue Type: Question > Components: Kubernetes > Affects Versions: 2.2.0 > Environment: Physical ab configurations. > 8 baremetal servers, > Each 56 Cores, 384GB RAM, RHEL 7.4 > Kernel : 3.10.0-862.9.1.el7.x86_64 > redhat-release-server.x86_64 7.4-18.el7 > > Spark Thrift server configurations > driver memory :10GB > driver core :4 > executor memory :35GB > executor core :8 > > Kubernetes info: > Client Version: version.Info\{Major:"1", Minor:"10", GitVersion:"v1.10.2", > GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", > BuildDate:"2018-04-27T09:22:21Z", GoVersion:"go1.9.3", Compiler:"gc", > Platform:"linux/amd64"} > Server Version: version.Info\{Major:"1", Minor:"10", GitVersion:"v1.10.2", > GitCommit:"81753b10df112992bf51bbc2c2f85208aad78335", GitTreeState:"clean", > BuildDate:"2018-04-27T09:10:24Z", GoVersion:"go1.9.3", Compiler:"gc", > Platform:"linux/amd64"} > Reporter: neenu > Priority: Major > > Spark thrift executors are getting killed with OOM error , where dynamic > allocation is enabled. > Tried to run TPCDS queries , on a 1TB parquet snappy data , where the > executor memory was set as 35GB and cores as 8. The max executors set was > 100. Saw around 30 executors running at a time. > Since dynamic allocation is enabled , where spark decides the no:of executors > being spawned , should there be OOM errors ? Couldn't the spark decide to > launch more executors to avoid the same ? > Note : There was enough cluster resources available to launch more executors > if needed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org