resource allocation spark on yarn
Hi All, I have spark on yarn and there are multiple spark jobs on the cluster. Sometimes some jobs are not getting enough resources even when there are enough free resources available on cluster, even when I use below settings --num-workers 75 \ --worker-cores 16 Jobs stick with the resources what they get when job started. Do we need to look at any other configs ? can some one give pointers on this issue. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: resource allocation spark on yarn
Hi, FYI - There are no Worker JVMs used when Spark is launched under YARN. Instead the NodeManager in YARN does what the Worker JVM does in Spark Standalone mode. For YARN you'll want to look into the following settings: --num-executors: controls how many executors will be allocated --executor-memory: RAM for each executor --executor-cores: CPU cores for each executor Also, look into the following for Dynamic Allocation: spark.dynamicAllocation.enabled spark.dynamicAllocation.minExecutors spark.dynamicAllocation.maxExecutors spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N) spark.dynamicAllocation.schedulerBacklogTimeout (M) spark.dynamicAllocation.executorIdleTimeout (K) Link to Dynamic Allocation code (with comments on how to use this feature): https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala On Fri, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote: Hi All, I have spark on yarn and there are multiple spark jobs on the cluster. Sometimes some jobs are not getting enough resources even when there are enough free resources available on cluster, even when I use below settings --num-workers 75 \ --worker-cores 16 Jobs stick with the resources what they get when job started. Do we need to look at any other configs ? can some one give pointers on this issue. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: resource allocation spark on yarn
but on spark 0.9 we don't have these options --num-executors: controls how many executors will be allocated --executor-memory: RAM for each executor --executor-cores: CPU cores for each executor On Fri, Dec 12, 2014 at 12:27 PM, Sameer Farooqui same...@databricks.com wrote: Hi, FYI - There are no Worker JVMs used when Spark is launched under YARN. Instead the NodeManager in YARN does what the Worker JVM does in Spark Standalone mode. For YARN you'll want to look into the following settings: --num-executors: controls how many executors will be allocated --executor-memory: RAM for each executor --executor-cores: CPU cores for each executor Also, look into the following for Dynamic Allocation: spark.dynamicAllocation.enabled spark.dynamicAllocation.minExecutors spark.dynamicAllocation.maxExecutors spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N) spark.dynamicAllocation.schedulerBacklogTimeout (M) spark.dynamicAllocation.executorIdleTimeout (K) Link to Dynamic Allocation code (with comments on how to use this feature): https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala On Fri, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote: Hi All, I have spark on yarn and there are multiple spark jobs on the cluster. Sometimes some jobs are not getting enough resources even when there are enough free resources available on cluster, even when I use below settings --num-workers 75 \ --worker-cores 16 Jobs stick with the resources what they get when job started. Do we need to look at any other configs ? can some one give pointers on this issue. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: resource allocation spark on yarn
Hi, In addition to the options Sameer Mentioned, we need to enable external shuffle manager, right? Thanks, - Tsuyoshi On Sat, Dec 13, 2014 at 5:27 AM, Sameer Farooqui same...@databricks.com wrote: Hi, FYI - There are no Worker JVMs used when Spark is launched under YARN. Instead the NodeManager in YARN does what the Worker JVM does in Spark Standalone mode. For YARN you'll want to look into the following settings: --num-executors: controls how many executors will be allocated --executor-memory: RAM for each executor --executor-cores: CPU cores for each executor Also, look into the following for Dynamic Allocation: spark.dynamicAllocation.enabled spark.dynamicAllocation.minExecutors spark.dynamicAllocation.maxExecutors spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N) spark.dynamicAllocation.schedulerBacklogTimeout (M) spark.dynamicAllocation.executorIdleTimeout (K) Link to Dynamic Allocation code (with comments on how to use this feature): https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala On Fri, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote: Hi All, I have spark on yarn and there are multiple spark jobs on the cluster. Sometimes some jobs are not getting enough resources even when there are enough free resources available on cluster, even when I use below settings --num-workers 75 \ --worker-cores 16 Jobs stick with the resources what they get when job started. Do we need to look at any other configs ? can some one give pointers on this issue. Thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- - Tsuyoshi - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org