resource allocation spark on yarn

2014-12-12 Thread gpatcham
Hi All,

I have spark on yarn and there are multiple spark jobs on the cluster.
Sometimes some jobs are not getting enough resources even when there are
enough free resources available on cluster, even when I use below settings 

--num-workers 75 \
--worker-cores 16

Jobs stick with the resources what they get when job started.

Do we need to look at any other configs ? can some one give pointers on this
issue.

Thanks




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: resource allocation spark on yarn

2014-12-12 Thread Sameer Farooqui
Hi,

FYI - There are no Worker JVMs used when Spark is launched under YARN.
Instead the NodeManager in YARN does what the Worker JVM does in Spark
Standalone mode.

For YARN you'll want to look into the following settings:

--num-executors: controls how many executors will be allocated
--executor-memory: RAM for each executor
--executor-cores: CPU cores for each executor

Also, look into the following for Dynamic Allocation:
spark.dynamicAllocation.enabled
spark.dynamicAllocation.minExecutors
spark.dynamicAllocation.maxExecutors
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N)
spark.dynamicAllocation.schedulerBacklogTimeout (M)
spark.dynamicAllocation.executorIdleTimeout (K)


Link to Dynamic Allocation code (with comments on how to use this feature):
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala


On Fri, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote:

 Hi All,

 I have spark on yarn and there are multiple spark jobs on the cluster.
 Sometimes some jobs are not getting enough resources even when there are
 enough free resources available on cluster, even when I use below settings

 --num-workers 75 \
 --worker-cores 16

 Jobs stick with the resources what they get when job started.

 Do we need to look at any other configs ? can some one give pointers on
 this
 issue.

 Thanks




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: resource allocation spark on yarn

2014-12-12 Thread Giri P
but on spark 0.9 we don't have  these options

--num-executors: controls how many executors will be allocated
--executor-memory: RAM for each executor
--executor-cores: CPU cores for each executor

On Fri, Dec 12, 2014 at 12:27 PM, Sameer Farooqui same...@databricks.com
wrote:

 Hi,

 FYI - There are no Worker JVMs used when Spark is launched under YARN.
 Instead the NodeManager in YARN does what the Worker JVM does in Spark
 Standalone mode.

 For YARN you'll want to look into the following settings:

 --num-executors: controls how many executors will be allocated
 --executor-memory: RAM for each executor
 --executor-cores: CPU cores for each executor

 Also, look into the following for Dynamic Allocation:
 spark.dynamicAllocation.enabled
 spark.dynamicAllocation.minExecutors
 spark.dynamicAllocation.maxExecutors
 spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N)
 spark.dynamicAllocation.schedulerBacklogTimeout (M)
 spark.dynamicAllocation.executorIdleTimeout (K)


 Link to Dynamic Allocation code (with comments on how to use this feature):

 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala


 On Fri, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote:

 Hi All,

 I have spark on yarn and there are multiple spark jobs on the cluster.
 Sometimes some jobs are not getting enough resources even when there are
 enough free resources available on cluster, even when I use below settings

 --num-workers 75 \
 --worker-cores 16

 Jobs stick with the resources what they get when job started.

 Do we need to look at any other configs ? can some one give pointers on
 this
 issue.

 Thanks




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: resource allocation spark on yarn

2014-12-12 Thread Tsuyoshi OZAWA
Hi,

In addition to the options Sameer Mentioned, we need to enable
external shuffle manager, right?

Thanks,
- Tsuyoshi

On Sat, Dec 13, 2014 at 5:27 AM, Sameer Farooqui same...@databricks.com wrote:
 Hi,

 FYI - There are no Worker JVMs used when Spark is launched under YARN.
 Instead the NodeManager in YARN does what the Worker JVM does in Spark
 Standalone mode.

 For YARN you'll want to look into the following settings:

 --num-executors: controls how many executors will be allocated
 --executor-memory: RAM for each executor
 --executor-cores: CPU cores for each executor

 Also, look into the following for Dynamic Allocation:
 spark.dynamicAllocation.enabled
 spark.dynamicAllocation.minExecutors
 spark.dynamicAllocation.maxExecutors
 spark.dynamicAllocation.sustainedSchedulerBacklogTimeout (N)
 spark.dynamicAllocation.schedulerBacklogTimeout (M)
 spark.dynamicAllocation.executorIdleTimeout (K)


 Link to Dynamic Allocation code (with comments on how to use this feature):
 https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala


 On Fri, Dec 12, 2014 at 10:52 AM, gpatcham gpatc...@gmail.com wrote:

 Hi All,

 I have spark on yarn and there are multiple spark jobs on the cluster.
 Sometimes some jobs are not getting enough resources even when there are
 enough free resources available on cluster, even when I use below settings

 --num-workers 75 \
 --worker-cores 16

 Jobs stick with the resources what they get when job started.

 Do we need to look at any other configs ? can some one give pointers on
 this
 issue.

 Thanks




 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/resource-allocation-spark-on-yarn-tp20664.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org





-- 
- Tsuyoshi

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org