Dear all,

Currently, I am running spark standalone cluster with ~100 nodes.
Multiple users can connect to the cluster by Spark-shell or PyShell.
However, I can't find an efficient way to control the resources among multiple 
users.


I can set "spark.deploy.defaultCores" in the server side to limit the cpus for 
each Application.
But, I cannot limit the memory usage by ease Application.
User can set "spark.executor.memory 10g spark.python.worker.memory 10g" in 
./conf/spark-defaults.conf in client side. 
Which means they can control how many resources they have.


How can I control resources in the server side?


Meanwhile, the Fair-shceduler requires the user to the pool explicitly. What if 
the user don't set it?
Can I maintain the user-to-pool mapping in the server side?


Thanks!

Reply via email to