There is a JIRA issue to track adding such functionality to spark-ec2: SPARK-2008 <https://issues.apache.org/jira/browse/SPARK-2008> - Enhance spark-ec2 to be able to add and remove slaves to an existing cluster
On Wed, Jul 23, 2014 at 10:12 AM, Akhil Das <ak...@sigmoidanalytics.com> wrote: > Hi > > Currently this is not supported out of the Box. But you can of course > add/remove workers in a running cluster. Better option would be to use a > Mesos cluster where adding/removing nodes are quiet simple. But again, i > believe adding new worker in the middle of a task won't give you better > performance. > > Thanks > Best Regards > > > On Wed, Jul 23, 2014 at 6:36 PM, Shubhabrata <mail2shu...@gmail.com> > wrote: > >> Hello, >> >> We plan to use Spark on EC2 for our data science pipeline. We successfully >> manage to set up cluster as-well-as launch and run applications on >> remote-clusters. However, to enhance scalability we would like to >> implement >> auto-scaling in EC2 for Spark applications. However, I did not find any >> proper reference about this. For example when we launch training programs >> that use Matlab scripts on EC2 cluster we do auto scaling by SQS. Can >> anyone >> please suggest what are the options for Spark ? This is especially more >> important when we would downscaling by removing a machine (how graceful >> can >> it be if it is in the middle of a task). >> >> Thanks in advance. >> >> Shubhabrata >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> > >