Hello, We plan to use Spark on EC2 for our data science pipeline. We successfully manage to set up cluster as-well-as launch and run applications on remote-clusters. However, to enhance scalability we would like to implement auto-scaling in EC2 for Spark applications. However, I did not find any proper reference about this. For example when we launch training programs that use Matlab scripts on EC2 cluster we do auto scaling by SQS. Can anyone please suggest what are the options for Spark ? This is especially more important when we would downscaling by removing a machine (how graceful can it be if it is in the middle of a task).
Thanks in advance. Shubhabrata -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Down-scaling-Spark-on-EC2-cluster-tp10494.html Sent from the Apache Spark User List mailing list archive at Nabble.com.