Hi all, Has anyone tried out autoscaling Spark YARN cluster on a public cloud (e.g. EC2) based on workload? To be clear, I¹m interested in scaling the cluster itself up and down by adding and removing YARN nodes based on the cluster resource utilization (e.g. # of applications queued, # of resources available), as opposed to scaling resources assigned to Spark applications, which is natively supported by Spark¹s dynamic resource scheduling. I¹ve found that Cloudbreak <http://sequenceiq.com/cloudbreak-docs/latest/periscope/#how-it-works> has a similar feature, but it¹s in ³technical preview², and I didn¹t find much else from my search.
This might be a general YARN question, but wanted to check if there¹s a solution popular in the Spark community. Any sharing of experience around autoscaling will be helpful! Thanks, Mingyu
smime.p7s
Description: S/MIME cryptographic signature