I assume you've looked into dynamic allocation. What do you need that isn't provided by dynamic allocation?
On Mon, Feb 27, 2017 at 4:11 AM, David J. Palaitis < [email protected]> wrote: > by using a combination of Spark's dynamic allocation, http://spark. > apache.org/docs/latest/job-scheduling.html#configuration-and-setup, and a > framework scheduler like Cook, https://github.com/ > twosigma/Cook/tree/master/spark, you can achieve the desired auto-scaling > effect without the overhead of managing roles/constraints in mesos. i'd be > happy to discuss this in more detail if you decide to give it a try. > > On Mon, Feb 27, 2017 at 3:14 AM, Ashish Mehta <[email protected]> > wrote: > >> Hi, >> >> We want to move to auto-scaling of spark driver, where in more resources >> are added into the available resources for "spark driver" based on >> requirement. The requirement can increase/decrease based on multiple SQL >> queries being done over REST server, or number of queries with multiple >> user over thrift server over Spark (HiveServer2). >> >> *Existing approach with static number of resources:* >> We have a very large pool of resources, but my "spark driver" is >> allocated limited amount of "static" resource, and we achieve this by >> following >> >> 1. While running the application I tag machine in Mesos with the name >> of my application, so that the offer is made accordingly. >> 2. My application is run with the constraint for above tagged machine >> using "spark.mesos.constraints" configuration, so that the >> application only accept offer made by these tagged machine, and don't eat >> up all the resource in my very large cluster. >> 3. Application launches executor on these accepted offers, and they >> are used to do computation as defined by Spark job, or as and when queries >> are fired over HTTP/Thrift server. >> >> *Approach for auto scaling:* >> Auto-scaling of driver helps us in many ways, and lets us use the >> resources with better efficiency. >> For enabling auto scaling, where in my spark application will get more >> and more resource offers, if it has consumed all the available resource, >> the workflow will be as follows >> >> 1. Running a daemon to monitor my app on Mesos >> 2. Keep on adding/removing machine for the application by >> tagging/untagging them by monitoring the resource usage metric for my >> application on Mesos. >> 3. Scale up/down based on Step 2 by tagging and untagging, and take >> "some buffer" into account. >> >> I wanted to know the opinion of you guys on "*Approach for auto scaling*". >> Is this the right approach to solve auto scaling of Spark driver? >> Also tagging/untagging machine is something which we do to limit/manage >> the resources in our big cluster. >> >> Thanks, >> Ashish >> > > -- Michael Gummelt Software Engineer Mesosphere

