Hi Tobias, Thanks for your reply, and the mesos-jetty project looks interesting. Let me describe my target app that should let you kind of get an idea about the use case and other scale up factors that I am talking about. The target app is either a simple standalone java netty based web server or a jetty web app. They all listen to one or more ports and are always exposed to the outside world via some loadbalancer (currently we use ELB). The metrics are currently collected using yammer-metrics library and published to graphite. This allows us to monitor some very specific/application specific metrics. The metrics may be something like memory load, average latency of writes to a database, end to end latency, cpu load, request rate, data-structure specific load factors, etc. These metrics are very specific and contribute highly towards determining the core cluster size, scale up size and the cool down period. We currently setup scale up/down policies based of high/low watermark of the metrics that we collect via amazon's cloud watch and dynamically adjust the size of the cluster. One of the very important things to keep in mind here is that some applications require us to maintain some (small amount) of state, it is not catastrophic to lose this data (such as in case of a node loss) but is is very helpful if we could do a graceful shutdown/restart/scale down of the instances themselves. Think of this as running through a commit log before actually killing an app. So the life-cycle of a node is important. Finally, deployments - we have a set of scripts that do a push button deployment, i.e. v123 -> v124 -> v125 is generally 1 button click each. The requirement here is that we either could stand a new cluster and do a red-black deployment at the loadbalancer level or do a rolling deployment. This is probably out of scope of what mesos want to get involved with but having start -> running -> shutdown -> killed lifecycle support would be a killer feature. I hope this kind of gives you an idea about what i am looking for as far as the requirements of the framework is concerned. As a suggestions, marathon could also think about supporting something like build packs (https://devcenter.heroku.com/articles/buildpacks). And using simple shell based interface to determine scale up requirement. Let me elaborate on that a bit. If we can assume that every app is a tgz file and has
> ./install.sh > ./startup.sh > ./status.sh > ./shutdown.sh > ./metrics.sh we can kind of build a pretty robust interface for deploying apps into containers with marathon. For example: in case of a jetty deployment we would have web-jetty.tgz the marathon executor first executes "install.sh", checks exit status, if it's okay it then goes on to execute ./startup.sh monitor the status using ./status.sh (periodically executed). The metrics.sh can be used to give back either a key,value pair of "monitored" metrics that the scheduler can use to determine current number of applications needed. All these are simple bash scripts that make it easy to test locally as well as on a cluster mesos setup. All this is just pretty fresh out of my head as I am writing/looking back at our deployment strategies, so I may not have been super clear about some stuff or over simplified some critical points. Please let me know what you think about this, I would love to contribute to marathon/mesos although I am pretty green as far as scala is concerned. Java is my strong suit. -- Ankur Chauhan [email protected] On Jan 13, 2014, 14:56:42, Tobias Knaup <[email protected]> wrote: Hey Ankur, your question is super timely, I've been working on a demo framework that shows exactly what you're trying to do with Jetty. The code is still a little rough and there are some hardcoded paths etc. but since you asked I just published it: https://github.com/guenter/jetty-mesos I'm also the main author of Marathon and auto scaling has been on my mind. The main question is what an interface for reporting load stats would look like. Curious what you think! On Sun, Jan 12, 2014 at 9:14 PM, Ankur Chauhan <[email protected](mailto:[email protected])> wrote: > Thanks everyone for all the help. > Marathon does seem like a good framework but my use case requires the app to > evaluate it's own health and scale up based on internal load stats (SLA > requirements) and I don't know if marathon supports that. This is the main > reason why i am looking at building out my own scheduler/executor. I will > give another go with Vinod's comments and have a look at the hadoop > scheduler. > > Just a recommendation to any mesos experts out there, it would be super > helpful if there was a complete mock app with annotated code somewhere. > Another good page on the website would a good FAQ page. > > I am still pretty n00b as far as mesos is concerned, so, pardon any stupid > comments/suggestions/questions. > > -- Ankur > > > On Fri, Dec 27, 2013 at 10:16 AM, Abhishek Parolkar > <[email protected](mailto:[email protected])> wrote: > > @Ankur, > > In case Marathon looks like direction you want to go with, I have a small > > demo in here if that helps http://www.youtube.com/watch?v=2YWVGMuMTrg > > > > -parolkar > > > > > > On Sat, Dec 28, 2013 at 2:10 AM, Vinod Kone > > <[email protected](mailto:[email protected])> wrote: > > > > I can't really find an example that is an end-to-end use case. By that > > > > I mean, I would like to know how to put the scheduler and the executor > > > > in the correct places. Right now I have a single jar with can be run > > > > from the command line: java -jar target/collector.jar and that would > > > > take care of everything. > > > > > > > This collector.jar can act as both scheduler and executor, presumably > > > based on command line flags? If yes, thats certainly doable. Typically > > > the scheduler and executor are split into separate jars. This makes it > > > easy to decouple the upgrade of scheduler and executor. > > > > > > > My current train of thought is that the webapp jar would stay somewhere > > > > on an S3 url and the "CollectorScheduler" would "somehow" tell a mesos > > > > slave to run the "CollectorExecutor" which in turn fetch the jar from > > > > S3 and run it. > > > > > > > > > > Yes, you are on the right track. Mesos slave can download the jar for you > > > as long as it could be accessed via (http://, https://, ftp://, hdfs:// > > > etc). This is how you do it: > > > > > > When you launch a task from the scheduler via 'launchTaks()' you give it > > > a 'vector<TaskInfo>' as one of the arguments. Since you are using a > > > custom executor you should set 'TaskInfo.ExecutorInfo' (see mesos.proto) > > > to point to your executor. To specify the S3 URL you would set > > > 'TaskInfo.ExecutorInfo.CommandInfo.URI.value'. To tell slave the command > > > to launch the executor after it downloads the the executor, you would set > > > 'TaskInfo.ExecutorInfo.CommandInfo.value'. > > > > > > You can find some examples here: > > > > > > Hadoop > > > scheduler(https://github.com/mesos/hadoop/blob/master/src/main/java/org/apache/hadoop/mapred/ResourcePolicy.java) > > > > > > > > > Example Java scheduler > > > (https://github.com/apache/mesos/blob/master/src/examples/java/TestFramework.java) > > > > > > > > > Hope that helps. Let us know if you have additional questions. > > > > > > Vinod >

