Hi David, First, glad to hear that you would like to contribute EC2 related patches. Looking forwarding to it!
Regarding your question about frameworks: You could absolutely do what you want to do with one framework. Below are some notes/suggestions. --> Mesos always re-offers un-used resources to frameworks. So, in your case, you can definitely schedule a 2 cpu job/task at a later point in time. --> If you want to adjust (by this I assume you mean kill some jobs/tasks?) old job/task, the framework typically needs to maintain a map of running tasks. You can then issue 'killTask()' calls, so that the tasks get killed and the corresponding resources are re-offered. --> Note that, for Mesos to re-offer resources after killing a task, the executor running that task needs to send a terminal (TASK_KILLED/TASK_FINISHED/TASK_LOST) status update. --> Mesos also re-offers resources, if an executor is terminated. These are the resources used by the executor and all its constituent tasks. --> Finally, if you don't want to write an executor (to begin with) Mesos has a built-in Command Executor. This executor just wraps your shell command, runs the command and exits when the command finishes. Hope that helps, On Sat, Apr 20, 2013 at 12:00 PM, David Challoner <[email protected]>wrote: > Hi, new to mesos. I setup a test cluster in EC2 (which required some > tweaks to the provided scripts - i'll try to send those back in) but I'm > not sure how I should write the framework for what I'd like to achieve. > > I'd like to use Mesos to run a dynamically changing list of applications > like so: > [ > { job: app, > cpu: 3, > memory: 20g, > instances: 4}, > { job: app2, > cpu: 5, > memory: 30g, > instances: 2} > ] > > So I want the framework to pull the list out of a database/redis/zk and run > these apps across the cluster in a round robbin fashion until either the > cluster resources are exhausted or we've satisfied the number of running > instances. What I'm having trouble groking is how this would look on a > framework/executor level. > > With a single framework it seems you're likely to fail using all available > resources given a long running daemon-like tasks: > *framework registered > *node offers 10 cpus > *framework accepts offer and at the time decides to to gives it 8cpu worth > of tasks to run. > *node has 2 cpus left over > *at some point the list of apps changes and a new app allocation is > needed.. Maybe we have a new app that could use those 2 cpus or maybe we > just need to adjust how many of the old apps are running. If I understand > the docs correctly, the nodes won't re-offer because they've already been > assigned tasks by the framework that will run forever. > > Do I maybe submit a new framework for each app type then? Would that scale > to large numbers of apps? >
