Hi, new to mesos. I setup a test cluster in EC2 (which required some
tweaks to the provided scripts - i'll try to send those back in) but I'm
not sure how I should write the framework for what I'd like to achieve.
I'd like to use Mesos to run a dynamically changing list of applications
like so:
[
{ job: app,
cpu: 3,
memory: 20g,
instances: 4},
{ job: app2,
cpu: 5,
memory: 30g,
instances: 2}
]
So I want the framework to pull the list out of a database/redis/zk and run
these apps across the cluster in a round robbin fashion until either the
cluster resources are exhausted or we've satisfied the number of running
instances. What I'm having trouble groking is how this would look on a
framework/executor level.
With a single framework it seems you're likely to fail using all available
resources given a long running daemon-like tasks:
*framework registered
*node offers 10 cpus
*framework accepts offer and at the time decides to to gives it 8cpu worth
of tasks to run.
*node has 2 cpus left over
*at some point the list of apps changes and a new app allocation is
needed.. Maybe we have a new app that could use those 2 cpus or maybe we
just need to adjust how many of the old apps are running. If I understand
the docs correctly, the nodes won't re-offer because they've already been
assigned tasks by the framework that will run forever.
Do I maybe submit a new framework for each app type then? Would that scale
to large numbers of apps?