On 13 November 2013 14:51, Khanh-Toan Tran <khanh-toan.t...@cloudwatt.com> wrote: > Well, I don't know what John means by "modify the over-commit calculation in > the scheduler", so I cannot comment.
I was talking about this code: https://github.com/openstack/nova/blob/master/nova/scheduler/filters/core_filter.py#L64 But I am not sure thats what you want. > The idea of choosing free host for Hadoop on the fly is rather complicated > and contains several operations, namely: (1) assuring the host never get > pass 100% CPU load; (2) identifying a host that already has a Hadoop VM > running on it, or already 100% CPU commitment; (3) releasing the host from > 100% CPU commitment once the Hadoop VM stops; (4) possibly avoiding other > applications to use the host (to economy the host resource). > > - You'll need (1) because otherwise your Hadoop VM would come short of > resources after the host gets overloaded. > - You'll need (2) because you don't want to restrict a new host while one of > your 100% CPU commited hosts still has free resources. > - You'll need (3) because otherwise you host would be forerever restricted, > and that is no longer "on the fly". > - You'll may need (4) because otherwise it'd be a waste of resources. > > The problem of changing CPU overcommit on the fly is that when your Hadoop > VM is still running, someone else can add another VM in the same host with a > higher CPU overcommit (e.g. 200%), (violating (1) ) thus effecting your > Hadoop VM also. > The idea of putting the host in the aggregate can give you (1) and (2). (4) > is done by AggregateInstanceExtraSpecsFilter. However, it does not give you > (3); which can be done with pCloud. Step 1: use flavors so nova can tell between the two workloads, and configure them differently Step 2: find capacity for your workload given your current cloud usage At the moment, most of our solutions involve reserving bits of your cloud capacity for different workloads, generally using host aggregates. The issue with claiming back capacity from other workloads is a bit tricker. The issue is I don't think you have defined where you get that capacity back from? Maybe you want to look at giving some workloads a higher priority over the constrained CPU resources? But you will probably starve the little people out at random, which seems bad. Maybe you want to have a concept of "spot instances" where they can use your "spare capacity" until you need it, and you can just kill them? But maybe I am miss understanding your use case, its not totally clear to me. John _______________________________________________ OpenStack-dev mailing list OpenStackfirstname.lastname@example.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev