Re: [openstack-dev] [scheduler] [heat] Policy specifics

Zane Bitter Mon, 30 Sep 2013 03:37:09 -0700

On 27/09/13 17:58, Clint Byrum wrote:

Excerpts from Zane Bitter's message of 2013-09-27 06:58:40 -0700:

On 27/09/13 08:58, Mike Spreitzer wrote:

I have begun to draft some specifics about the sorts of policies that
might be added to infrastructure to inform a smart unified placement
engine.  These are cast as an extension to Heat templates.  See
https://wiki.openstack.org/wiki/Heat/PolicyExtension.  Comments solicited.


Mike,
These are not the kinds of specifics that are of any help at all in
figuring out how (or, indeed, whether) to incorporate holistic
scheduling into OpenStack.


I agree that the things in that page are a wet dream of logical deployment
fun. However, I think one can target just a few of the basic ones,
and see a real achievable case forming. I think I grasp Mike's ideas,
so I'll respond to your concerns with what I think. Note that it is
highly likely I've gotten some of this wrong.

Thanks for having a crack at this Clint. However, I think your exampleis not apposite, because it doesn't actually require any holisticscheduling. You can easily do anti-colocation of a bunch of servers justusing scheduler hints to the Nova API (stick one in each zone until yourun out of zones). This just requires Heat to expose the scheduler hintsportion of the Nova API. To my mind this stuff is so basic that it fallssquarely in the category of what you said in a previous thread:

There is
definitely a need for Heat to be able to communicate to the API's any
placement details that can be communicated. However, Heat should not
actually be "scheduling" anything.

But in any event, most of your answers appear to be predicated on thisvery simple case, not on a holistic scheduler. I think you are vastlyunderestimating the complexity of the problem.

What Mike is proposing is something more sophisticated, whereby you cansolve for the optimal scheduling of resources of different types acrossdifferent APIs. There may be a case for including this in Heat, but itneeds to be made, and IMO it needs to be made by answering these kindsof questions at a similar level of detail to the symmetric dyadicprimitives wiki page.


BTW there is one more question I should add:

- Who will implement and maintain this service/feature, and theassociated changes to existing services?

- What would a holistic scheduling service look like? A standalone
service? Part of heat-engine?


I see it as a preprocessor of sorts for the current infrastructure engine.
It would take the logical expression of the cluster and either turn
it into actual deployment instructions or respond to the user that it
cannot succeed. Ideally it would just extend the same Heat API.

- How will the scheduling service reserve slots for resources in advance
of them being created? How will those reservations be accounted for and
billed?
- In the event that slots are reserved but those reservations are not
taken up, what will happen?


I dont' see the word "reserve" in Mike's proposal, and I don't think this
is necessary for the more basic models like Collocation and Anti-Collocation.

Right, but we're not talking about only the basic models. Reservationsare very much needed according to my understanding of the proposal,because the whole point is to co-ordinate across multiple services in away that is impossible to do atomically.

Reservations would of course make the scheduling decisions more likely to
succeed, but it isn't necessary if we do things optimistically. If the
stack create or update fails, we can retry with better parameters.

- Once scheduled, how will resources be created in their proper slots as
part of a Heat template?


In goes a Heat template (sorry for not using HOT.. still learning it. ;)

Resources:
   ServerTemplate:
     Type: Some::Defined::ProviderType
   HAThing1:
     Type: OS::Heat::HACluster
     Properties:
       ClusterSize: 3
       MaxPerAZ: 1
       PlacementStrategy: anti-collocation
       Resources: [ ServerTemplate ]

And if we have at least 2 AZ's available, it feeds to the heat engine:

Resources:
   HAThing1-0:
     Type: Some::Defined::ProviderType
       Parameters:
         availability-zone: zone-A
   HAThing1-1:
     Type: Some::Defined::ProviderType
       Parameters:
         availability-zone: zone-B
   HAThing1-2:
     Type: Some::Defined::ProviderType
       Parameters:
         availability-zone: zone-A

If not, holistic scheduler says back "I don't have enough AZ's to
satisfy MaxPerAZ".

Now, if Nova grows anti-affininty under the covers that it can manage
directly, a later version can just spit out:

Resources:
   HAThing1-0:
     Type: Some::Defined::ProviderType
       Parameters:
         instance-group: 0
         affinity-type: anti
   HAThing1-1:
     Type: Some::Defined::ProviderType
       Parameters:
         instance-group: 1
         affinity-type: anti
   HAThing1-2:
     Type: Some::Defined::ProviderType
       Parameters:
         instance-group: 0
         affinity-type: anti

The point is that the user cares about their servers not being in the
same failure domain, not how that happens.

- What about when the user calls the APIs directly? (i.e. does their own
orchestration - either hand-rolled or using their own standalone Heat.)


This has come up with autoscaling too. "Undefined" - that's not your stack.

Well, when we have the new autoscaling service you'll still be able tocreate an autoscaling group using your own standalone Heat engine. Ifthe provider has a scheduling service, why shouldn't you be able to usethat with your own standalone Heat engine too?

- How and from where will the scheduling service obtain the utilisation
data needed to perform the scheduling? What mechanism will segregate
this information from the end user?


I do think this is a big missing piece. Right now it is spread out
all over the place. Keystone at least has regions, so that could be
incorporated now. I briefly dug through the other API's and don't see
a way to enumerate AZ's or cells. Perhaps it is hiding in extensions?

I don't think this must be segregated from end users. An API for "show
me the placement decisions I can make" seems useful for anybody trying
to automate deployments. Anyway, probably best to keep it decentralized
and just make it so that each service can respond with lists of arguments
to their API that are likely to succeed.

I think you're thinking about the very simplest case still (e.g. list ofAZs - we have that already). To implement a completely generalscheduling service you're going to need data down to the level of e.g.which machines are overcommitted and by how much. Good luck convincingpublic cloud providers to make this available through a user-facing API.The unintended consequences only _begin_ with pathological userbehaviour, and end somewhere in the realm of lawsuits, financialreporting and competitive analysis.

As Mike pointed out downthread, the scheduler primarily serves the cloudprovider's interest. That means the raw input data is at best (whencompared to the actual scheduler output) a record of exactly how muchthe provider does or does not care about users, and at worst a basis forusers building their own scheduler that serves only their own interest.

So the scheduler service needs some privileged access to the internalsof each service. Heat is unprivileged (it just calls public APIs - youcan run your own locally). How to resolve that mismatch is a keyquestion if scheduling is to become part of Heat.


cheers,
Zane.

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [scheduler] [heat] Policy specifics

Reply via email to