On Fri, 2014-03-28 at 19:38 +0000, CARVER, PAUL wrote: > Jay Pipes wrote: > >I'm proposing getting rid of the host aggregate hack (or maybe evolving > >it?) as well as the availability zone concept and replacing them with a > >more flexible generic container object that may be hierarchical in > >nature. > > Is the thing you're proposing to replace them with something that already > exists or a brand new thing you're proposing should be created?
Either an evolution of the host aggregate concept (possibly renamed) or a brand new concept. > We need some sort of construct that allows the tenant to be confident that > they aren't going to lose multiple VMs simultaneously due to a failure of > underlying hardware. ? Tenants currently assume this is the case if they are using multiple availability zones, but there is nothing in Nova that actually prevents multiple availability zones from sharing hardware. Frankly, this is an SLA thing, and should not be part of the API, IMO. If a deployer wishes to advertise an SLA that says "this container of compute resources is a failure domain", then they should be free to make that SLA and even include it in a description of said generic container of compute resource, but there should be no *implicit* SLAs. > The semantics of it need to be easily comprehensible > to the tenant, otherwise you'll get people thinking they're protected because > they built a redundant pair of VMs but sheer bad luck results in them losing > them both at the same time. Umm, that's possible today. There is an implicit trust right now in the API that availability zones are independent failure domains. And what I am telling you is that no such constraint exists in the implementation of Nova availability zones (exposed via host aggregate). > We're using availability zone for that currently and it seems to serve the > purpose in a way that's easy to explain to a tenant. It may be easy to explain to a tenant -- simply because of its use in AWS. But that doesn't mean it's something that is real in practice. You're furthering a false trust if you explain to tenants that an availability zone is an independent failure domain when it can easily NOT be an independent failure domain because of the exposure of availability zones through the host aggregate concept (which themselves may overlap hardware and therefore spoil the promise of independent failure domains). Thus, we need a different concept than availability zone to expose to users. Thus, my proposal. Best, -jay _______________________________________________ OpenStack-dev mailing list OpenStackfirstname.lastname@example.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev