[Cc'ing product-wg@ - when replying, first please consider whether cross-posting is appropriate]
Hi all, Currently the OpenStack HA community is putting a lot of effort into converging on a single upstream solution for high availability of VMs and hypervisors[0], and we had a lot of very productive discussions in Austin on this topic[1]. One of the first areas of focus is the high level user story: http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/ha_vm.html In particular, there is an open review on which we could use some advice from the wider community. The review proposes adding four extra usage scenarios to the existing user story. All of these scenarios are to some degree related to HA of VMs and hypervisors, however none of them exclusively - they all have scope extending to other areas beyond HA. Here's a very brief summary of all four, as they relate to HA: 1. "Sticky" shared storage zones Scenario: all compute hosts have access to exactly one shared storage "availability zone" (potentially independent of the normal availability zones). For example, there could be multiple NFS servers, and every compute host has /var/lib/nova/instances mounted to one of them. On first boot, each VM is *implicitly* assigned to a zone, depending on which compute host nova-scheduler picks for it (so this could be more or less random). Subsequent operations such as "nova evacuate" would need to ensure the VM only ever moves to other hosts in the same zone. 2. Hypervisor reservation The operator wants a mechanism for reserving some compute hosts exclusively for use as failover hosts on which to automatically resurrect VMs from other failed compute nodes. 3. Host maintenance The operator wants a mechanism for flagging hosts as undergoing maintenance, so that the HA mechanisms for automatic recovery are temporarily disabled during the maintenance window. 4. Event history The operator wants a way to retrieve the history of what, when, where and how the HA automatic recovery mechanism is performed. And here's the review in question: https://review.openstack.org/#/c/318431/ My first instinct was that all of these scenarios are sufficiently independent, complex, and extend far enough outside HA scope, that they deserve to live in four separate user stories, rather than adding them to our existing "HA for VMs" user story. This could also maximise the chances of converging on a single upstream solution for each which works both inside and outside HA contexts. (Please read the review's comments for much more detail on these arguments.) However, others made the very valid point that since there are elements of all these stories which are indisputably related to HA for VMs, we still need the existing user story for HA VMs to cover them, so that it can provide "the big picture" which will tie together all the different strands of work it requires. So we are currently proposing to take the following steps: - Propose four new user stories for each of the above scenarios. - Link to the new stories from the "Related User Stories" section of the existing HA VMs story. - Extend the existing story so that it covers the HA-specific aspects of the four cases, leaving any non-HA aspects to be covered by the newly linked stories. Then each story would go through the standard workflow defined by the PWG: https://wiki.openstack.org/wiki/ProductTeam/User_Stories Does this sound reasonable, or is there a better way? BTW, whilst this email is primarily asking for advice on the process, feedback on each story is also welcome, whether it's "good idea", "you can already do that", or "terrible idea!" ;-) However please first read the comments on the above review, as the obvious points have probably already been covered :-) Thanks a lot! Adam [0] A complete description of the problem area and existing solutions was given in this talk: https://www.openstack.org/videos/video/high-availability-for-pets-and-hypervisors-state-of-the-nation [1] https://etherpad.openstack.org/p/newton-instance-ha __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev