Re: [Openstack] instance evacuation from a failed node (rebuild for HA)
From: Ryan Lane rl...@wikimedia.org We have submitted a patch https://review.openstack.org/#/c/11086/ to address https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that simplifies recovery from a node failure by introducing an API that recreates an instance on *another* host (similar to the existing instance 'rebuild' operation). [...] If shared storage is available, the only think that likely needs to happen is for the instance's host to be updated in the database, and a reboot issued for the instance. That would keep everything identical, and would likely be much faster. That's pretty much what we do in 'manager' -- but what needs to happen in 'driver' is to (re)create the domain in libvirt on the destination host, re-attach volumes, floating IPs, etc. Essentially, everything 'spawn' is doing today, just without creating the new instance file. Of course, we don't re-provision the instance from image in this case. - Ryan Regards, Alex ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
[Openstack] instance evacuation from a failed node (rebuild for HA)
Dear all, We have submitted a patch https://review.openstack.org/#/c/11086/ to address https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that simplifies recovery from a node failure by introducing an API that recreates an instance on *another* host (similar to the existing instance 'rebuild' operation). The exact semantics of this operations varies depending on the configuration of the instances and the underlying storage topology. For example, if it is a regular 'ephemeral' instance, invoking will respawn from the same image on another node while retaining the same identity and configuration (e.g. same ID, flavor, IP, attached volumes, etc). For instances running off shared storage (i.e. same instance file accessible on the target host), the VM will be re-created and point to the same instance file while retaining the identity and configuration. More details are available at http://wiki.openstack.org/Evacuate. Note that the API must be manually invoked today. In addition, this patch modifies nova-compute such that on startup (e.g., after it failed and recovered) it verifies with the DB that it is still the owner of an instance before starting the VM. Would be great to hear whether people think that such a capability is important to push into Folsom, despite the short runway till F3. Any other thoughts/recommendations regarding such capability would be also highly appreciated. Thanks, Alex Alex Glikson Manager, Cloud Operating System Technologies, IBM Haifa Research Lab http://w3.haifa.ibm.com/dept/stt/cloud_sys.html | https://www.research.ibm.com/haifa/dept/stt/cloud_sys.shtml Email: glik...@il.ibm.com | Phone: +972-4-8281085 | Mobile: +972-54-647 | Fax: +972-4-8296112 ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp
Re: [Openstack] instance evacuation from a failed node (rebuild for HA)
We have submitted a patch https://review.openstack.org/#/c/11086/ to address https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that simplifies recovery from a node failure by introducing an API that recreates an instance on *another* host (similar to the existing instance 'rebuild' operation). The exact semantics of this operations varies depending on the configuration of the instances and the underlying storage topology. For example, if it is a regular 'ephemeral' instance, invoking will respawn from the same image on another node while retaining the same identity and configuration (e.g. same ID, flavor, IP, attached volumes, etc). For instances running off shared storage (i.e. same instance file accessible on the target host), the VM will be re-created and point to the same instance file while retaining the identity and configuration. More details are available at http://wiki.openstack.org/Evacuate. If the instance is on shared storage, what does recreate mean? Delete the old instance and create a new instance, using the same disk image? Does that mean that the new instance will have a new nova/ec2 id? In the case where DNS is being used, this would delete the old DNS entry and create a new DNS entry. This is lossy. If shared storage is available, the only think that likely needs to happen is for the instance's host to be updated in the database, and a reboot issued for the instance. That would keep everything identical, and would likely be much faster. - Ryan ___ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp