Re: [Openstack] instance evacuation from a failed node (rebuild for HA)

2012-08-11 Thread Alex Glikson
 From: Ryan Lane rl...@wikimedia.org
  We have submitted a patch https://review.openstack.org/#/c/11086/ to 
address
  https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that 
simplifies
  recovery from a node failure by introducing an API that recreates an
  instance on *another* host (similar to the existing instance 'rebuild'
  operation). 
[...]
 If shared storage is available, the only think that likely needs to 
 happen is for the instance's host to be updated in the database, and
 a reboot issued for the instance. That would keep everything identical,
 and would likely be much faster.

That's pretty much what we do in 'manager' -- but what needs to happen in 
'driver' is to (re)create the domain in libvirt on the destination host, 
re-attach volumes, floating IPs, etc. Essentially, everything 'spawn' is 
doing today, just without creating the new instance file. Of course, we 
don't re-provision the instance from image in this case.

 - Ryan

Regards,
Alex
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


[Openstack] instance evacuation from a failed node (rebuild for HA)

2012-08-10 Thread Alex Glikson
Dear all,

We have submitted a patch https://review.openstack.org/#/c/11086/ to 
address https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that 
simplifies recovery from a node failure by introducing an API that 
recreates an instance on *another* host (similar to the existing instance 
'rebuild' operation). The exact semantics of this operations varies 
depending on the configuration of the instances and the underlying storage 
topology. For example, if it is a regular 'ephemeral' instance, invoking 
will respawn from the same image on another node while retaining the same 
identity and configuration (e.g. same ID, flavor, IP, attached volumes, 
etc). For instances running off shared storage (i.e. same instance file 
accessible on the target host), the VM will be re-created and point to the 
same instance file while retaining the identity and configuration. More 
details are available at http://wiki.openstack.org/Evacuate. 

Note that the API must be manually invoked today. 

In addition, this patch modifies nova-compute such that on startup (e.g., 
after it failed and recovered) it verifies with the DB that it is still 
the owner of an instance before starting the VM.

Would be great to hear whether people think that such a capability is 
important to push into Folsom, despite the short runway till F3. Any other 
thoughts/recommendations regarding such capability would be also highly 
appreciated.

Thanks,
Alex


Alex Glikson
Manager, Cloud Operating System Technologies, IBM Haifa Research Lab
http://w3.haifa.ibm.com/dept/stt/cloud_sys.html | 
https://www.research.ibm.com/haifa/dept/stt/cloud_sys.shtml 
Email: glik...@il.ibm.com | Phone: +972-4-8281085 | Mobile: 
+972-54-647 | Fax: +972-4-8296112
___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp


Re: [Openstack] instance evacuation from a failed node (rebuild for HA)

2012-08-10 Thread Ryan Lane
 We have submitted a patch https://review.openstack.org/#/c/11086/ to address
 https://blueprints.launchpad.net/nova/+spec/rebuild-for-ha that simplifies
 recovery from a node failure by introducing an API that recreates an
 instance on *another* host (similar to the existing instance 'rebuild'
 operation). The exact semantics of this operations varies depending on the
 configuration of the instances and the underlying storage topology. For
 example, if it is a regular 'ephemeral' instance, invoking will respawn from
 the same image on another node while retaining the same identity and
 configuration (e.g. same ID, flavor, IP, attached volumes, etc). For
 instances running off shared storage (i.e. same instance file accessible on
 the target host), the VM will be re-created and point to the same instance
 file while retaining the identity and configuration. More details are
 available at http://wiki.openstack.org/Evacuate.


If the instance is on shared storage, what does recreate mean? Delete
the old instance and create a new instance, using the same disk image?
Does that mean that the new instance will have a new nova/ec2 id? In
the case where DNS is being used, this would delete the old DNS entry
and create a new DNS entry. This is lossy. If shared storage is
available, the only think that likely needs to happen is for the
instance's host to be updated in the database, and a reboot issued for
the instance. That would keep everything identical, and would likely
be much faster.

- Ryan

___
Mailing list: https://launchpad.net/~openstack
Post to : openstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~openstack
More help   : https://help.launchpad.net/ListHelp