I try to implement a simple way to automate the backup mechanism (eg. every day): https://blueprints.launchpad.net/nova/+spec/backup-schedule
And I though of a solution to respond to your needs: when a node fails (for any reasons), I disable it, I delete all servers was running on it and I restart them from the last available backup. Édouard. On Fri, Nov 9, 2012 at 8:45 PM, Vishvananda Ishaya <[email protected]> wrote: > > The libvirt driver has actually gotten quite good at rebuilding all of the > data for instances. This only thing it can't do right now is redownload base > images from glance. With current state if you simply back up the instances > directory (usually /var/lib/nova/instances) then you can recover by bringing > back the whole directory and doing a nova reboot <uuid> for each instance. > > You could just stick the whole thing on an lvm and snaphot it regularly for > dr. The _base directory can be regenerated with images from glance so you > could also write a script to regenerate it and not have to worry about > backing it up. The code to add to nova to make it automatically re-download > the image from glance if it isn't there shouldn't be too bad either, which > would mean you could safely ignore the _base directory for backups. > Additionally using qcow images in glance and the config option > `force_raw_images=False` will keep this directory much smaller. > > Vish > > > On Nov 9, 2012, at 2:51 AM, Jānis Ģeņģeris <[email protected]> wrote: > > Hello all, > > I would like to know the available solutions that are used regarding to > backing up and/or snapshotting running > instances on compute nodes. Documentation does not mention anything related > to this. With snapshots I don't mean > the current snapshot mechanism, that imports image of the running VM into > glance. I'm using KVM, but this is > significant for any hypervisor. > > Why is this important? > Consider simple scenario when hardware on compute node fails and the node > goes down immediately and is not recoverable > in reasonable time. The images of the running instances are also lost. Shared > file system is not considered here as it > may cause IO bottlenecks and adds another layer of complexity. > > There have been a few discussions on the the list about this problem, but > none have really answered the question. > > The documentation speaks of disaster recovery when power loss have happened > and failed compute node recovery from > shared file system. But don't cover the case without shared file system. > > I can think of few solutions currently (for KVM): > a) using LVM images for VMs, and making LVM logical volume snapshots, but > then the current nova snapshot mechanism > will not work (from the docs - 'current snapshot mechanism in OpenStack > Compute works only with instances backed > with Qcow2 images'); > b) snapshot machines with OpenStack snapshotting mechanism, but this doesn't > fit somehow, because it has > other goal than creating backups, will be slow and pollute the glance image > space; > > Regards > --janis > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : [email protected] > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > > > > _______________________________________________ > Mailing list: https://launchpad.net/~openstack > Post to : [email protected] > Unsubscribe : https://launchpad.net/~openstack > More help : https://help.launchpad.net/ListHelp > _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : [email protected] Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp

