I did look at this, for the VMs in question there are no entries on the run_on_vds and migrating_to_vds fields. I'm thinking of giving this a try.
-----Ursprüngliche Nachricht----- Von: Yedidyah Bar David [mailto:d...@redhat.com] Gesendet: Sonntag, 25. März 2018 07:46 An: Sven Achtelik Cc: users@ovirt.org Betreff: Re: [ovirt-users] Workflow after restoring engine from backup On Fri, Mar 23, 2018 at 10:35 AM, Sven Achtelik <sven.achte...@eps.aero> wrote: > It looks like I can't get a chance to shut down the HA VMs. I check the > restore log and it did mention that it change the HA-VM entries. Just to make > sure I looked at the DB and for the vms in question it looks like this. > > engine=# select vm_guid,status,vm_host,exit_status,exit_reason from > vm_dynamic Where vm_guid IN (SELECT vm_guid FROM vm_static WHERE > auto_startup='t' AND lease_sd_id is NULL); > vm_guid | status | vm_host | > exit_status | exit_reason > --------------------------------------+--------+-----------------+-------------+------------- > 8733d4a6-0844-xxxx-804f-6b919e93e076 | 0 | DXXXX | 2 > | -1 > 4eeaa622-17f9-xxxx-b99a-cddb3ad942de | 0 | xxxxAPP | 2 > | -1 > fbbdc0a0-23a4-4d32-xxxx-a35c59eb790d | 0 | xxxxDB0 | 2 | > -1 > 45a4e7ce-19a9-4db9-xxxxx-66bd1b9d83af | 0 | xxxxxWOR | 2 | > -1 > (4 rows) > > Should that be enough to have a safe start of the engine without any HA > action kicking in. ? Looks ok, but check also run_on_vds and migrating_to_vds. See also bz 1446055. Best regards, > > > -----Ursprüngliche Nachricht----- > Von: Yedidyah Bar David [mailto:d...@redhat.com] > Gesendet: Montag, 19. März 2018 10:18 > An: Sven Achtelik > Cc: users@ovirt.org > Betreff: Re: [ovirt-users] Workflow after restoring engine from backup > > On Mon, Mar 19, 2018 at 11:03 AM, Sven Achtelik <sven.achte...@eps.aero> > wrote: >> Hi Didi, >> >> my backups where taken with the end. Backup utility. I have 3 Data >> centers, two of them with just one host and the third one with 3 >> hosts running the engine. The backup three days old, was taken on >> engine version 4.1 (4.1.7) and the restored engine is running on 4.1.9. > > Since the bug I mentioned was fixed in 4.1.3, you should be covered. > >> I have three HA VMs that would >> be affected. All others are just normal vms. Sounds like it would be >> the safest to shut down the HA vm S to make sure that nothing happens ? > > If you can have downtime, I agree it sounds safer to shutdown the VMs. > >> Or can I >> disable the HA action in the DB for now ? > > No need to. If you restored with 4.1.9 engine-backup, it should have done > this for you. If you still have the restore log, you can verify this by > checking it. It should contain 'Resetting HA VM status', and then the result > of the sql that it ran. > > Best regards, > >> >> Thank you, >> >> Sven >> >> >> >> Von meinem Samsung Galaxy Smartphone gesendet. >> >> >> -------- Ursprüngliche Nachricht -------- >> Von: Yedidyah Bar David <d...@redhat.com> >> Datum: 19.03.18 07:33 (GMT+01:00) >> An: Sven Achtelik <sven.achte...@eps.aero> >> Cc: users@ovirt.org >> Betreff: Re: [ovirt-users] Workflow after restoring engine from >> backup >> >> On Sun, Mar 18, 2018 at 11:45 PM, Sven Achtelik >> <sven.achte...@eps.aero> >> wrote: >>> Hi All, >>> >>> >>> >>> I had issue with the storage that hosted my engine vm. The disk got >>> corrupted and I needed to restore the engine from a backup. >> >> How did you backup, and how did you restore? >> >> Which version was used for each? >> >>> That worked as >>> expected, I just didn’t start the engine yet. >> >> OK. >> >>> I know that after the backup >>> was taken some machines where migrated around before the engine >>> disks failed. >> >> Are these machines HA? >> >>> My question is what will happen once I start the engine service >>> which has the restored backup on it ? Will it query the hosts for >>> the running VMs >> >> It will, but HA machines are handled differently. >> >> See also: >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1441322 >> https://bugzilla.redhat.com/show_bug.cgi?id=1446055 >> >>> or will it assume that the VMs are still on the hosts as they >>> resided at the point of backup ? >> >> It does, initially, but then updates status according to what it gets >> from hosts. >> >> But polling the hosts takes time, especially if you have many, and HA >> policy might require faster handling. So if it polls first a host >> that had a machine on it during backup, and sees that it's gone, and >> didn't yet poll the new host, HA handling starts immediately, which >> eventually might lead to starting the VM on another host. >> >> To prevent that, the fixes to above bugs make the restore process >> mark HA VMs that do not have leases on the storage as "dead". >> >>> Would I need to change the DB manual to let the engine know where >>> VMs are up at this point ? >> >> You might need to, if you have HA VMs and a too-old version of restore. >> >>> What will happen to HA VMs >>> ? I feel that it might try to start them a second time. My biggest >>> issue is that I can’t get a service Windows to shutdown all VMs and >>> then lat them restart by the engine. >>> >>> >>> >>> Is there a known workflow for that ? >> >> I am not aware of a tested procedure for handling above if you have a >> too-old version, but you can check the patches linked from above bugs >> and manually run the SQL command(s) they include. They are >> essentially comment 4 of the first bug. >> >> Good luck and best regards, >> -- >> Didi > > > > -- > Didi -- Didi _______________________________________________ Users mailing list Users@ovirt.org http://lists.ovirt.org/mailman/listinfo/users