Re: [ovirt-users] Workflow after restoring engine from backup

Sven Achtelik Tue, 27 Mar 2018 00:57:07 -0700

I did look at this, for the VMs in question there are no entries on the 
run_on_vds and migrating_to_vds fields. I'm thinking of giving this a try.


-----Ursprüngliche Nachricht-----
Von: Yedidyah Bar David [mailto:[email protected]] 
Gesendet: Sonntag, 25. März 2018 07:46
An: Sven Achtelik
Cc: [email protected]
Betreff: Re: [ovirt-users] Workflow after restoring engine from backup

On Fri, Mar 23, 2018 at 10:35 AM, Sven Achtelik <[email protected]> wrote:
> It looks like I can't get a chance to shut down the HA VMs. I check the 
> restore log and it did mention that it change the HA-VM entries. Just to make 
> sure I looked at the DB and for the vms in question it looks like this.
>
> engine=# select vm_guid,status,vm_host,exit_status,exit_reason from 
> vm_dynamic Where vm_guid IN (SELECT vm_guid FROM vm_static WHERE 
> auto_startup='t' AND lease_sd_id is NULL);
>                vm_guid                | status |     vm_host     | 
> exit_status | exit_reason
> --------------------------------------+--------+-----------------+-------------+-------------
>  8733d4a6-0844-xxxx-804f-6b919e93e076 |      0 | DXXXX          |           2 
> |          -1
>  4eeaa622-17f9-xxxx-b99a-cddb3ad942de |      0 | xxxxAPP        |           2 
> |          -1
>  fbbdc0a0-23a4-4d32-xxxx-a35c59eb790d |      0 | xxxxDB0 |           2 |      
>     -1
>  45a4e7ce-19a9-4db9-xxxxx-66bd1b9d83af |      0 | xxxxxWOR |           2 |    
>       -1
> (4 rows)
>
> Should that be enough to have a safe start of the engine without any HA 
> action kicking in. ?

Looks ok, but check also run_on_vds and migrating_to_vds. See also bz 1446055.

Best regards,

>
>
> -----Ursprüngliche Nachricht-----
> Von: Yedidyah Bar David [mailto:[email protected]]
> Gesendet: Montag, 19. März 2018 10:18
> An: Sven Achtelik
> Cc: [email protected]
> Betreff: Re: [ovirt-users] Workflow after restoring engine from backup
>
> On Mon, Mar 19, 2018 at 11:03 AM, Sven Achtelik <[email protected]> 
> wrote:
>> Hi Didi,
>>
>> my backups where taken with the end. Backup utility. I have 3 Data 
>> centers, two of them with just one host and the third one with 3 
>> hosts running the engine.  The backup three days old, was taken on 
>> engine version 4.1 (4.1.7) and the restored engine is running on 4.1.9.
>
> Since the bug I mentioned was fixed in 4.1.3, you should be covered.
>
>> I have three HA VMs that would
>> be affected. All others are just normal vms. Sounds like it would be 
>> the safest to shut down the HA vm S to make sure that nothing happens ?
>
> If you can have downtime, I agree it sounds safer to shutdown the VMs.
>
>> Or can I
>> disable the HA action in the DB for now ?
>
> No need to. If you restored with 4.1.9 engine-backup, it should have done 
> this for you. If you still have the restore log, you can verify this by 
> checking it. It should contain 'Resetting HA VM status', and then the result 
> of the sql that it ran.
>
> Best regards,
>
>>
>> Thank you,
>>
>> Sven
>>
>>
>>
>> Von meinem Samsung Galaxy Smartphone gesendet.
>>
>>
>> -------- Ursprüngliche Nachricht --------
>> Von: Yedidyah Bar David <[email protected]>
>> Datum: 19.03.18 07:33 (GMT+01:00)
>> An: Sven Achtelik <[email protected]>
>> Cc: [email protected]
>> Betreff: Re: [ovirt-users] Workflow after restoring engine from 
>> backup
>>
>> On Sun, Mar 18, 2018 at 11:45 PM, Sven Achtelik 
>> <[email protected]>
>> wrote:
>>> Hi All,
>>>
>>>
>>>
>>> I had issue with the storage that hosted my engine vm. The disk got 
>>> corrupted and I needed to restore the engine from a backup.
>>
>> How did you backup, and how did you restore?
>>
>> Which version was used for each?
>>
>>> That worked as
>>> expected, I just didn’t start the engine yet.
>>
>> OK.
>>
>>> I know that after the backup
>>> was taken some machines where migrated around before the engine 
>>> disks failed.
>>
>> Are these machines HA?
>>
>>> My question is what will happen once I start the engine service 
>>> which has the restored backup on it ? Will it query the hosts for 
>>> the running VMs
>>
>> It will, but HA machines are handled differently.
>>
>> See also:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1441322
>> https://bugzilla.redhat.com/show_bug.cgi?id=1446055
>>
>>> or will it assume that the VMs are still on the hosts as they 
>>> resided at the point of backup ?
>>
>> It does, initially, but then updates status according to what it gets 
>> from hosts.
>>
>> But polling the hosts takes time, especially if you have many, and HA 
>> policy might require faster handling. So if it polls first a host 
>> that had a machine on it during backup, and sees that it's gone, and 
>> didn't yet poll the new host, HA handling starts immediately, which 
>> eventually might lead to starting the VM on another host.
>>
>> To prevent that, the fixes to above bugs make the restore process 
>> mark HA VMs that do not have leases on the storage as "dead".
>>
>>> Would I need to change the DB manual to let the engine know where 
>>> VMs are up at this point ?
>>
>> You might need to, if you have HA VMs and a too-old version of restore.
>>
>>> What will happen to HA VMs
>>> ? I feel that it might try to start them a second time.  My biggest 
>>> issue is that I can’t get a service Windows to shutdown all VMs and 
>>> then lat them restart by the engine.
>>>
>>>
>>>
>>> Is there a known workflow for that ?
>>
>> I am not aware of a tested procedure for handling above if you have a 
>> too-old version, but you can check the patches linked from above bugs 
>> and manually run the SQL command(s) they include. They are 
>> essentially comment 4 of the first bug.
>>
>> Good luck and best regards,
>> --
>> Didi
>
>
>
> --
> Didi



--
Didi
_______________________________________________
Users mailing list
[email protected]
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Workflow after restoring engine from backup

Reply via email to