[ovirt-users] Re: problems testing 4.3.10 to 4.4.8 upgrade SHE

2021-09-12 Thread Yedidyah Bar David
On Mon, Sep 13, 2021 at 1:08 AM Gianluca Cecchi
 wrote:
>
> On Sun, Sep 12, 2021 at 10:35 AM Yedidyah Bar David  wrote:
>>
>>
>> >>
>> >> It was the step I suspect there was a regression for in 4.4.8 (comparing 
>> >> with 4.4.7) when updating the first hosted-engine host during the upgrade 
>> >> flow and retaining its hostname details.
>>
>> What's the regression?
>
>
> I thought that in 4.4.7 there was not this problem if you use the same 
> hostname but with different (real or virtual) hw as the first host during 
> your SHE upgrade from 4.3.10 to 4.4.7.
> But probably it was not so and I didn't remember correctly
>
>>
>> >> I'm going to test with latest async 2 4.4.8 and see if it solves the 
>> >> problem. Otherwise I'm going to open a bugzilla sending the logs.
>>
>> Can you clarify what the bug is?
>
>
> The automatic mgmt of host adding during the "hosted-engine --deploy 
> --restore-from-file=backup.bck" step if you have different hw and you want to 
> recycle your previous hostname.
> In the past it often happened to me to combine upgrades of systems with hw 
> refreshing (with standalone hosts, rhcs clusters, also ovirt/rhv from 4.2 to 
> 4.3 if I remember correctly, ecc.) where you re-use an existing hostname on 
> new hardware
> More than a bug it would be an RFE perhaps

OK, now filed it: https://bugzilla.redhat.com/show_bug.cgi?id=2003515

>
>
>>
>> > As novirt2 and novirt1 (in 4.3) are VMS running on the same hypervisor I 
>> > see that in their hw details I have the same serial number and the usual 
>> > random uuid
>>
>> Same serial number? Doesn't sound right. Any idea why it's the same?
>
>
> My env is nested oVirt and my hypervisors are Vms.
> I notice that in oVirt if you clone a VM it changes the uuid in the clone but 
> it retains the serial number...

OK, understood. Unrelated to current issue, but it might be worth to
optionally change this as well during a clone.

>
>> > Unfortunately I cannot try at the moment the scenario where I deploy the 
>> > new novirt2 on the same virtual hw, because in the first 4.3 install I 
>> > configured the OS disk as 50Gb and with this size 4.4.8 complains about 
>> > insufficient space. And having the snapshot active in preview I cannot 
>> > resize the disk
>> > Eventually I can reinstall 4.3 on an 80Gb disk and try the same, 
>> > maintaining the same hw ... but this would imply that in general I cannot 
>> > upgrade using different hw and reusing the same hostnames correct?
>>
>> Yes. Either reuse a host and keep its name (what we recommend in the
>> upgrade guide) or use a new host and a new name (backup/restore
>> guide).
>>
>> The condition to remove the host prior to adding it is based on
>> unique_id_out, which is set in (see also bz 1642440, 1654697):
>>
>>   - name: Get host unique id
>> shell: |
>>   if [ -e /etc/vdsm/vdsm.id ];
>>   then cat /etc/vdsm/vdsm.id;
>>   elif [ -e /proc/device-tree/system-id ];
>>   then cat /proc/device-tree/system-id; #ppc64le
>>   else dmidecode -s system-uuid;
>>   fi;
>> environment: "{{ he_cmd_lang }}"
>> changed_when: true
>> register: unique_id_out
>>
>> So if you want to "make this work", you can set the uuid (either in
>> your (virtual) BIOS, to affect the /proc value, or in
>> /etc/vdsm/vdsm.id) to match the one of the old host (the one you want
>> to reuse its name). I didn't test this myself, though.
>>
>
> I confirm that I reverted the snapshots of the 2 VMs used as hypervisors 
> taking them again at initial 4.3 status and remade all the steps, but right 
> after the install of the OS of 4.4.8 oVirt node I created /etc/vdsm/vdsm.id 
> inside novirt2 with the old 4.3 value (the file was not there at that moment) 
> and then all the flow went as expected and I was then able to reach the final 
> 4.4.8 async 2 env with both hosts at 4.4.8, cluster and DC updated to 4.6 
> compatibility level and no downtime for the VMs inside the env, because I was 
> able to execute live migration after upgrading the first host

Thanks for the report!

Best regards,
-- 
Didi
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/3PZLOASBPEUSGULEW2TPYG6ET4U7ICYT/


[ovirt-users] Re: Create template from snapshot of vm using MBS disk

2021-09-12 Thread ssarang520
How can I file the bug? Do you have a guide?
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/XCLAD43GLAQES3Q6LRPBRVMWLYVLDHTS/


[ovirt-users] Re: problems testing 4.3.10 to 4.4.8 upgrade SHE

2021-09-12 Thread Gianluca Cecchi
On Sun, Sep 12, 2021 at 10:35 AM Yedidyah Bar David  wrote:

>
> >>
> >> It was the step I suspect there was a regression for in 4.4.8
> (comparing with 4.4.7) when updating the first hosted-engine host during
> the upgrade flow and retaining its hostname details.
>
> What's the regression?
>

I thought that in 4.4.7 there was not this problem if you use the same
hostname but with different (real or virtual) hw as the first host during
your SHE upgrade from 4.3.10 to 4.4.7.
But probably it was not so and I didn't remember correctly


> >> I'm going to test with latest async 2 4.4.8 and see if it solves the
> problem. Otherwise I'm going to open a bugzilla sending the logs.
>
> Can you clarify what the bug is?
>

The automatic mgmt of host adding during the "hosted-engine --deploy
--restore-from-file=backup.bck" step if you have different hw and you want
to recycle your previous hostname.
In the past it often happened to me to combine upgrades of systems with hw
refreshing (with standalone hosts, rhcs clusters, also ovirt/rhv from 4.2
to 4.3 if I remember correctly, ecc.) where you re-use an existing hostname
on new hardware
More than a bug it would be an RFE perhaps



> > As novirt2 and novirt1 (in 4.3) are VMS running on the same hypervisor I
> see that in their hw details I have the same serial number and the usual
> random uuid
>
> Same serial number? Doesn't sound right. Any idea why it's the same?
>

My env is nested oVirt and my hypervisors are Vms.
I notice that in oVirt if you clone a VM it changes the uuid in the clone
but it retains the serial number...

> Unfortunately I cannot try at the moment the scenario where I deploy the
> new novirt2 on the same virtual hw, because in the first 4.3 install I
> configured the OS disk as 50Gb and with this size 4.4.8 complains about
> insufficient space. And having the snapshot active in preview I cannot
> resize the disk
> > Eventually I can reinstall 4.3 on an 80Gb disk and try the same,
> maintaining the same hw ... but this would imply that in general I cannot
> upgrade using different hw and reusing the same hostnames correct?
>
> Yes. Either reuse a host and keep its name (what we recommend in the
> upgrade guide) or use a new host and a new name (backup/restore
> guide).
>
> The condition to remove the host prior to adding it is based on
> unique_id_out, which is set in (see also bz 1642440, 1654697):
>
>   - name: Get host unique id
> shell: |
>   if [ -e /etc/vdsm/vdsm.id ];
>   then cat /etc/vdsm/vdsm.id;
>   elif [ -e /proc/device-tree/system-id ];
>   then cat /proc/device-tree/system-id; #ppc64le
>   else dmidecode -s system-uuid;
>   fi;
> environment: "{{ he_cmd_lang }}"
> changed_when: true
> register: unique_id_out
>
> So if you want to "make this work", you can set the uuid (either in
> your (virtual) BIOS, to affect the /proc value, or in
> /etc/vdsm/vdsm.id) to match the one of the old host (the one you want
> to reuse its name). I didn't test this myself, though.
>
>
I confirm that I reverted the snapshots of the 2 VMs used as hypervisors
taking them again at initial 4.3 status and remade all the steps, but right
after the install of the OS of 4.4.8 oVirt node I created /etc/vdsm/vdsm.id
inside novirt2 with the old 4.3 value (the file was not there at that
moment) and then all the flow went as expected and I was then able to reach
the final 4.4.8 async 2 env with both hosts at 4.4.8, cluster and DC
updated to 4.6 compatibility level and no downtime for the VMs inside the
env, because I was able to execute live migration after upgrading the first
host


Perhaps, if you do want to open a bug, it should say something like:
> "HE deploy should remove the old host based on its name, and not its
> UUID". However, it's not completely clear to me that this won't
> introduce new regressions.
>
> I admit I didn't completely understand your flow, and especially your
> considerations there. If you think the current behavior prevents an
> important flow, please clarify.
>
> Best regards,
> --
> Didi
>
>
My considerations, as explained at the beginning, were to give the chance
to reuse the hostname (often the oVirt admin is not responsible for
hostname creation/mgmt) if you want to leverage new hw in combination with
the upgrade process.

Thanks for all the other considerations you put into your answer.

Gianluca
___
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/KXSZHEPQSHTFS3VSB25TUZ7DNFFVBHYB/


[ovirt-users] Re: problems testing 4.3.10 to 4.4.8 upgrade SHE

2021-09-12 Thread Yedidyah Bar David
Hi Gianluca,

On Fri, Sep 10, 2021 at 10:04 AM Gianluca Cecchi
 wrote:
>
>
> On Wed, Sep 1, 2021 at 4:26 PM Gianluca Cecchi  
> wrote:
>>
>> On Wed, Sep 1, 2021 at 4:00 PM Yedidyah Bar David  wrote:
>>>
>>>
>>> >
>>> > So I think there was something wrong with my system or probably a 
>>> > regression on this in 4.4.8.
>>> >
>>> > I see these lines in ansible steps of deploy of RHV 4.3 -> 4.4
>>> >
>>> > [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Remove host used to 
>>> > redeploy]
>>> > [ INFO  ] changed: [localhost -> 192.168.222.170]
>>> >
>>> > possibly this step should remove the host that I'm reinstalling...?
>>>
>>> It should. From the DB, before adding it again. Matches on the uuid
>>> (search the code for unique_id_out if you want the details). Why?
>>>
>>> (I didn't follow all this thread, ignoring the rest for now...)
>>>
>>> Best regards,
>>>
>>>
>>
>> It was the step I suspect there was a regression for in 4.4.8 (comparing 
>> with 4.4.7) when updating the first hosted-engine host during the upgrade 
>> flow and retaining its hostname details.

What's the regression?

>> I'm going to test with latest async 2 4.4.8 and see if it solves the 
>> problem. Otherwise I'm going to open a bugzilla sending the logs.

Can you clarify what the bug is?

>>
>> Gianluca
>
>
> So tried with 4.4.8 async 2 but the same problem
>
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Check actual cluster 
> location]
> [ INFO  ] skipping: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Enable GlusterFS at cluster 
> level]
> [ INFO  ] skipping: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Set VLAN ID at datacenter 
> level]
> [ INFO  ] skipping: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Get active list of active 
> firewalld zones]
> [ INFO  ] changed: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Configure libvirt firewalld 
> zone]
> [ INFO  ] changed: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Add host]
> [ INFO  ] changed: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Include after_add_host 
> tasks files]
> [ INFO  ] You can now connect to 
> https://novirt2.localdomain.local:6900/ovirt-engine/ and check the status of 
> this host and eventually remediate it, please continue only when the host is 
> listed as 'up'
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : include_tasks]
> [ INFO  ] ok: [localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Create temporary lock file]
> [ INFO  ] changed: [localhost -> localhost]
> [ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Pause execution until 
> /tmp/ansible.wy3ichvk_he_setup_lock is removed, delete it once ready to 
> proceed]
>
> the host keeps remaining as NoNResponsive in local engine and in engine.log 
> the same
>
> 2021-09-10 08:44:51,481+02 ERROR 
> [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesAsyncVDSCommand] 
> (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [] 
> Command 'GetCapabilitiesAsyncVDSCommand(HostName = novirt2.localdomain.local, 
> VdsIdAndVdsVDSCommandParametersBase:{hostId='ca9ff6f7-5a7c-4168-9632-998c52f76cfa',
>  
> vds='Host[novirt2.localdomain.local,ca9ff6f7-5a7c-4168-9632-998c52f76cfa]'})' 
> execution failed: java.net.ConnectException: Connection refused
>
> so the initial install/config of novirt2 doesn't start
>
> So the scenario is
>
> initial 4.3.10 with 2 hosts (novirt1 and novirt2) and 1 she engine (novmgr)
> iSCSI based storage: hosted_engine storage domain and one data storage domain
>
> This is  nested env so that through snapshots I can try and repeat steps.
> novirt1 and novirt2 are two VMS under one oVirt 4.4 env composed by one 
> single host and an external engine
>
> the steps:
> 1 vm running under novirt1 and hosted engine running under novir2 at the 
> beginning
> . global maintenance
> . stop engine
> . backup
> . shutdown engine vm and scratch novirt2
> actually I simulate scenario where I deploy novirt2 on a new hw, that is a 
> clone of novirt2 VM
> Already tested (in previous version of 4.4.8) that if I go through a 
> different hostname it works

Correct

> As novirt2 and novirt1 (in 4.3) are VMS running on the same hypervisor I see 
> that in their hw details I have the same serial number and the usual random 
> uuid

Same serial number? Doesn't sound right. Any idea why it's the same?

>
> novirt1
> uuid B1EF9AFF-D4BD-41A1-B26E-7DD0CC440963
> serial number 00fa984c-d5a1-e811-906e-00163566263e
>
> novirt2
> uuid D584E962-5461-4FA5-AFFA-DB413E17590C
> serial number  00fa984c-d5a1-e811-906e-00163566263e
>
> and the new novirt2 that has a different uuid, being a clone  has (from 
> dmidecode)
> uuid: 10b9031d-a475-4b41-a134-bad2ede3cf11
> serial Number: 00fa984c-d5a1-e811-906e-00163566263e
>
> Unfortunately I cannot try at the moment the scenario where I deploy the new 
> novirt2 on the same virtual hw, because in the first 4.3 install I