[ovirt-users] Re: Stuck completing last step of 4.3 upgrade

Simone Tiraboschi Wed, 13 Feb 2019 11:22:46 -0800

On Wed, Feb 13, 2019 at 8:06 PM Jayme <[email protected]> wrote:

>
> I might be hitting this bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1666795
>


Yes, you definitively are.
Fixing files ownership on file system side is a valid workaround.


>
> On Wed, Feb 13, 2019 at 1:35 PM Jayme <[email protected]> wrote:
>
>> This may be happening because I changed cluster compatibility to 4.3 then
>> immediately after changed data center compatibility to 4.3 (before
>> restarting VMs after cluster compatibility change).  If this is the case I
>> can't fix by downgrading the data center compatibility to 4.2 as it won't
>> allow me to do so.  What can I do to fix this, any VM I restart will break
>> (I am leaving the others running for now, but there are some down that I
>> can't start).
>>
>> Full error from VDSM:
>>
>> 2019-02-13 13:30:55,465-0400 ERROR (vm/d070ce80)
>> [storage.TaskManager.Task] (Task='d5c8e50a-0a6f-4fe7-be79-fd322b273a1e')
>> Unexpected error (task:875)
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882,
>> in _run
>>     return fn(*args, **kargs)
>>   File "<string>", line 2, in prepareImage
>>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in
>> method
>>     ret = func(*args, **kwargs)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 3198,
>> in prepareImage
>>     legality = dom.produceVolume(imgUUID, volUUID).getLegality()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 818,
>> in produceVolume
>>     volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/glusterVolume.py",
>> line 45, in __init__
>>     volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 800, in __init__
>>     self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID,
>> volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>> line 71, in __init__
>>     volUUID)
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 86, in __init__
>>     self.validate()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>> 112, in validate
>>     self.validateVolumePath()
>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>> line 131, in validateVolumePath
>>     raise se.VolumeDoesNotExist(self.volUUID)
>> VolumeDoesNotExist: Volume does not exist:
>> (u'2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',)
>> 2019-02-13 13:30:55,468-0400 ERROR (vm/d070ce80) [storage.Dispatcher]
>> FINISH prepareImage error=Volume does not exist:
>> (u'2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',) (dispatcher:81)
>> 2019-02-13 13:30:55,469-0400 ERROR (vm/d070ce80) [virt.vm]
>> (vmId='d070ce80-e0bc-489d-8ee0-47d5926d5ae2') The vm start process failed
>> (vm:937)
>> Traceback (most recent call last):
>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 866, in
>> _startUnderlyingVm
>>     self._run()
>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2749, in
>> _run
>>     self._devices = self._make_devices()
>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2589, in
>> _make_devices
>>     disk_objs = self._perform_host_local_adjustment()
>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2662, in
>> _perform_host_local_adjustment
>>     self._preparePathsForDrives(disk_params)
>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1011, in
>> _preparePathsForDrives
>>     drive['path'] = self.cif.prepareVolumePath(drive, self.id)
>>   File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 415, in
>> prepareVolumePath
>>     raise vm.VolumeError(drive)
>> VolumeError: Bad volume specification {'address': {'function': '0x0',
>> 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'slot': '0x06'},
>> 'serial': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'index': 0, 'iface':
>> 'virtio', 'apparentsize': '64293699584', 'specParams': {}, 'cache': 'none',
>> 'imageID': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'truesize':
>> '64293814272', 'type': 'disk', 'domainID':
>> '1f2e9989-9ab3-43d5-971d-568b8feca918', 'reqsize': '0', 'format': 'cow',
>> 'poolID': 'a45e442e-9989-11e8-b0e4-00163e4bf18a', 'device': 'disk', 'path':
>> '/rhev/data-center/a45e442e-9989-11e8-b0e4-00163e4bf18a/1f2e9989-9ab3-43d5-971d-568b8feca918/images/d81a6826-dc46-44db-8de7-405d30e44d57/2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',
>> 'propagateErrors': 'off', 'name': 'vda', 'bootOrder': '1', 'volumeID':
>> '2d6d5f87-ccb0-48ce-b3ac-84495bd12d32', 'diskType': 'file', 'alias':
>> 'ua-d81a6826-dc46-44db-8de7-405d30e44d57', 'discard': False}
>>
>> On Wed, Feb 13, 2019 at 1:19 PM Jayme <[email protected]> wrote:
>>
>>> I may have made matters worse.  So I changed to 4.3 compatible cluster
>>> then 4.3 compatible data center.  All VMs were marked as requiring a
>>> reboot.  I restarted a couple of them and none of them will start up, they
>>> are saying "bad volume specification".  The ones running that I did not yet
>>> restart are still running ok.  I need to figure out why the VMs aren't
>>> restarting.
>>>
>>> Here is an example from vdsm.log
>>>
>>> olumeError: Bad volume specification {'address': {'function': '0x0',
>>> 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'slot': '0x06'},
>>> 'serial': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'index': 0, 'iface':
>>> 'virtio', 'apparentsize': '64293699584', 'specParams': {}, 'cache': 'none',
>>> 'imageID': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'truesize':
>>> '64293814272', 'type': 'disk', 'domainID':
>>> '1f2e9989-9ab3-43d5-971d-568b8feca918', 'reqsize': '0', 'format': 'cow',
>>> 'poolID': 'a45e442e-9989-11e8-b0e4-00163e4bf18a', 'device': 'disk', 'path':
>>> '/rhev/data-center/a45e442e-9989-11e8-b0e4-00163e4bf18a/1f2e9989-9ab3-43d5-971d-568b8feca918/images/d81a6826-dc46-44db-8de7-405d30e44d57/2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',
>>> 'propagateErrors': 'off', 'name': 'vda', 'bootOrder': '1', 'volumeID':
>>> '2d6d5f87-ccb0-48ce-b3ac-84495bd12d32', 'diskType': 'file', 'alias':
>>> 'ua-d81a6826-dc46-44db-8de7-405d30e44d57', 'discard': False}
>>>
>>> On Wed, Feb 13, 2019 at 1:01 PM Jayme <[email protected]> wrote:
>>>
>>>> I think I just figured out what I was doing wrong.  On edit cluster
>>>> screen I was changing both the CPU type and cluster level 4.3.  I tried it
>>>> again by switching to the new CPU type first (leaving cluster on 4.2) then
>>>> saving, then going back in and switching compat level to 4.3.  It appears
>>>> that you need to do this in two steps for it to work.
>>>>
>>>>
>>>>
>>>> On Wed, Feb 13, 2019 at 12:57 PM Jayme <[email protected]> wrote:
>>>>
>>>>> Hmm interesting, I wonder how you were able to switch from SandyBridge
>>>>> IBRS to SandyBridge IBRS SSBD.  I just attempted the same in both regular
>>>>> mode and in global maintenance mode and it won't allow me to, it says that
>>>>> all hosts have to be in maintenance mode (screenshots attached).   Are you
>>>>> also running HCI/Gluster setup?
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 13, 2019 at 12:44 PM Ron Jerome <[email protected]> wrote:
>>>>>
>>>>>> > Environment setup:
>>>>>> >
>>>>>> > 3 Host HCI GlusterFS setup.  Identical hosts, Dell R720s w/ Intel
>>>>>> E5-2690
>>>>>> > CPUs
>>>>>> >
>>>>>> > 1 default data center (4.2 compat)
>>>>>> > 1 default cluster (4.2 compat)
>>>>>> >
>>>>>> > Situation: I recently upgraded my three node HCI cluster from Ovirt
>>>>>> 4.2 to
>>>>>> > 4.3.  I did so by first updating the engine to 4.3 then upgrading
>>>>>> each
>>>>>> > ovirt-node host to 4.3 and rebooting.
>>>>>> >
>>>>>> > Currently engine and all hosts are running 4.3 and all is working
>>>>>> fine.
>>>>>> >
>>>>>> > To complete the upgrade I need to update cluster compatibility to
>>>>>> 4.3 and
>>>>>> > then data centre to 4.3.  This is where I am stuck.
>>>>>> >
>>>>>> > The CPU type on cluster is "Intel SandyBridge IBRS Family".  This
>>>>>> option is
>>>>>> > no longer available if I select 4.3 compatibility.  Any other
>>>>>> option chosen
>>>>>> > such as SandyBridge IBRS SSBD will not allow me to switch to 4.3 as
>>>>>> all
>>>>>> > hosts must be in maintenance mode (which is not possible w/ self
>>>>>> hosted
>>>>>> > engine).
>>>>>> >
>>>>>> > I saw another post about this where someone else followed steps to
>>>>>> create a
>>>>>> > second cluster on 4.3 with new CPU type then move one host to it,
>>>>>> start
>>>>>> > engine on it then perform other steps to eventually get to 4.3
>>>>>> > compatibility.
>>>>>> >
>>>>>>
>>>>>> I have the exact same hardware configuration and was able to change
>>>>>> to "SandyBridge IBRS SSBD" without creating a new cluster.  How I made 
>>>>>> that
>>>>>> happen, I'm not so sure, but the cluster may have been in "Global
>>>>>> Maintenance" mode when I changed it.
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Users mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>>> oVirt Code of Conduct:
>>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>>>> List Archives:
>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/5B3TAXKO7IBTWRVNF2K4II472TDISO6P/
>>>>>>
>>>>> _______________________________________________
> Users mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
> oVirt Code of Conduct:
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives:
> https://lists.ovirt.org/archives/list/[email protected]/message/PK7IR27DGLZRZSXVZEN66FL4O377GOHT/
>

_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/EZQQR3LBDF5IUJB6AFIYHCSAD36ME5DQ/

[ovirt-users] Re: Stuck completing last step of 4.3 upgrade

Reply via email to