[ovirt-users] Re: Stuck completing last step of 4.3 upgrade

Jayme Wed, 13 Feb 2019 11:34:33 -0800

I can confirm that this worked.  I had to shut down every single VM then
change ownership to vdsm:kvm of the image file then start VM back up.


On Wed, Feb 13, 2019 at 3:08 PM Simone Tiraboschi <[email protected]>
wrote:

>
>
> On Wed, Feb 13, 2019 at 8:06 PM Jayme <[email protected]> wrote:
>
>>
>> I might be hitting this bug:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1666795
>>
>
> Yes, you definitively are.
> Fixing files ownership on file system side is a valid workaround.
>
>
>>
>> On Wed, Feb 13, 2019 at 1:35 PM Jayme <[email protected]> wrote:
>>
>>> This may be happening because I changed cluster compatibility to 4.3
>>> then immediately after changed data center compatibility to 4.3 (before
>>> restarting VMs after cluster compatibility change).  If this is the case I
>>> can't fix by downgrading the data center compatibility to 4.2 as it won't
>>> allow me to do so.  What can I do to fix this, any VM I restart will break
>>> (I am leaving the others running for now, but there are some down that I
>>> can't start).
>>>
>>> Full error from VDSM:
>>>
>>> 2019-02-13 13:30:55,465-0400 ERROR (vm/d070ce80)
>>> [storage.TaskManager.Task] (Task='d5c8e50a-0a6f-4fe7-be79-fd322b273a1e')
>>> Unexpected error (task:875)
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line
>>> 882, in _run
>>>     return fn(*args, **kargs)
>>>   File "<string>", line 2, in prepareImage
>>>   File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50,
>>> in method
>>>     ret = func(*args, **kwargs)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line
>>> 3198, in prepareImage
>>>     legality = dom.produceVolume(imgUUID, volUUID).getLegality()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 818,
>>> in produceVolume
>>>     volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/glusterVolume.py",
>>> line 45, in __init__
>>>     volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>>> 800, in __init__
>>>     self._manifest = self.manifestClass(repoPath, sdUUID, imgUUID,
>>> volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>>> line 71, in __init__
>>>     volUUID)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>>> 86, in __init__
>>>     self.validate()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/volume.py", line
>>> 112, in validate
>>>     self.validateVolumePath()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/storage/fileVolume.py",
>>> line 131, in validateVolumePath
>>>     raise se.VolumeDoesNotExist(self.volUUID)
>>> VolumeDoesNotExist: Volume does not exist:
>>> (u'2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',)
>>> 2019-02-13 13:30:55,468-0400 ERROR (vm/d070ce80) [storage.Dispatcher]
>>> FINISH prepareImage error=Volume does not exist:
>>> (u'2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',) (dispatcher:81)
>>> 2019-02-13 13:30:55,469-0400 ERROR (vm/d070ce80) [virt.vm]
>>> (vmId='d070ce80-e0bc-489d-8ee0-47d5926d5ae2') The vm start process failed
>>> (vm:937)
>>> Traceback (most recent call last):
>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 866, in
>>> _startUnderlyingVm
>>>     self._run()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2749, in
>>> _run
>>>     self._devices = self._make_devices()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2589, in
>>> _make_devices
>>>     disk_objs = self._perform_host_local_adjustment()
>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2662, in
>>> _perform_host_local_adjustment
>>>     self._preparePathsForDrives(disk_params)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1011, in
>>> _preparePathsForDrives
>>>     drive['path'] = self.cif.prepareVolumePath(drive, self.id)
>>>   File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 415, in
>>> prepareVolumePath
>>>     raise vm.VolumeError(drive)
>>> VolumeError: Bad volume specification {'address': {'function': '0x0',
>>> 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'slot': '0x06'},
>>> 'serial': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'index': 0, 'iface':
>>> 'virtio', 'apparentsize': '64293699584', 'specParams': {}, 'cache': 'none',
>>> 'imageID': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'truesize':
>>> '64293814272', 'type': 'disk', 'domainID':
>>> '1f2e9989-9ab3-43d5-971d-568b8feca918', 'reqsize': '0', 'format': 'cow',
>>> 'poolID': 'a45e442e-9989-11e8-b0e4-00163e4bf18a', 'device': 'disk', 'path':
>>> '/rhev/data-center/a45e442e-9989-11e8-b0e4-00163e4bf18a/1f2e9989-9ab3-43d5-971d-568b8feca918/images/d81a6826-dc46-44db-8de7-405d30e44d57/2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',
>>> 'propagateErrors': 'off', 'name': 'vda', 'bootOrder': '1', 'volumeID':
>>> '2d6d5f87-ccb0-48ce-b3ac-84495bd12d32', 'diskType': 'file', 'alias':
>>> 'ua-d81a6826-dc46-44db-8de7-405d30e44d57', 'discard': False}
>>>
>>> On Wed, Feb 13, 2019 at 1:19 PM Jayme <[email protected]> wrote:
>>>
>>>> I may have made matters worse.  So I changed to 4.3 compatible cluster
>>>> then 4.3 compatible data center.  All VMs were marked as requiring a
>>>> reboot.  I restarted a couple of them and none of them will start up, they
>>>> are saying "bad volume specification".  The ones running that I did not yet
>>>> restart are still running ok.  I need to figure out why the VMs aren't
>>>> restarting.
>>>>
>>>> Here is an example from vdsm.log
>>>>
>>>> olumeError: Bad volume specification {'address': {'function': '0x0',
>>>> 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'slot': '0x06'},
>>>> 'serial': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'index': 0, 'iface':
>>>> 'virtio', 'apparentsize': '64293699584', 'specParams': {}, 'cache': 'none',
>>>> 'imageID': 'd81a6826-dc46-44db-8de7-405d30e44d57', 'truesize':
>>>> '64293814272', 'type': 'disk', 'domainID':
>>>> '1f2e9989-9ab3-43d5-971d-568b8feca918', 'reqsize': '0', 'format': 'cow',
>>>> 'poolID': 'a45e442e-9989-11e8-b0e4-00163e4bf18a', 'device': 'disk', 'path':
>>>> '/rhev/data-center/a45e442e-9989-11e8-b0e4-00163e4bf18a/1f2e9989-9ab3-43d5-971d-568b8feca918/images/d81a6826-dc46-44db-8de7-405d30e44d57/2d6d5f87-ccb0-48ce-b3ac-84495bd12d32',
>>>> 'propagateErrors': 'off', 'name': 'vda', 'bootOrder': '1', 'volumeID':
>>>> '2d6d5f87-ccb0-48ce-b3ac-84495bd12d32', 'diskType': 'file', 'alias':
>>>> 'ua-d81a6826-dc46-44db-8de7-405d30e44d57', 'discard': False}
>>>>
>>>> On Wed, Feb 13, 2019 at 1:01 PM Jayme <[email protected]> wrote:
>>>>
>>>>> I think I just figured out what I was doing wrong.  On edit cluster
>>>>> screen I was changing both the CPU type and cluster level 4.3.  I tried it
>>>>> again by switching to the new CPU type first (leaving cluster on 4.2) then
>>>>> saving, then going back in and switching compat level to 4.3.  It appears
>>>>> that you need to do this in two steps for it to work.
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Feb 13, 2019 at 12:57 PM Jayme <[email protected]> wrote:
>>>>>
>>>>>> Hmm interesting, I wonder how you were able to switch from
>>>>>> SandyBridge IBRS to SandyBridge IBRS SSBD.  I just attempted the same in
>>>>>> both regular mode and in global maintenance mode and it won't allow me 
>>>>>> to,
>>>>>> it says that all hosts have to be in maintenance mode (screenshots
>>>>>> attached).   Are you also running HCI/Gluster setup?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 13, 2019 at 12:44 PM Ron Jerome <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> > Environment setup:
>>>>>>> >
>>>>>>> > 3 Host HCI GlusterFS setup.  Identical hosts, Dell R720s w/ Intel
>>>>>>> E5-2690
>>>>>>> > CPUs
>>>>>>> >
>>>>>>> > 1 default data center (4.2 compat)
>>>>>>> > 1 default cluster (4.2 compat)
>>>>>>> >
>>>>>>> > Situation: I recently upgraded my three node HCI cluster from
>>>>>>> Ovirt 4.2 to
>>>>>>> > 4.3.  I did so by first updating the engine to 4.3 then upgrading
>>>>>>> each
>>>>>>> > ovirt-node host to 4.3 and rebooting.
>>>>>>> >
>>>>>>> > Currently engine and all hosts are running 4.3 and all is working
>>>>>>> fine.
>>>>>>> >
>>>>>>> > To complete the upgrade I need to update cluster compatibility to
>>>>>>> 4.3 and
>>>>>>> > then data centre to 4.3.  This is where I am stuck.
>>>>>>> >
>>>>>>> > The CPU type on cluster is "Intel SandyBridge IBRS Family".  This
>>>>>>> option is
>>>>>>> > no longer available if I select 4.3 compatibility.  Any other
>>>>>>> option chosen
>>>>>>> > such as SandyBridge IBRS SSBD will not allow me to switch to 4.3
>>>>>>> as all
>>>>>>> > hosts must be in maintenance mode (which is not possible w/ self
>>>>>>> hosted
>>>>>>> > engine).
>>>>>>> >
>>>>>>> > I saw another post about this where someone else followed steps to
>>>>>>> create a
>>>>>>> > second cluster on 4.3 with new CPU type then move one host to it,
>>>>>>> start
>>>>>>> > engine on it then perform other steps to eventually get to 4.3
>>>>>>> > compatibility.
>>>>>>> >
>>>>>>>
>>>>>>> I have the exact same hardware configuration and was able to change
>>>>>>> to "SandyBridge IBRS SSBD" without creating a new cluster.  How I made 
>>>>>>> that
>>>>>>> happen, I'm not so sure, but the cluster may have been in "Global
>>>>>>> Maintenance" mode when I changed it.
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Users mailing list -- [email protected]
>>>>>>> To unsubscribe send an email to [email protected]
>>>>>>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>>>>>>> oVirt Code of Conduct:
>>>>>>> https://www.ovirt.org/community/about/community-guidelines/
>>>>>>> List Archives:
>>>>>>> https://lists.ovirt.org/archives/list/[email protected]/message/5B3TAXKO7IBTWRVNF2K4II472TDISO6P/
>>>>>>>
>>>>>> _______________________________________________
>> Users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> Privacy Statement: https://www.ovirt.org/site/privacy-policy/
>> oVirt Code of Conduct:
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives:
>> https://lists.ovirt.org/archives/list/[email protected]/message/PK7IR27DGLZRZSXVZEN66FL4O377GOHT/
>>
>

_______________________________________________
Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/site/privacy-policy/
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/BPWTQIUEFH2IU3R6NQOGJDEWHVJWT3BP/

[ovirt-users] Re: Stuck completing last step of 4.3 upgrade

Reply via email to