Debugging
https://jenkins.ovirt.org/job/ovirt-system-tests_he-basic-suite-4.3/427

on engine.log I see:

At 2020-04-28 23:48:18,378-04 I see:SetVdsStatusVDSCommandParameters:{
  hostId='b34db269-5351-4653-9a0c-90a9154cd687',
  status='NonOperational',
  nonOperationalReason='STORAGE_DOMAIN_UNREACHABLE',
  stopSpmFailureLogged='false',
  maintenanceReason='null'}


So, when test try to put host1 in local maintenance at 2020-04-28
23:59:51 it fails with:

Validation of action 'MaintenanceNumberOfVdss' failed for user
admin@internal-authz. Reasons:
VAR__TYPE__HOST,VAR__ACTION__MAINTENANCE,VDS_CANNOT_MAINTENANCE_NO_ALTERNATE_HOST_FOR_HOSTED_ENGINE

vdsm on host0 shows a traceback

2020-04-28 23:43:04,944-0400 ERROR (jsonrpc/0) [vds] setKsmTune API
call failed. (API:1660)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1657, in setKsmTune
    supervdsm.getProxy().ksmTune(tuningParams)
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py",
line 56, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py",
line 54, in <lambda>
    **kwargs)
  File "<string>", line 2, in ksmTune
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773,
in _callmethod
    raise convert_to_error(kind, result)
IOError: [Errno 22] Invalid argument

which seems unrelated but maybe worth to be investigated by storage
team. +Tal Nisan <[email protected]> can you look into this?



More close to the failure on host0, I see:

2020-04-28 23:49:58,775-0400 ERROR (vm/b6ca2e94) [virt.vm]
(vmId='b6ca2e94-df8b-48e9-b0ee-2bc0f939786a') The vm start process
failed (vm:934)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 868,
in _startUnderlyingVm
    self._run()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2895, in _run
    dom.createWithFlags(flags)
  File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py",
line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/common/function.py",
line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1110, in
createWithFlags
    if ret == -1: raise libvirtError ('virDomainCreateWithFlags()
failed', dom=self)
libvirtError: internal error: qemu unexpectedly closed the monitor:
2020-04-29T03:49:55.484660Z qemu-kvm: warning: All CPU(s) up to
maxcpus should be described in NUMA config, ability to start up with
partial NUMA mappings is obsoleted and will be removed in future
2020-04-29T03:49:55.582536Z qemu-kvm: -device
virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.0,addr=0x7,drive=drive-ua-cfb2266f-5d47-4418-b30f-9c1d3fbf512c,id=ua-cfb2266f-5d47-4418-b30f-9c1d3fbf512c,bootindex=1,write-cache=on:
Failed to get shared "write" lock
Is another process using the image
[/var/run/vdsm/storage/fc1a55d5-deb4-4423-be56-e7313645798b/cfb2266f-5d47-4418-b30f-9c1d3fbf512c/68d04a61-9f34-4a1b-8d6e-bca43a7b9339]?
2020-04-28 23:49:58,775-0400 INFO  (vm/b6ca2e94) [virt.vm]
(vmId='b6ca2e94-df8b-48e9-b0ee-2bc0f939786a') Changed state to Down:
internal error: qemu unexpectedly closed the monitor:
2020-04-29T03:49:55.484660Z qemu-kvm: warning: All CPU(s) up to
maxcpus should be described in NUMA config, ability to start up with
partial NUMA mappings is obsoleted and will be removed in future
2020-04-29T03:49:55.582536Z qemu-kvm: -device
virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.0,addr=0x7,drive=drive-ua-cfb2266f-5d47-4418-b30f-9c1d3fbf512c,id=ua-cfb2266f-5d47-4418-b30f-9c1d3fbf512c,bootindex=1,write-cache=on:
Failed to get shared "write" lock
Is another process using the image
[/var/run/vdsm/storage/fc1a55d5-deb4-4423-be56-e7313645798b/cfb2266f-5d47-4418-b30f-9c1d3fbf512c/68d04a61-9f34-4a1b-8d6e-bca43a7b9339]?
(code=1) (vm:1702)
2020-04-28 23:49:58,799-0400 INFO  (vm/b6ca2e94) [virt.vm]
(vmId='b6ca2e94-df8b-48e9-b0ee-2bc0f939786a') Stopping connection
(guestagent:455)
2020-04-28 23:49:58,849-0400 INFO  (jsonrpc/1) [api.virt] START
destroy(gracefulAttempts=1) from=::ffff:192.168.200.99,49938,
vmId=b6ca2e94-df8b-48e9-b0ee-2bc0f939786a (api:48)
2020-04-28 23:49:58,851-0400 INFO  (jsonrpc/1) [virt.vm]
(vmId='b6ca2e94-df8b-48e9-b0ee-2bc0f939786a') Release VM resources
(vm:5186)
2020-04-28 23:49:58,851-0400 WARN  (jsonrpc/1) [virt.vm]
(vmId='b6ca2e94-df8b-48e9-b0ee-2bc0f939786a') trying to set state to
Powering down when already Down (vm:626)
2020-04-28 23:49:58,851-0400 INFO  (jsonrpc/1) [virt.vm]
(vmId='b6ca2e94-df8b-48e9-b0ee-2bc0f939786a') Stopping connection
(guestagent:455)

+Ryan Barry <[email protected]> can you check the qemu-kvm warning?


Help understanding why storage domain became unreachable is welcome.

-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA <https://www.redhat.com/>

[email protected]
<https://www.redhat.com/>*
<https://www.redhat.com/en/summit?sc_cid=7013a000002D2QxAAK>*
*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.*
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/LGFVYV6B2F665KHHPIUITORRWWKAPFPA/

Reply via email to