Hello everyone, 

I have a replica 2 + arbiter installation and this morning the Hosted Engine 
gave the following error on the UI and resumed on a different node (node3) than 
the one it was originally running(node1). (The original node has more memory 
than the one it ended up, but it had a better memory usage percentage at the 
time). Also, the only way I discovered the migration had happened and there was 
an Error in Events, was because I logged in the web interface of ovirt for a 
routine inspection. Î’esides that, everything was working properly and still is.

The error that popped is the following:

VM HostedEngine is down with error. Exit message: internal error: qemu 
unexpectedly closed the monitor: 
2020-09-01T06:49:20.749126Z qemu-kvm: warning: All CPU(s) up to maxcpus should 
be described in NUMA config, ability to start up with partial NUMA mappings is 
obsoleted and will be removed in future
2020-09-01T06:49:20.927274Z qemu-kvm: -device 
virtio-blk-pci,iothread=iothread1,scsi=off,bus=pci.0,addr=0x7,drive=drive-ua-d5de54b6-9f8e-4fba-819b-ebf6780757d2,id=ua-d5de54b6-9f8e-4fba-819b-ebf6780757d2,bootindex=1,write-cache=on:
 Failed to get "write" lock
Is another process using the image?.

Which from what I could gather concerns the following snippet from the 
HostedEngine.xml and it's the virtio disk of the Hosted Engine:

    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' 
io='threads' iothread='1'/>
      <source 
file='/var/run/vdsm/storage/80f6e393-9718-4738-a14a-64cf43c3d8c2/d5de54b6-9f8e-4fba-819b-ebf6780757d2/a48555f4-be23-4467-8a54-400ae7baf9d7'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <target dev='vda' bus='virtio'/>
      <serial>d5de54b6-9f8e-4fba-819b-ebf6780757d2</serial>
      <alias name='ua-d5de54b6-9f8e-4fba-819b-ebf6780757d2'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' 
function='0x0'/>
    </disk>

I've tried looking into the logs and the sar command but I couldn't find 
anything to relate with the above errors and determining the reason for it to 
happen. Is this a Gluster or a QEMU problem?

The Hosted Engine was manually migrated five days before on node1.

Is there a standard practice I could follow to determine what happened and 
secure my system?

Thank you very much for your time, 
Maria Souvalioti
_______________________________________________
Users mailing list -- users@ovirt.org
To unsubscribe send an email to users-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/users@ovirt.org/message/HBU4P4E5ECOA6BNNFVLK2Y44ZX5UHYYE/

Reply via email to