[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Vinícius Ferrão via Users
Hi again Strahil, It’s oVirt 4.3.10. Same CPU on the entire cluster, it’s three machines with Xeon E5-2620v2 (Ivy Bridge), all the machines are identical in model and specs. I’ve changed the VM CPU Model to: Nehalem,+spec-ctrl,+ssbd Let’s see how it behaves. If it crashes again I’ll definitely

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Vinícius Ferrão via Users
Hi Gianluca. On 22 Sep 2020, at 04:24, Gianluca Cecchi mailto:gianluca.cec...@gmail.com>> wrote: On Tue, Sep 22, 2020 at 9:12 AM Vinícius Ferrão via Users mailto:users@ovirt.org>> wrote: Hi Strahil, yes I can’t find anything recently either. You digged way further then me, I found some

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Strahil Nikolov via Users
This looks much like my openBSD 6.6 under Latest AMD CPUs. KVM did not accept a pretty valid instruction and it was a bug in KVM. Maybe you can try to : - power off the VM - pick an older CPU type for that VM only - power on and monitor in the next days  Do you have a cluster with different cpu

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Gianluca Cecchi
On Tue, Sep 22, 2020 at 9:12 AM Vinícius Ferrão via Users wrote: > Hi Strahil, yes I can’t find anything recently either. You digged way > further then me, I found some regressions on the kernel but I don’t know if > it’s related or not: > > https://patchwork.kernel.org/patch/5526561/ >

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-22 Thread Vinícius Ferrão via Users
Hi Strahil, yes I can’t find anything recently either. You digged way further then me, I found some regressions on the kernel but I don’t know if it’s related or not: https://patchwork.kernel.org/patch/5526561/ https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1045027 Regarding the OS,

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-21 Thread Strahil Nikolov via Users
Interesting is that I don't find anything recent , but this one: https://devblogs.microsoft.com/oldnewthing/20120511-00/?p=7653 Can you check if anything in the OS was updated/changed recently ? Also check if the VM is with nested virtualization enabled.  Best Regards, Strahil Nikolov В

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-21 Thread Vinícius Ferrão via Users
Strahil, thank you man. We finally got some output: 2020-09-15T12:34:49.362238Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: CPU 10 [socket-id: 10, core-id: 0, thread-id: 0], CPU 11 [socket-id: 11, core-id: 0, thread-id: 0], CPU 12 [socket-id: 12, core-id: 0, thread-id: 0], CPU 13

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-21 Thread Strahil Nikolov via Users
Usually libvirt's log might provide hints (yet , no clues) of any issues. For example:  /var/log/libvirt/qemu/.log Anything changed recently (maybe oVirt version was increased) ? Best Regards, Strahil Nikolov В понеделник, 21 септември 2020 г., 23:28:13 Гринуич+3, Vinícius Ferrão

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-21 Thread Strahil Nikolov via Users
What type of disks are you using ? Any change you use thin disks ? Best Regards, Strahil Nikolov В понеделник, 21 септември 2020 г., 07:20:23 Гринуич+3, Vinícius Ferrão via Users написа: Hi, sorry to bump the thread. But I still with this issue on the VM. This crashes are still

[ovirt-users] Re: How to discover why a VM is getting suspended without recovery possibility?

2020-09-20 Thread Vinícius Ferrão via Users
Hi, sorry to bump the thread. But I still with this issue on the VM. This crashes are still happening, and I really don’t know what to do. Since there’s nothing on logs, except from that message on `dmesg` of the host machine I started changing setting to see if anything changes or if I at