Hi Nir,

And the second one is down now too. see some comments below.

> On 13 Mar 2016, at 12:51, Nir Soffer <nsof...@redhat.com> wrote:
> On Sun, Mar 13, 2016 at 9:46 AM, Christophe TREFOIS
> <christophe.tref...@uni.lu> wrote:
>> Dear all,
>> I have a problem since couple of weeks, where randomly 1 VM (not always the 
>> same) becomes completely unresponsive.
>> We find this out because our Icinga server complains that host is down.
>> Upon inspection, we find we can’t open a console to the VM, nor can we login.
>> In oVirt engine, the VM looks like “up”. The only weird thing is that RAM 
>> usage shows 0% and CPU usage shows 100% or 75% depending on number of cores.
>> The only way to recover is to force shutdown the VM via 2-times shutdown 
>> from the engine.
>> Could you please help me to start debugging this?
>> I can provide any logs, but I’m not sure which ones, because I couldn’t see 
>> anything with ERROR in the vdsm logs on the host.
> I would inspect this vm on the host when it happens.
> What is vdsm cpu usage? what is the qemu process (for this vm) cpu usage?

vdsm cpu usage is going up and down to 15%.

qemu process usage for the VM was 0, except for 1 of the threads “stuck” at 
100%, rest was idle.

> strace output of this qemu process (all threads) or a core dump can help qemu
> developers to understand this issue.

I attached an strace on the process for:

qemu     15241 10.6  0.4 4742904 1934988 ?     Sl   Mar23 131:41 
/usr/libexec/qemu-kvm -name test-ubuntu-uni-lu -S -machine 
pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu SandyBridge -m 
size=4194304k,slots=16,maxmem=4294967296k -realtime mlock=off -smp 
4,maxcpus=64,sockets=16,cores=4,threads=1 -numa node,nodeid=0,cpus=0-3,mem=4096 
-uuid 754871ec-0339-4a65-b490-6a766aaea537 -smbios 
 -no-user-config -nodefaults -chardev 
 -mon chardev=charmonitor,id=monitor,mode=control -rtc 
base=2016-03-23T22:06:01,driftfix=slew -global kvm-pit.lost_tick_policy=discard 
-no-hpet -no-shutdown -boot strict=on -device 
piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device 
virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device 
virtio-serial-pci,id=virtio-serial0,max_ports=16,bus=pci.0,addr=0x5 -drive 
if=none,id=drive-ide0-1-0,readonly=on,format=raw,serial= -device 
ide-cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -drive 
 -netdev tap,fd=108,id=hostnet0,vhost=on,vhostfd=109 -device 
 -device usb-tablet,id=input0 -vnc,password -device 
cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device 
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on


This is CentOS 7.2, latest patches and latest 3.6.4 oVirt.

Thank you for any help / pointers.

Could it be memory ballooning?


>> The host is running
>> OS Version:             RHEL - 7 - 1.1503.el7.centos.2.8
>> Kernel Version: 3.10.0 - 229.14.1.el7.x86_64
>> KVM Version:            2.1.2 - 23.el7_1.8.1
>> LIBVIRT Version:        libvirt-1.2.8-16.el7_1.4
>> VDSM Version:   vdsm-4.16.26-0.el7.centos
>> SPICE Version:  0.12.4 - 9.el7_1.3
>> GlusterFS Version:      glusterfs-3.7.5-1.el7
> You are running old versions, missing lot of fixes. Nothing specific
> to your problem
> but this lower the chance to get a working system.
> It would be nice if you can upgrade to ovirt-3.6 and report if it made
> any change.
> Or at lest latest ovirt-3.5.
>> We use a locally exported gluster as storage domain (eg, storage is on the 
>> same machine exposed via gluster). No replica.
>> We run around 50 VMs on that host.
> Why use gluster for this? Do you plan to add more gluster servers in the 
> future?
> Nir

Users mailing list

Reply via email to