Hi,

I have some screenshots, they aren't complete as console shows about 4 crashes in a few seconds:

ftp://ftp.binovo.es/elacunza/migration-crash/Captura%20de%20pantalla%20de%202018-02-02%2009-33-24.png
ftp://ftp.binovo.es/elacunza/migration-crash/Captura%20de%20pantalla%20de%202018-02-02%2009-56-29.png
ftp://ftp.binovo.es/elacunza/migration-crash/Captura%20de%20pantalla%20de%202018-02-02%2009-57-05.png
ftp://ftp.binovo.es/elacunza/migration-crash/Captura%20de%20pantalla%20de%202018-02-02%2009-57-27.png

Crashed don't get logged to syslog/debug/dmesg

Cheers

El 02/02/18 a las 10:42, Gilberto Nunes escribió:
Hi
I think that would be nice if you could send us kernel panic message or
even the dmesg output.
Do you have any modules that was compiled but hand in this system?


Em sex, 2 de fev de 2018 07:14, Eneko Lacunza <elacu...@binovo.es> escreveu:

Hi all,

We have replaced an old node in our office Proxmox 5.1 cluster, with a
Ryzen 7 1700 machine with 64GB non-ECC RAM, just moving the disks from
the old Intel server to the new AMD machine. So far so good, everything
booted OK, Ceph OSD started OK after adjusting network, replacement went
really nice.

But we have found _one_ Debian 9 VM that kernel panics shortly after
migrating to/from Intel nodes from/to AMD node. Sometimes it is a matter
of seconds, sometimes it needs some minutes or even rarely one or two
hours.

The strange thing is that we have done that king of migration with other
VMs (serveral Windows VMs with different versions, another CentOS VM,
Debian 8 VM) and works perfectly.

If we restart this problematic VM after the migration+crash, it works
flawlessly (no more crashes until migration to another CPU maker).
Migration between Intel CPUs (with ECC memory) works OK too. We don't
have a second AMD machine to test migration between AMD nodes.

VM has 1 socket/2 cores type kvm64, 3GB of RAM, Standard VGA, cdrom at
IDE2, scsi-virtio, scsi0 8G on ceph-rbd, scsi1 50GB on ceph-rbd, network
virtio, OS type Linux 4.x, Hotplug Disk, Network, USB, ACPI support yes,
BIOS SeaBIOS, KVM hwd virt yes, qemu agent no. We have tried with
virtio-block too.

# pveversion -v
proxmox-ve: 5.1-35 (running kernel: 4.13.13-4-pve)
pve-manager: 5.1-42 (running version: 5.1-42/724a6cb3)
pve-kernel-4.4.83-1-pve: 4.4.83-96
pve-kernel-4.13.4-1-pve: 4.13.4-26
pve-kernel-4.4.76-1-pve: 4.4.76-94
pve-kernel-4.13.13-4-pve: 4.13.13-35
pve-kernel-4.4.67-1-pve: 4.4.67-92
libpve-http-server-perl: 2.0-8
lvm2: 2.02.168-pve6
corosync: 2.4.2-pve3
libqb0: 1.0.1-1
pve-cluster: 5.0-19
qemu-server: 5.0-19
pve-firmware: 2.0-3
libpve-common-perl: 5.0-25
libpve-guest-common-perl: 2.0-14
libpve-access-control: 5.0-7
libpve-storage-perl: 5.0-17
pve-libspice-server1: 0.12.8-3
vncterm: 1.5-3
pve-docs: 5.1-16
pve-qemu-kvm: 2.9.1-5
pve-container: 2.0-18
pve-firewall: 3.0-5
pve-ha-manager: 2.0-4
ksm-control-daemon: 1.2-2
glusterfs-client: 3.8.8-1
lxc-pve: 2.1.1-2
lxcfs: 2.0.8-1
criu: 2.11.1-1~bpo90
novnc-pve: 0.6-4
smartmontools: 6.5+svn4324-1
zfsutils-linux: 0.7.3-pve1~bpo9
ceph: 12.2.2-1~bpo90+1

Any ideas? This is a production VM but it isn't critical, we can play
with it. We can also live with the problem, but I think it could be of
interest to try to debug the problem.

Thanks a lot
Eneko

--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user


--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943569206
Astigarraga bidea 2, 2º izq. oficina 11; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

_______________________________________________
pve-user mailing list
pve-user@pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to