Hi all, I have had a strange behavior yesterday on a new cluster. A Windows 2008 Guest suddenly was off, but I could not find any clue in the logs on why it was off. It's a VM which was migrated from a physical one in the w.e. It had been working fine for the whole time.
I then thought it had something to do with the file system (as it's on gluster), but no clue there either, no errors were reported at that time. The server is also a gluster server and has a dedicated bond for gluster, a dedicated bond for cluster, and a dedicated bond for the LAN, all with balance-alb. The ethernet for cluster and gluster have jumbo frames enabled (enabled also on the switches) it's a dual E5-2630 server with 128GB ram, and it's not over allocated, server load is basically very low. The vm has many cpu assigned but is not using much of them right now. down happened around 16:20 (local time) Here some information: proxmox-ve: 4.4-79 (running kernel: 4.4.35-2-pve) pve-manager: 4.4-12 (running version: 4.4-12/e71b7a74) pve-kernel-4.4.35-1-pve: 4.4.35-77 pve-kernel-4.4.35-2-pve: 4.4.35-79 lvm2: 2.02.116-pve3 corosync-pve: 2.4.0-1 libqb0: 1.0-1 pve-cluster: 4.0-48 qemu-server: 4.0-108 pve-firmware: 1.1-10 libpve-common-perl: 4.0-91 libpve-access-control: 4.0-23 libpve-storage-perl: 4.0-73 pve-libspice-server1: 0.12.8-1 vncterm: 1.2-1 pve-docs: 4.4-3 pve-qemu-kvm: 2.7.1-1 pve-container: 1.0-93 pve-firewall: 2.0-33 pve-ha-manager: 1.0-40 ksm-control-daemon: 1.2-1 glusterfs-client: 3.8.8-1 lxc-pve: 2.0.7-1 lxcfs: 2.0.6-pve1 criu: 1.6.0-1 novnc-pve: 0.5-8 smartmontools: 6.5+svn4324-1~pve80 zfsutils: 0.6.5.8-pve14~bpo80 This is the only entry in the syslog: Feb 13 16:18:35 srvpve1 systemd-timesyncd[3187]: interval/delta/delay/jitter/drift 2048s/+0.003s/0.037s/0.022s/+5ppm Feb 13 16:20:08 srvpve1 kernel: [492671.687688] vmbr0: port 3(tap101i0) entered disabled state Feb 13 16:20:08 srvpve1 kernel: [492671.689734] vmbr0: port 3(tap101i0) entered disabled state Feb 13 16:25:37 srvpve1 pvedaemon[101918]: <root@pam> successful auth for user 'root@pam' I have checked basically all the logs but found no clues on what happened. Also within windows the event viewer does not report anything except that at next boot (which I had to start manually), it reported that windows was badly shutdown. Virtio disk driver installed is 0.1.126 The VM is NOT configured for HA, and right now it's the only vm which is heavily used, the other VMs are not much used but they did not stop, they all use the same glusterfs filesystem. this is the VM configuration: boot: cdn bootdisk: virtio0 cores: 8 ide2: none,media=cdrom memory: 65536 name: scavb net0: e1000=1C:C1:DE:E9:2A:06,bridge=vmbr0 numa: 0 onboot: 1 ostype: win7 scsihw: virtio-scsi-pci smbios1: uuid=90477e47-4357-42f8-9c1d-797d4111a604 sockets: 2 virtio0: datastore1:101/vm-101-disk-1.qcow2,size=500G virtio1: datastore1:101/vm-101-disk-2.qcow2,size=900G Any hint on how to proceed. If any other info is needed for debug please let me know. Alessandro _______________________________________________ pve-user mailing list pve-user@pve.proxmox.com http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user