bhyve stuck
Hello everybody. I had () a VM doing some intense work over an volume that, on the host, is mapped on a iSCSI volume. After some hours of correct work the machine hang displaying… ahcich0: Timeout on slot 29 port 0 ahcich0: is cs ss f007 rs f007 tfd 50 serr cmd 1000c217 (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 e2 b9 7e 40 38 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 2 port 0 ahcich0: is cs ss 01fc rs 01fc tfd 50 serr cmd 1000c817 (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e2 4e 3c 40 10 00 00 01 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 8 port 0 ahcich0: is cs ss 7f00 rs 7f00 tfd 50 serr cmd 1000ce17 (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e2 4e 3c 40 10 00 00 01 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 14 port 0 ahcich0: is cs ss 001fc000 rs 001fc000 tfd 50 serr cmd 1000d417 (ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e2 4e 3c 40 10 00 00 01 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command Assertion failed: (aior != NULL), function ahci_handle_dma, file /usr/src/usr.sbin/bhyve/pci_ahci.c, line 494. Now the VM is totally hang. Trying to kill bhyve doesn’t work, not even kill -9. I tries do to do a bhyvectl —destroy and the VM disappeared from /dev/vmm but I am strongly uncomfortable with what to do now. The process is still there. Can I restart the VM? Obviously I cannot restart the physical machine. This is the state of the process: root@environment-rm-01:/san_storage/VMfs/cloud31Slave # ps -ax | grep 91715 91715 5 T+ 1465:19.24 bhyve: cloud31Slave (bhyve) 18037 14 S+ 0:00.00 grep 91715 root@environment-rm-01:/san_storage/VMfs/cloud31Slave # procstat -t 91715 PID TID COMM TDNAME CPU PRI STATE WCHAN 91715 100129 bhyve mevent 2 120 stop - 91715 100246 bhyve blk-2:0 7 121 stop getblk 91715 100247 bhyve vtnet-4:0 tx 6 120 stop - 91715 100248 bhyve vcpu 0 8 120 stop - 91715 100249 bhyve vcpu 1 4 120 stop - root@environment-rm-01:/san_storage/VMfs/cloud31Slave # procstat -kk 91715 PID TID COMM TDNAME KSTACK 91715 100129 bhyve mevent mi_switch+0xe1 thread_suspend_check+0x317 ast+0x4f5 doreti_ast+0x1f 91715 100246 bhyve blk-2:0 mi_switch+0xe1 sleepq_wait+0x3a sleeplk+0x15d __lockmgr_args+0xc9e getblk+0x131 cluster_read+0xd0 ffs_read+0x1a9 VOP_READ_APV+0xa1 vn_read+0x165 vn_io_fault_doio+0x22 vn_io_fault1+0x7c vn_io_fault+0x18b dofileread+0x95 kern_preadv+0x92 sys_preadv+0x3a amd64_syscall+0x351 Xfast_syscall+0xfb 91715 100247 bhyve vtnet-4:0 tx mi_switch+0xe1 thread_suspend_check+0x317 ast+0x4f5 doreti_ast+0x1f 91715 100248 bhyve vcpu 0 mi_switch+0xe1 thread_suspend_check+0x317 ast+0x4f5 doreti_ast+0x1f 91715 100249 bhyve vcpu 1 mi_switch+0xe1 thread_suspend_switch+0x170 thread_single+0x357 sigexit+0x4e postsig+0x361 ast+0x427 Xfast_syscall+0x160 root@environment-rm-01:/san_storage/VMfs/cloud31Slave # kill -CONT 91715 root@environment-rm-01:/san_storage/VMfs/cloud31Slave # kill -9 91715 root@environment-rm-01:/san_storage/VMfs/cloud31Slave # ps -ax | grep 91715 91715 5 T+ 1465:19.24 bhyve: cloud31Slave (bhyve) 18041 14 S+ 0:00.00 grep 91715 --- Andrea Brancatelli ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
Re: bhyve stuck
Hello Peter. The host is a FreeBSD 10.1-p3. I tried to restart the VM, but it hung after bhyveload. I had to reboot the physical host and what's worst is that the MySQL instance inside of the VM was trashed. Luckily I has backups. Sent from my iPad On 07/gen/2015, at 18:45, Peter Grehan gre...@freebsd.org wrote: Hi Andrea, Assertion failed: (aior != NULL), function ahci_handle_dma, file /usr/src/usr.sbin/bhyve/pci_ahci.c, line 494. Ok - this should result in the bhyve process exiting. Now the VM is totally hang. Trying to kill bhyve doesn’t work, not even kill -9. I tries do to do a bhyvectl —destroy and the VM disappeared from /dev/vmm but I am strongly uncomfortable with what to do now. The process is still there. Can I restart the VM? It should be fine to restart after a bhyvectl --destroy This is the state of the process: ... 91715 100246 bhyve blk-2:0 7 121 stop getblk This seems to be the culprit. What's the version of FreeBSD running on the host ? tychon@ did quite a bit of work recently on making the block layer more robust in the face of guest controller timeouts. This made it in to CURRENT as of r274330, and was MFCd to 10-STABLE with r276429. That change may help with your issue. later, Peter. ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org
Re: bhyve stuck
Hi Andrea, Assertion failed: (aior != NULL), function ahci_handle_dma, file /usr/src/usr.sbin/bhyve/pci_ahci.c, line 494. Ok - this should result in the bhyve process exiting. Now the VM is totally hang. Trying to kill bhyve doesn’t work, not even kill -9. I tries do to do a bhyvectl —destroy and the VM disappeared from /dev/vmm but I am strongly uncomfortable with what to do now. The process is still there. Can I restart the VM? It should be fine to restart after a bhyvectl --destroy This is the state of the process: ... 91715 100246 bhyve blk-2:0 7 121 stop getblk This seems to be the culprit. What's the version of FreeBSD running on the host ? tychon@ did quite a bit of work recently on making the block layer more robust in the face of guest controller timeouts. This made it in to CURRENT as of r274330, and was MFCd to 10-STABLE with r276429. That change may help with your issue. later, Peter. ___ freebsd-virtualization@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization To unsubscribe, send any mail to freebsd-virtualization-unsubscr...@freebsd.org