bhyve stuck

Andrea Brancatelli Wed, 07 Jan 2015 01:23:56 -0800

 
Hello everybody.

I had () a VM doing some intense work over an volume that, on the host, is 
mapped on a iSCSI volume.


After some hours of correct work the machine hang displaying…  

ahcich0: Timeout on slot 29 port 0  
ahcich0: is 00000000 cs 00000000 ss f0000007 rs f0000007 tfd 50 serr 00000000 
cmd 1000c217
(ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 e2 b9 7e 40 38 00 00 00 00 
00
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Retrying command
ahcich0: Timeout on slot 2 port 0
ahcich0: is 00000000 cs 00000000 ss 000001fc rs 000001fc tfd 50 serr 00000000 
cmd 1000c817
(ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e2 4e 3c 40 10 00 00 01 00 
00
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Retrying command
ahcich0: Timeout on slot 8 port 0
ahcich0: is 00000000 cs 00000000 ss 00007f00 rs 00007f00 tfd 50 serr 00000000 
cmd 1000ce17
(ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e2 4e 3c 40 10 00 00 01 00 
00
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Retrying command
ahcich0: Timeout on slot 14 port 0
ahcich0: is 00000000 cs 00000000 ss 001fc000 rs 001fc000 tfd 50 serr 00000000 
cmd 1000d417
(ada0:ahcich0:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 e2 4e 3c 40 10 00 00 01 00 
00
(ada0:ahcich0:0:0:0): CAM status: Command timeout
(ada0:ahcich0:0:0:0): Retrying command
Assertion failed: (aior != NULL), function ahci_handle_dma, file 
/usr/src/usr.sbin/bhyve/pci_ahci.c, line 494.


Now the VM is totally hang. Trying to kill bhyve doesn’t work, not even kill 
-9. I tries do to do a bhyvectl —destroy and the VM disappeared from /dev/vmm 
but I am strongly uncomfortable with what to do now. The process is still 
there. Can I restart the VM?  

Obviously I cannot restart the physical machine.  

This is the state of the process:  

root@environment-rm-01:/san_storage/VMfs/cloud31Slave # ps -ax | grep 91715  
91715 5 T+ 1465:19.24 bhyve: cloud31Slave (bhyve)
18037 14 S+ 0:00.00 grep 91715

root@environment-rm-01:/san_storage/VMfs/cloud31Slave # procstat -t 91715
PID TID COMM TDNAME CPU PRI STATE WCHAN
91715 100129 bhyve mevent 2 120 stop -
91715 100246 bhyve blk-2:0 7 121 stop getblk
91715 100247 bhyve vtnet-4:0 tx 6 120 stop -
91715 100248 bhyve vcpu 0 8 120 stop -
91715 100249 bhyve vcpu 1 4 120 stop -

root@environment-rm-01:/san_storage/VMfs/cloud31Slave # procstat -kk 91715
PID TID COMM TDNAME KSTACK
91715 100129 bhyve mevent mi_switch+0xe1 thread_suspend_check+0x317 ast+0x4f5 
doreti_ast+0x1f
91715 100246 bhyve blk-2:0 mi_switch+0xe1 sleepq_wait+0x3a sleeplk+0x15d 
__lockmgr_args+0xc9e getblk+0x131 cluster_read+0xd0 ffs_read+0x1a9 
VOP_READ_APV+0xa1 vn_read+0x165 vn_io_fault_doio+0x22 vn_io_fault1+0x7c 
vn_io_fault+0x18b dofileread+0x95 kern_preadv+0x92 sys_preadv+0x3a 
amd64_syscall+0x351 Xfast_syscall+0xfb
91715 100247 bhyve vtnet-4:0 tx mi_switch+0xe1 thread_suspend_check+0x317 
ast+0x4f5 doreti_ast+0x1f
91715 100248 bhyve vcpu 0 mi_switch+0xe1 thread_suspend_check+0x317 ast+0x4f5 
doreti_ast+0x1f
91715 100249 bhyve vcpu 1 mi_switch+0xe1 thread_suspend_switch+0x170 
thread_single+0x357 sigexit+0x4e postsig+0x361 ast+0x427 Xfast_syscall+0x160

root@environment-rm-01:/san_storage/VMfs/cloud31Slave # kill -CONT 91715
root@environment-rm-01:/san_storage/VMfs/cloud31Slave # kill -9 91715
root@environment-rm-01:/san_storage/VMfs/cloud31Slave # ps -ax | grep 91715
91715 5 T+ 1465:19.24 bhyve: cloud31Slave (bhyve)

18041 14 S+ 0:00.00 grep 91715



-------  
Andrea Brancatelli  


_______________________________________________
freebsd-virtualization@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"

bhyve stuck

Reply via email to