I've been testing CGD + LVM combination with NetBSD-10.1 Bhyve VM and so
far every single test reliably hangs the kernel. The same test with LVM
only (i.e. no CGD) passes OK.

The test involves intensive I/O to a single 40GiB file, using variable
blocks between 1KiB - 32KiB and performed at random offsets. The user
space threads performing I/O become stuck in tstile state:

crash> bt/a ffffe9b7ecf61480
trace: pid 4855 lid 4685 at 0xffffc68279585c70
sleepq_block() at sleepq_block+0x13a
turnstile_block() at turnstile_block+0x2bf
rw_vector_enter() at rw_vector_enter+0x14b
genfs_lock() at genfs_lock+0x80
VOP_LOCK() at VOP_LOCK+0xb3
vn_write() at vn_write+0xa0
dofilewrite() at dofilewrite+0x80
sys_pwrite() at sys_pwrite+0x95
syscall() at syscall+0x1fc
--- syscall (number 174) ---
syscall+0x1fc:

The threads seem to be waiting for:

PID     LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
0       385 3   3       240   ffffe9b7f5a0d980            ioflush plpg


Looking at one of the cgd threads, it is also waiting on something:

crash> ps
PID     LID S CPU     FLAGS       STRUCT LWP *               NAME WAIT
0       549 3   1       600   ffffe9b7f4aeab80              cgd/1 cgd

crash> bt/a ffffe9b7f4aeab80
trace: pid 0 lid 549 at 0xffffc682793f6f30
sleepq_block() at sleepq_block+0x13a
cv_wait() at cv_wait+0xb7
workqueue_worker() at workqueue_worker+0x124

The poweroff command also hangs and the VM must be power cycled.

Reply via email to