On Mon, Jan 26, 2026 at 11:30:17AM +0000, Shinichiro Kawasaki wrote: > Hello all, > > I regularly run fstests with the kernel at xfs/for-next branch tip to validate > the capability of zoned block device capability of xfs. Recently, I started > observing hangs of the test runs with the message: > > "rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:" > > The hangs occurred in different test cases, and simply rerunning the test > cases > does not reproduce the hang. When I ran the whole fstests test cases, it also > fails to reproduce the hang. However, when the whole fstests is repeated the > hang is recreated. The hang looks rare, takes very long time to recreate and > is > tough to chase down. > > To tackle this problem, I would like to seek the expertise of rcu developers. > I > have attached kernel message logs captured at the hangs for analysis > [1][2][3]. > Any insights or guidance on how to debug this problem will be appreciated. >
Nothing XFS related in these. All the XFS traces are waiting on IO submission - the block layer below XFS is typically sleeping waiting for tags to be allocated. > [1] hang observed on Jan/23/2026 > > dmesg log file attached: generic_005_hang > hanged test case: generic/005 > kernel: xfs/for-next, 51aba4ca399, v6.19-rc5+ > block device: dm-linear on HDD (non-zoned) > xfs: zoned The block device has an expired rq so the timeout work is trying to run synchronise_rcu(): > [272416.203262][ T167] wait_for_completion_state+0x21/0x40 > [272416.203719][ T167] __wait_rcu_gp+0x1cd/0x410 > [272416.204487][ T167] synchronize_rcu_normal+0x4a8/0x510 > [272416.207632][ T167] blk_mq_timeout_work+0x4aa/0x5d0 > [272416.209324][ T167] process_one_work+0x86b/0x1490 So that's possibly why IO is stuck. i.e. the block device is waiting on the RCU grace period to expire, and RCU processing has stalled for some reason. Hence the block device appears to be a victim of the issue, not the cause. > [2] hang observed on Jan/18/2026 > > dmesg log file attached: xfs_598_hang > hanged test case: xfs/598 > kernel: Christophs' xfs branch, ec6aea2a5 v6.19-rc1+ > block device: TCMU (non-zoned) > xfs: non-zoned Looks like some kind of scheduler/static-key livelock or deadlock. There are a bunch of tasks all doing stuff like: > [164582.112175][ C10] on_each_cpu_cond_mask+0x24/0x40 > [164582.112179][ C10] smp_text_poke_batch_finish+0x45c/0xd20 > [164582.112218][ C10] arch_jump_label_transform_apply+0x1c/0x30 > [164582.112224][ C10] static_key_enable_cpuslocked+0x16c/0x230 > [164582.112230][ C10] static_key_enable+0x1f/0x30 > [164582.112235][ C10] process_one_work+0x86b/0x1490 Along with the rcu_preempt thread apparently spinning trying to reschedule: > [164661.054667][ C12] RIP: 0010:__pv_queued_spin_lock_slowpath+0x232/0xdc0 > [164661.054745][ C12] do_raw_spin_lock+0x1d9/0x270 > [164661.054768][ C12] raw_spin_rq_lock_nested+0x24/0x170 > [164661.054774][ C12] _raw_spin_rq_lock_irqsave+0x41/0x50 > [164661.054778][ C12] resched_cpu+0x62/0xf0 > [164661.054783][ C12] force_qs_rnp+0x67d/0xaa0 > [164661.054799][ C12] rcu_gp_fqs_loop+0x948/0x11b0 > [164661.054841][ C12] rcu_gp_kthread+0x4f2/0x660 > [164661.054876][ C12] kthread+0x3a4/0x760 I can't find anything obvious in the block layer waiting on RCU. However, XFS is waiting in the block layer on mq tag allocation for submission (like the 005 hang above) or waiting on journal write IO completion, so the block may may well be hung on RCU again. > [3] hang observed on Jan/14/2026 > > dmesg log file attached: generic_417_hang > hanged test case: generic/417 > kernel: xfs/for-next, ea44380376c, v6.19-rc1+ > block device: null_blk (non-zoned) > xfs: zoned Same static key pattern in on_each_cpu_cond_mask(), there's also a bunch of tlb flushes stcuk in on_each_cpu_cond_mask(). rcu_preempt thread is not waking from: > [74627.121083][ C2] schedule+0xd1/0x250 > [74627.121959][ C2] schedule_timeout+0x103/0x260 > [74627.128027][ C2] rcu_gp_fqs_loop+0x208/0x11b0 > [74627.135240][ C2] rcu_gp_kthread+0x4f2/0x660 There is nothing XFS or block related in the hung task traces at all. IOWs, this looks like some kind of RCU/static key/scheduler interaction which may propagate into the block layer if it needs RCU synchronisation. Hence it does not appear to have anything to do with the filesystem layers, and it is possible the block layer is colateral damage, too. Probably best to hand this over to the core kernel ppl. -Dave. -- Dave Chinner [email protected]
