On Mon, Jan 26, 2026 at 11:30:17AM +0000, Shinichiro Kawasaki wrote:
> Hello all,
> 
> I regularly run fstests with the kernel at xfs/for-next branch tip to validate
> the capability of zoned block device capability of xfs. Recently, I started
> observing hangs of the test runs with the message:
> 
>   "rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:"
> 
> The hangs occurred in different test cases, and simply rerunning the test 
> cases
> does not reproduce the hang. When I ran the whole fstests test cases, it also
> fails to reproduce the hang. However, when the whole fstests is repeated the
> hang is recreated. The hang looks rare, takes very long time to recreate and 
> is
> tough to chase down.
> 
> To tackle this problem, I would like to seek the expertise of rcu developers. 
> I
> have attached kernel message logs captured at the hangs for analysis 
> [1][2][3].
> Any insights or guidance on how to debug this problem will be appreciated.
> 

Nothing XFS related in these. All the XFS traces are waiting on IO
submission - the block layer below XFS is typically sleeping waiting
for tags to be allocated.

> [1] hang observed on Jan/23/2026
> 
>      dmesg log file attached: generic_005_hang
>      hanged test case: generic/005
>      kernel: xfs/for-next, 51aba4ca399, v6.19-rc5+
>      block device: dm-linear on HDD (non-zoned)
>      xfs: zoned

The block device has an expired rq so the timeout work is trying to
run synchronise_rcu():

> [272416.203262][  T167]  wait_for_completion_state+0x21/0x40
> [272416.203719][  T167]  __wait_rcu_gp+0x1cd/0x410
> [272416.204487][  T167]  synchronize_rcu_normal+0x4a8/0x510
> [272416.207632][  T167]  blk_mq_timeout_work+0x4aa/0x5d0
> [272416.209324][  T167]  process_one_work+0x86b/0x1490

So that's possibly why IO is stuck. i.e. the block device is waiting
on the RCU grace period to expire, and RCU processing has stalled
for some reason. Hence the block device appears to be a victim of
the issue, not the cause.

> [2] hang observed on Jan/18/2026
> 
>      dmesg log file attached: xfs_598_hang
>      hanged test case: xfs/598
>      kernel: Christophs' xfs branch, ec6aea2a5 v6.19-rc1+
>      block device: TCMU (non-zoned)
>      xfs: non-zoned

Looks like some kind of scheduler/static-key livelock or deadlock.
There are a bunch of tasks all doing stuff like:

> [164582.112175][   C10]  on_each_cpu_cond_mask+0x24/0x40
> [164582.112179][   C10]  smp_text_poke_batch_finish+0x45c/0xd20
> [164582.112218][   C10]  arch_jump_label_transform_apply+0x1c/0x30
> [164582.112224][   C10]  static_key_enable_cpuslocked+0x16c/0x230
> [164582.112230][   C10]  static_key_enable+0x1f/0x30
> [164582.112235][   C10]  process_one_work+0x86b/0x1490

Along with the rcu_preempt thread apparently spinning trying to
reschedule:

> [164661.054667][   C12] RIP: 0010:__pv_queued_spin_lock_slowpath+0x232/0xdc0
> [164661.054745][   C12]  do_raw_spin_lock+0x1d9/0x270
> [164661.054768][   C12]  raw_spin_rq_lock_nested+0x24/0x170
> [164661.054774][   C12]  _raw_spin_rq_lock_irqsave+0x41/0x50
> [164661.054778][   C12]  resched_cpu+0x62/0xf0
> [164661.054783][   C12]  force_qs_rnp+0x67d/0xaa0
> [164661.054799][   C12]  rcu_gp_fqs_loop+0x948/0x11b0
> [164661.054841][   C12]  rcu_gp_kthread+0x4f2/0x660
> [164661.054876][   C12]  kthread+0x3a4/0x760

I can't find anything obvious in the block layer waiting on RCU.
However, XFS is waiting in the block layer on mq tag allocation for
submission (like the 005 hang above) or waiting on journal write IO
completion, so the block may may well be hung on RCU again.

> [3] hang observed on Jan/14/2026
> 
>      dmesg log file attached: generic_417_hang
>      hanged test case: generic/417
>      kernel: xfs/for-next, ea44380376c, v6.19-rc1+
>      block device: null_blk (non-zoned)
>      xfs: zoned

Same static key pattern in on_each_cpu_cond_mask(), there's also a
bunch of tlb flushes stcuk in on_each_cpu_cond_mask(). rcu_preempt
thread is not waking from:

> [74627.121083][    C2]  schedule+0xd1/0x250
> [74627.121959][    C2]  schedule_timeout+0x103/0x260
> [74627.128027][    C2]  rcu_gp_fqs_loop+0x208/0x11b0
> [74627.135240][    C2]  rcu_gp_kthread+0x4f2/0x660

There is nothing XFS or block related in the hung task traces
at all.

IOWs, this looks like some kind of RCU/static key/scheduler
interaction which may propagate into the block layer if it needs RCU
synchronisation. Hence it does not appear to have anything to do
with the filesystem layers, and it is possible the block layer is
colateral damage, too.

Probably best to hand this over to the core kernel ppl.

-Dave.

-- 
Dave Chinner
[email protected]

Reply via email to