We observed instability for the rt kernel under the upgreade and rollback test 
in 6.6, 6.12 and mainline.

The issue is related with dm_exception_table_lock(&lock), in which function 
preempt_disable() is called twice.
The code block is between dm_exception_table_lock(&lock) and 
dm_exception_table_unlock(&lock),if the code involves rt_spin_lock that will 
trigger such as "BUG: scheduling while atomic: kworker/u72:11/349/0x00000003" 
because the preempt number is 3 in this time.

There are several places that involve the same issue in dm-snap.c, such as 
dm_add_exception(), pending_complete() and snapshot_map().

Do we need reimplement dm_exception_table_lock?
Any suggestions or assistance would be appreciated.

[  862.410151] Kernel panic - not syncing: scheduling while atomic: 
panic_on_warn set ...
[  862.580196] CPU: 2 UID: 0 PID: 349 Comm: kworker/u72:11 Kdump: loaded 
Tainted: G           O       6.12.0-1-rt-amd64 #1  Debian 6.12.40-1.stx.130
[  862.593223] Tainted: [O]=OOT_MODULE
[  862.596714] Hardware name: Dell Inc. PowerEdge R740xd/00WGD1, BIOS 2.24.0 
03/27/2025
[  862.604453] Workqueue: writeback wb_workfn (flush-253:21)
[  862.609852] Call Trace:
[  862.612306]  <TASK>
[  862.614411]  panic+0x34a/0x370
[  862.617470]  check_panic_on_warn+0x50/0x50
[  862.621569]  __schedule_bug+0x4d/0x60
[  862.625236]  __schedule+0xa0c/0xbb0
[  862.628729]  schedule_rtlock+0x1a/0x30
[  862.632481]  rtlock_slowlock_locked+0x20b/0xcc0
[  862.637014]  rt_spin_lock+0x40/0x60
[  862.640506]  __insert_pending_exception+0x4e/0xe0 [dm_snapshot]
[  862.646424]  __origin_write+0x2fb/0x360 [dm_snapshot]
[  862.651477]  do_origin+0xd5/0xe0 [dm_snapshot]
[  862.655923]  __map_bio+0x17c/0x1b0 [dm_mod]
[  862.660117]  dm_submit_bio+0x1ad/0x5a0 [dm_mod]
[  862.664649]  __submit_bio+0x144/0x240
[  862.668315]  ? __submit_bio+0xc1/0x240
[  862.672067]  submit_bio_noacct_nocheck+0x19a/0x3c0
[  862.676860]  iomap_submit_ioend+0x42/0x80
[  862.680873]  iomap_writepages+0x5f8/0x8d0
[  862.684886]  xfs_vm_writepages+0x62/0x90 [xfs]
[  862.689473]  do_writepages+0xcc/0x240
[  862.693136]  __writeback_single_inode+0x41/0x330
[  862.697756]  writeback_sb_inodes+0x21c/0x4d0
[  862.702028]  wb_writeback+0x7c/0x2f0
[  862.705607]  wb_workfn+0xc1/0x450
[  862.708926]  process_one_work+0x179/0x390
[  862.712940]  worker_thread+0x237/0x340
[  862.716691]  ? __pfx_worker_thread+0x10/0x10
[  862.720964]  kthread+0xc6/0x100
[  862.724111]  ? __pfx_kthread+0x10/0x10
[  862.727863]  ret_from_fork+0x2d/0x50
[  862.731441]  ? __pfx_kthread+0x10/0x10
[  862.735193]  ret_from_fork_asm+0x1a/0x30
[  862.739122]  </TASK>

and

[   36.563812] BUG: scheduling while atomic: lvm/1380/0x00000003
......
[   36.563841] CPU: 32 PID: 1380 Comm: lvm Tainted: G           O       
6.6.0-1-rt-amd64 #1  Debian 6.6.71-1.stx.104
[   36.563844] Hardware name: ZTSYSTEMS Galene EI/Galene, BIOS 1.01 12/07/2023
[   36.563845] Call Trace:
[   36.563848]  <TASK>
[   36.563849]  dump_stack_lvl+0x37/0x50
[   36.563855]  __schedule_bug+0x52/0x60
[   36.563859]  __schedule+0x87d/0xb10
[   36.563861]  ? update_load_avg+0x7e/0x750
[   36.563865]  schedule_rtlock+0x1f/0x40
[   36.563866]  rtlock_slowlock_locked+0x232/0xd40
[   36.563870]  ? __set_cpus_allowed_ptr+0x55/0xa0
[   36.563873]  ? dm_add_exception+0xb4/0xf0 [dm_snapshot]
[   36.563879]  rt_spin_lock+0x45/0x60
[   36.563881]  kmem_cache_free+0x182/0x480
[   36.563884]  dm_add_exception+0xb4/0xf0 [dm_snapshot]
[   36.563889]  persistent_read_metadata+0x29d/0x550 [dm_snapshot]
[   36.563895]  ? __pfx_dm_add_exception+0x10/0x10 [dm_snapshot]
[   36.563900]  snapshot_ctr+0x60b/0x8f0 [dm_snapshot]
[   36.563905]  dm_table_add_target+0x246/0x3b0 [dm_mod]
[   36.563919]  table_load+0x136/0x4b0 [dm_mod]
[   36.563930]  ? __pfx_table_load+0x10/0x10 [dm_mod]
[   36.563940]  ctl_ioctl+0x1b3/0x500 [dm_mod]
[   36.563950]  dm_ctl_ioctl+0xe/0x20 [dm_mod]
[   36.563960]  __x64_sys_ioctl+0x8f/0xd0
[   36.563964]  do_syscall_64+0x58/0xb0
[   36.563967]  ? dm_ctl_ioctl+0xe/0x20 [dm_mod]
[   36.563976]  ? __ct_user_enter+0x2f/0xd0
[   36.563978]  ? syscall_exit_to_user_mode+0x32/0x40
[   36.563980]  ? do_syscall_64+0x65/0xb0
[   36.563983]  ? exit_to_user_mode_prepare+0xa9/0x190
[   36.563985]  ? __ct_user_enter+0x2f/0xd0
[   36.563987]  ? syscall_exit_to_user_mode+0x32/0x40

Thanks,
Jiping

Reply via email to