On 6/3/26 6:22 PM, Naveen Kumar Chaudhary wrote:
> read_file_mod_stats() traverses dup_failed_modules with
> list_for_each_entry_rcu() while holding module_mutex, but does not pass
> the lockdep condition. This triggers a false-positive "RCU-list traversed
> in non-reader section" warning with CONFIG_PROVE_RCU_LIST=y.
>
> The warning can be reproduced by:
> 1. Racing two loads of the same module to populate dup_failed_modules:
> insmod dummy.ko &
> insmod dummy.ko &
> 2. Reading the stats debugfs file:
> cat /sys/kernel/debug/modules/stats
>
> =============================
> WARNING: suspicious RCU usage
> 7.1.0-rc5-gae12a56ba16a #1 Not tainted
> -----------------------------
> kernel/module/stats.c:385 RCU-list traversed in non-reader section!!
>
> other info that might help us debug this:
>
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by cat/128:
> #0: ffff80008288f7a8 (module_mutex){+.+.}-{4:4}, at:
> read_file_mod_stats+0x46c/0x5ec
>
> stack backtrace:
> CPU: 1 UID: 0 PID: 128 Comm: cat Kdump: loaded Not tainted
> 7.1.0-rc5-gae12a56ba16a #1 PREEMPT
> Hardware name: linux,dummy-virt (DT)
> Call trace:
> show_stack+0x18/0x24 (C)
> __dump_stack+0x28/0x38
> dump_stack_lvl+0x64/0x84
> dump_stack+0x18/0x24
> lockdep_rcu_suspicious+0x134/0x1d4
> read_file_mod_stats+0x554/0x5ec
> full_proxy_read+0xe0/0x1ac
> vfs_read+0xd8/0x2b0
> ksys_read+0x70/0xe4
> __arm64_sys_read+0x1c/0x28
> invoke_syscall+0x48/0xf8
> el0_svc_common+0x8c/0xd8
> do_el0_svc+0x1c/0x28
> el0_svc+0x58/0x1d8
> el0t_64_sync_handler+0x84/0x12c
> el0t_64_sync+0x198/0x19c
>
> The traversal is protected by module_mutex, so pass
> lockdep_is_held(&module_mutex) to inform the checker.
>
> Signed-off-by: Naveen Kumar Chaudhary <[email protected]>
> ---
> kernel/module/stats.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/module/stats.c b/kernel/module/stats.c
> index 3a9672f93a8e..a62961acd8e3 100644
> --- a/kernel/module/stats.c
> +++ b/kernel/module/stats.c
> @@ -382,7 +382,8 @@ static ssize_t read_file_mod_stats(struct file *file,
> char __user *user_buf,
> mutex_lock(&module_mutex);
>
>
> - list_for_each_entry_rcu(mod_fail, &dup_failed_modules, list) {
> + list_for_each_entry_rcu(mod_fail, &dup_failed_modules, list,
> + lockdep_is_held(&module_mutex)) {
> if (WARN_ON_ONCE(++count_failed >= MAX_FAILED_MOD_PRINT))
> goto out_unlock;
> len += scnprintf(buf + len, size - len, "%25s\t%15lu\t%25s\n",
> mod_fail->name,
I mentioned in my reply to your previous patch fixing the use of
synchronize_rcu() in the module dups code [1] that the overall RCU usage
in this code appears to be incorrect. I also noticed that Sashiko
reported the same issue. I think it is not very productive to try to fix
these specific RCU-related problems and instead the code should be
properly reworked. It most likely should not be using RCU at all and
kmod_dup_req should instead be reference-counted.
[1]
https://lore.kernel.org/linux-modules/ajydyxgaea27rhcopp5eauji24znotu65d2b4uw344yvmwcc6f@7l5re6f2xcuk/
--
Thanks,
Petr