On Thu, 11 Sep 2025 15:33:15 +0200
Vladimir Riabchun <ferr.lambargi...@gmail.com> wrote:

> A soft lockup was observed when loading amdgpu module,

I'd like to see more about that soft lockup.

> this is the same issue that was fixed in
> commit d0b24b4e91fc ("ftrace: Prevent RCU stall on PREEMPT_VOLUNTARY
> kernels") and commit 42ea22e754ba ("ftrace: Add cond_resched() to
> ftrace_graph_set_hash()").

The above cond_resched() is in the loop of all records that actually look
at all records! And that can be pretty big. On my server, it shows on boot:


[    1.934175] ftrace: allocating 45706 entries in 180 pages
[    1.934177] ftrace: allocated 180 pages with 4 groups

That means the loop will go through 45,706 entries. That's quite a lot and
a cond_resched() makes perfect sense.

> 
> Fix it the same way by adding cond_resched() in ftrace_module_enable.
> 
> Signed-off-by: Vladimir Riabchun <ferr.lambargi...@gmail.com>
> ---
>  kernel/trace/ftrace.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index a69067367c29..23c4d37c7bcd 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -7526,6 +7526,9 @@ void ftrace_module_enable(struct module *mod)
>  
>       do_for_each_ftrace_rec(pg, rec) {
>               int cnt;
> +
> +             cond_resched();
> +
>               /*
>                * do_for_each_ftrace_rec() is a double loop.
>                * module text shares the pg. If a record is

This loop is different. Let me show a bit more context:

        do_for_each_ftrace_rec(pg, rec) {
                int cnt;
                /*
                 * do_for_each_ftrace_rec() is a double loop.
                 * module text shares the pg. If a record is
                 * not part of this module, then skip this pg,
                 * which the "break" will do.
                 */
                if (!within_module(rec->ip, mod))
                        break;

See this "if (!within_module(rec->ip, mod))" break?

Look at the dmesg output again, and you'll see "groups" mentioned.

[    1.934177] ftrace: allocated 180 pages with 4 groups

That "4 groups" means there are 4 "page groups". That's the "pg" in the
do_for_each_ftrace_recr() function.

This means in my scenario, it loops 4 times. And then it will loop through
each module.

How big is the amdgpu driver? How many functions does it have?

 # grep amdgpu /sys/kernel/tracing/available_filter_functions | wc -l

And I'm guessing that this is only an issue when ftrace is enabled:

                if (ftrace_start_up && cnt) {
                        int failed = __ftrace_replace_code(rec, 1);
                        if (failed) {
                                ftrace_bug(failed, rec);
                                goto out_loop;
                        }
                }

As that could slow things down.

If this is all the case, then the cond_resched() should be with the
ftrace_start_up code and not in the open like you have it.

                if (ftrace_start_up && cnt) {
                        int failed = __ftrace_replace_code(rec, 1);
                        if (failed) {
                                ftrace_bug(failed, rec);
                                goto out_loop;
                        }
+                       cond_resched();
                }


-- Steve

Reply via email to