On Wed, 20 Aug 2025 09:00:17 +0000
Tengda Wu <wuten...@huaweicloud.com> wrote:

Hi Tengda!

> When calling ftrace_dump_one() concurrently with reading trace_pipe,
> a WARN_ON_ONCE() in trace_printk_seq() can be triggered due to a race
> condition.
> 
> The issue occurs because:
> 
> CPU0 (ftrace_dump)                              CPU1 (reader)
> echo z > /proc/sysrq-trigger
> 
> !trace_empty(&iter)
> trace_iterator_reset(&iter) <- len = size = 0
>                                                 cat 
> /sys/kernel/tracing/trace_pipe
> trace_find_next_entry_inc(&iter)
>   __find_next_entry
>     ring_buffer_empty_cpu <- all empty
>   return NULL
> 
> trace_printk_seq(&iter.seq)
>   WARN_ON_ONCE(s->seq.len >= s->seq.size)

Thanks for debugging this code. I do appreciate it.

> 
> In the context between trace_empty() and trace_find_next_entry_inc()
> during ftrace_dump, the ring buffer data was consumed by other readers.
> This caused trace_find_next_entry_inc to return NULL, failing to populate
> `iter.seq`. At this point, due to the prior trace_iterator_reset, both
> `iter.seq.len` and `iter.seq.size` were set to 0. Since they are equal,
> the WARN_ON_ONCE condition is triggered.
> 
> Add a non-NULL check on the return value of trace_find_next_entry_inc
> prior to invoking trace_printk_seq, ensuring the `iter.seq` is properly
> populated before subsequent operations.
> 


> Furthermore, per the seq_buf specification, the condition len == size
> indicates a full buffer, which constitutes a valid state. Consequently,
> the equality check and size - 1 adjustment in WARN_ON_ONCE() are redundant
> and should be eliminated.
> 
> Fixes: d769041f8653 ("ring_buffer: implement new locking")
> Signed-off-by: Tengda Wu <wuten...@huaweicloud.com>
> ---
>  kernel/trace/trace.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 4283ed4e8f59..61c5d389dbd3 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -10521,8 +10521,8 @@ trace_printk_seq(struct trace_seq *s)
>        * PAGE_SIZE, and TRACE_MAX_PRINT is 1000, this is just
>        * an extra layer of protection.
>        */
> -     if (WARN_ON_ONCE(s->seq.len >= s->seq.size))
> -             s->seq.len = s->seq.size - 1;
> +     if (WARN_ON_ONCE(s->seq.len > s->seq.size))
> +             s->seq.len = s->seq.size;
>  

This should be a separate patch as it has nothing to do with the other
changes. It's also incorrect, which is why you want to make it a separate
patch, as now this delays the changes below from being accepted.

Yes, the seq_buf specification states this, but this is not using the
seq_buf interface. It's adding on to it. The code below that change has:

        /* should be zero ended, but we are paranoid. */
        s->buffer[s->seq.len] = 0;

You see, it is updating the buffer. Which means it needs its own check.
If we don't set len to size - 1, the above write will overflow the buffer.

-- Steve



>       /* should be zero ended, but we are paranoid. */
>       s->buffer[s->seq.len] = 0;
> @@ -10617,6 +10617,7 @@ static void ftrace_dump_one(struct trace_array *tr, 
> enum ftrace_dump_mode dump_m
>        */
>  
>       while (!trace_empty(&iter)) {
> +             void *ent;
>  
>               if (!cnt)
>                       printk(KERN_TRACE 
> "---------------------------------\n");
> @@ -10625,17 +10626,18 @@ static void ftrace_dump_one(struct trace_array *tr, 
> enum ftrace_dump_mode dump_m
>  
>               trace_iterator_reset(&iter);
>               iter.iter_flags |= TRACE_FILE_LAT_FMT;
> +             ent = trace_find_next_entry_inc(&iter);
>  
> -             if (trace_find_next_entry_inc(&iter) != NULL) {
> +             if (ent) {
>                       int ret;
>  
>                       ret = print_trace_line(&iter);
>                       if (ret != TRACE_TYPE_NO_CONSUME)
>                               trace_consume(&iter);
> +
> +                     trace_printk_seq(&iter.seq);
>               }
>               touch_nmi_watchdog();
> -
> -             trace_printk_seq(&iter.seq);
>       }
>  
>       if (!cnt)


Reply via email to