On Fri, 14 Feb 2025 07:07:12 -0500
Steven Rostedt <[email protected]> wrote:

> On Fri, 14 Feb 2025 16:13:32 +0900
> Masami Hiramatsu (Google) <[email protected]> wrote:
> 
> > On Thu, 13 Feb 2025 21:21:47 -0500
> > Steven Rostedt <[email protected]> wrote:
> > 
> > > On Fri, 14 Feb 2025 11:07:22 +0900
> > > Masami Hiramatsu (Google) <[email protected]> wrote:
> > >   
> > > > This may be good for fixing crash short term but may not allow us to 
> > > > read the
> > > > subbufs which is read right before crash. Can we also add an option to 
> > > > reset
> > > > the read pointer in the previous boot for the persistent ring buffer?  
> > > 
> > > I'm not sure what you mean here.
> > > 
> > > Note, this is just for mmapping the buffers. Currently we never tried it 
> > > as
> > > if we did, it would have crashed. But this does not affect normal reads.  
> > 
> > But reading buffer via mmap() is already supported. Can we read all pages,
> > which had been read by trace_pipe_raw in previous boot, again on this boot?
> 
> It's not supported. If you try it, it will crash. This prevents reading via
> mmap() on a boot buffer. I don't know what you are asking. Once this patch
> is applied, mmap() will always fail on the boot buffer before or after you
> start it.

Hmm, I meant it is supported for other non-persisten ring buffer, isn't it?

> 
> > 
> > Anyway, I made another patch to fix it to allow mmap() on persistent ring
> > buffer here. I thought I can get the phys_addr from struct vm_area, but
> > it could not for vmap()'d area. So this stores the phys_addr in 
> > trace_buffer.
> 
> Which is way too complex and intrusive. If you want to allow mapping, all
> it needs is this:

Ah, vmalloc_to_page() works, I see.

> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 07b421115692..9339adc88ad5 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -5927,12 +5941,18 @@ static void rb_clear_buffer_page(struct buffer_page 
> *page)
>  static void rb_update_meta_page(struct ring_buffer_per_cpu *cpu_buffer)
>  {
>       struct trace_buffer_meta *meta = cpu_buffer->meta_page;
> +     struct page *page;
>  
>       if (!meta)
>               return;
>  
>       meta->reader.read = cpu_buffer->reader_page->read;
> -     meta->reader.id = cpu_buffer->reader_page->id;
> +     /* For boot buffers, the id is the index */
> +     if (cpu_buffer->ring_meta)
> +             meta->reader.id = rb_meta_subbuf_idx(cpu_buffer->ring_meta,
> +                                                  
> cpu_buffer->reader_page->page);
> +     else
> +             meta->reader.id = cpu_buffer->reader_page->id;
>       meta->reader.lost_events = cpu_buffer->lost_events;
>  
>       meta->entries = local_read(&cpu_buffer->entries);
> @@ -5940,7 +5960,12 @@ static void rb_update_meta_page(struct 
> ring_buffer_per_cpu *cpu_buffer)
>       meta->read = cpu_buffer->read;
>  
>       /* Some archs do not have data cache coherency between kernel and 
> user-space */
> -     flush_dcache_folio(virt_to_folio(cpu_buffer->meta_page));
> +     if (virt_addr_valid(cpu_buffer->meta_page))
> +             page = virt_to_page(cpu_buffer->meta_page);
> +     else
> +             page = vmalloc_to_page(cpu_buffer->meta_page);
> +
> +     flush_dcache_folio(page_folio(page));
>  }
>  
>  static void
> @@ -7041,7 +7066,10 @@ static int __rb_map_vma(struct ring_buffer_per_cpu 
> *cpu_buffer,
>                       goto out;
>               }
>  
> -             page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]);
> +             if (virt_addr_valid(cpu_buffer->subbuf_ids[s]))
> +                     page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]);
> +             else
> +                     page = vmalloc_to_page((void 
> *)cpu_buffer->subbuf_ids[s]);
>  
>               for (; off < (1 << (subbuf_order)); off++, page++) {
>                       if (p >= nr_pages)
> @@ -7187,6 +7215,7 @@ int ring_buffer_map_get_reader(struct trace_buffer 
> *buffer, int cpu)
>       unsigned long missed_events;
>       unsigned long reader_size;
>       unsigned long flags;
> +     struct page *page;
>  
>       cpu_buffer = rb_get_mapped_buffer(buffer, cpu);
>       if (IS_ERR(cpu_buffer))
> @@ -7255,7 +7291,12 @@ int ring_buffer_map_get_reader(struct trace_buffer 
> *buffer, int cpu)
>  
>  out:
>       /* Some archs do not have data cache coherency between kernel and 
> user-space */
> -     flush_dcache_folio(virt_to_folio(cpu_buffer->reader_page->page));
> +     if (virt_addr_valid(cpu_buffer->meta_page))
> +             page = virt_to_page(cpu_buffer->meta_page);
> +     else
> +             page = vmalloc_to_page(cpu_buffer->meta_page);
> +
> +     flush_dcache_folio(page_folio(page));
>  
>       rb_update_meta_page(cpu_buffer);
>  
> But this still doesn't work, as the index is still off (I'm still fixing it).

Ah, OK. Thanks for fixing it.

> 
> The persistent ring buffer uses the page->id for one thing but the user
> space mmap() uses it for something else.

Thanks you,

> 
> -- Steve


-- 
Masami Hiramatsu (Google) <[email protected]>

Reply via email to