On Fri, 14 Feb 2025 11:55:47 -0500
Steven Rostedt <[email protected]> wrote:

> From: Steven Rostedt <[email protected]>
> 
> When trying to mmap a trace instance buffer that is attached to
> reserve_mem, it would crash:
> 
>  BUG: unable to handle page fault for address: ffffe97bd00025c8
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
>  PGD 2862f3067 P4D 2862f3067 PUD 0
>  Oops: Oops: 0000 [#1] PREEMPT_RT SMP PTI
>  CPU: 4 UID: 0 PID: 981 Comm: mmap-rb Not tainted 
> 6.14.0-rc2-test-00003-g7f1a5e3fbf9e-dirty #233
>  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
> 1.16.3-debian-1.16.3-2 04/01/2014
>  RIP: 0010:validate_page_before_insert+0x5/0xb0
>  Code: e2 01 89 d0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 
> 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <48> 8b 46 08 a8 01 
> 75 67 66 90 48 89 f0 8b 50 34 85 d2 74 76 48 89
>  RSP: 0018:ffffb148c2f3f968 EFLAGS: 00010246
>  RAX: ffff9fa5d3322000 RBX: ffff9fa5ccff9c08 RCX: 00000000b879ed29
>  RDX: ffffe97bd00025c0 RSI: ffffe97bd00025c0 RDI: ffff9fa5ccff9c08
>  RBP: ffffb148c2f3f9f0 R08: 0000000000000004 R09: 0000000000000004
>  R10: 0000000000000000 R11: 0000000000000200 R12: 0000000000000000
>  R13: 00007f16a18d5000 R14: ffff9fa5c48db6a8 R15: 0000000000000000
>  FS:  00007f16a1b54740(0000) GS:ffff9fa73df00000(0000) knlGS:0000000000000000
>  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>  CR2: ffffe97bd00025c8 CR3: 00000001048c6006 CR4: 0000000000172ef0
>  Call Trace:
>   <TASK>
>   ? __die_body.cold+0x19/0x1f
>   ? __die+0x2e/0x40
>   ? page_fault_oops+0x157/0x2b0
>   ? search_module_extables+0x53/0x80
>   ? validate_page_before_insert+0x5/0xb0
>   ? kernelmode_fixup_or_oops.isra.0+0x5f/0x70
>   ? __bad_area_nosemaphore+0x16e/0x1b0
>   ? bad_area_nosemaphore+0x16/0x20
>   ? do_kern_addr_fault+0x77/0x90
>   ? exc_page_fault+0x22b/0x230
>   ? asm_exc_page_fault+0x2b/0x30
>   ? validate_page_before_insert+0x5/0xb0
>   ? vm_insert_pages+0x151/0x400
>   __rb_map_vma+0x21f/0x3f0
>   ring_buffer_map+0x21b/0x2f0
>   tracing_buffers_mmap+0x70/0xd0
>   __mmap_region+0x6f0/0xbd0
>   mmap_region+0x7f/0x130
>   do_mmap+0x475/0x610
>   vm_mmap_pgoff+0xf2/0x1d0
>   ksys_mmap_pgoff+0x166/0x200
>   __x64_sys_mmap+0x37/0x50
>   x64_sys_call+0x1670/0x1d70
>   do_syscall_64+0xbb/0x1d0
>   entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> The reason was that the code that maps the ring buffer pages to user space
> has:
> 
>       page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]);
> 
> And uses that in:
> 
>       vm_insert_pages(vma, vma->vm_start, pages, &nr_pages);
> 
> But virt_to_page() does not work with vmap()'d memory which is what the
> persistent ring buffer has. It is rather trivial to allow this, but for
> now just disable mmap() of instances that have their ring buffer from the
> reserve_mem option.
> 
> If an mmap() is performed on a persistent buffer it will return -ENODEV
> just like it would if the .mmap field wasn't defined in the
> file_operations structure.
> 

Looks good to me.

Tested-by: Masami Hiramatsu (Google) <[email protected]>
Acked-by: Masami Hiramatsu (Google) <[email protected]>

Thanks,

> Cc: [email protected]
> Fixes: e645535a954ad ("tracing: Add option to use memmapped memory for trace 
> boot instance")
> Signed-off-by: Steven Rostedt (Google) <[email protected]>
> ---
> Changes since v1: 
> https://lore.kernel.org/[email protected]
> 
> - Return -ENODEV instead of -EINVAL as that is what is returned when the .mmap
>   field is missing from the file_operations structure.
> 
>  kernel/trace/trace.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 25ff37aab00f..0e6d517e74e0 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -8279,6 +8279,10 @@ static int tracing_buffers_mmap(struct file *filp, 
> struct vm_area_struct *vma)
>       struct trace_iterator *iter = &info->iter;
>       int ret = 0;
>  
> +     /* Currently the boot mapped buffer is not supported for mmap */
> +     if (iter->tr->flags & TRACE_ARRAY_FL_BOOT)
> +             return -ENODEV;
> +
>       ret = get_snapshot_map(iter->tr);
>       if (ret)
>               return ret;
> -- 
> 2.47.2
> 


-- 
Masami Hiramatsu (Google) <[email protected]>

Reply via email to