Hi Mark, On 10/01/2019 14:27, Mark Rutland wrote: > The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how > large its ringbuffer mmap should be. This can be configured to arbitrary > values, which can be larger than the maximum possible allocation from > kmalloc. > > When this is configured to a suitably large value (e.g. thanks to the > perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in > __alloc_pages_nodemask(): > > [ 337.316688] WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 > __alloc_pages_nodemask+0x3f8/0xbc8 > [ 337.316694] Modules linked in: > [ 337.316704] CPU: 2 PID: 5666 Comm: perf Not tainted 5.0.0-rc1 #2669 > [ 337.316708] Hardware name: ARM Juno development board (r0) (DT) > [ 337.316714] pstate: 20000005 (nzCv daif -PAN -UAO) > [ 337.316720] pc : __alloc_pages_nodemask+0x3f8/0xbc8 > [ 337.316728] lr : alloc_pages_current+0x80/0xe8 > [ 337.316732] sp : ffff000016eeb9e0 > [ 337.316736] x29: ffff000016eeb9e0 x28: 0000000000080001 > [ 337.316744] x27: 0000000000000000 x26: ffff0000111e21f0 > [ 337.316751] x25: 0000000000000001 x24: 0000000000000000 > [ 337.316757] x23: 0000000000080001 x22: 0000000000000000 > [ 337.316762] x21: 0000000000000000 x20: 000000000000000b > [ 337.316768] x19: 000000000060c0c0 x18: 0000000000000000 > [ 337.316773] x17: 0000000000000000 x16: 0000000000000000 > [ 337.316779] x15: 0000000000000000 x14: 0000000000000000 > [ 337.316784] x13: 0000000000000000 x12: 0000000000000000 > [ 337.316789] x11: 0000000000100000 x10: 0000000000000000 > [ 337.316795] x9 : 0000000010044400 x8 : 0000000080001000 > [ 337.316800] x7 : 0000000000000000 x6 : ffff800975584700 > [ 337.316806] x5 : 0000000000000000 x4 : ffff0000111cd6c8 > [ 337.316811] x3 : 0000000000000000 x2 : 0000000000000000 > [ 337.316816] x1 : 000000000000000b x0 : 000000000060c0c0 > [ 337.316822] Call trace: > [ 337.316828] __alloc_pages_nodemask+0x3f8/0xbc8 > [ 337.316834] alloc_pages_current+0x80/0xe8 > [ 337.316841] kmalloc_order+0x14/0x30 > [ 337.316848] __kmalloc+0x1dc/0x240 > [ 337.316854] rb_alloc+0x3c/0x170 > [ 337.316860] perf_mmap+0x3bc/0x470 > [ 337.316867] mmap_region+0x374/0x4f8 > [ 337.316873] do_mmap+0x300/0x430 > [ 337.316878] vm_mmap_pgoff+0xe4/0x110 > [ 337.316884] ksys_mmap_pgoff+0xc0/0x230 > [ 337.316892] __arm64_sys_mmap+0x28/0x38 > [ 337.316899] el0_svc_common+0xb4/0x118 > [ 337.316905] el0_svc_handler+0x2c/0x80 > [ 337.316910] el0_svc+0x8/0xc > [ 337.316915] ---[ end trace fa29167e20ef0c62 ]--- > > Let's avoid this by checking that the requested allocation is possible > before calling kzalloc. > > Reported-by: Julien Thierry <[email protected]> > Signed-off-by: Mark Rutland <[email protected]> > Cc: Alexander Shishkin <[email protected]> > Cc: Ingo Molnar <[email protected]> > Cc: Jiri Olsa <[email protected]> > Cc: Namhyung Kim <[email protected]> > Cc: Peter Zijlstra <[email protected]> > --- > kernel/events/ring_buffer.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 4a9937076331..309ef5a64af5 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -734,6 +734,9 @@ struct ring_buffer *rb_alloc(int nr_pages, long > watermark, int cpu, int flags) > size = sizeof(struct ring_buffer); > size += nr_pages * sizeof(void *); > > + if (order_base_2(size) >= MAX_ORDER) > + goto fail; > +
I see that in kernel/events/ring_buffer.c there are two versions of rb_alloc() (depending on whether CONFIG_PERF_USE_VMALLOC is defined or not). Since the warning comes from the kzalloc, I'd think we'd need to add this check in both implementations of rb_alloc(). With that change (or if for some reason the other rb_alloc() version doesn't need the check): Reviewed-by: Julien Thierry <[email protected]> Thanks, -- Julien Thierry

