Re: kernel BUG in split_huge_page_to_list
On 1/24/21 8:01 PM, syzbot wrote: Hello, syzbot found the following issue on: HEAD commit:647060f3 Add linux-next specific files for 20210120 git tree: linux-next console output: https://syzkaller.appspot.com/x/log.txt?x=16f0353f50 kernel config: https://syzkaller.appspot.com/x/.config?x=8f8a72b7e5067002 dashboard link: https://syzkaller.appspot.com/bug?extid=9b83ff893245a25c320e compiler: gcc (GCC) 10.1.0-syz 20200507 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=143d7e3b50 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17652feb50 It looks like a new page been mapped by the old vma, which happen - after the existing page table entry moved from old vma to new vma; - before unlinking anon_vma from old vma; So this new page's ->mapping still points to the anon_vma which been unlinked by the old vma. later, rmap path would not be able to find mapped entries of this page. Any hints about above scenario? I suppose it only possible for THP pages and not happen through page fault path. Thanks. The issue was bisected to: commit fbdbae3da30a149a55a5f1883bbbe17a27660e05 Author: Li Xinhai Date: Tue Jan 19 21:54:00 2021 + mm: mremap: unlink anon_vmas when mremap with MREMAP_DONTUNMAP success bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=16cad100d0 final oops: https://syzkaller.appspot.com/x/report.txt?x=15cad100d0 console output: https://syzkaller.appspot.com/x/log.txt?x=11cad100d0 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+9b83ff893245a25c3...@syzkaller.appspotmail.com Fixes: fbdbae3da30a ("mm: mremap: unlink anon_vmas when mremap with MREMAP_DONTUNMAP success") head:091c6650 order:9 compound_mapcount:0 compound_pincount:0 memcg:888010d0a000 anon flags: 0xfff009001d(locked|uptodate|dirty|lru|head|swapbacked) raw: 00fff009001d eabc51c8 888010201800 88802575d801 raw: 00020e00 01fc 888010d0a000 page dumped because: VM_BUG_ON_PAGE(!unmap_success) [ cut here ] kernel BUG at mm/huge_memory.c:2351! invalid opcode: [#1] PREEMPT SMP KASAN CPU: 0 PID: 8483 Comm: syz-executor525 Not tainted 5.11.0-rc4-next-20210120-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:unmap_page mm/huge_memory.c:2351 [inline] RIP: 0010:split_huge_page_to_list+0x1f02/0x43b0 mm/huge_memory.c:2720 Code: ef e8 82 46 ea ff 0f 0b e8 ab 69 b9 ff 4c 8d 73 ff e9 56 ea ff ff e8 9d 69 b9 ff 48 c7 c6 40 69 57 89 48 89 ef e8 5e 46 ea ff <0f> 0b e8 87 69 b9 ff 4c 8d 75 ff e9 28 e9 ff ff e8 79 69 b9 ff 49 RSP: 0018:c9000168f7a0 EFLAGS: 00010282 RAX: RBX: RCX: RDX: 88801e2d5400 RSI: 88bcc6c7 RDI: f520002d1e8e RBP: eaca8000 R08: 0033 R09: R10: 815b136e R11: R12: 888010d0ae60 R13: eaca8000 R14: 018c R15: FS: 0154e880() GS:8880b9e0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7fcc655666c0 CR3: 12e9a000 CR4: 001506f0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Call Trace: split_huge_page include/linux/huge_mm.h:187 [inline] madvise_free_pte_range+0x736/0x1ee0 mm/madvise.c:633 walk_pmd_range mm/pagewalk.c:89 [inline] walk_pud_range mm/pagewalk.c:160 [inline] walk_p4d_range mm/pagewalk.c:193 [inline] walk_pgd_range mm/pagewalk.c:229 [inline] __walk_page_range+0xe20/0x1ea0 mm/pagewalk.c:331 walk_page_range+0x20d/0x400 mm/pagewalk.c:427 madvise_free_single_vma+0x383/0x550 mm/madvise.c:731 madvise_dontneed_free mm/madvise.c:819 [inline] madvise_vma mm/madvise.c:936 [inline] do_madvise.part.0+0x4e4/0x1ed0 mm/madvise.c:1132 do_madvise mm/madvise.c:1158 [inline] __do_sys_madvise mm/madvise.c:1158 [inline] __se_sys_madvise mm/madvise.c:1156 [inline] __x64_sys_madvise+0x113/0x150 mm/madvise.c:1156 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x440219 Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:7ffc51b58b98 EFLAGS: 0246 ORIG_RAX: 001c RAX: ffda RBX: 004002c8 RCX: 00440219 RDX: 0008 RSI: 00c0 RDI: 2040 RBP: 006ca018 R08: R09: R10: 20ffc000 R11: 0246 R12: 00401a20 R13: 00401ab0 R14: R15: Modules linked in: ---[ end trace 7812a1
tools/bpf: build failed with defconfig(x86_64) on v5.6 and v5.7
- information of machine Linux localhost.localdomain 4.18.0-193.6.3.el8_2.x86_64 #1 SMP Wed Jun 10 11:09:32 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux - configurations make defconfig make kvmconfig - failed logs on v5.6 ``` LINK /mnt/build/1_build/05_build_v5.6/bpf/bpftool//libbpf/libbpf.a LINK /mnt/build/1_build/05_build_v5.6/bpf/bpftool/bpftool DESCEND runqslower GEN /mnt/build/0_code/0_linux/linux/tools/bpf/runqslower/.output/bpf_helper_defs.h make[4]: *** No rule to make target '/mnt/build/0_code/0_linux/linux/tools/include/linux/build_bug.h', needed by '/mnt/build/0_code/0_linux/linux/tools/bpf/runqslower/.output/staticobjs/libbpf.o'. Stop. make[3]: *** [Makefile:183: /mnt/build/0_code/0_linux/linux/tools/bpf/runqslower/.output/staticobjs/libbpf-in.o] Error 2 make[2]: *** [Makefile:79: .output/libbpf.a] Error 2 make[1]: *** [Makefile:119: runqslower] Error 2 make: *** [Makefile:68: bpf] Error 2 ``` - failed logs on v5.7 ``` In file included from /mnt/build/0_code/0_linux/linux/tools/include/linux/build_bug.h:5, from /mnt/build/0_code/0_linux/linux/tools/include/linux/kernel.h:8, from /mnt/build/0_code/0_linux/linux/kernel/bpf/disasm.h:10, from /mnt/build/0_code/0_linux/linux/kernel/bpf/disasm.c:8: /mnt/build/0_code/0_linux/linux/kernel/bpf/disasm.c: In function ‘__func_get_name’: /mnt/build/0_code/0_linux/linux/tools/include/linux/compiler.h:37:38: warning: nested extern declaration of ‘__compiletime_assert_0’ [-Wnested-externs] _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^ /mnt/build/0_code/0_linux/linux/tools/include/linux/compiler.h:16:15: note: in definition of macro ‘__compiletime_assert’ extern void prefix ## suffix(void) __compiletime_error(msg); \ ^~ /mnt/build/0_code/0_linux/linux/tools/include/linux/compiler.h:37:2: note: in expansion of macro ‘_compiletime_assert’ _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ^~~ /mnt/build/0_code/0_linux/linux/tools/include/linux/build_bug.h:39:37: note: in expansion of macro ‘compiletime_assert’ #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) ^~ /mnt/build/0_code/0_linux/linux/tools/include/linux/build_bug.h:50:2: note: in expansion of macro ‘BUILD_BUG_ON_MSG’ BUILD_BUG_ON_MSG(condition, "BUILD_BUG_ON failed: " #condition) ^~~~ /mnt/build/0_code/0_linux/linux/kernel/bpf/disasm.c:20:2: note: in expansion of macro ‘BUILD_BUG_ON’ BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID); ^~~~ ``` and ``` LINK /mnt/build/0_code/0_linux/linux/tools/bpf/runqslower/.output/libbpf.a GEN vmlinux.h BPF runqslower.bpf.o In file included from runqslower.bpf.c:3: .output/vmlinux.h:5:15: error: attribute 'preserve_access_index' is not supported by '#pragma clang attribute' #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record) ^ .output/vmlinux.h:98607:15: error: '#pragma clang attribute pop' with no matching '#pragma clang attribute push' #pragma clang attribute pop ^ 2 errors generated. make[2]: *** [Makefile:57: .output/runqslower.bpf.o] Error 1 make[1]: *** [Makefile:119: runqslower] Error 2 make: *** [Makefile:68: bpf] Error 2 ``` On this same machine and with same configuration, I've tried v5.4 and v5.5, no failures.
Re: [PATCH] tracing: Fix events.rst section numbering
On 2020-05-19 at 02:29 Tom Zanussi wrote: >The in-kernel trace event API should have its own section, and the >duplicate section numbers need fixing as well. > >Signed-off-by: Tom Zanussi >Reported-by: Li Xinhai >--- > Documentation/trace/events.rst | 28 ++-- > 1 file changed, 14 insertions(+), 14 deletions(-) > >diff --git a/Documentation/trace/events.rst b/Documentation/trace/events.rst >index ed79b220bd07..1a3b7762cb0f 100644 >--- a/Documentation/trace/events.rst >+++ b/Documentation/trace/events.rst >@@ -526,8 +526,8 @@ The following commands are supported: > > See Documentation/trace/histogram.rst for details and examples. > >-6.3 In-kernel trace event API >-- >+7. In-kernel trace event API >+ > > In most cases, the command-line interface to trace events is more than > sufficient. Sometimes, however, applications might find the need for >@@ -559,8 +559,8 @@ following: > - tracing synthetic events from in-kernel code > - the low-level "dynevent_cmd" API > >-6.3.1 Dyamically creating synthetic event definitions >-- >+7.1 Dyamically creating synthetic event definitions >+--- > > There are a couple ways to create a new synthetic event from a kernel > module or other kernel code. >@@ -665,8 +665,8 @@ registered by calling the synth_event_gen_cmd_end() >function: > At this point, the event object is ready to be used for tracing new > events. > >-6.3.3 Tracing synthetic events from in-kernel code >--- >+7.2 Tracing synthetic events from in-kernel code >+ > > To trace a synthetic event, there are several options. The first > option is to trace the event in one call, using synth_event_trace() >@@ -677,8 +677,8 @@ synth_event_trace_start() and synth_event_trace_end() >along with > synth_event_add_next_val() or synth_event_add_val() to add the values > piecewise. > >-6.3.3.1 Tracing a synthetic event all at once >-- >+7.2.1 Tracing a synthetic event all at once >+--- > > To trace a synthetic event all at once, the synth_event_trace() or > synth_event_trace_array() functions can be used. >@@ -779,8 +779,8 @@ remove the event: > > ret = synth_event_delete("schedtest"); > >-6.3.3.1 Tracing a synthetic event piecewise > >+7.2.2 Tracing a synthetic event piecewise >+- > > To trace a synthetic using the piecewise method described above, the > synth_event_trace_start() function is used to 'open' the synthetic >@@ -863,8 +863,8 @@ Note that synth_event_trace_end() must be called at the >end regardless > of whether any of the add calls failed (say due to a bad field name > being passed in). > >-6.3.4 Dyamically creating kprobe and kretprobe event definitions >- >+7.3 Dyamically creating kprobe and kretprobe event definitions >+-- > > To create a kprobe or kretprobe trace event from kernel code, the > kprobe_event_gen_cmd_start() or kretprobe_event_gen_cmd_start() >@@ -940,8 +940,8 @@ used to give the kprobe event file back and delete the >event: > > ret = kprobe_event_delete("gen_kprobe_test"); > >-6.3.4 The "dynevent_cmd" low-level API >------- >+7.4 The "dynevent_cmd" low-level API >+ > > Both the in-kernel synthetic event and kprobe interfaces are built on > top of a lower-level "dynevent_cmd" interface. This interface is >-- >2.17.1 > It looks correct to me. Reviewed-by: Li Xinhai >
Re: Documentation/trace/events.rst: wrong numbering of sections
>Hi, > >On Fri, 2020-05-15 at 09:11 -0400, Steven Rostedt wrote: >> It's best to Cc the maintainers of the file. Nobody reads linux- >> kernel (it >> produces 800 emails a day!). Luckily, I happen to monitor the >> linux-trace-devel list (which is mostly for userland tools), >> otherwise this >> email would have been lost to the LKML abyss. >> >> On Fri, 15 May 2020 15:43:43 +0800 >> "Li Xinhai" wrote: >> >> > This document has below numbering of its sections: >> > >> > 1. Introduction >> > 2. Using Event Tracing >> > 2.1 Via the 'set_event' interface >> > 2.2 Via the 'enable' toggle >> > 2.3 Boot option >> > 3. Defining an event-enabled tracepoint >> > 4. Event formats >> > 5. Event filtering >> > 5.1 Expression syntax >> > 5.2 Setting filters >> > 5.3 Clearing filters >> > 5.3 Subsystem filters >> > 5.4 PID filtering >> > 6. Event triggers >> > 6.1 Expression syntax >> > 6.2 Supported trigger commands >> > 6.3 In-kernel trace event API >> > 6.3.1 Dyamically creating synthetic event definitions >> > 6.3.3 Tracing synthetic events from in-kernel code >> > 6.3.3.1 Tracing a synthetic event all at once >> > 6.3.3.1 Tracing a synthetic event piecewise >> > 6.3.4 Dyamically creating kprobe and kretprobe event definitions >> > 6.3.4 The "dynevent_cmd" low-level API >> > >> > It seems wrong numbering within 6.3 section. >> > or, would it be better to have separated chapter #7, for 'In-kernel >> > trace >> > event API'? it seems not belong to 'Event triggers'. >> >> Yeah, 6.3.4 (both of them) probably should have been under a new top >> level >> section. (#7). >> > >Yeah, aside from duplicate numbering in a couple of places, it would >make more sense for everything starting from '6.3 In-kernel trace event >API' to be in a section 7. > >Would you like to submit a patch for that, Li, or should I? > I am not sure the correct organization of these part, you maybe better to fix it, thanks. >Thanks, > >Tom > >> -- Steve >
Documentation/trace/events.rst: wrong numbering of sections
This document has below numbering of its sections: 1. Introduction 2. Using Event Tracing 2.1 Via the 'set_event' interface 2.2 Via the 'enable' toggle 2.3 Boot option 3. Defining an event-enabled tracepoint 4. Event formats 5. Event filtering 5.1 Expression syntax 5.2 Setting filters 5.3 Clearing filters 5.3 Subsystem filters 5.4 PID filtering 6. Event triggers 6.1 Expression syntax 6.2 Supported trigger commands 6.3 In-kernel trace event API 6.3.1 Dyamically creating synthetic event definitions 6.3.3 Tracing synthetic events from in-kernel code 6.3.3.1 Tracing a synthetic event all at once 6.3.3.1 Tracing a synthetic event piecewise 6.3.4 Dyamically creating kprobe and kretprobe event definitions 6.3.4 The "dynevent_cmd" low-level API It seems wrong numbering within 6.3 section. or, would it be better to have separated chapter #7, for 'In-kernel trace event API'? it seems not belong to 'Event triggers'.