Re: [PATCH v5 04/12] fprobe: Use ftrace_regs in fprobe entry handler

2023-09-25 Thread Google
On Tue, 26 Sep 2023 00:14:33 +0200
Jiri Olsa  wrote:

> On Mon, Sep 25, 2023 at 09:15:15PM +0900, Masami Hiramatsu wrote:
> > Hi Jiri,
> > 
> > On Mon, 25 Sep 2023 12:41:59 +0200
> > Jiri Olsa  wrote:
> > 
> > > On Sun, Sep 24, 2023 at 10:36:36PM +0900, Masami Hiramatsu (Google) wrote:
> > > > From: Masami Hiramatsu (Google) 
> > > > 
> > > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> > > > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
> > > > on arm64.
> > > > 
> > > > Signed-off-by: Masami Hiramatsu (Google) 
> > > > Acked-by: Florent Revest 
> > > 
> > > I was getting bpf selftests failures with this patchset and when
> > > bisecting I'm getting crash when running on top of this change
> > 
> > Thanks for bisecting!
> > 
> > > 
> > > looks like it's missing some of the regs NULL checks added later?
> > 
> > yeah, if the RIP (arch_rethook_prepare+0x0/0x30) is correct, 
> > 
> > void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs 
> > *fregs, bool mcount)
> > 
> > RSI (the 2nd argument) is NULL. This means fregs == NULL and caused the 
> > crash.
> > I think ftrace_get_regs(fregs) for the entry handler may return NULL.
> > 
> > Ah, 
> > 
> > @@ -182,7 +182,7 @@ static void fprobe_init(struct fprobe *fp)
> > fp->ops.func = fprobe_kprobe_handler;
> > else
> > fp->ops.func = fprobe_handler;
> > -   fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS;
> > +   fp->ops.flags |= FTRACE_OPS_FL_SAVE_ARGS;
> >  }
> >  
> >  static int fprobe_init_rethook(struct fprobe *fp, int num)
> > 
> > This may cause the issue, it should keep REGS at this point (this must be 
> > done in
> > [9/12]). But after applying [9/12], it shouldn't be a problem... 
> > 
> > Let me check it again.
> 
> that helped with the crash, I'll continue bisecting to find out
> where it breaks the tests

Can you share the configuration and the test?
I would like to reproduce it because I couldn't make it reproduced.

Thank you,

-- 
Masami Hiramatsu (Google) 



Re: [PATCH v5 04/12] fprobe: Use ftrace_regs in fprobe entry handler

2023-09-25 Thread Jiri Olsa
On Mon, Sep 25, 2023 at 09:15:15PM +0900, Masami Hiramatsu wrote:
> Hi Jiri,
> 
> On Mon, 25 Sep 2023 12:41:59 +0200
> Jiri Olsa  wrote:
> 
> > On Sun, Sep 24, 2023 at 10:36:36PM +0900, Masami Hiramatsu (Google) wrote:
> > > From: Masami Hiramatsu (Google) 
> > > 
> > > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> > > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
> > > on arm64.
> > > 
> > > Signed-off-by: Masami Hiramatsu (Google) 
> > > Acked-by: Florent Revest 
> > 
> > I was getting bpf selftests failures with this patchset and when
> > bisecting I'm getting crash when running on top of this change
> 
> Thanks for bisecting!
> 
> > 
> > looks like it's missing some of the regs NULL checks added later?
> 
> yeah, if the RIP (arch_rethook_prepare+0x0/0x30) is correct, 
> 
> void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs *fregs, 
> bool mcount)
> 
> RSI (the 2nd argument) is NULL. This means fregs == NULL and caused the crash.
> I think ftrace_get_regs(fregs) for the entry handler may return NULL.
> 
> Ah, 
> 
> @@ -182,7 +182,7 @@ static void fprobe_init(struct fprobe *fp)
>   fp->ops.func = fprobe_kprobe_handler;
>   else
>   fp->ops.func = fprobe_handler;
> - fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS;
> + fp->ops.flags |= FTRACE_OPS_FL_SAVE_ARGS;
>  }
>  
>  static int fprobe_init_rethook(struct fprobe *fp, int num)
> 
> This may cause the issue, it should keep REGS at this point (this must be 
> done in
> [9/12]). But after applying [9/12], it shouldn't be a problem... 
> 
> Let me check it again.

that helped with the crash, I'll continue bisecting to find out
where it breaks the tests

thanks,
jirka



Re: [PATCH v5 04/12] fprobe: Use ftrace_regs in fprobe entry handler

2023-09-25 Thread Google
Hi Jiri,

On Mon, 25 Sep 2023 12:41:59 +0200
Jiri Olsa  wrote:

> On Sun, Sep 24, 2023 at 10:36:36PM +0900, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) 
> > 
> > This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> > instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
> > on arm64.
> > 
> > Signed-off-by: Masami Hiramatsu (Google) 
> > Acked-by: Florent Revest 
> 
> I was getting bpf selftests failures with this patchset and when
> bisecting I'm getting crash when running on top of this change

Thanks for bisecting!

> 
> looks like it's missing some of the regs NULL checks added later?

yeah, if the RIP (arch_rethook_prepare+0x0/0x30) is correct, 

void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs *fregs, 
bool mcount)

RSI (the 2nd argument) is NULL. This means fregs == NULL and caused the crash.
I think ftrace_get_regs(fregs) for the entry handler may return NULL.

Ah, 

@@ -182,7 +182,7 @@ static void fprobe_init(struct fprobe *fp)
fp->ops.func = fprobe_kprobe_handler;
else
fp->ops.func = fprobe_handler;
-   fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS;
+   fp->ops.flags |= FTRACE_OPS_FL_SAVE_ARGS;
 }
 
 static int fprobe_init_rethook(struct fprobe *fp, int num)

This may cause the issue, it should keep REGS at this point (this must be done 
in
[9/12]). But after applying [9/12], it shouldn't be a problem... 

Let me check it again.

Thank you!

> 
> jirka
> 
> 
> ---
> [  124.089449][  T677] BUG: kernel NULL pointer dereference, address: 
> 0098
> [  124.090102][  T677] #PF: supervisor read access in kernel mode
> [  124.090568][  T677] #PF: error_code(0x) - not-present page
> [  124.091039][  T677] PGD 158fd8067 P4D 158fd8067 PUD 10896a067 PMD 0 
> [  124.091482][  T677] Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
> [  124.091986][  T677] CPU: 1 PID: 677 Comm: test_progs Tainted: G   
> OE  6.6.0-rc2+ #768 1c8a8990289f2615e36dadd01915b80e8da29bf5
> [  124.092898][  T677] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), 
> BIOS 1.16.2-1.fc38 04/01/2014
> [  124.093613][  T677] RIP: 0010:arch_rethook_prepare+0x0/0x30
> [  124.094060][  T677] Code: 90 90 90 90 90 90 90 90 90 90 48 89 b7 a8 00 00 
> 00 c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 
> <48> 8b 86 98 00 00 00 48 8b 10 48 89 57 20 48 8b 96 98 00 00 00 48
> [  124.096239][  T677] RSP: 0018:c90003d3bc98 EFLAGS: 00010286
> [  124.096708][  T677] RAX:  RBX: 88815d9fbe50 RCX: 
> 88815d9fbe40
> [  124.097332][  T677] RDX: 0001 RSI:  RDI: 
> 88815d9fbe40
> [  124.097946][  T677] RBP:  R08:  R09: 
> 0004
> [  124.098554][  T677] R10: 0001 R11:  R12: 
> 81fbb7b0
> [  124.099108][  T677] R13: 81fbd47b R14: fff7 R15: 
> 88815d9fbe40
> [  124.099720][  T677] FS:  7f9c2063ed00() GS:88846d20() 
> knlGS:
> [  124.100403][  T677] CS:  0010 DS:  ES:  CR0: 80050033
> [  124.100908][  T677] CR2: 0098 CR3: 000108e02002 CR4: 
> 00770ee0
> [  124.101537][  T677] DR0:  DR1:  DR2: 
> 
> [  124.102096][  T677] DR3:  DR6: 4ff0 DR7: 
> 0400
> [  124.102689][  T677] PKRU: 5554
> [  124.102988][  T677] Call Trace:
> [  124.103303][  T677]  
> [  124.103580][  T677]  ? __die+0x1f/0x70
> [  124.103928][  T677]  ? page_fault_oops+0x176/0x4d0
> [  124.104348][  T677]  ? __pte_offset_map_lock+0xa5/0x190
> [  124.104818][  T677]  ? do_user_addr_fault+0x73/0x870
> [  124.105277][  T677]  ? exc_page_fault+0x81/0x250
> [  124.105709][  T677]  ? asm_exc_page_fault+0x22/0x30
> [  124.106171][  T677]  ? bpf_prog_test_run_tracing+0x14b/0x2c0
> [  124.106675][  T677]  ? __pfx_bpf_fentry_test1+0x10/0x10
> [  124.107135][  T677]  ? __pfx_arch_rethook_prepare+0x10/0x10
> [  124.107598][  T677]  rethook_hook+0x10/0x30
> [  124.107966][  T677]  fprobe_handler+0x129/0x210
> [  124.108351][  T677]  ? __pfx_bpf_fentry_test1+0x10/0x10
> [  124.108796][  T677]  ? bpf_prog_test_run_tracing+0x14b/0x2c0
> [  124.109276][  T677]  arch_ftrace_ops_list_func+0xf2/0x200
> [  124.109708][  T677]  ftrace_call+0x5/0x44
> [  124.110060][  T677]  ? bpf_fentry_test1+0x5/0x10
> [  124.110463][  T677]  bpf_fentry_test1+0x5/0x10
> [  124.110881][  T677]  bpf_prog_test_run_tracing+0x14b/0x2c0
> [  124.111337][  T677]  __sys_bpf+0x305/0x2820
> [  124.111705][  T677]  __x64_sys_bpf+0x1a/0x30
> [  124.112053][  T677]  do_syscall_64+0x38/0x90
> [  124.116245][  T677]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [  124.116688][  T677] RIP: 0033:0x7f9c20806b5d
> [  124.117078][  T677] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e 
> fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 

Re: [PATCH v5 04/12] fprobe: Use ftrace_regs in fprobe entry handler

2023-09-25 Thread Jiri Olsa
On Sun, Sep 24, 2023 at 10:36:36PM +0900, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) 
> 
> This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
> on arm64.
> 
> Signed-off-by: Masami Hiramatsu (Google) 
> Acked-by: Florent Revest 

I was getting bpf selftests failures with this patchset and when
bisecting I'm getting crash when running on top of this change

looks like it's missing some of the regs NULL checks added later?

jirka


---
[  124.089449][  T677] BUG: kernel NULL pointer dereference, address: 
0098
[  124.090102][  T677] #PF: supervisor read access in kernel mode
[  124.090568][  T677] #PF: error_code(0x) - not-present page
[  124.091039][  T677] PGD 158fd8067 P4D 158fd8067 PUD 10896a067 PMD 0 
[  124.091482][  T677] Oops:  [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
[  124.091986][  T677] CPU: 1 PID: 677 Comm: test_progs Tainted: G   OE 
 6.6.0-rc2+ #768 1c8a8990289f2615e36dadd01915b80e8da29bf5
[  124.092898][  T677] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 
1.16.2-1.fc38 04/01/2014
[  124.093613][  T677] RIP: 0010:arch_rethook_prepare+0x0/0x30
[  124.094060][  T677] Code: 90 90 90 90 90 90 90 90 90 90 48 89 b7 a8 00 00 00 
c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <48> 
8b 86 98 00 00 00 48 8b 10 48 89 57 20 48 8b 96 98 00 00 00 48
[  124.096239][  T677] RSP: 0018:c90003d3bc98 EFLAGS: 00010286
[  124.096708][  T677] RAX:  RBX: 88815d9fbe50 RCX: 
88815d9fbe40
[  124.097332][  T677] RDX: 0001 RSI:  RDI: 
88815d9fbe40
[  124.097946][  T677] RBP:  R08:  R09: 
0004
[  124.098554][  T677] R10: 0001 R11:  R12: 
81fbb7b0
[  124.099108][  T677] R13: 81fbd47b R14: fff7 R15: 
88815d9fbe40
[  124.099720][  T677] FS:  7f9c2063ed00() GS:88846d20() 
knlGS:
[  124.100403][  T677] CS:  0010 DS:  ES:  CR0: 80050033
[  124.100908][  T677] CR2: 0098 CR3: 000108e02002 CR4: 
00770ee0
[  124.101537][  T677] DR0:  DR1:  DR2: 

[  124.102096][  T677] DR3:  DR6: 4ff0 DR7: 
0400
[  124.102689][  T677] PKRU: 5554
[  124.102988][  T677] Call Trace:
[  124.103303][  T677]  
[  124.103580][  T677]  ? __die+0x1f/0x70
[  124.103928][  T677]  ? page_fault_oops+0x176/0x4d0
[  124.104348][  T677]  ? __pte_offset_map_lock+0xa5/0x190
[  124.104818][  T677]  ? do_user_addr_fault+0x73/0x870
[  124.105277][  T677]  ? exc_page_fault+0x81/0x250
[  124.105709][  T677]  ? asm_exc_page_fault+0x22/0x30
[  124.106171][  T677]  ? bpf_prog_test_run_tracing+0x14b/0x2c0
[  124.106675][  T677]  ? __pfx_bpf_fentry_test1+0x10/0x10
[  124.107135][  T677]  ? __pfx_arch_rethook_prepare+0x10/0x10
[  124.107598][  T677]  rethook_hook+0x10/0x30
[  124.107966][  T677]  fprobe_handler+0x129/0x210
[  124.108351][  T677]  ? __pfx_bpf_fentry_test1+0x10/0x10
[  124.108796][  T677]  ? bpf_prog_test_run_tracing+0x14b/0x2c0
[  124.109276][  T677]  arch_ftrace_ops_list_func+0xf2/0x200
[  124.109708][  T677]  ftrace_call+0x5/0x44
[  124.110060][  T677]  ? bpf_fentry_test1+0x5/0x10
[  124.110463][  T677]  bpf_fentry_test1+0x5/0x10
[  124.110881][  T677]  bpf_prog_test_run_tracing+0x14b/0x2c0
[  124.111337][  T677]  __sys_bpf+0x305/0x2820
[  124.111705][  T677]  __x64_sys_bpf+0x1a/0x30
[  124.112053][  T677]  do_syscall_64+0x38/0x90
[  124.116245][  T677]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[  124.116688][  T677] RIP: 0033:0x7f9c20806b5d
[  124.117078][  T677] Code: c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 
48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 
3d 01 f0 ff ff 73 01 c3 48 8b 0d 7b 92 0c 00 f7 d8 64 89 01 48
[  124.118487][  T677] RSP: 002b:7ffc615e5228 EFLAGS: 0206 ORIG_RAX: 
0141
[  124.119090][  T677] RAX: ffda RBX: 7f9c20918000 RCX: 
7f9c20806b5d
[  124.119698][  T677] RDX: 0050 RSI: 7ffc615e5260 RDI: 
000a
[  124.120298][  T677] RBP: 7ffc615e5240 R08:  R09: 
7ffc615e5260
[  124.120893][  T677] R10: 0064 R11: 0206 R12: 
0001
[  124.121486][  T677] R13:  R14: 7f9c2094d000 R15: 
011a8db0
[  124.122156][  T677]  
[  124.122421][  T677] Modules linked in: bpf_testmod(OE) intel_rapl_msr 
intel_rapl_common crct10dif_pclmul crc32_pclmul crc32c_intel 
ghash_clmulni_intel kvm_intel rapl iTCO_wdt iTCO_vendor_support i2c_i801 
i2c_smbus lpc_ich drm loop drm_panel_orientation_quirks zram
[  124.125064][  T677] CR2: 0098
[  124.125431][  T677] ---[ end trace  ]---
[  124.125861][  T677] RIP: