Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

2024-05-15 Thread Stephen Brennan
Masami Hiramatsu (Google)  writes:
> On Thu, 2 May 2024 01:35:16 +0800
> Guo Ren  wrote:
>
>> On Thu, May 2, 2024 at 12:30 AM Stephen Brennan
>>  wrote:
>> >
>> > If an error happens in ftrace, ftrace_kill() will prevent disarming
>> > kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> > freed, yet the kprobes will still be active, and when triggered, they
>> > will use the freed memory, likely resulting in a page fault and panic.
>> >
>> > This behavior can be reproduced quite easily, by creating a kprobe and
>> > then triggering a ftrace_kill(). For simplicity, we can simulate an
>> > ftrace error with a kernel module like [1]:
>> >
>> > [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> >
>> >   sudo perf probe --add commit_creds
>> >   sudo perf trace -e probe:commit_creds
>> >   # In another terminal
>> >   make
>> >   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>> >   # Back to perf terminal
>> >   # ctrl-c
>> >   sudo perf probe --del commit_creds
>> >
>> > After a short period, a page fault and panic would occur as the kprobe
>> > continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> > is supposed to be used only in extreme circumstances, it is invoked in
>> > FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> > could be triggered, yet the system may continue operating, possibly
>> > without the administrator noticing. If ftrace_kill() does not panic the
>> > system, then we should do everything we can to continue operating,
>> > rather than leave a ticking time bomb.
>> >
>> > Signed-off-by: Stephen Brennan 
>> > ---
>> > Changes in v3:
>> >   Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
>> >   variable and check it directly in the kprobe handlers.
>> > Link to v1/v2 discussion:
>> >   
>> > https://lore.kernel.org/all/20240426225834.993353-1-stephen.s.bren...@oracle.com/
>> >
>> >  arch/csky/kernel/probes/ftrace.c | 3 +++
>> >  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>> >  arch/parisc/kernel/ftrace.c  | 3 +++
>> >  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>> >  arch/riscv/kernel/probes/ftrace.c| 3 +++
>> >  arch/s390/kernel/ftrace.c| 3 +++
>> >  arch/x86/kernel/kprobes/ftrace.c | 3 +++
>> >  include/linux/kprobes.h  | 7 +++
>> >  kernel/kprobes.c | 6 ++
>> >  kernel/trace/ftrace.c| 1 +
>> >  10 files changed, 35 insertions(+)
>> >
>> > diff --git a/arch/csky/kernel/probes/ftrace.c 
>> > b/arch/csky/kernel/probes/ftrace.c
>> > index 834cffcfbce3..7ba4b98076de 100644
>> > --- a/arch/csky/kernel/probes/ftrace.c
>> > +++ b/arch/csky/kernel/probes/ftrace.c
>> > @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned 
>> > long parent_ip,
>> > struct kprobe_ctlblk *kcb;
>> > struct pt_regs *regs;
>> >
>> > +   if (unlikely(kprobe_ftrace_disabled))
>> > +   return;
>> > +
>> For csky part.
>> Acked-by: Guo Ren 
>
> Thanks Stephen, Guo and Steve!
>
> Let me pick this to probes/for-next!

Thank you Masami!

I did want to check, is this the correct git tree to be watching?

https://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git/log/?h=probes/for-next

( I'm not trying to pressure on timing, as I know the merge window is
  hectic. Just making sure I'm watching the correct place! )

Thanks,
Stephen


Re: [PATCH v3] kprobe/ftrace: bail out if ftrace was killed

2024-05-07 Thread Stephen Brennan
Christophe Leroy  writes:
> Le 01/05/2024 à 18:29, Stephen Brennan a écrit :
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>> 
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>> 
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> 
>>sudo perf probe --add commit_creds
>>sudo perf trace -e probe:commit_creds
>># In another terminal
>>make
>>sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>># Back to perf terminal
>># ctrl-c
>>sudo perf probe --del commit_creds
>> 
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>> 
>> Signed-off-by: Stephen Brennan 
>> ---
>> Changes in v3:
>>Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
>>variable and check it directly in the kprobe handlers.
>
> Isn't it safer to provide a fonction rather than a direct access to a 
> variable ?

Is the concern that other code could modify this variable? If so, then I
suppose the function call is safer. But the variable is not exported and
I think built-in code can be trusted not to muck with it. Maybe I'm
missing your point about safety though?

> By the way, wouldn't it be more performant to use a static branch (jump 
> label) ?

I agree with Steven's concern that text modification would unfortunately
not be a good way to handle an error in text modification. Especially, I
believe there could be deadlock risks, as static key enablement requires
taking the text_mutex and the jump_label_mutex. I'd be concerned that
the text_mutex could already be held in some situations where
ftrace_kill() is called. But I'm not certain about that.

Thanks for taking a look!
Stephen


[PATCH v3] kprobe/ftrace: bail out if ftrace was killed

2024-05-01 Thread Stephen Brennan
If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

  sudo perf probe --add commit_creds
  sudo perf trace -e probe:commit_creds
  # In another terminal
  make
  sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
  # Back to perf terminal
  # ctrl-c
  sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Signed-off-by: Stephen Brennan 
---
Changes in v3:
  Don't expose ftrace_is_dead(). Create a "kprobe_ftrace_disabled"
  variable and check it directly in the kprobe handlers.
Link to v1/v2 discussion:
  
https://lore.kernel.org/all/20240426225834.993353-1-stephen.s.bren...@oracle.com/

 arch/csky/kernel/probes/ftrace.c | 3 +++
 arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
 arch/parisc/kernel/ftrace.c  | 3 +++
 arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
 arch/riscv/kernel/probes/ftrace.c| 3 +++
 arch/s390/kernel/ftrace.c| 3 +++
 arch/x86/kernel/kprobes/ftrace.c | 3 +++
 include/linux/kprobes.h  | 7 +++
 kernel/kprobes.c | 6 ++
 kernel/trace/ftrace.c| 1 +
 10 files changed, 35 insertions(+)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 834cffcfbce3..7ba4b98076de 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe_ctlblk *kcb;
struct pt_regs *regs;
 
+   if (unlikely(kprobe_ftrace_disabled))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/loongarch/kernel/ftrace_dyn.c 
b/arch/loongarch/kernel/ftrace_dyn.c
index 73858c9029cc..bff058317062 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
+   if (unlikely(kprobe_ftrace_disabled))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 621a4b386ae4..c91f9c2e61ed 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
int bit;
 
+   if (unlikely(kprobe_ftrace_disabled))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 072ebe7f290b..f8208c027148 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long 
parent_nip,
struct pt_regs *regs;
int bit;
 
+   if (unlikely(kprobe_ftrace_disabled))
+   return;
+
bit = ftrace_test_recursion_trylock(nip, parent_nip);
if (bit < 0)
return;
diff --git a/arch/riscv/kernel/probes/ftrace.c 
b/arch/riscv/kernel/probes/ftrace.c
index 7142ec42e889..a69dfa610aa8 100644
--- a/arch/riscv/kernel/probes/ftrace.c
+++ b/arch/riscv/kernel/probes/ftrace.c
@@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe_ctlblk *kcb;
int bit;
 
+   if (unlikely(kprobe_ftrace_disabled))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..7f6f8c438c26 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip,

Re: [PATCH v2] kprobe/ftrace: bail out if ftrace was killed

2024-04-30 Thread Stephen Brennan
Steven Rostedt  writes:
> On Mon, 29 Apr 2024 10:47:18 -0700
> Stephen Brennan  wrote:
>
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>> 
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>> 
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> 
>>   sudo perf probe --add commit_creds
>>   sudo perf trace -e probe:commit_creds
>>   # In another terminal
>>   make
>>   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>>   # Back to perf terminal
>>   # ctrl-c
>>   sudo perf probe --del commit_creds
>> 
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>> 
>> Signed-off-by: Stephen Brennan 
>> ---
>> Difference from v1: removed both existing declarations of ftrace_is_dead()
>> from kernel/trace/trace.h.
>> 
>>  arch/csky/kernel/probes/ftrace.c | 3 +++
>>  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>>  arch/parisc/kernel/ftrace.c  | 3 +++
>>  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>>  arch/riscv/kernel/probes/ftrace.c| 3 +++
>>  arch/s390/kernel/ftrace.c| 3 +++
>>  arch/x86/kernel/kprobes/ftrace.c | 3 +++
>>  include/linux/ftrace.h   | 2 ++
>>  kernel/trace/trace.h | 2 --
>>  9 files changed, 23 insertions(+), 2 deletions(-)
>> 
>> diff --git a/arch/csky/kernel/probes/ftrace.c 
>> b/arch/csky/kernel/probes/ftrace.c
>> index 834cffcfbce3..3931bf9f707b 100644
>> --- a/arch/csky/kernel/probes/ftrace.c
>> +++ b/arch/csky/kernel/probes/ftrace.c
>> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
>> parent_ip,
>>  struct kprobe_ctlblk *kcb;
>>  struct pt_regs *regs;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  if (bit < 0)
>>  return;
>> diff --git a/arch/loongarch/kernel/ftrace_dyn.c 
>> b/arch/loongarch/kernel/ftrace_dyn.c
>> index 73858c9029cc..82c952cb5be0 100644
>> --- a/arch/loongarch/kernel/ftrace_dyn.c
>> +++ b/arch/loongarch/kernel/ftrace_dyn.c
>> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned 
>> long parent_ip,
>>  struct kprobe *p;
>>  struct kprobe_ctlblk *kcb;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  if (bit < 0)
>>  return;
>> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
>> index 621a4b386ae4..3660834f54c3 100644
>> --- a/arch/parisc/kernel/ftrace.c
>> +++ b/arch/parisc/kernel/ftrace.c
>> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned 
>> long parent_ip,
>>  struct kprobe *p;
>>  int bit;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  if (bit < 0)
>>  return;
>> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
>> b/arch/powerpc/kernel/kprobes-ftrace.c
>> index 072ebe7f290b..85eb55aa1457 100644
>> --- a/arch/powerpc/kernel/kprobes-ftrace.c
>> +++ b/arch/powerpc/kernel/kprobes-ftrace.c
>> @@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned 
>> long parent_nip,
>>  struct pt_regs *regs;
>>  int bit;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(nip, parent_nip);
>>  if (bit < 0)
>>  

Re: [PATCH] kprobe/ftrace: bail out if ftrace was killed

2024-04-29 Thread Stephen Brennan
Masami Hiramatsu (Google)  writes:
> Hi Stephen,
>
> On Fri, 26 Apr 2024 15:58:34 -0700
> Stephen Brennan  wrote:
>
>> If an error happens in ftrace, ftrace_kill() will prevent disarming
>> kprobes. Eventually, the ftrace_ops associated with the kprobes will be
>> freed, yet the kprobes will still be active, and when triggered, they
>> will use the freed memory, likely resulting in a page fault and panic.
>
> Hmm, indeed.
>
>> 
>> This behavior can be reproduced quite easily, by creating a kprobe and
>> then triggering a ftrace_kill(). For simplicity, we can simulate an
>> ftrace error with a kernel module like [1]:
>> 
>> [1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer
>> 
>>   sudo perf probe --add commit_creds
>>   sudo perf trace -e probe:commit_creds
>>   # In another terminal
>>   make
>>   sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
>>   # Back to perf terminal
>>   # ctrl-c
>>   sudo perf probe --del commit_creds
>> 
>> After a short period, a page fault and panic would occur as the kprobe
>> continues to execute and uses the freed ftrace_ops. While ftrace_kill()
>> is supposed to be used only in extreme circumstances, it is invoked in
>> FTRACE_WARN_ON() and so there are many places where an unexpected bug
>> could be triggered, yet the system may continue operating, possibly
>> without the administrator noticing. If ftrace_kill() does not panic the
>> system, then we should do everything we can to continue operating,
>> rather than leave a ticking time bomb.
>
> OK, the patch looks good to me.
>
> Acked-by: Masami Hiramatsu (Google) 
>
> Thanks!

Hi Masami,

Thank you! Sadly I took a second look at the patch and noticed I forgot
to remove the existing declarations of ftrace_is_dead() from
kernel/trace/trace.h. I've sent v2 in reply to v1 in order to correct
that. I'm sorry for the churn.

Thanks,
Stephen

>> 
>> Signed-off-by: Stephen Brennan 
>> ---
>> 
>> Apologies for the wide net cast here. I recognize that a change like this
>> may need to be split up and go through arch-specific trees. I hoped to get
>> feedback on the patch itself. If it's satisfactory and the architecture
>> maintainers prefer it split out, I'm glad to do it. Thanks!
>> 
>>  arch/csky/kernel/probes/ftrace.c | 3 +++
>>  arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
>>  arch/parisc/kernel/ftrace.c  | 3 +++
>>  arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
>>  arch/riscv/kernel/probes/ftrace.c| 3 +++
>>  arch/s390/kernel/ftrace.c| 3 +++
>>  arch/x86/kernel/kprobes/ftrace.c | 3 +++
>>  include/linux/ftrace.h   | 2 ++
>>  8 files changed, 23 insertions(+)
>> 
>> diff --git a/arch/csky/kernel/probes/ftrace.c 
>> b/arch/csky/kernel/probes/ftrace.c
>> index 834cffcfbce3..3931bf9f707b 100644
>> --- a/arch/csky/kernel/probes/ftrace.c
>> +++ b/arch/csky/kernel/probes/ftrace.c
>> @@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
>> parent_ip,
>>  struct kprobe_ctlblk *kcb;
>>  struct pt_regs *regs;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  if (bit < 0)
>>  return;
>> diff --git a/arch/loongarch/kernel/ftrace_dyn.c 
>> b/arch/loongarch/kernel/ftrace_dyn.c
>> index 73858c9029cc..82c952cb5be0 100644
>> --- a/arch/loongarch/kernel/ftrace_dyn.c
>> +++ b/arch/loongarch/kernel/ftrace_dyn.c
>> @@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned 
>> long parent_ip,
>>  struct kprobe *p;
>>  struct kprobe_ctlblk *kcb;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  if (bit < 0)
>>  return;
>> diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
>> index 621a4b386ae4..3660834f54c3 100644
>> --- a/arch/parisc/kernel/ftrace.c
>> +++ b/arch/parisc/kernel/ftrace.c
>> @@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned 
>> long parent_ip,
>>  struct kprobe *p;
>>  int bit;
>>  
>> +if (unlikely(ftrace_is_dead()))
>> +return;
>> +
>>  bit = ftrace_test_recursion_trylock(ip, parent_ip);
>>  if (bit < 0)
>>  return;
>> diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
>> b/ar

[PATCH v2] kprobe/ftrace: bail out if ftrace was killed

2024-04-29 Thread Stephen Brennan
If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

  sudo perf probe --add commit_creds
  sudo perf trace -e probe:commit_creds
  # In another terminal
  make
  sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
  # Back to perf terminal
  # ctrl-c
  sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Signed-off-by: Stephen Brennan 
---
Difference from v1: removed both existing declarations of ftrace_is_dead()
from kernel/trace/trace.h.

 arch/csky/kernel/probes/ftrace.c | 3 +++
 arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
 arch/parisc/kernel/ftrace.c  | 3 +++
 arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
 arch/riscv/kernel/probes/ftrace.c| 3 +++
 arch/s390/kernel/ftrace.c| 3 +++
 arch/x86/kernel/kprobes/ftrace.c | 3 +++
 include/linux/ftrace.h   | 2 ++
 kernel/trace/trace.h | 2 --
 9 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 834cffcfbce3..3931bf9f707b 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe_ctlblk *kcb;
struct pt_regs *regs;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/loongarch/kernel/ftrace_dyn.c 
b/arch/loongarch/kernel/ftrace_dyn.c
index 73858c9029cc..82c952cb5be0 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 621a4b386ae4..3660834f54c3 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 072ebe7f290b..85eb55aa1457 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long 
parent_nip,
struct pt_regs *regs;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(nip, parent_nip);
if (bit < 0)
return;
diff --git a/arch/riscv/kernel/probes/ftrace.c 
b/arch/riscv/kernel/probes/ftrace.c
index 7142ec42e889..8814fbe4c888 100644
--- a/arch/riscv/kernel/probes/ftrace.c
+++ b/arch/riscv/kernel/probes/ftrace.c
@@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe_ctlblk *kcb;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..ccbe8ccf945b 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)

[PATCH] kprobe/ftrace: bail out if ftrace was killed

2024-04-27 Thread Stephen Brennan
If an error happens in ftrace, ftrace_kill() will prevent disarming
kprobes. Eventually, the ftrace_ops associated with the kprobes will be
freed, yet the kprobes will still be active, and when triggered, they
will use the freed memory, likely resulting in a page fault and panic.

This behavior can be reproduced quite easily, by creating a kprobe and
then triggering a ftrace_kill(). For simplicity, we can simulate an
ftrace error with a kernel module like [1]:

[1]: https://github.com/brenns10/kernel_stuff/tree/master/ftrace_killer

  sudo perf probe --add commit_creds
  sudo perf trace -e probe:commit_creds
  # In another terminal
  make
  sudo insmod ftrace_killer.ko  # calls ftrace_kill(), simulating bug
  # Back to perf terminal
  # ctrl-c
  sudo perf probe --del commit_creds

After a short period, a page fault and panic would occur as the kprobe
continues to execute and uses the freed ftrace_ops. While ftrace_kill()
is supposed to be used only in extreme circumstances, it is invoked in
FTRACE_WARN_ON() and so there are many places where an unexpected bug
could be triggered, yet the system may continue operating, possibly
without the administrator noticing. If ftrace_kill() does not panic the
system, then we should do everything we can to continue operating,
rather than leave a ticking time bomb.

Signed-off-by: Stephen Brennan 
---

Apologies for the wide net cast here. I recognize that a change like this
may need to be split up and go through arch-specific trees. I hoped to get
feedback on the patch itself. If it's satisfactory and the architecture
maintainers prefer it split out, I'm glad to do it. Thanks!

 arch/csky/kernel/probes/ftrace.c | 3 +++
 arch/loongarch/kernel/ftrace_dyn.c   | 3 +++
 arch/parisc/kernel/ftrace.c  | 3 +++
 arch/powerpc/kernel/kprobes-ftrace.c | 3 +++
 arch/riscv/kernel/probes/ftrace.c| 3 +++
 arch/s390/kernel/ftrace.c| 3 +++
 arch/x86/kernel/kprobes/ftrace.c | 3 +++
 include/linux/ftrace.h   | 2 ++
 8 files changed, 23 insertions(+)

diff --git a/arch/csky/kernel/probes/ftrace.c b/arch/csky/kernel/probes/ftrace.c
index 834cffcfbce3..3931bf9f707b 100644
--- a/arch/csky/kernel/probes/ftrace.c
+++ b/arch/csky/kernel/probes/ftrace.c
@@ -12,6 +12,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe_ctlblk *kcb;
struct pt_regs *regs;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/loongarch/kernel/ftrace_dyn.c 
b/arch/loongarch/kernel/ftrace_dyn.c
index 73858c9029cc..82c952cb5be0 100644
--- a/arch/loongarch/kernel/ftrace_dyn.c
+++ b/arch/loongarch/kernel/ftrace_dyn.c
@@ -287,6 +287,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
struct kprobe_ctlblk *kcb;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/parisc/kernel/ftrace.c b/arch/parisc/kernel/ftrace.c
index 621a4b386ae4..3660834f54c3 100644
--- a/arch/parisc/kernel/ftrace.c
+++ b/arch/parisc/kernel/ftrace.c
@@ -206,6 +206,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c 
b/arch/powerpc/kernel/kprobes-ftrace.c
index 072ebe7f290b..85eb55aa1457 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -21,6 +21,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long 
parent_nip,
struct pt_regs *regs;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(nip, parent_nip);
if (bit < 0)
return;
diff --git a/arch/riscv/kernel/probes/ftrace.c 
b/arch/riscv/kernel/probes/ftrace.c
index 7142ec42e889..8814fbe4c888 100644
--- a/arch/riscv/kernel/probes/ftrace.c
+++ b/arch/riscv/kernel/probes/ftrace.c
@@ -11,6 +11,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe_ctlblk *kcb;
int bit;
 
+   if (unlikely(ftrace_is_dead()))
+   return;
+
bit = ftrace_test_recursion_trylock(ip, parent_ip);
if (bit < 0)
return;
diff --git a/arch/s390/kernel/ftrace.c b/arch/s390/kernel/ftrace.c
index c46381ea04ec..ccbe8ccf945b 100644
--- a/arch/s390/kernel/ftrace.c
+++ b/arch/s390/kernel/ftrace.c
@@ -296,6 +296,9 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long 
parent_ip,
struct kprobe *p;
int bit;
 
+   if (unlikely

Re: [PATCH v3 5/9] mm: remove CONFIG_DISCONTIGMEM

2021-06-11 Thread Stephen Brennan
Mike Rapoport  writes:
> From: Mike Rapoport 
>
> There are no architectures that support DISCONTIGMEM left.
>
> Remove the configuration option and the dead code it was guarding in the
> generic memory management code.
>
> Signed-off-by: Mike Rapoport 
> ---
>  include/asm-generic/memory_model.h | 37 --
>  include/linux/mmzone.h |  8 ---
>  mm/Kconfig | 25 +++-
>  mm/page_alloc.c| 13 ---
>  4 files changed, 12 insertions(+), 71 deletions(-)
>
> diff --git a/include/asm-generic/memory_model.h 
> b/include/asm-generic/memory_model.h
> index 7637fb46ba4f..a2c8ed60233a 100644
> --- a/include/asm-generic/memory_model.h
> +++ b/include/asm-generic/memory_model.h
> @@ -6,47 +6,18 @@
>  
>  #ifndef __ASSEMBLY__
>  
> +/*
> + * supports 3 memory models.
> + */

This comment could either be updated to reflect 2 memory models, or
removed entirely.

Thanks,
Stephen

>  #if defined(CONFIG_FLATMEM)
>  
>  #ifndef ARCH_PFN_OFFSET
>  #define ARCH_PFN_OFFSET  (0UL)
>  #endif
>  
> -#elif defined(CONFIG_DISCONTIGMEM)
> -
> -#ifndef arch_pfn_to_nid
> -#define arch_pfn_to_nid(pfn) pfn_to_nid(pfn)
> -#endif
> -
> -#ifndef arch_local_page_offset
> -#define arch_local_page_offset(pfn, nid) \
> - ((pfn) - NODE_DATA(nid)->node_start_pfn)
> -#endif
> -
> -#endif /* CONFIG_DISCONTIGMEM */
> -
> -/*
> - * supports 3 memory models.
> - */
> -#if defined(CONFIG_FLATMEM)
> -
>  #define __pfn_to_page(pfn)   (mem_map + ((pfn) - ARCH_PFN_OFFSET))
>  #define __page_to_pfn(page)  ((unsigned long)((page) - mem_map) + \
>ARCH_PFN_OFFSET)
> -#elif defined(CONFIG_DISCONTIGMEM)
> -
> -#define __pfn_to_page(pfn)   \
> -({   unsigned long __pfn = (pfn);\
> - unsigned long __nid = arch_pfn_to_nid(__pfn);  \
> - NODE_DATA(__nid)->node_mem_map + arch_local_page_offset(__pfn, __nid);\
> -})
> -
> -#define __page_to_pfn(pg)\
> -({   const struct page *__pg = (pg); \
> - struct pglist_data *__pgdat = NODE_DATA(page_to_nid(__pg)); \
> - (unsigned long)(__pg - __pgdat->node_mem_map) + \
> -  __pgdat->node_start_pfn;   \
> -})
>  
>  #elif defined(CONFIG_SPARSEMEM_VMEMMAP)
>  
> @@ -70,7 +41,7 @@
>   struct mem_section *__sec = __pfn_to_section(__pfn);\
>   __section_mem_map_addr(__sec) + __pfn;  \
>  })
> -#endif /* CONFIG_FLATMEM/DISCONTIGMEM/SPARSEMEM */
> +#endif /* CONFIG_FLATMEM/SPARSEMEM */
>  
>  /*
>   * Convert a physical address to a Page Frame Number and back
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 0d53eba1c383..700032e99419 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -738,10 +738,12 @@ struct zonelist {
>   struct zoneref _zonerefs[MAX_ZONES_PER_ZONELIST + 1];
>  };
>  
> -#ifndef CONFIG_DISCONTIGMEM
> -/* The array of struct pages - for discontigmem use pgdat->lmem_map */
> +/*
> + * The array of struct pages for flatmem.
> + * It must be declared for SPARSEMEM as well because there are configurations
> + * that rely on that.
> + */
>  extern struct page *mem_map;
> -#endif
>  
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  struct deferred_split {
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 02d44e3420f5..218b96ccc84a 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -19,7 +19,7 @@ choice
>  
>  config FLATMEM_MANUAL
>   bool "Flat Memory"
> - depends on !(ARCH_DISCONTIGMEM_ENABLE || ARCH_SPARSEMEM_ENABLE) || 
> ARCH_FLATMEM_ENABLE
> + depends on !ARCH_SPARSEMEM_ENABLE || ARCH_FLATMEM_ENABLE
>   help
> This option is best suited for non-NUMA systems with
> flat address space. The FLATMEM is the most efficient
> @@ -32,21 +32,6 @@ config FLATMEM_MANUAL
>  
> If unsure, choose this option (Flat Memory) over any other.
>  
> -config DISCONTIGMEM_MANUAL
> - bool "Discontiguous Memory"
> - depends on ARCH_DISCONTIGMEM_ENABLE
> - help
> -   This option provides enhanced support for discontiguous
> -   memory systems, over FLATMEM.  These systems have holes
> -   in their physical address spaces, and this option provides
> -   more efficient handling of these holes.
> -
> -   Although "Discontiguous Memory" is still used by several
> -   architectures, it is considered deprecated in favor of
> -   "Sparse Memory".
> -
> -   If unsure, choose "Sparse Memory" over this option.
> -
>  config SPARSEMEM_MANUAL
>   bool "Sparse Memory"
>   depends on ARCH_SPARSEMEM_ENABLE
> @@ -62,17 +47,13 @@ config SPARSEMEM_MANUAL
>  
>  endchoice
>  
> -config DISCONTIGMEM
> - def_bool y
> - depends on (!SELECT_MEMORY_MODEL && ARCH_DISCONTIGMEM_ENABLE) || 
> DISCONTIGMEM_MANUAL
> -
>  config SPARSEMEM
>   def_bool y
>   depends on