Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 3:55 PM Kyle Huey wrote: > > I have verified that a) the test case I sent earlier passes now and b) > all rr tests pass now. Thanks for keeping on top of this. Thomas/Andy - the patch looks straightforward and obvious enough, and I don't see any issues with it, so I assume I'll get it through the normal channels and will archive this whole discussion. No huge hurry, as long as it hits 5.11 final so that we don't end up with a regression. Thanks, Linus
Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 10:11 AM Kyle Huey wrote: > > On Wed, Feb 3, 2021 at 10:00 AM Gabriel Krisman Bertazi > wrote: > > This seems to pass Kyle's test case. Kyle, can you verify it works with > > rr? > > I will test it later today. I have verified that a) the test case I sent earlier passes now and b) all rr tests pass now. Tested-by: Kyle Huey - Kyle
Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 10:18 AM Andy Lutomirski wrote: > > So if we want to reliably single-step a system call and trap after the > system call, we just need to synthesize a trap on the way out. Well, I think Gabriel's patch does exactly that, due to how SYSCALL_EXIT_TRAP is set. It looks like subsequent system calls will work exactly the way the concurrent system call case (that Kyle's test did) does. So it all _looks_ sane to me, but this is one of those "it needs testing". Linus
Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 10:10 AM Linus Torvalds wrote: > > On Wed, Feb 3, 2021 at 10:00 AM Gabriel Krisman Bertazi > wrote: > > > > Does the patch below follows your suggestion? I'm setting the > > SYSCALL_WORK shadowing TIF_SINGLESTEP every time, instead of only when > > the child is inside a system call. Is this acceptable? > > Looks sane to me. > > My main worry would be about "what about the next system call"? It's > not what Kyle's case cares about, but let me just give an example: > > - task A traces task B, and starts single-stepping. Task B was *not* > in a system call at this point. > > - task B happily executes one instruction at a time, takes a TF > fault, everything is good > > - task B now does a system call. That will disable single-stepping > while in the kernel > > - task B returns from the system call. TF will be set in eflags, but > the first instruction *after* the system call will execute unless we > go through the system call exit path > > So I think the tracer basically misses one instruction when single-stepping. I was hoping you wouldn't ask this :) The x86 architecture is fundamentally a bit busted here. If we return from a system call with SYSRET and TF is set in R11, then SYSRET traps, which means that #DB is delivered before executing a user instruction. I have been asking Intel for quite a while to document this, and they said they did, but I still can't find it. IRET is the opposite: if we return from a system call with IRET and TF is set on the stack, we execute one user instruction and then trap. So if we want to reliably single-step a system call and trap after the system call, we just need to synthesize a trap on the way out. Doing this and getting all the nasty corners (e.g. sigreturn setting TF, sigreturn *clearing* TF, signal delivery as part of the syscall, ptrace mucking with TF) etc right might be nontrivial. I suspect the behavior back in the bad old asm-entry-path days was at best inconsistent. --Andy
Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 10:00 AM Gabriel Krisman Bertazi wrote: > > Linus Torvalds writes: > > > On Sun, Jan 31, 2021 at 3:35 PM Linus Torvalds > > wrote: > >> > >> I wonder if the simple solution is to just > >> > >> (a) always set one of the SYSCALL_WORK_EXIT bits on the child in > >> ptrace (exactly to catch the child on system call exit) > >> > >> (b) basically revert 299155244770 ("entry: Drop usage of TIF flags in > >> the generic syscall code") and have the syscall exit code check the > >> TIF_SINGLESTEP flag > > > > Actually, (b) looks unnecessary - as long as we get to > > syscall_exit_work(), the current code will work fine. > > > > So maybe just add a dummy SYSCALL_WORK_SYSCALL_EXIT_TRAP, and set that > > flag whenever a singestep is requested for a process that is currently > > in a system call? > > > > IOW, make it a very explicit "do TF for system calls", rather than the > > old code that was doing so implicitly and not very obviously. Hmm? > > Linus, > > Does the patch below follows your suggestion? I'm setting the > SYSCALL_WORK shadowing TIF_SINGLESTEP every time, instead of only when > the child is inside a system call. Is this acceptable? > > This seems to pass Kyle's test case. Kyle, can you verify it works with > rr? I will test it later today. > I can also turn Kyle's test case into a selftest, if it is ok with him. Sure. Consider whatever license/copyright/etc you need granted. - Kyle > Thanks, > > -- >8 -- > Subject: [PATCH] entry: Fix missed trap after single-step on a system call > return > > Commit 299155244770 ("entry: Drop usage of TIF flags in the generic > syscall code") introduces a bug on architectures using the generic > syscall entry code, in which processes stopped by PTRACE_SYSCALL do not > trap on syscall return after receiving a TIF_SINGLESTEP. The reason is > the meaning of TIF_SINGLESTEP flag is overloaded to cause the trap after > a system call is executed, but since the above commit, the syscall call > handler only checks for the SYSCALL_WORK flags on the exit work. > > This patch splits the meaning of TIF_SINGLESTEP such that it only means > single-step mode, and creates a new type of SYSCALL_WORK to request a > trap immediately after a syscall in single-step mode. In the current > implementation, the SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag > for simplicity. > > Since x86 is the only code already using the generic syscall handling, > this also updates that architecture to flip this bit when a tracer > enables single step. > > Suggested-by: Linus Torvalds > Fixes: 299155244770 ("entry: Drop usage of TIF flags in the generic syscall > code") > Signed-off-by: Gabriel Krisman Bertazi > --- > arch/x86/include/asm/entry-common.h | 2 -- > arch/x86/kernel/step.c | 10 -- > include/linux/entry-common.h| 1 + > include/linux/thread_info.h | 2 ++ > kernel/entry/common.c | 12 ++-- > 5 files changed, 13 insertions(+), 14 deletions(-) > > diff --git a/arch/x86/include/asm/entry-common.h > b/arch/x86/include/asm/entry-common.h > index 6fe54b2813c1..2b87b191b3b8 100644 > --- a/arch/x86/include/asm/entry-common.h > +++ b/arch/x86/include/asm/entry-common.h > @@ -43,8 +43,6 @@ static __always_inline void arch_check_user_regs(struct > pt_regs *regs) > } > #define arch_check_user_regs arch_check_user_regs > > -#define ARCH_SYSCALL_EXIT_WORK (_TIF_SINGLESTEP) > - > static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, > unsigned long ti_work) > { > diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c > index 60d2c3798ba2..de975b62f10a 100644 > --- a/arch/x86/kernel/step.c > +++ b/arch/x86/kernel/step.c > @@ -127,12 +127,17 @@ static int enable_single_step(struct task_struct *child) > regs->flags |= X86_EFLAGS_TF; > > /* > -* Always set TIF_SINGLESTEP - this guarantees that > -* we single-step system calls etc.. This will also > +* Always set TIF_SINGLESTEP. This will also > * cause us to set TF when returning to user mode. > */ > set_tsk_thread_flag(child, TIF_SINGLESTEP); > > + /* > +* Trigger a trap is triggered once stepping out of a system > +* call prior to executing any user instruction. > +*/ > + set_task_syscall_work(child, SYSCALL_EXIT_TRAP); > + > oflags = regs->flags; > > /* Set TF on the kernel stack.. */
Re: [PATCH] entry: Fix missed trap after single-step on system call return
On Wed, Feb 3, 2021 at 10:00 AM Gabriel Krisman Bertazi wrote: > > Does the patch below follows your suggestion? I'm setting the > SYSCALL_WORK shadowing TIF_SINGLESTEP every time, instead of only when > the child is inside a system call. Is this acceptable? Looks sane to me. My main worry would be about "what about the next system call"? It's not what Kyle's case cares about, but let me just give an example: - task A traces task B, and starts single-stepping. Task B was *not* in a system call at this point. - task B happily executes one instruction at a time, takes a TF fault, everything is good - task B now does a system call. That will disable single-stepping while in the kernel - task B returns from the system call. TF will be set in eflags, but the first instruction *after* the system call will execute unless we go through the system call exit path So I think the tracer basically misses one instruction when single-stepping. I think your patch works for this case (because the SYSCALL_EXIT_TRAP flag stays set until single-stepping is done), so I think it's all good. But can you verify, just to allay my worry? Linus
[PATCH] entry: Fix missed trap after single-step on system call return
Linus Torvalds writes: > On Sun, Jan 31, 2021 at 3:35 PM Linus Torvalds > wrote: >> >> I wonder if the simple solution is to just >> >> (a) always set one of the SYSCALL_WORK_EXIT bits on the child in >> ptrace (exactly to catch the child on system call exit) >> >> (b) basically revert 299155244770 ("entry: Drop usage of TIF flags in >> the generic syscall code") and have the syscall exit code check the >> TIF_SINGLESTEP flag > > Actually, (b) looks unnecessary - as long as we get to > syscall_exit_work(), the current code will work fine. > > So maybe just add a dummy SYSCALL_WORK_SYSCALL_EXIT_TRAP, and set that > flag whenever a singestep is requested for a process that is currently > in a system call? > > IOW, make it a very explicit "do TF for system calls", rather than the > old code that was doing so implicitly and not very obviously. Hmm? Linus, Does the patch below follows your suggestion? I'm setting the SYSCALL_WORK shadowing TIF_SINGLESTEP every time, instead of only when the child is inside a system call. Is this acceptable? This seems to pass Kyle's test case. Kyle, can you verify it works with rr? I can also turn Kyle's test case into a selftest, if it is ok with him. Thanks, -- >8 -- Subject: [PATCH] entry: Fix missed trap after single-step on a system call return Commit 299155244770 ("entry: Drop usage of TIF flags in the generic syscall code") introduces a bug on architectures using the generic syscall entry code, in which processes stopped by PTRACE_SYSCALL do not trap on syscall return after receiving a TIF_SINGLESTEP. The reason is the meaning of TIF_SINGLESTEP flag is overloaded to cause the trap after a system call is executed, but since the above commit, the syscall call handler only checks for the SYSCALL_WORK flags on the exit work. This patch splits the meaning of TIF_SINGLESTEP such that it only means single-step mode, and creates a new type of SYSCALL_WORK to request a trap immediately after a syscall in single-step mode. In the current implementation, the SYSCALL_WORK flag shadows the TIF_SINGLESTEP flag for simplicity. Since x86 is the only code already using the generic syscall handling, this also updates that architecture to flip this bit when a tracer enables single step. Suggested-by: Linus Torvalds Fixes: 299155244770 ("entry: Drop usage of TIF flags in the generic syscall code") Signed-off-by: Gabriel Krisman Bertazi --- arch/x86/include/asm/entry-common.h | 2 -- arch/x86/kernel/step.c | 10 -- include/linux/entry-common.h| 1 + include/linux/thread_info.h | 2 ++ kernel/entry/common.c | 12 ++-- 5 files changed, 13 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h index 6fe54b2813c1..2b87b191b3b8 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -43,8 +43,6 @@ static __always_inline void arch_check_user_regs(struct pt_regs *regs) } #define arch_check_user_regs arch_check_user_regs -#define ARCH_SYSCALL_EXIT_WORK (_TIF_SINGLESTEP) - static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, unsigned long ti_work) { diff --git a/arch/x86/kernel/step.c b/arch/x86/kernel/step.c index 60d2c3798ba2..de975b62f10a 100644 --- a/arch/x86/kernel/step.c +++ b/arch/x86/kernel/step.c @@ -127,12 +127,17 @@ static int enable_single_step(struct task_struct *child) regs->flags |= X86_EFLAGS_TF; /* -* Always set TIF_SINGLESTEP - this guarantees that -* we single-step system calls etc.. This will also +* Always set TIF_SINGLESTEP. This will also * cause us to set TF when returning to user mode. */ set_tsk_thread_flag(child, TIF_SINGLESTEP); + /* +* Trigger a trap is triggered once stepping out of a system +* call prior to executing any user instruction. +*/ + set_task_syscall_work(child, SYSCALL_EXIT_TRAP); + oflags = regs->flags; /* Set TF on the kernel stack.. */ @@ -230,6 +235,7 @@ void user_disable_single_step(struct task_struct *child) /* Always clear TIF_SINGLESTEP... */ clear_tsk_thread_flag(child, TIF_SINGLESTEP); + clear_task_syscall_work(child, SYSCALL_EXIT_TRAP); /* But touch TF only if it was set by us.. */ if (test_and_clear_tsk_thread_flag(child, TIF_FORCED_TF)) diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h index ca86a00abe86..a104b298019a 100644 --- a/include/linux/entry-common.h +++ b/include/linux/entry-common.h @@ -46,6 +46,7 @@ SYSCALL_WORK_SYSCALL_TRACE | \