On Tue, Jul 29, 2025 at 02:23:10PM -0400, Steven Rostedt wrote:
> @@ -212,32 +225,59 @@ int unwind_deferred_request(struct unwind_work *work,
> u64 *cookie)
>
> *cookie = get_cookie(info);
>
> - /* callback already pending? */
> - pending = READ_ONCE(info->pending);
> - if (p
On Tue, Jul 29, 2025 at 02:23:13PM -0400, Steven Rostedt wrote:
> @@ -230,6 +232,14 @@ int unwind_deferred_request(struct unwind_work *work,
> u64 *cookie)
> if (WARN_ON_ONCE(!CAN_USE_IN_NMI && in_nmi()))
> return -EINVAL;
>
> + /* Do not allow cancelled works to request
On Thu, Sep 18, 2025 at 03:10:18PM -0400, Steven Rostedt wrote:
> On Thu, 18 Sep 2025 19:32:20 +0200
> Peter Zijlstra wrote:
>
> > > Now, task_work_run() is in the exit_to_user_mode_loop() which is notably
> > > *before* exit_to_user_mode() which does the unwind_reset_info().
> > >
> > > What ha
On Fri, Jul 18, 2025 at 10:28:32AM +0200, Jens Remus wrote:
> On 17.07.2025 13:09, Jens Remus wrote:
> > On 17.07.2025 01:01, Josh Poimboeuf wrote:
> >> On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
> >>> +++ b/arch/Kconfig
> >>> @@ -450,6
On Thu, Jul 17, 2025 at 02:20:12PM +0200, Jens Remus wrote:
> >> +done_backchain:
> >>state->topmost = false;
> >>return 0;
> >
> > This feels very grafted on, is there not some way to make it more
> > generic, i.e., to just work with CONFIG_HAVE_UNWIND_USER_FP?
>
> I agree. It could pro
On Thu, Jul 17, 2025 at 02:07:05PM +0200, Jens Remus wrote:
> On 17.07.2025 04:50, Josh Poimboeuf wrote:
> > So the following is wrong:
> >
> > case UNWIND_USER_LOC_STACK:
> > if (!frame->fp.frame_off)
> > goto done;
> &g
On Thu, Jul 17, 2025 at 01:28:25PM +0200, Jens Remus wrote:
> >>}
> >>
> >>/* Get the Frame Pointer (FP) */
> >> - if (frame->fp_off && unwind_get_user_long(fp, cfa + frame->fp_off,
> >> state))
> >> + switch (frame->fp.loc) {
> >> + case UNWIND_USER_LOC_NONE:
> >> + break;
>
On Thu, Jul 17, 2025 at 11:27:45AM +0200, Jens Remus wrote:
> On 16.07.2025 23:32, Josh Poimboeuf wrote:
> > On Thu, Jul 10, 2025 at 06:35:12PM +0200, Jens Remus wrote:
> >> Most architectures define their CFA as the value of the stack pointer
> >> (SP) at the call sit
On Wed, Jul 16, 2025 at 11:57:51PM -0400, Steven Rostedt wrote:
> On Wed, 16 Jul 2025 19:01:06 -0700
> Josh Poimboeuf wrote:
>
> > > + if (unwind_user_get_reg(&ra, frame->ra.regnum))
> > > + goto done;
> > > + break;
>
On Wed, Jul 16, 2025 at 07:01:09PM -0700, Josh Poimboeuf wrote:
> > state->ip = ra;
> > state->sp = sp;
> > - if (frame->fp_off)
> > + if (frame->fp.loc != UNWIND_USER_LOC_NONE)
> > state->fp = fp;
>
> Instead of the ex
On Thu, Jul 10, 2025 at 06:35:18PM +0200, Jens Remus wrote:
> @@ -66,12 +73,20 @@ static int unwind_user_next(struct unwind_user_state
> *state)
> /* sframe expects the frame to be local storage */
> frame = &_frame;
> if (sframe_find(state->ip, frame, top
On Thu, Jul 10, 2025 at 06:35:14PM +0200, Jens Remus wrote:
> +#ifndef unwind_user_get_reg
> +
> +/**
> + * generic_unwind_user_get_reg - Get register value.
> + * @val: Register value.
> + * @regnum: DWARF register number to obtain the value from.
> + *
> + * Returns zero if successful. Otherwise
On Thu, Jul 10, 2025 at 06:35:13PM +0200, Jens Remus wrote:
> +++ b/arch/Kconfig
> @@ -450,6 +450,11 @@ config HAVE_UNWIND_USER_SFRAME
> bool
> select UNWIND_USER
>
> +config HAVE_USER_RA_REG
> + bool
> + help
> + The arch passes the return address (RA) in user space in
On Thu, Jul 10, 2025 at 06:35:12PM +0200, Jens Remus wrote:
> Most architectures define their CFA as the value of the stack pointer
> (SP) at the call site in the previous frame, as suggested by the DWARF
> standard:
>
> CFA =
>
> Enable unwinding of user space for architectures, such as s390,
On Tue, Jul 08, 2025 at 12:31:20PM -0400, Steven Rostedt wrote:
> On Tue, 8 Jul 2025 08:53:56 -0700
> Linus Torvalds wrote:
>
> > On Tue, 8 Jul 2025 at 07:41, Steven Rostedt wrote:
> > >
> > > Would something like this work? If someone enables the config to enable
> > > the
> > > validation, I
On Tue, Jul 08, 2025 at 09:23:51AM -0400, Steven Rostedt wrote:
> On Mon, 7 Jul 2025 20:38:35 -0700
> Linus Torvalds wrote:
>
> > On Mon, 7 Jul 2025 at 19:12, Steven Rostedt wrote:
>
> > This patch is disgusting, in other words. It's wrong. STOP IT.
> >
>
> No problem, I can easily drop it.
>
On Thu, Jun 12, 2025 at 02:44:18PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 10, 2025 at 6:03 PM Steven Rostedt wrote:
> >
> >
> > Hi Peter and Ingo,
> >
> > This is the first patch series of a set that will make it possible to be
> > able
> > to use SFrames[1] in the Linux kernel. A quick reca
On Fri, May 09, 2025 at 02:53:38PM -0700, Andrii Nakryiko wrote:
> On Fri, May 9, 2025 at 9:52 AM Steven Rostedt wrote:
> >
> > From: Josh Poimboeuf
> >
> > get_perf_callchain() doesn't support cross-task unwinding for user space
> > stacks, have it retu
On Sun, May 04, 2025 at 12:43:30PM -0400, Steven Rostedt wrote:
> On Sun, 4 May 2025 11:30:32 +0200
> Ingo Molnar wrote:
>
> > > +struct unwind_user_state {
> > > + unsigned long ip;
> > > + unsigned long sp;
> > > + unsigned long fp;
> > > + enum unwind_user_type type;
> > > + bool done;
> > > +
On Mon, Apr 28, 2025 at 09:17:01AM -0700, Josh Poimboeuf wrote:
> On Fri, Apr 25, 2025 at 04:22:22PM -0700, Indu Bhagat wrote:
> > On 4/24/25 7:37 PM, Steven Rostedt wrote:
> > > From: Josh Poimboeuf
> > >
> > > Use the CFI macros instead of the raw .cfi_* di
On Thu, Apr 24, 2025 at 10:37:55PM -0400, Steven Rostedt wrote:
> From: Josh Poimboeuf
>
> Use the CFI macros instead of the raw .cfi_* directives to be consistent
> with the rest of the VDSO asm. It's also easier on the eyes.
>
> No functional changes.
>
>
On Fri, Apr 25, 2025 at 04:22:22PM -0700, Indu Bhagat wrote:
> On 4/24/25 7:37 PM, Steven Rostedt wrote:
> > From: Josh Poimboeuf
> >
> > Use the CFI macros instead of the raw .cfi_* directives to be consistent
> > with the rest of the VDSO asm. It's also
On Fri, Apr 25, 2025 at 10:54:33AM -0400, Steven Rostedt wrote:
> From: Steven Rostedt
>
> To determine if a task is a kernel thread or not, it is more reliable to
> use (current->flags & PF_KTHREAD) than to rely on current->mm being NULL.
> That is because some kernel tasks (io_uring helpers) ma
On Tue, Apr 22, 2025 at 02:34:45PM -0400, Steven Rostedt wrote:
> +++ b/arch/x86/entry/vdso/Makefile
> @@ -47,13 +47,17 @@ quiet_cmd_vdso2c = VDSO2C $@
> $(obj)/vdso-image-%.c: $(obj)/vdso%.so.dbg $(obj)/vdso%.so $(obj)/vdso2c
> FORCE
> $(call if_changed,vdso2c)
>
> +ifeq ($(CONFIG_AS_SF
On Tue, Apr 22, 2025 at 02:34:42PM -0400, Steven Rostedt wrote:
> From: Josh Poimboeuf
>
> The DWARF .cfi_startproc annotation needs to be at the very beginning of
> a function. But with kernel IBT that doesn't happen as ENDBR is
> sneakily embedded in SYM_FUNC_START. A
On Tue, Apr 22, 2025 at 12:15:41PM -0400, Steven Rostedt wrote:
> On Wed, 22 Jan 2025 13:42:28 +0100
> Peter Zijlstra wrote:
>
> > If we hit before schedule(), all just works as expected, if we hit after
> > schedule(), the task will already have the TIF flag set, and we'll hit
> > the return to
On Wed, Feb 05, 2025 at 02:56:36PM +0100, Jens Remus wrote:
> On 29.01.2025 03:02, Josh Poimboeuf wrote:
>
> > Note FDEs aren't even needed here as the unwinder doesn't need to know
> > when a function begins/ends. The only info needed by the unwinder is
> &g
On Wed, Feb 05, 2025 at 10:47:58AM +0100, Jens Remus wrote:
> UNSAFE_GET_USER_INC(ra_off, cur, offset_size, Efault);
>
> With offset_size=1 expands into:
>
> __UNSAFE_GET_USER_INC(/*to=*/ra_off, /*from=*cur, /*type=*/u8,
> /*label=*/Efault);
>
> Expands into:
>
> {
> u8 __to;
> uns
On Thu, Jan 30, 2025 at 03:21:36PM -0500, Steven Rostedt wrote:
> Coming back from this. It would be fine if we could do the back trace when
> we come back from the scheduler, so it should not be an issue if the task
> even has to schedule again to fault in the sframe information.
So there would b
On Thu, Jan 30, 2025 at 01:39:52PM -0800, Indu Bhagat wrote:
> On 1/28/25 6:02 PM, Josh Poimboeuf wrote:
> > However, if we're going that route, we might want to even consider a
> > completely revamped data layout. For example:
> >
> > One insight is that t
On Thu, Jan 30, 2025 at 01:21:21PM -0800, Indu Bhagat wrote:
> > Yeah, and it's actually bothering me quite a lot 🙂 I have a tentative
> > proposal, maybe we can discuss this for SFrame v3? Let me briefly
> > outline the idea.
> >
>
> I looked at the idea below. It could work wrt unaligned acces
On Thu, Jan 30, 2025 at 05:38:24PM +0100, Jens Remus wrote:
> Add a similar debug message for SFRame FDE user copy failures?
>
> diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
>
> @@ -125,6 +125,7 @@ static __always_inline int __find_fde(struct
> sframe_section *sec,
> retu
On Thu, Jan 30, 2025 at 07:51:15PM +, Weinan Liu wrote:
> Nit: swap() might be a simplify way to alternate pointers between two
> fre_addr[] entries.
>
> For example,
>
> static __always_inline int __find_fre(struct sframe_section *sec,
> struct sframe_fde
On Thu, Jan 30, 2025 at 05:17:33PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > When debugging sframe issues, the error messages aren't all that helpful
> > without knowing what file a corresponding .sframe section belongs to.
> > Prefix deb
On Thu, Jan 30, 2025 at 04:47:00PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > +struct sframe_fre {
> > + unsigned intsize;
> > + s32 ip_off;
>
> The IP offset (from function start) in the SFrame V2 FDE is unsigne
On Thu, Jan 30, 2025 at 07:07:32AM -0800, Indu Bhagat wrote:
> On 1/21/25 6:31 PM, Josh Poimboeuf wrote:
> > + for (i = 0; i < fde->fres_num; i++) {
> > + int ret;
> > +
> > + /*
> > +* Alternate between t
On Wed, Jan 29, 2025 at 04:02:34PM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 28, 2025 at 6:02 PM Josh Poimboeuf wrote:
> I'm not sure about this chunked lookup approach for arbitrary user
> space applications. Those executable sections can be a) big and b)
> discontiguous.
On Mon, Jan 27, 2025 at 05:10:27PM -0800, Andrii Nakryiko wrote:
> > Yes, in theory, it is allowed (as per the specification) to have an
> > SFrame section with zero number of FDEs/FREs. But since such a section
> > will not be useful, I share the opinion that it makes sense to disallow
> > it in
On Tue, Jan 28, 2025 at 11:50:25AM +0100, Jens Remus wrote:
> On 28.01.2025 01:39, Andrii Nakryiko wrote:
> > On Fri, Jan 24, 2025 at 1:41 PM Josh Poimboeuf wrote:
> > > On Fri, Jan 24, 2025 at 10:02:46AM -0800, Andrii Nakryiko wrote:
> > > > On Tue, Jan 21,
On Fri, Jan 24, 2025 at 02:46:48PM -0800, Josh Poimboeuf wrote:
> On Fri, Jan 24, 2025 at 04:58:03PM -0500, Steven Rostedt wrote:
> > Now the only thing I could think of is a flag gets set where the task comes
> > out of the scheduler and then does the stack trace. It doesn'
On Fri, Jan 24, 2025 at 04:58:03PM -0500, Steven Rostedt wrote:
> On Thu, 23 Jan 2025 23:13:26 +0100
> Peter Zijlstra wrote:
>
> > -EPONIES, you cannot take faults from the middle of schedule(). They can
> > always use the best effort FP unwind we have today.
>
> Agreed.
>
> Now the only thing
On Fri, Jan 24, 2025 at 03:13:40PM -0500, Steven Rostedt wrote:
> On Fri, 24 Jan 2025 11:21:59 -0800
> Josh Poimboeuf wrote:
>
> > > given SFRAME_F_FRAME_POINTER in the header, is it really that
> > > nonsensical and illegal to have zero FDEs/FREs? Maybe we should allo
On Fri, Jan 24, 2025 at 03:02:11PM -0500, Steven Rostedt wrote:
> On Tue, 21 Jan 2025 18:31:03 -0800
> Josh Poimboeuf wrote:
> > +int unwind_user_start(struct unwind_user_state *state)
> > +{
> > + struct pt_regs *regs = task_pt_regs(current);
> > +
> >
On Fri, Jan 24, 2025 at 10:13:23AM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 21, 2025 at 6:32 PM Josh Poimboeuf wrote:
> > @@ -430,10 +429,8 @@ static long __bpf_get_stack(struct pt_regs *regs,
> > struct task_struct *task,
> > if (task &
On Fri, Jan 24, 2025 at 03:09:27PM -0500, Steven Rostedt wrote:
> On Tue, 21 Jan 2025 18:31:06 -0800
> Josh Poimboeuf wrote:
>
> > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > index c75c482d4c52..23ac6343cf86 100644
> > --- a/arch/x86/events/core.
On Fri, Jan 24, 2025 at 10:02:46AM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 21, 2025 at 6:32 PM Josh Poimboeuf wrote:
> > +static __always_inline int __read_fde(struct sframe_section *sec,
> > + unsig
On Fri, Jan 24, 2025 at 10:00:52AM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 21, 2025 at 6:32 PM Josh Poimboeuf wrote:
> > +static inline int sframe_add_section(unsigned long sframe_start, unsigned
> > long sframe_end, unsigned long text_start, unsigned long text_end) { retu
On Fri, Jan 24, 2025 at 09:59:28AM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 21, 2025 at 6:32 PM Josh Poimboeuf wrote:
> > +int unwind_user_next(struct unwind_user_state *state)
> > +{
> > + struct unwind_user_frame _frame;
> > + struct unwind_
On Fri, Jan 24, 2025 at 09:59:37AM -0800, Andrii Nakryiko wrote:
> On Tue, Jan 21, 2025 at 6:32 PM Josh Poimboeuf wrote:
> >
> > Add optional support for user space frame pointer unwinding. If
> > supported, the arch needs to enable CONFIG_HAVE_UNWIND_U
On Fri, Jan 24, 2025 at 05:36:38PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
>
> > diff --git a/include/linux/sframe.h b/include/linux/sframe.h
>
> > @@ -3,11 +3,14 @@
> > #define _LINUX_SFRAME_H
> > #include
> >
On Fri, Jan 24, 2025 at 05:41:29PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > +int unwind_user_next(struct unwind_user_state *state)
> > +{
> > + struct unwind_user_frame _frame;
> > + struct unwind_user_frame *frame = &_frame;
&g
On Fri, Jan 24, 2025 at 05:35:37PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > +static inline void unwind_deferred_init(struct unwind_work *work,
> > unwind_callback_t func) {}
> > +static inline int unwind_deferred_request(struct task
On Fri, Jan 24, 2025 at 05:30:36PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > Enable sframe generation in the VDSO library so kernel and user space
> > can unwind through it.
> >
> > Signed-off-by: Josh Poimboeuf
>
> > diff --
On Fri, Jan 24, 2025 at 08:43:35AM -0800, Josh Poimboeuf wrote:
> On Fri, Jan 24, 2025 at 05:00:27PM +0100, Jens Remus wrote:
> > On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > > +#ifdef __x86_64__
> >
> > #if defined(__x86_64__) && defined(CONFIG_AS_S
On Fri, Jan 24, 2025 at 05:08:57PM +0100, Jens Remus wrote:
> On 22.01.2025 03:30, Josh Poimboeuf wrote:
> > -#ifndef BUILD_VDSO
> > - /*
> > -* Emit CFI data in .debug_frame sections, not .eh_frame sections.
> > -* The latter we currently just disc
On Fri, Jan 24, 2025 at 05:00:27PM +0100, Jens Remus wrote:
> On 22.01.2025 03:31, Josh Poimboeuf wrote:
> > diff --git a/arch/x86/include/asm/dwarf2.h b/arch/x86/include/asm/dwarf2.h
> > index b195b3c8677e..1c354f648505 100644
> > --- a/arch/x86/include/asm/dwarf2.h
> >
On Thu, Jan 23, 2025 at 11:17:34PM +0100, Peter Zijlstra wrote:
> On Thu, Jan 23, 2025 at 11:48:07AM -0800, Josh Poimboeuf wrote:
> > > > > cookie = READ_ONCE(current->unwind_info.cookie);
> > > > > do {
> > > > > if (co
On Thu, Jan 23, 2025 at 11:48:10AM -0800, Josh Poimboeuf wrote:
> On Thu, Jan 23, 2025 at 09:40:26AM +0100, Peter Zijlstra wrote:
> > On Wed, Jan 22, 2025 at 02:49:02PM -0800, Josh Poimboeuf wrote:
> > > But also, the nmi_cookie is still needed for the case where the NMI
> &g
On Thu, Jan 23, 2025 at 09:40:26AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 02:49:02PM -0800, Josh Poimboeuf wrote:
> > On Wed, Jan 22, 2025 at 03:15:05PM +0100, Peter Zijlstra wrote:
> > > On Tue, Jan 21, 2025 at 06:31:22PM -0800, Josh Poimboeuf wrote:
> >
On Thu, Jan 23, 2025 at 09:25:34AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 08:05:33PM -0800, Josh Poimboeuf wrote:
>
> > However... would it be a horrible idea for 'next' to unwind 'prev' after
> > the context switch???
>
> The idea
On Thu, Jan 23, 2025 at 09:31:31AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 02:36:25PM -0800, Josh Poimboeuf wrote:
> > On Wed, Jan 22, 2025 at 02:57:00PM +0100, Peter Zijlstra wrote:
> > > On Tue, Jan 21, 2025 at 06:31:21PM -0800, Josh Poimboeuf wrote:
> >
On Thu, Jan 23, 2025 at 09:17:18AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 02:51:27PM -0800, Josh Poimboeuf wrote:
> > On Wed, Jan 22, 2025 at 03:16:16PM +0100, Peter Zijlstra wrote:
> > The ctx_ctr is always incremented before calling this, so 0 isn't a
> >
On Thu, Jan 23, 2025 at 09:14:03AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 12:47:20PM -0800, Josh Poimboeuf wrote:
> > What exactly do you mean by "NMI like"? Is it because a #DB might be
> > basically running in NMI context, if the NMI hit a breakpoint?
&
On Wed, Jan 22, 2025 at 03:13:10PM -0500, Mathieu Desnoyers wrote:
> > +struct unwind_work {
> > + struct callback_headwork;
> > + unwind_callback_t func;
> > + int pending;
> > +};
>
> This is a lot of information to keep around per inst
On Wed, Jan 22, 2025 at 03:29:10PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:26PM -0800, Josh Poimboeuf wrote:
> > If the task doesn't have any memory, there's no stack to unwind.
> >
> > Signed-off-by: Josh Poimboeuf
> > ---
> &g
On Wed, Jan 22, 2025 at 03:24:18PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:22PM -0800, Josh Poimboeuf wrote:
> > +static int unwind_deferred_request_nmi(struct unwind_work *work, u64
> > *cookie)
> > +{
> > + struct unwind_task_info *info = ¤t
On Wed, Jan 22, 2025 at 03:16:16PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 02:37:30PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 21, 2025 at 06:31:20PM -0800, Josh Poimboeuf wrote:
> > > +/*
> > > + * The context cookie is a unique identifier which allow
On Wed, Jan 22, 2025 at 03:15:05PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:22PM -0800, Josh Poimboeuf wrote:
> Oh gawd. Can we please do something simple like:
>
> guard(irqsave)();
> cpu = raw_smp_processor_id();
> ctr = __this_cpu_r
On Wed, Jan 22, 2025 at 02:57:00PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:21PM -0800, Josh Poimboeuf wrote:
> > Cache the results of the unwind to ensure the unwind is only performed
> > once, even when called by multiple tracers.
> >
> > Si
On Wed, Jan 22, 2025 at 01:03:42PM -0800, Josh Poimboeuf wrote:
> The self-IPI is only needed when the NMI happened in user space, right?
> Would it make sense to have an optimized version of that?
Actually, maybe not, that could be tricky if the NMI hits in the kernel
after task wor
On Wed, Jan 22, 2025 at 02:44:20PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:20PM -0800, Josh Poimboeuf wrote:
>
> > +/* entry-from-user counter */
> > +static DEFINE_PER_CPU(u64, unwind_ctx_ctr);
>
> AFAICT from the below, this thing does *not* c
On Wed, Jan 22, 2025 at 02:37:30PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:20PM -0800, Josh Poimboeuf wrote:
> > +/*
> > + * The context cookie is a unique identifier which allows post-processing
> > to
> > + * correlate kernel trace(s) with user
On Wed, Jan 22, 2025 at 01:51:54PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:31:06PM -0800, Josh Poimboeuf wrote:
> > get_segment_base() will be used by the unwind_user code, so make it
> > global and rename it so it doesn't conflict with a KVM function
On Wed, Jan 22, 2025 at 01:42:28PM +0100, Peter Zijlstra wrote:
> So I'm a little confused, isn't something like this sufficient?
>
> If we hit before schedule(), all just works as expected, if we hit after
> schedule(), the task will already have the TIF flag set, and we'll hit
> the return to us
On Wed, Jan 22, 2025 at 01:28:21PM +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2025 at 06:30:53PM -0800, Josh Poimboeuf wrote:
> > It's possible for irq_work_queue() to fail if the work has already been
> > claimed. That can happen if a TWA_NMI_CURRENT task work is re
On Tue, Jan 21, 2025 at 06:30:52PM -0800, Josh Poimboeuf wrote:
> For testing with user space, here are the latest binutils fixes:
>
> 1785837a2570 ("ld: fix PR/32297")
> 938fb512184d ("ld: fix wrong SFrame info for lazy IBT PLT")
> 47c88752f9ad ("
rf_ioctl
__x64_sys_ioctl
do_syscall_64
entry_SYSCALL_64
__GI___ioctl
Signed-off-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
tools/perf/Documentation/perf-script.txt | 5 ++
tools/perf/builtin-script.c | 5 +-
tools/perf/util/cal
failed, error -22
switching off deferred callchain support
Signed-off-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
tools/perf/util/evsel.c | 24
tools/perf/util/evsel.h | 1 +
2 files changed, 25 insertions(+)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util
.
0 24344920643 0x5fe0 [0x40]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2):
801/801: 0
... FP chain: nr:2
. 0: fe00
. 1: 7f45253fd34b
: unhandled!
Signed-off-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
tools/lib/perf/include/perf/event.h | 7 +++
/libc.so.6)
Signed-off-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
tools/perf/builtin-script.c | 89 +
1 file changed, 89 insertions(+)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 9e47905f75a6..2b9085fa18bd 100644
--- a/tools
Use the new unwind_deferred_request() interface (if available) to defer
unwinds to task context. This allows the use of .sframe (if available)
and also prevents duplicate userspace unwinds.
Suggested-by: Steven Rostedt
Suggested-by: Peter Zijlstra
Signed-off-by: Josh Poimboeuf
---
arch
If the task doesn't have any memory, there's no stack to unwind.
Signed-off-by: Josh Poimboeuf
---
kernel/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 99f0f28feeb5..a886bb83f4d0 100644
--- a/ker
Simplify the get_perf_callchain() user logic a bit. task_pt_regs()
should never be NULL.
Acked-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
kernel/events/callchain.c | 20 +---
1 file changed, 9 insertions(+), 11 deletions(-)
diff --git a/kernel/events/callchain.c b
get_perf_callchain() doesn't support cross-task unwinding, so it doesn't
make much sense to have 'crosstask' as an argument.
Acked-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
include/linux/perf_event.h | 2 +-
kernel/bpf/stackmap.c | 12
kerne
The 'init_nr' argument has double duty: it's used to initialize both the
number of contexts and the number of stack entries. That's confusing
and the callers always pass zero anyway. Hard code the zero.
Acked-by: Namhyung Kim
Signed-off-by: Josh Poimboeuf
---
include/lin
Make unwind_deferred_request() NMI-safe so tracers in NMI context can
call it to get the cookie immediately rather than have to do the fragile
"schedule irq work and then call unwind_deferred_request()" dance.
Signed-off-by: Josh Poimboeuf
---
include/linux/entry-common.h
Cache the results of the unwind to ensure the unwind is only performed
once, even when called by multiple tracers.
Signed-off-by: Josh Poimboeuf
---
include/linux/unwind_deferred_types.h | 8 +++-
kernel/unwind/deferred.c | 26 --
2 files changed, 27
, whether called multiple times by the same
caller or by different callers.
- Create a "context cookie" which allows trace post-processing to
correlate kernel unwinds/traces with the user unwind.
Signed-off-by: Josh Poimboeuf
---
include/linux/entry-common.h | 2 +
inc
Add a debug feature to validate all .sframe sections when first loading
the file rather than on demand.
Signed-off-by: Josh Poimboeuf
---
arch/Kconfig | 19 ++
kernel/unwind/sframe.c | 81 ++
2 files changed, 100 insertions(+)
diff
Objtool warns about calling pr_debug() from uaccess-enabled regions, and
rightfully so. Add a dbg_sec_uaccess() macro which temporarily disables
uaccess before doing the dynamic printk, and use that to add debug
messages throughout the uaccess-enabled regions.
Signed-off-by: Josh Poimboeuf
To avoid continued attempted use of a bad .sframe section, remove it
on demand when the first sign of corruption is detected.
Signed-off-by: Josh Poimboeuf
---
kernel/unwind/sframe.c | 4
1 file changed, 4 insertions(+)
diff --git a/kernel/unwind/sframe.c b/kernel/unwind/sframe.c
index
When debugging sframe issues, the error messages aren't all that helpful
without knowing what file a corresponding .sframe section belongs to.
Prefix debug output strings with the file name.
Signed-off-by: Josh Poimboeuf
---
include/linux/sframe.h | 4 +++-
kernel/unwind/sfr
Now that the sframe infrastructure is fully in place, make it work by
hooking it up to the unwind_user interface.
Signed-off-by: Josh Poimboeuf
---
arch/Kconfig | 1 +
include/linux/unwind_user_types.h | 1 +
kernel/unwind/user.c | 22
The x86 sframe 2.0 implementation works fairly well, starting with
binutils 2.41 (though some bugs are getting fixed in later versions).
Enable it.
Signed-off-by: Josh Poimboeuf
---
arch/x86/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index
The kernel doesn't have direct visibility to the ELF contents of shared
libraries. Add some prctl() interfaces which allow glibc to tell the
kernel where to find .sframe sections.
Signed-off-by: Josh Poimboeuf
---
include/uapi/linux/prctl.h | 5 -
kernel/sys.c
When loading an ELF executable, automatically detect an .sframe section
and associate it with the mm_struct.
Signed-off-by: Josh Poimboeuf
---
fs/binfmt_elf.c | 49 +---
include/uapi/linux/elf.h | 1 +
2 files changed, 47 insertions(+), 3 deletions
-enabled
regions would break noinstr validation, so there aren't any debug
messages yet. That will be added in a subsequent commit.
Signed-off-by: Josh Poimboeuf
---
include/linux/sframe.h | 5 +
kernel/unwind/sframe.c | 295 ++-
kernel/u
Associate an sframe section with its mm by adding it to a per-mm maple
tree which is indexed by the corresponding text address range. A single
sframe section can be associated with multiple text ranges.
Signed-off-by: Josh Poimboeuf
---
arch/x86/include/asm/mmu.h | 2 +-
include/linux
Use ARCH_INIT_USER_COMPAT_FP_FRAME to describe how frame pointers are
unwound on x86, and implement the hooks needed to add the segment base
addresses. Enable HAVE_UNWIND_USER_COMPAT_FP if the system has compat
mode compiled in.
Signed-off-by: Josh Poimboeuf
---
arch/x86/Kconfig
that.
Signed-off-by: Josh Poimboeuf
---
arch/Kconfig | 3 +
include/linux/sframe.h | 36 +++
kernel/unwind/Makefile | 3 +-
kernel/unwind/sframe.c | 136 +
kernel/unwind/sframe.h | 71 +
5 files changed, 248 inser
1 - 100 of 128 matches
Mail list logo