On Mon, Apr 29, 2019 at 8:34 AM Bruce Ashfield <[email protected]> wrote:
> > > On Sun, Apr 28, 2019 at 6:50 AM He Zhe <[email protected]> wrote: > >> >> >> On 4/26/19 11:14 PM, Bruce Ashfield wrote: >> > >> > >> > On Wed, Apr 24, 2019 at 9:38 PM He Zhe <[email protected] <mailto: >> [email protected]>> wrote: >> > >> > >> > >> > On 4/24/19 8:34 PM, Bruce Ashfield wrote: >> > > >> > > >> > > On Wed, Apr 24, 2019 at 3:47 AM He Zhe <[email protected] >> <mailto:[email protected]> <mailto:[email protected] <mailto: >> [email protected]>>> wrote: >> > > >> > > This is for standard/base and all sub-level branches. For >> explanation, see the >> > > bottom of the commit log I append. >> > > >> > > >> > > Which kernel versions ? I didn't notice a version it the shortlog >> or temporary section, but I may have overlooked it. >> > >> > >From 4.19 to 5.0 >> > >> > >> > Thanks, this is now merged. >> >> This is missing on v5.0/standard/intel-x86. >> >> > Not in my tree, but I'll double check as I'm merging more changes later > today. > I confirmed that this really is in the branches that I pushed, are you really not seeing commit b470fb994ebd5b9ae8e520c85029449bf847289c in that branch ? https://git.yoctoproject.org/cgit/cgit.cgi/linux-yocto/commit/?h=v5.0/standard/intel-x86&id=b470fb994ebd5b9ae8e520c85029449bf847289c Bruce > > Bruce > > > >> Zhe >> >> > >> > Bruce >> > >> > >> > >> > Zhe >> > >> > > >> > > Bruce >> > > >> > > >> > > >> > > >> > > Zhe >> > > >> > > On 4/24/19 3:42 PM, [email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>> wrote: >> > > > From: "Steven Rostedt (VMware)" <[email protected] >> <mailto:[email protected]> <mailto:[email protected] <mailto: >> [email protected]>>> >> > > > >> > > > He Zhe reported a crash by enabling trace events and >> selecting >> > > > "userstacktrace" which will read the stack of userspace for >> every trace >> > > > event recorded. Zhe narrowed it down to: >> > > > >> > > > c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints >> and unify their usage") >> > > > >> > > > With the problem config, I was able to also reproduce the >> error. I >> > > > narrowed it down to just having to do the following: >> > > > >> > > > # cd /sys/kernel/tracing >> > > > # echo 1 > options/userstacktrace >> > > > # echo 1 > events/preemptirq/irq_disable/enable >> > > > >> > > > And sure enough, I triggered a crash. Well, it was systemd >> crashing >> > > > with a bad memory access?? >> > > > >> > > > systemd-journal[537]: segfault at ed8cb8 ip >> 00007f7fffc9fef5 sp 00007ffc4062cb10 error 7 >> > > > >> > > > And it would crash similarly each time I tried it, but >> always at a >> > > > different place. After spending the day on this, I finally >> figured it >> > > > out. The bug is happening in entry_64.S right after >> error_entry. >> > > > There's two TRACE_IRQS_OFF in that code path, which if I >> comment out, >> > > > the bug goes away. Then it dawned on me that the crash >> always happens >> > > > when systemd does a normal page fault. We had this bug >> before, and it >> > > > was with the exception trace points. >> > > > >> > > > The issue is that a tracepoint can fault (reading vmalloc >> or whatever). >> > > > And doing a userspace stack trace most definitely will >> fault. But if we >> > > > are coming from a legitimate page fault, the address of >> that fault (in >> > > > the CR2 register) will be lost if we fault before we get to >> the page >> > > > fault handler. That's exactly what is happening. >> > > > >> > > > To solve this, a TRACE_IRQS_OFF_CR2 (and ON for >> consistency) was added >> > > > that saves the CR2 register. A new >> trace_hardirqs_off_thunk_cr2 is >> > > > created that stores the cr2 register, calls the >> > > > trace_hardirqs_off_caller, then on return restores the cr2 >> register if >> > > > it changed, before returning. >> > > > >> > > > On my tests this fixes the issue. I just want to know if >> this is a >> > > > legitimate fix or if someone can come up with a better fix? >> > > > >> > > > Note: this also saves the exception context just like the >> > > > do_page_fault() function does. >> > > > >> > > > Note2: This only gets enabled when lockdep or irq tracing >> is enabled, >> > > > which is not recommended for production environments. >> > > > >> > > > Link: >> http://lkml.kernel.org/r/[email protected] >> > > > >> > > > Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq >> tracepoints and unify their usage") >> > > > Signed-off-by: Steven Rostedt (VMware) <[email protected] >> <mailto:[email protected]> <mailto:[email protected] <mailto: >> [email protected]>>> >> > > > >> > > > Link: >> https://lore.kernel.org/lkml/[email protected]/ >> > > > >> > > > This might not be the final solution. But the upstream >> thread has stopped over >> > > > a month and there is unlikely a final solution in the near >> future. >> > > > >> > > > Since the diff looks quite clear and does not affect other >> functions. It should >> > > > be worth adding this initial patch from the maintainer. >> > > > >> > > > Signed-off-by: He Zhe <[email protected] <mailto: >> [email protected]> <mailto:[email protected] <mailto: >> [email protected]>>> >> > > > --- >> > > > arch/x86/entry/common.c | 26 >> ++++++++++++++++++++++++++ >> > > > arch/x86/entry/entry_64.S | 4 ++-- >> > > > arch/x86/entry/thunk_64.S | 2 ++ >> > > > arch/x86/include/asm/irqflags.h | 4 ++++ >> > > > 4 files changed, 34 insertions(+), 2 deletions(-) >> > > > >> > > > diff --git a/arch/x86/entry/common.c >> b/arch/x86/entry/common.c >> > > > index 7bc105f..7edffec 100644 >> > > > --- a/arch/x86/entry/common.c >> > > > +++ b/arch/x86/entry/common.c >> > > > @@ -292,6 +292,32 @@ __visible void do_syscall_64(unsigned >> long nr, struct pt_regs *regs) >> > > > >> > > > syscall_return_slowpath(regs); >> > > > } >> > > > + >> > > > +extern void trace_hardirqs_on_caller(unsigned long >> caller_addr); >> > > > +__visible void trace_hardirqs_on_caller_cr2(unsigned long >> caller_addr) >> > > > +{ >> > > > + unsigned long address = read_cr2(); /* Get the >> faulting address */ >> > > > + enum ctx_state prev_state; >> > > > + >> > > > + prev_state = exception_enter(); >> > > > + trace_hardirqs_on_caller(caller_addr); >> > > > + if (address != read_cr2()) >> > > > + write_cr2(address); >> > > > + exception_exit(prev_state); >> > > > +} >> > > > + >> > > > +extern void trace_hardirqs_off_caller(unsigned long >> caller_addr); >> > > > +__visible void trace_hardirqs_off_caller_cr2(unsigned long >> caller_addr) >> > > > +{ >> > > > + unsigned long address = read_cr2(); /* Get the >> faulting address */ >> > > > + enum ctx_state prev_state; >> > > > + >> > > > + prev_state = exception_enter(); >> > > > + trace_hardirqs_off_caller(caller_addr); >> > > > + if (address != read_cr2()) >> > > > + write_cr2(address); >> > > > + exception_exit(prev_state); >> > > > +} >> > > > #endif >> > > > >> > > > #if defined(CONFIG_X86_32) || >> defined(CONFIG_IA32_EMULATION) >> > > > diff --git a/arch/x86/entry/entry_64.S >> b/arch/x86/entry/entry_64.S >> > > > index 1f0efdb..73ddf24 100644 >> > > > --- a/arch/x86/entry/entry_64.S >> > > > +++ b/arch/x86/entry/entry_64.S >> > > > @@ -1248,12 +1248,12 @@ ENTRY(error_entry) >> > > > * we fix gsbase, and we should do it before >> enter_from_user_mode >> > > > * (which can take locks). >> > > > */ >> > > > - TRACE_IRQS_OFF >> > > > + TRACE_IRQS_OFF_CR2 >> > > > CALL_enter_from_user_mode >> > > > ret >> > > > >> > > > .Lerror_entry_done: >> > > > - TRACE_IRQS_OFF >> > > > + TRACE_IRQS_OFF_CR2 >> > > > ret >> > > > >> > > > /* >> > > > diff --git a/arch/x86/entry/thunk_64.S >> b/arch/x86/entry/thunk_64.S >> > > > index be36bf4..1300b53 100644 >> > > > --- a/arch/x86/entry/thunk_64.S >> > > > +++ b/arch/x86/entry/thunk_64.S >> > > > @@ -41,6 +41,8 @@ >> > > > #ifdef CONFIG_TRACE_IRQFLAGS >> > > > THUNK >> trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1 >> > > > THUNK >> trace_hardirqs_off_thunk,trace_hardirqs_off_caller,1 >> > > > + THUNK >> trace_hardirqs_on_thunk_cr2,trace_hardirqs_on_caller_cr2,1 >> > > > + THUNK >> trace_hardirqs_off_thunk_cr2,trace_hardirqs_off_caller_cr2,1 >> > > > #endif >> > > > >> > > > #ifdef CONFIG_DEBUG_LOCK_ALLOC >> > > > diff --git a/arch/x86/include/asm/irqflags.h >> b/arch/x86/include/asm/irqflags.h >> > > > index 058e40f..dd51174 100644 >> > > > --- a/arch/x86/include/asm/irqflags.h >> > > > +++ b/arch/x86/include/asm/irqflags.h >> > > > @@ -172,9 +172,13 @@ static inline int >> arch_irqs_disabled(void) >> > > > #ifdef CONFIG_TRACE_IRQFLAGS >> > > > # define TRACE_IRQS_ON call >> trace_hardirqs_on_thunk; >> > > > # define TRACE_IRQS_OFF call trace_hardirqs_off_thunk; >> > > > +# define TRACE_IRQS_ON_CR2 call >> trace_hardirqs_on_thunk_cr2; >> > > > +# define TRACE_IRQS_OFF_CR2 call >> trace_hardirqs_off_thunk_cr2; >> > > > #else >> > > > # define TRACE_IRQS_ON >> > > > # define TRACE_IRQS_OFF >> > > > +# define TRACE_IRQS_ON_CR2 >> > > > +# define TRACE_IRQS_OFF_CR2 >> > > > #endif >> > > > #ifdef CONFIG_DEBUG_LOCK_ALLOC >> > > > # ifdef CONFIG_X86_64 >> > > >> > > >> > > >> > > -- >> > > - Thou shalt not follow the NULL pointer, for chaos and madness >> await thee at its end >> > > - "Use the force Harry" - Gandalf, Star Trek II >> > > >> > >> > >> > >> > -- >> > - Thou shalt not follow the NULL pointer, for chaos and madness await >> thee at its end >> > - "Use the force Harry" - Gandalf, Star Trek II >> > >> >> > > -- > - Thou shalt not follow the NULL pointer, for chaos and madness await thee > at its end > - "Use the force Harry" - Gandalf, Star Trek II > > -- - Thou shalt not follow the NULL pointer, for chaos and madness await thee at its end - "Use the force Harry" - Gandalf, Star Trek II
-- _______________________________________________ linux-yocto mailing list [email protected] https://lists.yoctoproject.org/listinfo/linux-yocto
