On Mon, Oct 20, 2008 at 05:08:34PM +0200, Alexander van Heukelum wrote:
> Mostly use the x86_64 version of oops_begin() and oops_end() on
> i386 too. Changes to the original x86_64 version:
> 
Hey, doing a sight review this am here.  Didn't find anything major, but I did
find a few little nits.  comments inlie

>  - move add_taint(TAINT_DIE) into oops_end()
>  - add a conditional crash_kexec() into oops_end()
> 
> Signed-off-by: Alexander van Heukelum <[EMAIL PROTECTED]>
> ---
>  arch/x86/kernel/dumpstack_32.c |   36 +++++++++++++++++++++---------------
>  arch/x86/kernel/dumpstack_64.c |    4 +++-
>  2 files changed, 24 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/x86/kernel/dumpstack_32.c b/arch/x86/kernel/dumpstack_32.c
> index b361475..e45952b 100644
> --- a/arch/x86/kernel/dumpstack_32.c
> +++ b/arch/x86/kernel/dumpstack_32.c
> @@ -289,35 +289,41 @@ static unsigned int die_nest_count;
>  
>  unsigned __kprobes long oops_begin(void)
>  {
> +     int cpu;
>       unsigned long flags;
>  
>       oops_enter();
>  
> -     if (die_owner != raw_smp_processor_id()) {
> -             console_verbose();
> -             raw_local_irq_save(flags);
> -             __raw_spin_lock(&die_lock);
> -             die_owner = smp_processor_id();
> -             die_nest_count = 0;
> -             bust_spinlocks(1);
> -     } else {
> -             raw_local_irq_save(flags);
> +     /* racy, but better than risking deadlock. */
> +     raw_local_irq_save(flags);
> +     cpu = smp_processor_id();
> +     if (!__raw_spin_trylock(&die_lock)) {
> +             if (cpu == die_owner)
> +                     /* nested oops. should stop eventually */;
> +             else
> +                     __raw_spin_lock(&die_lock);
>       }
>       die_nest_count++;
> +     die_owner = cpu;
> +     console_verbose();
> +     bust_spinlocks(1);
>       return flags;
>  }
>  
>  void __kprobes oops_end(unsigned long flags, struct pt_regs *regs, int signr)
>  {
> -     bust_spinlocks(0);
>       die_owner = -1;
> +     bust_spinlocks(0);
> +     die_nest_count--;
>       add_taint(TAINT_DIE);
> -     __raw_spin_unlock(&die_lock);
> +     if (!die_nest_count)
> +             /* Nest count reaches zero, release the lock. */
> +             __raw_spin_unlock(&die_lock);
>       raw_local_irq_restore(flags);
> -
> -     if (!regs)
> +     if (!regs) {
> +             oops_exit();
>               return;
> -
> +     }
>       if (kexec_should_crash(current))
>               crash_kexec(regs);
Hmm.  I think this creates the same case that I just fixed in my initial post.
If we start using oops_end with this here, it may be possible to call
crash_kexec with the console_sem held.  If that happens, we deadlock.  I think
you should be able to move this clause up above the bust_spinlocks(0) without
any issue, and that would take care of that

>       if (in_interrupt())
> @@ -405,6 +411,7 @@ die_nmi(char *str, struct pt_regs *regs, int do_panic)
>               panic("Non maskable interrupt");
>       console_silent();
>       spin_unlock(&nmi_print_lock);
> +     bust_spinlocks(0);
>  
>       /*
>        * If we are in kernel we are probably nested up pretty bad
> @@ -415,7 +422,6 @@ die_nmi(char *str, struct pt_regs *regs, int do_panic)
>               crash_kexec(regs);
>       }
>  
> -     bust_spinlocks(0);
>       do_exit(SIGSEGV);
>  }
>  
This undoes my previous patch.  I realize your second patch fixes it properly so
the ordering is correct when oops_begin and oops_end are used, but if you could
rediff so this isn't here, I'd appreciate it.  If these patches are committed
separately, you'll avoid having the tree in a state where that deadlock can
reoccur (even if it is just for one commit)

> diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
> index 96a5db7..cd7b46b 100644
> --- a/arch/x86/kernel/dumpstack_64.c
> +++ b/arch/x86/kernel/dumpstack_64.c
> @@ -461,6 +461,7 @@ void __kprobes oops_end(unsigned long flags, struct 
> pt_regs *regs, int signr)
>       die_owner = -1;
>       bust_spinlocks(0);
>       die_nest_count--;
> +     add_taint(TAINT_DIE);
>       if (!die_nest_count)
>               /* Nest count reaches zero, release the lock. */
>               __raw_spin_unlock(&die_lock);
> @@ -469,6 +470,8 @@ void __kprobes oops_end(unsigned long flags, struct 
> pt_regs *regs, int signr)
>               oops_exit();
>               return;
>       }
> +     if (kexec_should_crash(current))
> +             crash_kexec(regs);
If you're going to add the crash_kexec here (which looking at the call sites,
makes sense to me), you should likely remove it from the critical section of die
and die_nmi, just to avoid the redundancy.  Same issue as the 32 bit version
above applies, this needs to happen before you call bust_spinlocks(0).

Fix those issues, and the rest looks good to me.

Regards
Neil

-- 
/****************************************************
 * Neil Horman <[EMAIL PROTECTED]>
 * Software Engineer, Red Hat
 ****************************************************/

_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec

Reply via email to