Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-20 Thread Neil Horman
On Tue, Feb 12, 2008 at 04:08:16PM -0500, Neil Horman wrote: > > > > Neil, is it possible to do some serial console debugging to find out > > where exactly we are hanging? Beats me, what's that operation which can > > not be executed while being in NMI handler and makes system to hang. I am > >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-20 Thread Neil Horman
On Tue, Feb 12, 2008 at 04:08:16PM -0500, Neil Horman wrote: Neil, is it possible to do some serial console debugging to find out where exactly we are hanging? Beats me, what's that operation which can not be executed while being in NMI handler and makes system to hang. I am also

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-15 Thread Eric W. Biederman
Neil Horman <[EMAIL PROTECTED]> writes: >> >> Neil, is it possible to do some serial console debugging to find out >> where exactly we are hanging? Beats me, what's that operation which can >> not be executed while being in NMI handler and makes system to hang. I am >> also curious to know if it

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-15 Thread Eric W. Biederman
Neil Horman [EMAIL PROTECTED] writes: Neil, is it possible to do some serial console debugging to find out where exactly we are hanging? Beats me, what's that operation which can not be executed while being in NMI handler and makes system to hang. I am also curious to know if it is nested

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-12 Thread Neil Horman
> > Neil, is it possible to do some serial console debugging to find out > where exactly we are hanging? Beats me, what's that operation which can > not be executed while being in NMI handler and makes system to hang. I am > also curious to know if it is nested NMI case. > > Thanks > Vivek >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-12 Thread Neil Horman
Neil, is it possible to do some serial console debugging to find out where exactly we are hanging? Beats me, what's that operation which can not be executed while being in NMI handler and makes system to hang. I am also curious to know if it is nested NMI case. Thanks Vivek Hey-

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Neil Horman
On Fri, Feb 08, 2008 at 11:45:44AM -0500, Vivek Goyal wrote: > On Fri, Feb 08, 2008 at 11:14:22AM -0500, Neil Horman wrote: > > On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > > > > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > > > > > Ingo noted a few posts down the nmi_exit

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Andi Kleen
Ingo Molnar <[EMAIL PROTECTED]> writes: > > try a dummy iret, something like: > > asm volatile ("pushf; push $1f; iret; 1: \n"); > > to get the CPU out of its 'nested NMI' state. (totally untested) Just if you do this while running on the NMI stack (and I think you do if you insert it at the

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Vivek Goyal
On Fri, Feb 08, 2008 at 11:14:22AM -0500, Neil Horman wrote: > On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the > > > APIC EOI register, so yeah, I

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Neil Horman
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the > > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I > > should have checked that more

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Neil Horman
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: * Neil Horman [EMAIL PROTECTED] wrote: Ingo noted a few posts down the nmi_exit doesn't actually write to the APIC EOI register, so yeah, I agree, its bogus (and I apologize, I should have checked that more carefully).

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-08 Thread Vivek Goyal
On Fri, Feb 08, 2008 at 11:14:22AM -0500, Neil Horman wrote: On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: * Neil Horman [EMAIL PROTECTED] wrote: Ingo noted a few posts down the nmi_exit doesn't actually write to the APIC EOI register, so yeah, I agree, its bogus

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Neil Horman
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > Ingo noted a few posts down the nmi_exit doesn't actually write to the > > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I > > should have checked that more

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Ingo Molnar
* Neil Horman <[EMAIL PROTECTED]> wrote: > Ingo noted a few posts down the nmi_exit doesn't actually write to the > APIC EOI register, so yeah, I agree, its bogus (and I apologize, I > should have checked that more carefully). Nevertheless, this patch > consistently allowed a hangning

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Neil Horman
On Wed, Feb 06, 2008 at 05:31:11PM -0700, Eric W. Biederman wrote: > Ingo Molnar <[EMAIL PROTECTED]> writes: > > > * H. Peter Anvin <[EMAIL PROTECTED]> wrote: > > > >>> I am wondering if interrupts are disabled on crashing cpu or if > >>> crashing cpu is inside die_nmi(), how would it

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Neil Horman
On Wed, Feb 06, 2008 at 05:31:11PM -0700, Eric W. Biederman wrote: Ingo Molnar [EMAIL PROTECTED] writes: * H. Peter Anvin [EMAIL PROTECTED] wrote: I am wondering if interrupts are disabled on crashing cpu or if crashing cpu is inside die_nmi(), how would it stop/prevent delivery of

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Ingo Molnar
* Neil Horman [EMAIL PROTECTED] wrote: Ingo noted a few posts down the nmi_exit doesn't actually write to the APIC EOI register, so yeah, I agree, its bogus (and I apologize, I should have checked that more carefully). Nevertheless, this patch consistently allowed a hangning machine to

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-07 Thread Neil Horman
On Thu, Feb 07, 2008 at 01:24:04PM +0100, Ingo Molnar wrote: * Neil Horman [EMAIL PROTECTED] wrote: Ingo noted a few posts down the nmi_exit doesn't actually write to the APIC EOI register, so yeah, I agree, its bogus (and I apologize, I should have checked that more carefully).

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * Eric W. Biederman <[EMAIL PROTECTED]> wrote: > >> Looking at the patch the local_irq_enable() is totally bogus. As soon >> was we hit machine_crash_shutdown the first thing we do is disable >> irqs. > > yeah. > >> I'm wondering if someone was using

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Eric W. Biederman <[EMAIL PROTECTED]> wrote: > Looking at the patch the local_irq_enable() is totally bogus. As soon > was we hit machine_crash_shutdown the first thing we do is disable > irqs. yeah. > I'm wondering if someone was using the switch cpus on crash patch that > was floating

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Eric W. Biederman
Ingo Molnar <[EMAIL PROTECTED]> writes: > * H. Peter Anvin <[EMAIL PROTECTED]> wrote: > >>> I am wondering if interrupts are disabled on crashing cpu or if >>> crashing cpu is inside die_nmi(), how would it stop/prevent delivery >>> of NMI IPI to other cpus. >> >> I don't see how it would. > >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Thu, Feb 07, 2008 at 12:36:57AM +0100, Ingo Molnar wrote: > > * H. Peter Anvin <[EMAIL PROTECTED]> wrote: > > >> I am wondering if interrupts are disabled on crashing cpu or if > >> crashing cpu is inside die_nmi(), how would it stop/prevent delivery > >> of NMI IPI to other cpus. > > > > I

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* H. Peter Anvin <[EMAIL PROTECTED]> wrote: >> I am wondering if interrupts are disabled on crashing cpu or if >> crashing cpu is inside die_nmi(), how would it stop/prevent delivery >> of NMI IPI to other cpus. > > I don't see how it would. cross-CPU IPIs are a bit fragile on some PC

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread H. Peter Anvin
Vivek Goyal wrote: I am wondering if interrupts are disabled on crashing cpu or if crashing cpu is inside die_nmi(), how would it stop/prevent delivery of NMI IPI to other cpus. I don't see how it would. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel"

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Vivek Goyal <[EMAIL PROTECTED]> wrote: > On Wed, Feb 06, 2008 at 11:00:01PM +0100, Ingo Molnar wrote: > > > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > > > if (!user_mode_vm(regs)) { > > > + nmi_exit(); > > > + local_irq_enable(); > > >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 11:00:01PM +0100, Ingo Molnar wrote: > > * Neil Horman <[EMAIL PROTECTED]> wrote: > > > if (!user_mode_vm(regs)) { > > + nmi_exit(); > > + local_irq_enable(); > > current->thread.trap_no = 2; > > crash_kexec(regs); > >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Neil Horman <[EMAIL PROTECTED]> wrote: > if (!user_mode_vm(regs)) { > + nmi_exit(); > + local_irq_enable(); > current->thread.trap_no = 2; > crash_kexec(regs); looks good to me, but please move the local_irq_enable() to within

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
On Wed, Feb 06, 2008 at 12:21:30PM -0800, H. Peter Anvin wrote: > Neil Horman wrote: > >Can an APIC accept an NMI while already handling an NMI? I didn't think > >they > >would interrupt one another, but rather, pend until such time as the > >previous > >NMI was cleared > > The CPU certainly

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 03:12:23PM -0500, Neil Horman wrote: > On Wed, Feb 06, 2008 at 02:40:40PM -0500, Vivek Goyal wrote: > > On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: > > > Hey all- > > > A hang on kdump was reported to me awhile back, only when systems died > > > via nmi

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread H. Peter Anvin
Neil Horman wrote: Can an APIC accept an NMI while already handling an NMI? I didn't think they would interrupt one another, but rather, pend until such time as the previous NMI was cleared The CPU certainly won't (there is a hidden flag that's cleared on IRET which disables NMI; it's

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
On Wed, Feb 06, 2008 at 02:40:40PM -0500, Vivek Goyal wrote: > On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: > > Hey all- > > A hang on kdump was reported to me awhile back, only when systems died > > via nmi watchdog panic. The hang wouldn't always be in the same place, but >

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: > Hey all- > A hang on kdump was reported to me awhile back, only when systems died > via nmi watchdog panic. The hang wouldn't always be in the same place, but it > would usually be somewhere down in purgatory. In looking at the

[PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
Hey all- A hang on kdump was reported to me awhile back, only when systems died via nmi watchdog panic. The hang wouldn't always be in the same place, but it would usually be somewhere down in purgatory. In looking at the code, it occured to me that since, during an nmi interrupt, we

[PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
Hey all- A hang on kdump was reported to me awhile back, only when systems died via nmi watchdog panic. The hang wouldn't always be in the same place, but it would usually be somewhere down in purgatory. In looking at the code, it occured to me that since, during an nmi interrupt, we

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: Hey all- A hang on kdump was reported to me awhile back, only when systems died via nmi watchdog panic. The hang wouldn't always be in the same place, but it would usually be somewhere down in purgatory. In looking at the

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 03:12:23PM -0500, Neil Horman wrote: On Wed, Feb 06, 2008 at 02:40:40PM -0500, Vivek Goyal wrote: On Wed, Feb 06, 2008 at 02:25:55PM -0500, Neil Horman wrote: Hey all- A hang on kdump was reported to me awhile back, only when systems died via nmi watchdog

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Neil Horman
On Wed, Feb 06, 2008 at 12:21:30PM -0800, H. Peter Anvin wrote: Neil Horman wrote: Can an APIC accept an NMI while already handling an NMI? I didn't think they would interrupt one another, but rather, pend until such time as the previous NMI was cleared The CPU certainly won't (there

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Neil Horman [EMAIL PROTECTED] wrote: if (!user_mode_vm(regs)) { + nmi_exit(); + local_irq_enable(); current-thread.trap_no = 2; crash_kexec(regs); looks good to me, but please move the local_irq_enable() to within crash_kexec()

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Vivek Goyal
On Wed, Feb 06, 2008 at 11:00:01PM +0100, Ingo Molnar wrote: * Neil Horman [EMAIL PROTECTED] wrote: if (!user_mode_vm(regs)) { + nmi_exit(); + local_irq_enable(); current-thread.trap_no = 2; crash_kexec(regs); looks good to me, but

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Vivek Goyal [EMAIL PROTECTED] wrote: On Wed, Feb 06, 2008 at 11:00:01PM +0100, Ingo Molnar wrote: * Neil Horman [EMAIL PROTECTED] wrote: if (!user_mode_vm(regs)) { + nmi_exit(); + local_irq_enable(); current-thread.trap_no = 2;

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread H. Peter Anvin
Vivek Goyal wrote: I am wondering if interrupts are disabled on crashing cpu or if crashing cpu is inside die_nmi(), how would it stop/prevent delivery of NMI IPI to other cpus. I don't see how it would. -hpa -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* H. Peter Anvin [EMAIL PROTECTED] wrote: I am wondering if interrupts are disabled on crashing cpu or if crashing cpu is inside die_nmi(), how would it stop/prevent delivery of NMI IPI to other cpus. I don't see how it would. cross-CPU IPIs are a bit fragile on some PC platforms. So if

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Eric W. Biederman
Ingo Molnar [EMAIL PROTECTED] writes: * H. Peter Anvin [EMAIL PROTECTED] wrote: I am wondering if interrupts are disabled on crashing cpu or if crashing cpu is inside die_nmi(), how would it stop/prevent delivery of NMI IPI to other cpus. I don't see how it would. cross-CPU IPIs are a

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Ingo Molnar
* Eric W. Biederman [EMAIL PROTECTED] wrote: Looking at the patch the local_irq_enable() is totally bogus. As soon was we hit machine_crash_shutdown the first thing we do is disable irqs. yeah. I'm wondering if someone was using the switch cpus on crash patch that was floating around.

Re: [PATCH], issue EOI to APIC prior to calling crash_kexec in die_nmi path

2008-02-06 Thread Eric W. Biederman
Ingo Molnar [EMAIL PROTECTED] writes: * Eric W. Biederman [EMAIL PROTECTED] wrote: Looking at the patch the local_irq_enable() is totally bogus. As soon was we hit machine_crash_shutdown the first thing we do is disable irqs. yeah. I'm wondering if someone was using the switch cpus on