Borislav Petkov <[EMAIL PROTECTED]> wrote:
>
> On Monday 11 April 2005 11:43, Andrew Morton wrote:
> > (Please do reply-to-all)
> >
> > "J.A. Magallon" <[EMAIL PROTECTED]> wrote:
> > > On 04.11, Andrew Morton wrote:
> > >  > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-r
> > >  >c2/2.6.12-rc2-mm3/
> > >
> > >  Is this not needed anymore ?
> > >
> > >  --- 25/arch/i386/kernel/entry.S~nmi_stack_correct-fix    2005-04-05
> > > 00:02:48.000000000 -0700 +++ 25-akpm/arch/i386/kernel/entry.S     
> > > 2005-04-05
> > > 00:02:48.000000000 -0700
> >
> > Hopefully not. fix-crash-in-entrys-restore_all.patch works around the
> > problem. -
> 
> Hello Andrew,
> I don't know whether you remember the mysterious crashes I was telling you 
> about last week and me rookiesh-ly trying to debug them with kgdb over the 
> serial console. Well, today I tried for the n-th time again and after rc2-mm3 
> blocked again while loading, here's what I did:
> 
> <snip>
> [   12.335438] NET: Registered protocol family 17
> [   12.362483] Testing NMI watchdog ... OK.
> [   12.416195] Starting balanced_irq
> [   12.443099] VFS: Mounted root (ext2 filesystem) readonly.
> [   12.472490] Freeing unused kernel memory: 196k freed
> [   12.521004] logips2pp: Detected unknown logitech mouse model 1
> [   12.572581] Warning: unable to open an initial console.
> [   12.972518] input: PS/2 Logitech Mouse on isa0060/serio1
> 
> Program received signal SIGTRAP, Trace/breakpoint trap.
> 0xc0102ee7 in resume_kernelX () at atomic.h:175 <--- this one is wrong for a 
> mysterious reason
> 175     {
> (gdb) p $eip
> $1 = (void *) 0xc0102ee7
> 
> (gdb) disas 0xc0102ee7
> Dump of assembler code for function resume_kernelX:
> 0xc0102ee7 <resume_kernelX+0>:  mov    0x30(%esp),%eax
> 0xc0102eeb <resume_kernelX+4>:  mov    0x38(%esp),%ah
> 0xc0102eef <resume_kernelX+8>:  mov    0x2c(%esp),%al
> 0xc0102ef3 <resume_kernelX+12>: and    $0x20403,%eax
> 0xc0102ef8 <resume_kernelX+17>: cmp    $0x403,%eax
> 0xc0102efd <resume_kernelX+22>: je     0xc0102f0c <ldt_ss>
> End of assembler dump.
> (gdb)  
> 
> And as we see, we're at the "mov    0x30(%esp),%eax" which accesses above the 
> bottom of the stack. After applying nmi_stack_correct-fix.patch, rc2-mm3 
> booted just fine, so I IMHO think that we might still be needing this, after 
> all.

Interesting.  It could be an interaction between the kgdb patch and the new
vm86 checking code.  (looks.  I don't think that's the case).

Stas, could you please take a look at 2.6.12-rc2-mm3's entry.S sometime,
see if you think my theory is correct?

It seems that you have CONFIG_TRAP_BAD_SYSCALL_EXITS enabled - I can't say
that I've ever used that, and I really should remove it.  But I doubt if
that is the cause of this bug.


The above code is accessing esp+56, but Stas's patch only offsets the stack
pointer by 32 bytes, so I assume this, in copy_thread():

-       p->thread.esp0 = (unsigned long) (childregs+1) - 8;
+       p->thread.esp0 = (unsigned long) (childregs+1) - 15;

fixes it?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to