Re: kernel crashes inside MV643xx driver

2007-09-05 Thread Satyam Sharma


On Wed, 5 Sep 2007, Dale Farnsworth wrote:

> On Wed, Sep 05, 2007 at 08:24:52AM -0700, Andrew Morton wrote:
> > > On Mon, 20 Aug 2007 14:38:57 +0800 gshan <[EMAIL PROTECTED]> wrote:
> > > Hi All,
> > > 
> > > After I started the NFS server, it crashed:
> > > 
> > > <3>Badness in local_bh_enable at 
> > > /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> > > Badness in local_bh_enable at 
> > > /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> > > Call trace:
> > >  [c0005340] check_bug_trap+0xbc/0x11c
> > >  [c0005604] ProgramCheckException+0x264/0x2bc
> > >  [c0004ac4] ret_from_except_full+0x0/0x4c
> > >  [c0022ae4] local_bh_enable+0x18/0x80
> > >  [c024648c] skb_copy_bits+0x168/0x3b8
> > >  [c024db44] __skb_linearize+0x90/0x150
> > >  [c020e8a4] mv643xx_eth_start_xmit+0x4c0/0x5bc
> > >  [c025c934] qdisc_restart+0xac/0x2bc
> > >  [c024de9c] dev_queue_xmit+0x298/0x34c
> > >  [c0269814] ip_finish_output+0x140/0x2b8
> > >  [c026a3ac] ip_fragment+0x3cc/0x6e0
> > >  [c026bac8] ip_push_pending_frames+0x3dc/0x46c
> > >  [c0289ec4] udp_push_pending_frames+0x10c/0x1cc
> > >  [c028a7c4] udp_sendpage+0x104/0x188
> > >  [c0292fc8] inet_sendpage+0x90/0xb8
> > > 
> > > I searched the webs and found the similar problems:

> > > http://www.mail-archive.com/netdev@vger.kernel.org/msg05199.html

> > > http://oss.sgi.com/archives/netdev/2005-09/msg00025.html
> > > 
> > > Who knew there are fixes for the problem?
> > 
> > Well that got a tremendous response, didn't it?

I thought of replying to this post when I saw it couple weeks back, but
didn't because (1) that's an ancient kernel and (2) the first link that
was mentioned up there in the original post itself pointed to a patch to
solve it (which I verified was since applied to mainline too).


> > What do you mean by "crashed"?  The above is a warning and the system
> > should have survived.
> > 
> > Which kernel version is being used?
> 
> That is the key question.  From the pathnames, I suspect that gshan is
> using a MontaVista version.  I'm still (especially) interested, since
> MontaVista pays me.
> 
> BTW, I never received the original message on netdev or linux-kernel.
> Hmm.  Thanks to Andrew for replying.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes inside MV643xx driver

2007-09-05 Thread Dale Farnsworth
On Wed, Sep 05, 2007 at 08:24:52AM -0700, Andrew Morton wrote:
> > On Mon, 20 Aug 2007 14:38:57 +0800 gshan <[EMAIL PROTECTED]> wrote:
> > Hi All,
> > 
> > After I started the NFS server, it crashed:
> > 
> > <3>Badness in local_bh_enable at 
> > /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> > Badness in local_bh_enable at 
> > /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> > Call trace:
> >  [c0005340] check_bug_trap+0xbc/0x11c
> >  [c0005604] ProgramCheckException+0x264/0x2bc
> >  [c0004ac4] ret_from_except_full+0x0/0x4c
> >  [c0022ae4] local_bh_enable+0x18/0x80
> >  [c024648c] skb_copy_bits+0x168/0x3b8
> >  [c024db44] __skb_linearize+0x90/0x150
> >  [c020e8a4] mv643xx_eth_start_xmit+0x4c0/0x5bc
> >  [c025c934] qdisc_restart+0xac/0x2bc
> >  [c024de9c] dev_queue_xmit+0x298/0x34c
> >  [c0269814] ip_finish_output+0x140/0x2b8
> >  [c026a3ac] ip_fragment+0x3cc/0x6e0
> >  [c026bac8] ip_push_pending_frames+0x3dc/0x46c
> >  [c0289ec4] udp_push_pending_frames+0x10c/0x1cc
> >  [c028a7c4] udp_sendpage+0x104/0x188
> >  [c0292fc8] inet_sendpage+0x90/0xb8
> > 
> > I searched the webs and found the similar problems:
> > http://www.mail-archive.com/netdev@vger.kernel.org/msg05199.html
> > http://oss.sgi.com/archives/netdev/2005-09/msg00025.html
> > 
> > Who knew there are fixes for the problem?
> > 
> 
> Well that got a tremendous response, didn't it?
> 
> What do you mean by "crashed"?  The above is a warning and the system
> should have survived.
> 
> Which kernel version is being used?

That is the key question.  From the pathnames, I suspect that gshan is
using a MontaVista version.  I'm still (especially) interested, since
MontaVista pays me.

BTW, I never received the original message on netdev or linux-kernel.
Hmm.  Thanks to Andrew for replying.

-Dale
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes inside MV643xx driver

2007-09-05 Thread Stephen Hemminger
On Wed, 5 Sep 2007 08:24:52 -0700
Andrew Morton <[EMAIL PROTECTED]> wrote:

> > On Mon, 20 Aug 2007 14:38:57 +0800 gshan <[EMAIL PROTECTED]> wrote:
> > Hi All,
> > 
> > After I started the NFS server, it crashed:
> > 
> > <3>Badness in local_bh_enable at 
> > /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> > Badness in local_bh_enable at 
> > /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> > Call trace:
> >  [c0005340] check_bug_trap+0xbc/0x11c
> >  [c0005604] ProgramCheckException+0x264/0x2bc
> >  [c0004ac4] ret_from_except_full+0x0/0x4c
> >  [c0022ae4] local_bh_enable+0x18/0x80
> >  [c024648c] skb_copy_bits+0x168/0x3b8
> >  [c024db44] __skb_linearize+0x90/0x150
> >  [c020e8a4] mv643xx_eth_start_xmit+0x4c0/0x5bc
> >  [c025c934] qdisc_restart+0xac/0x2bc
> >  [c024de9c] dev_queue_xmit+0x298/0x34c
> >  [c0269814] ip_finish_output+0x140/0x2b8
> >  [c026a3ac] ip_fragment+0x3cc/0x6e0
> >  [c026bac8] ip_push_pending_frames+0x3dc/0x46c
> >  [c0289ec4] udp_push_pending_frames+0x10c/0x1cc
> >  [c028a7c4] udp_sendpage+0x104/0x188
> >  [c0292fc8] inet_sendpage+0x90/0xb8
> > 
> > I searched the webs and found the similar problems:
> > http://www.mail-archive.com/netdev@vger.kernel.org/msg05199.html
> > http://oss.sgi.com/archives/netdev/2005-09/msg00025.html
> > 
> > Who knew there are fixes for the problem?
> > 
> 
> Well that got a tremendous response, didn't it?
> 
> What do you mean by "crashed"?  The above is a warning and the system
> should have survived.
> 
> Which kernel version is being used?

The transmit start rework look like it should have already fixed the problem.
The driver was calling spin_lock_irqsave before calling skb_linearize, now
it checks first.

commit c8aaea25e0b069e9572caa74f984e109899c1765
Author: Dale Farnsworth <[EMAIL PROTECTED]>
Date:   Fri Mar 3 10:02:05 2006 -0700

[PATCH] mv643xx_eth: Refactor tx command queuing code

Simplify and remove redundant code for filling transmit descriptors.
No changes in features; it's just a code reorganization/cleanup.

Signed-off-by: Dale Farnsworth <[EMAIL PROTECTED]>
Signed-off-by: Jeff Garzik <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel crashes inside MV643xx driver

2007-09-05 Thread Andrew Morton
> On Mon, 20 Aug 2007 14:38:57 +0800 gshan <[EMAIL PROTECTED]> wrote:
> Hi All,
> 
> After I started the NFS server, it crashed:
> 
> <3>Badness in local_bh_enable at 
> /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> Badness in local_bh_enable at 
> /home/cli4/sandbox/main/TelicaRoot/components/mvlinux/cge/devkit/lsp/7xx/linux/kernel/softirq.c:195
> Call trace:
>  [c0005340] check_bug_trap+0xbc/0x11c
>  [c0005604] ProgramCheckException+0x264/0x2bc
>  [c0004ac4] ret_from_except_full+0x0/0x4c
>  [c0022ae4] local_bh_enable+0x18/0x80
>  [c024648c] skb_copy_bits+0x168/0x3b8
>  [c024db44] __skb_linearize+0x90/0x150
>  [c020e8a4] mv643xx_eth_start_xmit+0x4c0/0x5bc
>  [c025c934] qdisc_restart+0xac/0x2bc
>  [c024de9c] dev_queue_xmit+0x298/0x34c
>  [c0269814] ip_finish_output+0x140/0x2b8
>  [c026a3ac] ip_fragment+0x3cc/0x6e0
>  [c026bac8] ip_push_pending_frames+0x3dc/0x46c
>  [c0289ec4] udp_push_pending_frames+0x10c/0x1cc
>  [c028a7c4] udp_sendpage+0x104/0x188
>  [c0292fc8] inet_sendpage+0x90/0xb8
> 
> I searched the webs and found the similar problems:
> http://www.mail-archive.com/netdev@vger.kernel.org/msg05199.html
> http://oss.sgi.com/archives/netdev/2005-09/msg00025.html
> 
> Who knew there are fixes for the problem?
> 

Well that got a tremendous response, didn't it?

What do you mean by "crashed"?  The above is a warning and the system
should have survived.

Which kernel version is being used?


-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html