RE: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Cabaniols, Sebastien
> Andrea Arcangeli wrote: > > > On Thu, May 03, 2001 at 06:16:02PM +0200, Cabaniols, > Sebastien wrote: > > > The only thing that does not work under load is the > network TCP/IP ? > > > > My alpha is running 2.4.4aa3 under very high load (apache > beaten from ab > > in loop via 100mbit

RE: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Cabaniols, Sebastien
>Silly question, Sebastien - when you do a "show config" at the console, how >is your card represented? FWIU, there have been problems with adapters under >load that aren't fully supported by SRM... Just a guess. Could you try this >with a DE600 (Intel) or a DE500 (tulip)? > - Pete

Re: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Andrea Arcangeli
On Thu, May 03, 2001 at 06:46:10PM +0200, Andrea Arcangeli wrote: > as well. The only annoying thing is that UP kernel compiles seems not to > boot but I hope that will be fixed soon too. Ok I spotted and fixed that bug that forbidden my tree to boot with UP compiles on alpha. The bug is that

Re: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Andrea Arcangeli
On Thu, May 03, 2001 at 06:16:02PM +0200, Cabaniols, Sebastien wrote: > The only thing that does not work under load is the network TCP/IP ? My alpha is running 2.4.4aa3 under very high load (apache beaten from ab in loop via 100mbit switched network [tulip on the alpha] plus cerberus) and I

[BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? (was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Cabaniols, Sebastien
Hello, I have a bug on an Alpha ES40 SMP 2.4.4.ac3 modified (TCP Bug from lkml) Platform: Linux Version: --- My kernel is 2.4.4-ac3 with the tcp.c file modified as suggested by the following patch. >I see! Dave, please, take the second Andrea's patch (appended). >It is

RE: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Cabaniols, Sebastien
Silly question, Sebastien - when you do a show config at the console, how is your card represented? FWIU, there have been problems with adapters under load that aren't fully supported by SRM... Just a guess. Could you try this with a DE600 (Intel) or a DE500 (tulip)? - Pete appended to

RE: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Cabaniols, Sebastien
Andrea Arcangeli wrote: On Thu, May 03, 2001 at 06:16:02PM +0200, Cabaniols, Sebastien wrote: The only thing that does not work under load is the network TCP/IP ? My alpha is running 2.4.4aa3 under very high load (apache beaten from ab in loop via 100mbit switched network

Re: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Andrea Arcangeli
On Thu, May 03, 2001 at 06:16:02PM +0200, Cabaniols, Sebastien wrote: The only thing that does not work under load is the network TCP/IP ? My alpha is running 2.4.4aa3 under very high load (apache beaten from ab in loop via 100mbit switched network [tulip on the alpha] plus cerberus) and I

Re: [BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? ( was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Andrea Arcangeli
On Thu, May 03, 2001 at 06:46:10PM +0200, Andrea Arcangeli wrote: as well. The only annoying thing is that UP kernel compiles seems not to boot but I hope that will be fixed soon too. Ok I spotted and fixed that bug that forbidden my tree to boot with UP compiles on alpha. The bug is that the

[BUG] freeze Alpha ES40 SMP 2.4.4.ac3, another TCP/IP Problem ? (was 2.4.4 kernel crash , possibly tcp related )

2001-05-03 Thread Cabaniols, Sebastien
Hello, I have a bug on an Alpha ES40 SMP 2.4.4.ac3 modified (TCP Bug from lkml) Platform: Linux Version: --- My kernel is 2.4.4-ac3 with the tcp.c file modified as suggested by the following patch. I see! Dave, please, take the second Andrea's patch (appended). It is

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread David S. Miller
[EMAIL PROTECTED] writes: > > See? > > I see! Dave, please, take the second Andrea's patch (appended). > It is really the cleanest one. Thanks a lot Andrea and Alexey. I've applied the patch. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! > If send_head doesn't point to skb then it is before it (and it cannot > advance under us of course because we hold the sock lock) and so in such > case we didn't clobbered the send_head at all in skb_entail, and so we > don't need to touch send_head in order to undo (we only need to

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread Andrea Arcangeli
On Tue, May 01, 2001 at 09:25:43PM +0400, [EMAIL PROTECTED] wrote: > Hello! > > > zero and we are running in such slow path, it is obvious the send_head > > _was_ NULL when we entered the critical section, so it's perfectly fine > > It is not only not obvious, it is not true almost always. > On

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! > zero and we are running in such slow path, it is obvious the send_head > _was_ NULL when we entered the critical section, so it's perfectly fine It is not only not obvious, it is not true almost always. On normally working tcp send_head is almost never NULL, it is NULL only when

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread Andrea Arcangeli
On Tue, May 01, 2001 at 08:44:52PM +0400, [EMAIL PROTECTED] wrote: > Hello! > > > this is the strict fix: > > Andrea, you caught the problem! > > The fix is not right though (it is equivalent to straight > tp->send_head=NULL, as you noticed. It also corrupts queue in > an opposite manner.)

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! > this is the strict fix: Andrea, you caught the problem! The fix is not right though (it is equivalent to straight tp->send_head=NULL, as you noticed. It also corrupts queue in an opposite manner.) Right fix is appended. Explanation: in do_fault we must undo effect of enqueueing new

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread Andrea Arcangeli
On Mon, Apr 30, 2001 at 09:00:09PM +0400, [EMAIL PROTECTED] wrote: > Hello! > > > My current theory is that tcpblast does something erratic when the > > error occurs. > > It has buffer size of 32K, so that it faults at enough large chunk sizes. > > Erratic errno is because this applet prints

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread Andrea Arcangeli
On Mon, Apr 30, 2001 at 09:00:09PM +0400, [EMAIL PROTECTED] wrote: Hello! My current theory is that tcpblast does something erratic when the error occurs. It has buffer size of 32K, so that it faults at enough large chunk sizes. Erratic errno is because this applet prints errno on

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread David S. Miller
[EMAIL PROTECTED] writes: See? I see! Dave, please, take the second Andrea's patch (appended). It is really the cleanest one. Thanks a lot Andrea and Alexey. I've applied the patch. Later, David S. Miller [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread Andrea Arcangeli
On Tue, May 01, 2001 at 09:25:43PM +0400, [EMAIL PROTECTED] wrote: Hello! zero and we are running in such slow path, it is obvious the send_head _was_ NULL when we entered the critical section, so it's perfectly fine It is not only not obvious, it is not true almost always. On normally

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! If send_head doesn't point to skb then it is before it (and it cannot advance under us of course because we hold the sock lock) and so in such case we didn't clobbered the send_head at all in skb_entail, and so we don't need to touch send_head in order to undo (we only need to

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! this is the strict fix: Andrea, you caught the problem! The fix is not right though (it is equivalent to straight tp-send_head=NULL, as you noticed. It also corrupts queue in an opposite manner.) Right fix is appended. Explanation: in do_fault we must undo effect of enqueueing new

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread Andrea Arcangeli
On Tue, May 01, 2001 at 08:44:52PM +0400, [EMAIL PROTECTED] wrote: Hello! this is the strict fix: Andrea, you caught the problem! The fix is not right though (it is equivalent to straight tp-send_head=NULL, as you noticed. It also corrupts queue in an opposite manner.) Right fix is

Re: 2.4.4: Kernel crash, possibly tcp related

2001-05-01 Thread kuznet
Hello! zero and we are running in such slow path, it is obvious the send_head _was_ NULL when we entered the critical section, so it's perfectly fine It is not only not obvious, it is not true almost always. On normally working tcp send_head is almost never NULL, it is NULL only when

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread Ingo Oeser
On Mon, Apr 30, 2001 at 06:46:33PM +0200, Andrea Arcangeli wrote: > On Sun, Apr 29, 2001 at 11:58:20PM -0700, David S. Miller wrote: > > Andrew Morton writes: > > > "David S. Miller" wrote: > > Anyways, I just tried to reproduce Ralf's problem on two of my > > machines. One was an SMP sparc64

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread kuznet
Hello! > My current theory is that tcpblast does something erratic when the > error occurs. It has buffer size of 32K, so that it faults at enough large chunk sizes. Erratic errno is because this applet prints errno on partial write. Oops is apparently because I did something wrong in

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread Andrea Arcangeli
On Sun, Apr 29, 2001 at 11:58:20PM -0700, David S. Miller wrote: > > Andrew Morton writes: > > "David S. Miller" wrote: > > > > > > I'm having a devil of a time finding the tcpblast sources on the > > > net, can you point me to where I can get them? > > > > I seem to have a copy. > >

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread Ralf Nyren
On Sun, 29 Apr 2001, David S. Miller wrote: [snip] > > Anyways, I just tried to reproduce Ralf's problem on two of my > machines. One was an SMP sparc64 system, and the other was my > uniprocessor Athlon. > > What kind of machine are you reproducing this on Ralf? I'm not > even getting the

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread David S. Miller
Andrew Morton writes: > "David S. Miller" wrote: > > > > I'm having a devil of a time finding the tcpblast sources on the > > net, can you point me to where I can get them? > > I seem to have a copy. > > http://www.zip.com.au/~akpm/tcpblast-19990504.tar.gz Thanks to everyone who

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread J Sloan
"David S. Miller" schrieb: > I'm having a devil of a time finding the tcpblast sources on the > net, can you point me to where I can get them? The one reference > I saw to get the original sources was: > > ftp://ftp.xlink.net/pub/network/tcpblast.shar.gz > > But even that directory no longer

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread J Sloan
David S. Miller schrieb: I'm having a devil of a time finding the tcpblast sources on the net, can you point me to where I can get them? The one reference I saw to get the original sources was: ftp://ftp.xlink.net/pub/network/tcpblast.shar.gz But even that directory no longer exists.

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread Ralf Nyren
On Sun, 29 Apr 2001, David S. Miller wrote: [snip] Anyways, I just tried to reproduce Ralf's problem on two of my machines. One was an SMP sparc64 system, and the other was my uniprocessor Athlon. What kind of machine are you reproducing this on Ralf? I'm not even getting the very

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread Andrea Arcangeli
On Sun, Apr 29, 2001 at 11:58:20PM -0700, David S. Miller wrote: Andrew Morton writes: David S. Miller wrote: I'm having a devil of a time finding the tcpblast sources on the net, can you point me to where I can get them? I seem to have a copy.

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread Ingo Oeser
On Mon, Apr 30, 2001 at 06:46:33PM +0200, Andrea Arcangeli wrote: On Sun, Apr 29, 2001 at 11:58:20PM -0700, David S. Miller wrote: Andrew Morton writes: David S. Miller wrote: Anyways, I just tried to reproduce Ralf's problem on two of my machines. One was an SMP sparc64 system, and

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-30 Thread kuznet
Hello! My current theory is that tcpblast does something erratic when the error occurs. It has buffer size of 32K, so that it faults at enough large chunk sizes. Erratic errno is because this applet prints errno on partial write. Oops is apparently because I did something wrong in do_fault

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-29 Thread David S. Miller
Ralf Nyren writes: > The problem appears when this value is set to 40481 or higher. For ex: > $ tcpblast -d0 -s 40481 another_host 9000 ... > KERNEL: assertion (!skb_queue_empty(>write_queue)) failed at tcp_timer.c(327): > tcp_retransmit_timer > Unable to handle kernel NULL pointer

2.4.4: Kernel crash, possibly tcp related

2001-04-29 Thread Ralf Nyren
Greetings, A possibly tcp-related bug causing a kernel crash, possible to trigger from an unprivileged user. Kernel 2.4.4, no patches applied. The problem appeared when performing some network-performance tests with a program called tcpblast. tcpblast has an option to set its "block size".

2.4.4: Kernel crash, possibly tcp related

2001-04-29 Thread Ralf Nyren
Greetings, A possibly tcp-related bug causing a kernel crash, possible to trigger from an unprivileged user. Kernel 2.4.4, no patches applied. The problem appeared when performing some network-performance tests with a program called tcpblast. tcpblast has an option to set its block size. The

Re: 2.4.4: Kernel crash, possibly tcp related

2001-04-29 Thread David S. Miller
Ralf Nyren writes: The problem appears when this value is set to 40481 or higher. For ex: $ tcpblast -d0 -s 40481 another_host 9000 ... KERNEL: assertion (!skb_queue_empty(sk-write_queue)) failed at tcp_timer.c(327): tcp_retransmit_timer Unable to handle kernel NULL pointer