Re: Oops in 2.6.23-rc5

2007-09-02 Thread Christian Kujau

On Sun, 2 Sep 2007, Herbert Xu wrote:

You want this patch (by davem).


I applied the patch and the box is up for 1hr now. Since I was able to 
reproduce the oops pretty reliable with this bittorrent thingy, I 
did the same a few times now, but the box did NOT crash :)



Unfortunately people are travelling so I'm not sure when it'll
get picked up by Linus.


I've seen this patch only in:
http://article.gmane.org/gmane.linux.network/70781

And, for the archives, a simliar looking error report:
http://article.gmane.org/gmane.linux.network/70777

Thanks for the quick reply, Herbert!

Christian.
--
BOFH excuse #297:

Too many interrupts
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.23-rc5

2007-09-02 Thread Christian Kujau

On Sun, 2 Sep 2007, Herbert Xu wrote:

You want this patch (by davem).


I applied the patch and the box is up for 1hr now. Since I was able to 
reproduce the oops pretty reliable with this bittorrent thingy, I 
did the same a few times now, but the box did NOT crash :)



Unfortunately people are travelling so I'm not sure when it'll
get picked up by Linus.


I've seen this patch only in:
http://article.gmane.org/gmane.linux.network/70781

And, for the archives, a simliar looking error report:
http://article.gmane.org/gmane.linux.network/70777

Thanks for the quick reply, Herbert!

Christian.
--
BOFH excuse #297:

Too many interrupts
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.23-rc5

2007-09-01 Thread Herbert Xu
Christian Kujau <[EMAIL PROTECTED]> wrote:
> 
> today I switched from 2.6.22.3 to 2.6.23-rc5 (skipped quite a few -rc 
> versions due to lack of time), and the box keeps panicking under certain 
> circumstances. I suspected disk related problems, because: when the box 
> is up, I usually resume ~10 bittorrent files. When doing this, each
> file (~200MB...1GB) is checked and disk activity is pretty high (20MB/s
> or so), and after 1 minute of doing so the box panicks. Every time.

You want this patch (by davem).

Unfortunately people are travelling so I'm not sure when it'll
get picked up by Linus.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
From: [EMAIL PROTECTED] (David Miller)

> ip is at tcp_rto_min+0x20/0x40

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1ee7212..bbad2cd 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -560,7 +560,7 @@ static u32 tcp_rto_min(struct sock *sk)
struct dst_entry *dst = __sk_dst_get(sk);
u32 rto_min = TCP_RTO_MIN;
 
-   if (dst_metric_locked(dst, RTAX_RTO_MIN))
+   if (dst && dst_metric_locked(dst, RTAX_RTO_MIN))
rto_min = dst->metrics[RTAX_RTO_MIN-1];
return rto_min;
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Oops in 2.6.23-rc5

2007-09-01 Thread Christian Kujau

Hi,

today I switched from 2.6.22.3 to 2.6.23-rc5 (skipped quite a few -rc 
versions due to lack of time), and the box keeps panicking under certain 
circumstances. I suspected disk related problems, because: when the box 
is up, I usually resume ~10 bittorrent files. When doing this, each

file (~200MB...1GB) is checked and disk activity is pretty high (20MB/s
or so), and after 1 minute of doing so the box panicks. Every time.

However, I could not reproduce it while generating disk-io with say tar 
or rsync to the same fs. It always panicked when the torrent client(s) 
start up. As the box would not log anything via remote-syslog before 
halting, I connected a vga display. As I don't have a digital camera, I 
tried to write down some stuff: http://ww.nerdbynature.de/bits/2.6.23-rc5/
(I'll try to write down the full oops to this place, or what was still 
visible from it, because the first few(?) lines where lost, display 
scrollback was not working, only sysrq was).


The backtrace mentions do_page_fault, error_code, tcp_rtt_estimator, 
tcp_ack_saw_timestamp, tcp_ack, tcp_rcv_established, tcp_v4_do_rcv, 
tcp_v4_rcv, ip_local_delimiter, netif_receive_skb, process_backlog, 
net_rcv_activate, __do_softirq, do_softirq - in that order. As said, the 
correct addresses will be put on above's url (Q: do I really need *all* 
the numbers? Or just a few?). These snippets made me suspect network 
related issues, because: aside from disk-io, the bittorrent clients will 
establish quite a few (~50 in total) connections to all the peers.


The box is a amd-k7, 2 NICs (forcedeth, 3c59x), 2 GB RAM, ACPI 
disabled, gcc-4.1


Thanks for looking into this,
Christian.
--
BOFH excuse #335:

the AA battery in the wallclock sends magnetic interference
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Oops in 2.6.23-rc5

2007-09-01 Thread Christian Kujau

Hi,

today I switched from 2.6.22.3 to 2.6.23-rc5 (skipped quite a few -rc 
versions due to lack of time), and the box keeps panicking under certain 
circumstances. I suspected disk related problems, because: when the box 
is up, I usually resume ~10 bittorrent files. When doing this, each

file (~200MB...1GB) is checked and disk activity is pretty high (20MB/s
or so), and after 1 minute of doing so the box panicks. Every time.

However, I could not reproduce it while generating disk-io with say tar 
or rsync to the same fs. It always panicked when the torrent client(s) 
start up. As the box would not log anything via remote-syslog before 
halting, I connected a vga display. As I don't have a digital camera, I 
tried to write down some stuff: http://ww.nerdbynature.de/bits/2.6.23-rc5/
(I'll try to write down the full oops to this place, or what was still 
visible from it, because the first few(?) lines where lost, display 
scrollback was not working, only sysrq was).


The backtrace mentions do_page_fault, error_code, tcp_rtt_estimator, 
tcp_ack_saw_timestamp, tcp_ack, tcp_rcv_established, tcp_v4_do_rcv, 
tcp_v4_rcv, ip_local_delimiter, netif_receive_skb, process_backlog, 
net_rcv_activate, __do_softirq, do_softirq - in that order. As said, the 
correct addresses will be put on above's url (Q: do I really need *all* 
the numbers? Or just a few?). These snippets made me suspect network 
related issues, because: aside from disk-io, the bittorrent clients will 
establish quite a few (~50 in total) connections to all the peers.


The box is a amd-k7, 2 NICs (forcedeth, 3c59x), 2 GB RAM, ACPI 
disabled, gcc-4.1


Thanks for looking into this,
Christian.
--
BOFH excuse #335:

the AA battery in the wallclock sends magnetic interference
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Oops in 2.6.23-rc5

2007-09-01 Thread Herbert Xu
Christian Kujau [EMAIL PROTECTED] wrote:
 
 today I switched from 2.6.22.3 to 2.6.23-rc5 (skipped quite a few -rc 
 versions due to lack of time), and the box keeps panicking under certain 
 circumstances. I suspected disk related problems, because: when the box 
 is up, I usually resume ~10 bittorrent files. When doing this, each
 file (~200MB...1GB) is checked and disk activity is pretty high (20MB/s
 or so), and after 1 minute of doing so the box panicks. Every time.

You want this patch (by davem).

Unfortunately people are travelling so I'm not sure when it'll
get picked up by Linus.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED]
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
From: [EMAIL PROTECTED] (David Miller)

 ip is at tcp_rto_min+0x20/0x40

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 1ee7212..bbad2cd 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -560,7 +560,7 @@ static u32 tcp_rto_min(struct sock *sk)
struct dst_entry *dst = __sk_dst_get(sk);
u32 rto_min = TCP_RTO_MIN;
 
-   if (dst_metric_locked(dst, RTAX_RTO_MIN))
+   if (dst  dst_metric_locked(dst, RTAX_RTO_MIN))
rto_min = dst-metrics[RTAX_RTO_MIN-1];
return rto_min;
 }
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/