Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On 8/26/05, Patrick McHardy <[EMAIL PROTECTED]> wrote: > Alessandro Suardi wrote: > > Stack is hand-copied from the dead box's console. > > > > [] die+0xe4/0x170 > > [] do_trap+0x7f/0xc0 > > [] do_invalid_op+0xa3/0xb0 > > [] error_code+0x4f/0x54 > > [] kfree_skbmem+0xb/0x20 > > [] __kfree_skb+0x5f/0xf0 > > [] tcp_clean_rtx_queue+0x16a/0x470 > > [] tcp_ack+0xf6/0x360 > > [] tcp_rcv_established+0x277/0x7a0 > > [] tcp_v4_do_rcv+0xf0/0x110 > > [] tcp_v4_rcv+0x6e0/0x820 > > [] ip_local_deliver_finish+0x84/0x160 > > [] nf_reinject+0x13a/0x1c0 > > [] ipq_issue_verdict+0x28/0x40 > > [] ipq_set_verdict+0x48/0x70 > > [] ipq_receive_peer+0x39/0x50 > > [] ipq_receive_sk+0x172/0x190 > > [] netlink_data_ready+0x35/0x60 > > [] netlink_sendskb+0x24/0x60 > > [] netlink_unicast+0x127/0x160 > > [] netlink_sendmsg+0x204/0x2b0 > > [] sock_sendmsg+0xb0/0xe0 > > [] sys_sendmsg+0x134/0x240 > > [] sys_socketcall+0x224/0x230 > > [] sysenter_past_esp+0x54/0x75 > > Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 > > c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc <0f> > > 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 > > <0>Kernel panic - not syncing: Fatal exception in interrupt > > > > If there's need for further info I'd be happy to provide it. For now > > the box is rebooted into the same kernel and running the same > > PG/eD2k programs, if the issue reproduces I'll follow up on my > > own message. > > Any chance you can get the entire Oops including registers etc > using netconsole or serial console? Not right now, as I noticed netconsole requires netpoll and this latter can't be modular; but I'll do so before leaving tomorrow morning, obviously rebuilding with 2.6.13-rc7-git1 or -git2 if the new snapshot comes out. At the moment, the box has been running for 32 hours with no sign of wanting to oops... [EMAIL PROTECTED] ~]# ps ax | egrep 'peer|edon' 2416 pts/2Sl25:37 peerguardnf -d -l /var/log/pg.log -c /etc/PG.conf 25186 pts/0R+76:37 ./edonkey2000 25189 pts/0S+ 0:06 ./edonkey2000 25191 pts/0S+ 9:49 ./edonkey2000 7007 pts/0S+ 0:00 ./edonkey2000 7011 pts/3R+ 0:00 egrep peer|edon [EMAIL PROTECTED] ~]# w 22:37:53 up 1 day, 7:49, 4 users, load average: 0.15, 0.18, 0.25 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0donkey:2.0 Thu14 20:15m 1:26m 0.00s bash root pts/1donkey:2.0 Thu14 13:40m 0.41s 1:57 gnome-terminal --sm-config-prefix /gnome-terminal-wBjEOn/ - root pts/2donkey:2.0 Thu144:07 25:37 0.49s bash root pts/3192.168.1.6 22:370.00s 0.06s 0.01s w Thanks, --alessandro "Not every smile means I'm laughing inside" (Wallflowers - "From The Bottom Of My Heart") - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
Alessandro Suardi wrote: > Stack is hand-copied from the dead box's console. > > [] die+0xe4/0x170 > [] do_trap+0x7f/0xc0 > [] do_invalid_op+0xa3/0xb0 > [] error_code+0x4f/0x54 > [] kfree_skbmem+0xb/0x20 > [] __kfree_skb+0x5f/0xf0 > [] tcp_clean_rtx_queue+0x16a/0x470 > [] tcp_ack+0xf6/0x360 > [] tcp_rcv_established+0x277/0x7a0 > [] tcp_v4_do_rcv+0xf0/0x110 > [] tcp_v4_rcv+0x6e0/0x820 > [] ip_local_deliver_finish+0x84/0x160 > [] nf_reinject+0x13a/0x1c0 > [] ipq_issue_verdict+0x28/0x40 > [] ipq_set_verdict+0x48/0x70 > [] ipq_receive_peer+0x39/0x50 > [] ipq_receive_sk+0x172/0x190 > [] netlink_data_ready+0x35/0x60 > [] netlink_sendskb+0x24/0x60 > [] netlink_unicast+0x127/0x160 > [] netlink_sendmsg+0x204/0x2b0 > [] sock_sendmsg+0xb0/0xe0 > [] sys_sendmsg+0x134/0x240 > [] sys_socketcall+0x224/0x230 > [] sysenter_past_esp+0x54/0x75 > Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 > c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc <0f> > 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 > <0>Kernel panic - not syncing: Fatal exception in interrupt > > If there's need for further info I'd be happy to provide it. For now > the box is rebooted into the same kernel and running the same > PG/eD2k programs, if the issue reproduces I'll follow up on my > own message. Any chance you can get the entire Oops including registers etc using netconsole or serial console? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On Thu, Aug 25, 2005 at 11:02:01PM +0200, Sven Schuster wrote: > > Hi Harald, > > On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us: > > Is it true that PeerGuardian is a proprietary application? I'm not > > going to debug this problem using a proprietary ip_queue program, sorry. > > sorry to jump in here, but I took a quick look at PeerGuardian, > according to > http://methlabs.org/wiki/license_information > it's open source. The source code is available at > http://methlabs.org/projects/peerguardian-linuxosx/ ok, thanks. Sorry for the confusion, but the 'official' website is just a blog that didn't really reveal all that much information. -- - Harald Welte <[EMAIL PROTECTED]> http://netfilter.org/ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed."-- Paul Vixie pgpS5H7yzk190.pgp Description: PGP signature
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On Thu, Aug 25, 2005 at 11:02:01PM +0200, Sven Schuster wrote: Hi Harald, On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us: Is it true that PeerGuardian is a proprietary application? I'm not going to debug this problem using a proprietary ip_queue program, sorry. sorry to jump in here, but I took a quick look at PeerGuardian, according to http://methlabs.org/wiki/license_information it's open source. The source code is available at http://methlabs.org/projects/peerguardian-linuxosx/ ok, thanks. Sorry for the confusion, but the 'official' website is just a blog that didn't really reveal all that much information. -- - Harald Welte [EMAIL PROTECTED] http://netfilter.org/ Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed.-- Paul Vixie pgpS5H7yzk190.pgp Description: PGP signature
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
Alessandro Suardi wrote: Stack is hand-copied from the dead box's console. [c0103714] die+0xe4/0x170 [c010381f] do_trap+0x7f/0xc0 [c0103b33] do_invalid_op+0xa3/0xb0 [c0102faf] error_code+0x4f/0x54 [c02eb05b] kfree_skbmem+0xb/0x20 [c02eb0cf] __kfree_skb+0x5f/0xf0 [c031304a] tcp_clean_rtx_queue+0x16a/0x470 [c0313746] tcp_ack+0xf6/0x360 [c0315d57] tcp_rcv_established+0x277/0x7a0 [c031eba0] tcp_v4_do_rcv+0xf0/0x110 [c031f2a0] tcp_v4_rcv+0x6e0/0x820 [c0305594] ip_local_deliver_finish+0x84/0x160 [c02fbe4a] nf_reinject+0x13a/0x1c0 [c033f0d8] ipq_issue_verdict+0x28/0x40 [c033f968] ipq_set_verdict+0x48/0x70 [c033fa79] ipq_receive_peer+0x39/0x50 [c033fc72] ipq_receive_sk+0x172/0x190 [c02fffa5] netlink_data_ready+0x35/0x60 [c02ff4a4] netlink_sendskb+0x24/0x60 [c02ff657] netlink_unicast+0x127/0x160 [c02ffcc4] netlink_sendmsg+0x204/0x2b0 [c02e6dc0] sock_sendmsg+0xb0/0xe0 [c02e83f4] sys_sendmsg+0x134/0x240 [c02e88e4] sys_socketcall+0x224/0x230 [c0102d3b] sysenter_past_esp+0x54/0x75 Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc 0f 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 0Kernel panic - not syncing: Fatal exception in interrupt If there's need for further info I'd be happy to provide it. For now the box is rebooted into the same kernel and running the same PG/eD2k programs, if the issue reproduces I'll follow up on my own message. Any chance you can get the entire Oops including registers etc using netconsole or serial console? - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On 8/26/05, Patrick McHardy [EMAIL PROTECTED] wrote: Alessandro Suardi wrote: Stack is hand-copied from the dead box's console. [c0103714] die+0xe4/0x170 [c010381f] do_trap+0x7f/0xc0 [c0103b33] do_invalid_op+0xa3/0xb0 [c0102faf] error_code+0x4f/0x54 [c02eb05b] kfree_skbmem+0xb/0x20 [c02eb0cf] __kfree_skb+0x5f/0xf0 [c031304a] tcp_clean_rtx_queue+0x16a/0x470 [c0313746] tcp_ack+0xf6/0x360 [c0315d57] tcp_rcv_established+0x277/0x7a0 [c031eba0] tcp_v4_do_rcv+0xf0/0x110 [c031f2a0] tcp_v4_rcv+0x6e0/0x820 [c0305594] ip_local_deliver_finish+0x84/0x160 [c02fbe4a] nf_reinject+0x13a/0x1c0 [c033f0d8] ipq_issue_verdict+0x28/0x40 [c033f968] ipq_set_verdict+0x48/0x70 [c033fa79] ipq_receive_peer+0x39/0x50 [c033fc72] ipq_receive_sk+0x172/0x190 [c02fffa5] netlink_data_ready+0x35/0x60 [c02ff4a4] netlink_sendskb+0x24/0x60 [c02ff657] netlink_unicast+0x127/0x160 [c02ffcc4] netlink_sendmsg+0x204/0x2b0 [c02e6dc0] sock_sendmsg+0xb0/0xe0 [c02e83f4] sys_sendmsg+0x134/0x240 [c02e88e4] sys_socketcall+0x224/0x230 [c0102d3b] sysenter_past_esp+0x54/0x75 Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc 0f 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 0Kernel panic - not syncing: Fatal exception in interrupt If there's need for further info I'd be happy to provide it. For now the box is rebooted into the same kernel and running the same PG/eD2k programs, if the issue reproduces I'll follow up on my own message. Any chance you can get the entire Oops including registers etc using netconsole or serial console? Not right now, as I noticed netconsole requires netpoll and this latter can't be modular; but I'll do so before leaving tomorrow morning, obviously rebuilding with 2.6.13-rc7-git1 or -git2 if the new snapshot comes out. At the moment, the box has been running for 32 hours with no sign of wanting to oops... [EMAIL PROTECTED] ~]# ps ax | egrep 'peer|edon' 2416 pts/2Sl25:37 peerguardnf -d -l /var/log/pg.log -c /etc/PG.conf 25186 pts/0R+76:37 ./edonkey2000 25189 pts/0S+ 0:06 ./edonkey2000 25191 pts/0S+ 9:49 ./edonkey2000 7007 pts/0S+ 0:00 ./edonkey2000 7011 pts/3R+ 0:00 egrep peer|edon [EMAIL PROTECTED] ~]# w 22:37:53 up 1 day, 7:49, 4 users, load average: 0.15, 0.18, 0.25 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0donkey:2.0 Thu14 20:15m 1:26m 0.00s bash root pts/1donkey:2.0 Thu14 13:40m 0.41s 1:57 gnome-terminal --sm-config-prefix /gnome-terminal-wBjEOn/ - root pts/2donkey:2.0 Thu144:07 25:37 0.49s bash root pts/3192.168.1.6 22:370.00s 0.06s 0.01s w Thanks, --alessandro Not every smile means I'm laughing inside (Wallflowers - From The Bottom Of My Heart) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
Hi Harald, On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us: > Is it true that PeerGuardian is a proprietary application? I'm not > going to debug this problem using a proprietary ip_queue program, sorry. sorry to jump in here, but I took a quick look at PeerGuardian, according to http://methlabs.org/wiki/license_information it's open source. The source code is available at http://methlabs.org/projects/peerguardian-linuxosx/ HTH Sven -- Linux zion.homelinux.com 2.6.13-rc6-mm2 #3 Thu Aug 25 14:53:55 CEST 2005 i686 athlon i386 GNU/Linux 22:56:18 up 7:40, 1 user, load average: 0.46, 0.14, 0.04 pgp8ptImjJfSl.pgp Description: PGP signature
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On 8/25/05, Harald Welte <[EMAIL PROTECTED]> wrote: > On Thu, Aug 25, 2005 at 03:39:02PM +0200, Alessandro Suardi wrote: > > Howdy, and excuse me for crossposting - feel free to zap CC to > > unrelated, if any, mailing lists. > > > > just gave PeerGuardian a spin on my eDonkey home box and > > said box didn't last half a day before oopsing in netlink/nf/tcp > > related routines (or so it seems to my untrained eye). > > Yes, it indeed could be that there is some fishy interaction between the > tcp stack and ip_queue causing the oops. > > > K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, > > doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. > > PeerGuardian is the 1.5 beta version available from methlabs.org. > > Is it true that PeerGuardian is a proprietary application? I'm not > going to debug this problem using a proprietary ip_queue program, sorry. I'm not sure I understand the issue; I built PG from these sources: http://prdownloads.sourceforge.net/peerguardian/pglinux-1.5beta.tar.gz?download and I had to install the iptables-devel FC3 rpm to build. The PG sources seem to be licensed under GPLv2. But maybe you're referring to the fact that whatever PG does, it doesn't show up as output from 'iptables -L' ? > If you can produce a testcase with open source userspace ip_queue code, > I could look into reproducing the problem locally and debugging the > problem more thoroughly. So far the box has been running for over four hours, I'll configure my laptop as a netdump server hoping it might capture something if the ed2k box crashes again later. I'm afraid I won't be able to set up a real testcase (and btw, edonkey v1.4.3 from MetaMachine is actually a proprietary program, though entirely in userspace). > While it definitely is a kernel bug (whatever userspace sends should not > crash the kernel), it might be something that specifically [only] > PeerGuardian does to the packet. Something that ip_queue doesn't check > (but should check) on packet reinjection and therefore upsets the TCP stack. > > Also helpful would be the output of an "strace -f -x -s65535 -e > trace=sendmsg" on the PeerGuardian (daemon?) process. > > > > [] die+0xe4/0x170 > > [] do_trap+0x7f/0xc0 > > [] do_invalid_op+0xa3/0xb0 > > [] error_code+0x4f/0x54 > > [] kfree_skbmem+0xb/0x20 > > [] __kfree_skb+0x5f/0xf0 > > ok, so something down the chain from kfree_skb() results in an invalid > operation? looks more like some compiler problem, bad memory or memory > corruption to me. Try to reproduce the problem without PG. compiler is fc3's latest - gcc-3.4.4-2.fc3. I might have a go at memtest86 in the next weeks if more symptoms point at possible bad RAM. > > [] tcp_clean_rtx_queue+0x16a/0x470 > > [] tcp_ack+0xf6/0x360 > > [] tcp_rcv_established+0x277/0x7a0 > > [] tcp_v4_do_rcv+0xf0/0x110 > > [] tcp_v4_rcv+0x6e0/0x820 > > [] ip_local_deliver_finish+0x84/0x160 > > so something in the tcp stack ends up doing tcp_clean_rtx_queue() > > > [] nf_reinject+0x13a/0x1c0 > > [] ipq_issue_verdict+0x28/0x40 > > [] ipq_set_verdict+0x48/0x70 > > ip_queue reinjects a packet via nf_reinject() > > > [] ipq_receive_peer+0x39/0x50 > > [] ipq_receive_sk+0x172/0x190 > > ip_queue receives and ipq verdict msg packet from netlink > > > [] netlink_data_ready+0x35/0x60 > > [] netlink_sendskb+0x24/0x60 > > [] netlink_unicast+0x127/0x160 > > [] netlink_sendmsg+0x204/0x2b0 > > [] sock_sendmsg+0xb0/0xe0 > > [] sys_sendmsg+0x134/0x240 > > [] sys_socketcall+0x224/0x230 > > [] sysenter_past_esp+0x54/0x75 > > process sendmsg()s on the netlink socket. Thanks, --alessandro "Not every smile means I'm laughing inside" (Wallflowers - "From The Bottom Of My Heart") - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On Thu, Aug 25, 2005 at 03:39:02PM +0200, Alessandro Suardi wrote: > Howdy, and excuse me for crossposting - feel free to zap CC to > unrelated, if any, mailing lists. > > just gave PeerGuardian a spin on my eDonkey home box and > said box didn't last half a day before oopsing in netlink/nf/tcp > related routines (or so it seems to my untrained eye). Yes, it indeed could be that there is some fishy interaction between the tcp stack and ip_queue causing the oops. > K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, > doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. > PeerGuardian is the 1.5 beta version available from methlabs.org. Is it true that PeerGuardian is a proprietary application? I'm not going to debug this problem using a proprietary ip_queue program, sorry. If you can produce a testcase with open source userspace ip_queue code, I could look into reproducing the problem locally and debugging the problem more thoroughly. While it definitely is a kernel bug (whatever userspace sends should not crash the kernel), it might be something that specifically [only] PeerGuardian does to the packet. Something that ip_queue doesn't check (but should check) on packet reinjection and therefore upsets the TCP stack. Also helpful would be the output of an "strace -f -x -s65535 -e trace=sendmsg" on the PeerGuardian (daemon?) process. > [] die+0xe4/0x170 > [] do_trap+0x7f/0xc0 > [] do_invalid_op+0xa3/0xb0 > [] error_code+0x4f/0x54 > [] kfree_skbmem+0xb/0x20 > [] __kfree_skb+0x5f/0xf0 ok, so something down the chain from kfree_skb() results in an invalid operation? looks more like some compiler problem, bad memory or memory corruption to me. Try to reproduce the problem without PG. > [] tcp_clean_rtx_queue+0x16a/0x470 > [] tcp_ack+0xf6/0x360 > [] tcp_rcv_established+0x277/0x7a0 > [] tcp_v4_do_rcv+0xf0/0x110 > [] tcp_v4_rcv+0x6e0/0x820 > [] ip_local_deliver_finish+0x84/0x160 so something in the tcp stack ends up doing tcp_clean_rtx_queue() > [] nf_reinject+0x13a/0x1c0 > [] ipq_issue_verdict+0x28/0x40 > [] ipq_set_verdict+0x48/0x70 ip_queue reinjects a packet via nf_reinject() > [] ipq_receive_peer+0x39/0x50 > [] ipq_receive_sk+0x172/0x190 ip_queue receives and ipq verdict msg packet from netlink > [] netlink_data_ready+0x35/0x60 > [] netlink_sendskb+0x24/0x60 > [] netlink_unicast+0x127/0x160 > [] netlink_sendmsg+0x204/0x2b0 > [] sock_sendmsg+0xb0/0xe0 > [] sys_sendmsg+0x134/0x240 > [] sys_socketcall+0x224/0x230 > [] sysenter_past_esp+0x54/0x75 process sendmsg()s on the netlink socket. -- - Harald Welte <[EMAIL PROTECTED]> http://netfilter.org/ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed."-- Paul Vixie pgpz7kKVQdD10.pgp Description: PGP signature
oops in 2.6.13-rc6-git12 in tcp/netfilter routines
Howdy, and excuse me for crossposting - feel free to zap CC to unrelated, if any, mailing lists. just gave PeerGuardian a spin on my eDonkey home box and said box didn't last half a day before oopsing in netlink/nf/tcp related routines (or so it seems to my untrained eye). K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. PeerGuardian is the 1.5 beta version available from methlabs.org. Stack is hand-copied from the dead box's console. [] die+0xe4/0x170 [] do_trap+0x7f/0xc0 [] do_invalid_op+0xa3/0xb0 [] error_code+0x4f/0x54 [] kfree_skbmem+0xb/0x20 [] __kfree_skb+0x5f/0xf0 [] tcp_clean_rtx_queue+0x16a/0x470 [] tcp_ack+0xf6/0x360 [] tcp_rcv_established+0x277/0x7a0 [] tcp_v4_do_rcv+0xf0/0x110 [] tcp_v4_rcv+0x6e0/0x820 [] ip_local_deliver_finish+0x84/0x160 [] nf_reinject+0x13a/0x1c0 [] ipq_issue_verdict+0x28/0x40 [] ipq_set_verdict+0x48/0x70 [] ipq_receive_peer+0x39/0x50 [] ipq_receive_sk+0x172/0x190 [] netlink_data_ready+0x35/0x60 [] netlink_sendskb+0x24/0x60 [] netlink_unicast+0x127/0x160 [] netlink_sendmsg+0x204/0x2b0 [] sock_sendmsg+0xb0/0xe0 [] sys_sendmsg+0x134/0x240 [] sys_socketcall+0x224/0x230 [] sysenter_past_esp+0x54/0x75 Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc <0f> 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 <0>Kernel panic - not syncing: Fatal exception in interrupt If there's need for further info I'd be happy to provide it. For now the box is rebooted into the same kernel and running the same PG/eD2k programs, if the issue reproduces I'll follow up on my own message. Thanks in advance, ciao, --alessandro "Not every smile means I'm laughing inside" (Wallflowers - "From The Bottom Of My Heart") - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
oops in 2.6.13-rc6-git12 in tcp/netfilter routines
Howdy, and excuse me for crossposting - feel free to zap CC to unrelated, if any, mailing lists. just gave PeerGuardian a spin on my eDonkey home box and said box didn't last half a day before oopsing in netlink/nf/tcp related routines (or so it seems to my untrained eye). K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. PeerGuardian is the 1.5 beta version available from methlabs.org. Stack is hand-copied from the dead box's console. [c0103714] die+0xe4/0x170 [c010381f] do_trap+0x7f/0xc0 [c0103b33] do_invalid_op+0xa3/0xb0 [c0102faf] error_code+0x4f/0x54 [c02eb05b] kfree_skbmem+0xb/0x20 [c02eb0cf] __kfree_skb+0x5f/0xf0 [c031304a] tcp_clean_rtx_queue+0x16a/0x470 [c0313746] tcp_ack+0xf6/0x360 [c0315d57] tcp_rcv_established+0x277/0x7a0 [c031eba0] tcp_v4_do_rcv+0xf0/0x110 [c031f2a0] tcp_v4_rcv+0x6e0/0x820 [c0305594] ip_local_deliver_finish+0x84/0x160 [c02fbe4a] nf_reinject+0x13a/0x1c0 [c033f0d8] ipq_issue_verdict+0x28/0x40 [c033f968] ipq_set_verdict+0x48/0x70 [c033fa79] ipq_receive_peer+0x39/0x50 [c033fc72] ipq_receive_sk+0x172/0x190 [c02fffa5] netlink_data_ready+0x35/0x60 [c02ff4a4] netlink_sendskb+0x24/0x60 [c02ff657] netlink_unicast+0x127/0x160 [c02ffcc4] netlink_sendmsg+0x204/0x2b0 [c02e6dc0] sock_sendmsg+0xb0/0xe0 [c02e83f4] sys_sendmsg+0x134/0x240 [c02e88e4] sys_socketcall+0x224/0x230 [c0102d3b] sysenter_past_esp+0x54/0x75 Code: 8b 41 0c 85 c0 75 1b 8b 86 94 00 00 00 e8 9e 37 e5 ff 5b 5e c9 c3 89 d0 e8 43 46 e5 ff 8d 76 00 eb d2 89 f0 e8 f7 fe ff ff eb dc 0f 0b 54 01 16 d2 36 c0 eb b4 8d 74 26 00 8d bc 27 00 00 00 00 0Kernel panic - not syncing: Fatal exception in interrupt If there's need for further info I'd be happy to provide it. For now the box is rebooted into the same kernel and running the same PG/eD2k programs, if the issue reproduces I'll follow up on my own message. Thanks in advance, ciao, --alessandro Not every smile means I'm laughing inside (Wallflowers - From The Bottom Of My Heart) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On Thu, Aug 25, 2005 at 03:39:02PM +0200, Alessandro Suardi wrote: Howdy, and excuse me for crossposting - feel free to zap CC to unrelated, if any, mailing lists. just gave PeerGuardian a spin on my eDonkey home box and said box didn't last half a day before oopsing in netlink/nf/tcp related routines (or so it seems to my untrained eye). Yes, it indeed could be that there is some fishy interaction between the tcp stack and ip_queue causing the oops. K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. PeerGuardian is the 1.5 beta version available from methlabs.org. Is it true that PeerGuardian is a proprietary application? I'm not going to debug this problem using a proprietary ip_queue program, sorry. If you can produce a testcase with open source userspace ip_queue code, I could look into reproducing the problem locally and debugging the problem more thoroughly. While it definitely is a kernel bug (whatever userspace sends should not crash the kernel), it might be something that specifically [only] PeerGuardian does to the packet. Something that ip_queue doesn't check (but should check) on packet reinjection and therefore upsets the TCP stack. Also helpful would be the output of an strace -f -x -s65535 -e trace=sendmsg on the PeerGuardian (daemon?) process. [c0103714] die+0xe4/0x170 [c010381f] do_trap+0x7f/0xc0 [c0103b33] do_invalid_op+0xa3/0xb0 [c0102faf] error_code+0x4f/0x54 [c02eb05b] kfree_skbmem+0xb/0x20 [c02eb0cf] __kfree_skb+0x5f/0xf0 ok, so something down the chain from kfree_skb() results in an invalid operation? looks more like some compiler problem, bad memory or memory corruption to me. Try to reproduce the problem without PG. [c031304a] tcp_clean_rtx_queue+0x16a/0x470 [c0313746] tcp_ack+0xf6/0x360 [c0315d57] tcp_rcv_established+0x277/0x7a0 [c031eba0] tcp_v4_do_rcv+0xf0/0x110 [c031f2a0] tcp_v4_rcv+0x6e0/0x820 [c0305594] ip_local_deliver_finish+0x84/0x160 so something in the tcp stack ends up doing tcp_clean_rtx_queue() [c02fbe4a] nf_reinject+0x13a/0x1c0 [c033f0d8] ipq_issue_verdict+0x28/0x40 [c033f968] ipq_set_verdict+0x48/0x70 ip_queue reinjects a packet via nf_reinject() [c033fa79] ipq_receive_peer+0x39/0x50 [c033fc72] ipq_receive_sk+0x172/0x190 ip_queue receives and ipq verdict msg packet from netlink [c02fffa5] netlink_data_ready+0x35/0x60 [c02ff4a4] netlink_sendskb+0x24/0x60 [c02ff657] netlink_unicast+0x127/0x160 [c02ffcc4] netlink_sendmsg+0x204/0x2b0 [c02e6dc0] sock_sendmsg+0xb0/0xe0 [c02e83f4] sys_sendmsg+0x134/0x240 [c02e88e4] sys_socketcall+0x224/0x230 [c0102d3b] sysenter_past_esp+0x54/0x75 process sendmsg()s on the netlink socket. -- - Harald Welte [EMAIL PROTECTED] http://netfilter.org/ Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed.-- Paul Vixie pgpz7kKVQdD10.pgp Description: PGP signature
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
On 8/25/05, Harald Welte [EMAIL PROTECTED] wrote: On Thu, Aug 25, 2005 at 03:39:02PM +0200, Alessandro Suardi wrote: Howdy, and excuse me for crossposting - feel free to zap CC to unrelated, if any, mailing lists. just gave PeerGuardian a spin on my eDonkey home box and said box didn't last half a day before oopsing in netlink/nf/tcp related routines (or so it seems to my untrained eye). Yes, it indeed could be that there is some fishy interaction between the tcp stack and ip_queue causing the oops. K7800, 256MB RAM, uptodate FC3 running 2.6.13-rc6-git12, doing nothing but running MetaMachine's eDonkey 1.4.3 QT gui. PeerGuardian is the 1.5 beta version available from methlabs.org. Is it true that PeerGuardian is a proprietary application? I'm not going to debug this problem using a proprietary ip_queue program, sorry. I'm not sure I understand the issue; I built PG from these sources: http://prdownloads.sourceforge.net/peerguardian/pglinux-1.5beta.tar.gz?download and I had to install the iptables-devel FC3 rpm to build. The PG sources seem to be licensed under GPLv2. But maybe you're referring to the fact that whatever PG does, it doesn't show up as output from 'iptables -L' ? If you can produce a testcase with open source userspace ip_queue code, I could look into reproducing the problem locally and debugging the problem more thoroughly. So far the box has been running for over four hours, I'll configure my laptop as a netdump server hoping it might capture something if the ed2k box crashes again later. I'm afraid I won't be able to set up a real testcase (and btw, edonkey v1.4.3 from MetaMachine is actually a proprietary program, though entirely in userspace). While it definitely is a kernel bug (whatever userspace sends should not crash the kernel), it might be something that specifically [only] PeerGuardian does to the packet. Something that ip_queue doesn't check (but should check) on packet reinjection and therefore upsets the TCP stack. Also helpful would be the output of an strace -f -x -s65535 -e trace=sendmsg on the PeerGuardian (daemon?) process. [c0103714] die+0xe4/0x170 [c010381f] do_trap+0x7f/0xc0 [c0103b33] do_invalid_op+0xa3/0xb0 [c0102faf] error_code+0x4f/0x54 [c02eb05b] kfree_skbmem+0xb/0x20 [c02eb0cf] __kfree_skb+0x5f/0xf0 ok, so something down the chain from kfree_skb() results in an invalid operation? looks more like some compiler problem, bad memory or memory corruption to me. Try to reproduce the problem without PG. compiler is fc3's latest - gcc-3.4.4-2.fc3. I might have a go at memtest86 in the next weeks if more symptoms point at possible bad RAM. [c031304a] tcp_clean_rtx_queue+0x16a/0x470 [c0313746] tcp_ack+0xf6/0x360 [c0315d57] tcp_rcv_established+0x277/0x7a0 [c031eba0] tcp_v4_do_rcv+0xf0/0x110 [c031f2a0] tcp_v4_rcv+0x6e0/0x820 [c0305594] ip_local_deliver_finish+0x84/0x160 so something in the tcp stack ends up doing tcp_clean_rtx_queue() [c02fbe4a] nf_reinject+0x13a/0x1c0 [c033f0d8] ipq_issue_verdict+0x28/0x40 [c033f968] ipq_set_verdict+0x48/0x70 ip_queue reinjects a packet via nf_reinject() [c033fa79] ipq_receive_peer+0x39/0x50 [c033fc72] ipq_receive_sk+0x172/0x190 ip_queue receives and ipq verdict msg packet from netlink [c02fffa5] netlink_data_ready+0x35/0x60 [c02ff4a4] netlink_sendskb+0x24/0x60 [c02ff657] netlink_unicast+0x127/0x160 [c02ffcc4] netlink_sendmsg+0x204/0x2b0 [c02e6dc0] sock_sendmsg+0xb0/0xe0 [c02e83f4] sys_sendmsg+0x134/0x240 [c02e88e4] sys_socketcall+0x224/0x230 [c0102d3b] sysenter_past_esp+0x54/0x75 process sendmsg()s on the netlink socket. Thanks, --alessandro Not every smile means I'm laughing inside (Wallflowers - From The Bottom Of My Heart) - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: oops in 2.6.13-rc6-git12 in tcp/netfilter routines
Hi Harald, On Thu, Aug 25, 2005 at 06:55:50PM +0200, Harald Welte told us: Is it true that PeerGuardian is a proprietary application? I'm not going to debug this problem using a proprietary ip_queue program, sorry. sorry to jump in here, but I took a quick look at PeerGuardian, according to http://methlabs.org/wiki/license_information it's open source. The source code is available at http://methlabs.org/projects/peerguardian-linuxosx/ HTH Sven -- Linux zion.homelinux.com 2.6.13-rc6-mm2 #3 Thu Aug 25 14:53:55 CEST 2005 i686 athlon i386 GNU/Linux 22:56:18 up 7:40, 1 user, load average: 0.46, 0.14, 0.04 pgp8ptImjJfSl.pgp Description: PGP signature