Re: 2.6.23-rc[68]-mm: network hangs
From: Laurent Riffard <[EMAIL PROTECTED]> Date: Sat, 29 Sep 2007 11:22:06 +0200 > Could a router problem prevent "ping 127.0.0.1" from working ? Two things that are new and could cause these problems: 1) We dynamically allocate the loopback device now. 2) We have the network namespace stuff. Another change in this area is that we do routing cache garbage collection from a workqueue instead of a timer but that was a pretty straightforward transformation so it is not high on my list of "suspects". - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc[68]-mm: network hangs
Le 29.09.2007 10:31, Andrew Morton a écrit : > On Fri, 28 Sep 2007 21:48:18 +0200 Laurent Riffard <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> >From time to time, I experience some complete network hangs: >> >> Suddenly, all network connections become unresponsive. Even "ping >> 127.0.0.1" doesn't work. SysRq-w does not show any blocked processus. >> >> When such hang happen, I have to reboot (shutdown does work). >> >> This is not easily reproducible: it happens several minutes after >> boot (could be 45 minutes or 2 hours). I do not use heavy networking >> apps (like P2P). My typical usage is a Gnome desktop with browser, >> mailer, IM, video or audio streaming. >> >> I have a single PC connected to a DSL router via ethernet (so no LAN >> with NFS or CIFS). >> >> This happens with 2.6.23-rc8-mm2 and 2.6.23-rc6-mm1. I can't remember >> when I first see this problem. Maybe 2 months ago. >> >> I attached the output of "strace ping 127.0.0.1". How can I collect >> some more data when this problem happens ? >> > > I have a second report of this from Uwe Bugla (who has been banished from > all vger lists for various naughtinesses). > > Similar story - after a few hours his network router (which is using ppp in > some fashion) craps out and dhcp queries all time out. > > I'd be suspecting a ppp bug in net-2.6.24. mmm... I don't use PPP at all, nor dhcp. My DSL box act as a NAT-router. Its local adress is 192.168.0.254 and my PC is at 192.192.0.9 (fixed IP). /--- telephone line | +--+ | ADSL box | +--+ 192.168.0.254 | 192.168.0.9 +---+ |PC | +---+ Firstly, I suspected a bug in the DSL box, so I reset it but it did not help. I have to reboot the PC (don't need to reset the DSL box). Could a router problem prevent "ping 127.0.0.1" from working ? [this becomes somewhat off-topic for LKML, sorry] Here's my routing table: linux-2.6-mm$ LANG=C netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 0.0.0.0 192.168.0.254 0.0.0.0 UG0 0 0 eth0 It seems it lacks a route for 127.0.0.1 through lo, isn't it? In this case, if eth0 hangs, 127.0.0.1 hangs too, no? So, it could be an ethernet driver bug. I'm using ne2k-pci. -- laurent - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc[68]-mm: network hangs
On Fri, 28 Sep 2007 21:48:18 +0200 Laurent Riffard <[EMAIL PROTECTED]> wrote: > Hi, > > >From time to time, I experience some complete network hangs: > > Suddenly, all network connections become unresponsive. Even "ping > 127.0.0.1" doesn't work. SysRq-w does not show any blocked processus. > > When such hang happen, I have to reboot (shutdown does work). > > This is not easily reproducible: it happens several minutes after > boot (could be 45 minutes or 2 hours). I do not use heavy networking > apps (like P2P). My typical usage is a Gnome desktop with browser, > mailer, IM, video or audio streaming. > > I have a single PC connected to a DSL router via ethernet (so no LAN > with NFS or CIFS). > > This happens with 2.6.23-rc8-mm2 and 2.6.23-rc6-mm1. I can't remember > when I first see this problem. Maybe 2 months ago. > > I attached the output of "strace ping 127.0.0.1". How can I collect > some more data when this problem happens ? > I have a second report of this from Uwe Bugla (who has been banished from all vger lists for various naughtinesses). Similar story - after a few hours his network router (which is using ppp in some fashion) craps out and dhcp queries all time out. I'd be suspecting a ppp bug in net-2.6.24. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc[68]-mm: network hangs
On Fri, 28 Sep 2007 21:48:18 +0200 Laurent Riffard [EMAIL PROTECTED] wrote: Hi, From time to time, I experience some complete network hangs: Suddenly, all network connections become unresponsive. Even ping 127.0.0.1 doesn't work. SysRq-w does not show any blocked processus. When such hang happen, I have to reboot (shutdown does work). This is not easily reproducible: it happens several minutes after boot (could be 45 minutes or 2 hours). I do not use heavy networking apps (like P2P). My typical usage is a Gnome desktop with browser, mailer, IM, video or audio streaming. I have a single PC connected to a DSL router via ethernet (so no LAN with NFS or CIFS). This happens with 2.6.23-rc8-mm2 and 2.6.23-rc6-mm1. I can't remember when I first see this problem. Maybe 2 months ago. I attached the output of strace ping 127.0.0.1. How can I collect some more data when this problem happens ? I have a second report of this from Uwe Bugla (who has been banished from all vger lists for various naughtinesses). Similar story - after a few hours his network router (which is using ppp in some fashion) craps out and dhcp queries all time out. I'd be suspecting a ppp bug in net-2.6.24. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc[68]-mm: network hangs
Le 29.09.2007 10:31, Andrew Morton a écrit : On Fri, 28 Sep 2007 21:48:18 +0200 Laurent Riffard [EMAIL PROTECTED] wrote: Hi, From time to time, I experience some complete network hangs: Suddenly, all network connections become unresponsive. Even ping 127.0.0.1 doesn't work. SysRq-w does not show any blocked processus. When such hang happen, I have to reboot (shutdown does work). This is not easily reproducible: it happens several minutes after boot (could be 45 minutes or 2 hours). I do not use heavy networking apps (like P2P). My typical usage is a Gnome desktop with browser, mailer, IM, video or audio streaming. I have a single PC connected to a DSL router via ethernet (so no LAN with NFS or CIFS). This happens with 2.6.23-rc8-mm2 and 2.6.23-rc6-mm1. I can't remember when I first see this problem. Maybe 2 months ago. I attached the output of strace ping 127.0.0.1. How can I collect some more data when this problem happens ? I have a second report of this from Uwe Bugla (who has been banished from all vger lists for various naughtinesses). Similar story - after a few hours his network router (which is using ppp in some fashion) craps out and dhcp queries all time out. I'd be suspecting a ppp bug in net-2.6.24. mmm... I don't use PPP at all, nor dhcp. My DSL box act as a NAT-router. Its local adress is 192.168.0.254 and my PC is at 192.192.0.9 (fixed IP). /--- telephone line | +--+ | ADSL box | +--+ 192.168.0.254 | 192.168.0.9 +---+ |PC | +---+ Firstly, I suspected a bug in the DSL box, so I reset it but it did not help. I have to reboot the PC (don't need to reset the DSL box). Could a router problem prevent ping 127.0.0.1 from working ? [this becomes somewhat off-topic for LKML, sorry] Here's my routing table: linux-2.6-mm$ LANG=C netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 192.168.0.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 eth0 0.0.0.0 192.168.0.254 0.0.0.0 UG0 0 0 eth0 It seems it lacks a route for 127.0.0.1 through lo, isn't it? In this case, if eth0 hangs, 127.0.0.1 hangs too, no? So, it could be an ethernet driver bug. I'm using ne2k-pci. -- laurent - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.23-rc[68]-mm: network hangs
From: Laurent Riffard [EMAIL PROTECTED] Date: Sat, 29 Sep 2007 11:22:06 +0200 Could a router problem prevent ping 127.0.0.1 from working ? Two things that are new and could cause these problems: 1) We dynamically allocate the loopback device now. 2) We have the network namespace stuff. Another change in this area is that we do routing cache garbage collection from a workqueue instead of a timer but that was a pretty straightforward transformation so it is not high on my list of suspects. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.23-rc[68]-mm: network hangs
Hi, >From time to time, I experience some complete network hangs: Suddenly, all network connections become unresponsive. Even "ping 127.0.0.1" doesn't work. SysRq-w does not show any blocked processus. When such hang happen, I have to reboot (shutdown does work). This is not easily reproducible: it happens several minutes after boot (could be 45 minutes or 2 hours). I do not use heavy networking apps (like P2P). My typical usage is a Gnome desktop with browser, mailer, IM, video or audio streaming. I have a single PC connected to a DSL router via ethernet (so no LAN with NFS or CIFS). This happens with 2.6.23-rc8-mm2 and 2.6.23-rc6-mm1. I can't remember when I first see this problem. Maybe 2 months ago. I attached the output of "strace ping 127.0.0.1". How can I collect some more data when this problem happens ? ~~ laurent execve("/bin/ping", ["ping", "127.0.0.1"], [/* 42 vars */]) = 0 brk(0) = 0x93ac000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f9b000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=57464, ...}) = 0 mmap2(NULL, 57464, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f8c000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/cmov/libresolv.so.2", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0!\0\000"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=67408, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f8b000 mmap2(NULL, 75976, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7f78000 mmap2(0xb7f87000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf) = 0xb7f87000 mmap2(0xb7f89000, 6344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f89000 close(3)= 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory) open("/lib/tls/i686/cmov/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0`\1\000"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=1307104, ...}) = 0 mmap2(NULL, 1312164, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e37000 mmap2(0xb7f72000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13b) = 0xb7f72000 mmap2(0xb7f75000, 9636, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f75000 close(3)= 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e36000 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7e366c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect(0xb7f72000, 4096, PROT_READ) = 0 munmap(0xb7f8c000, 57464) = 0 socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3 getuid32() = 0 setuid32(0) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr("127.0.0.1")}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(44103), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0 close(4)= 0 setsockopt(3, SOL_RAW, ICMP_FILTER, ~(ICMP_ECHOREPLY|ICMP_DEST_UNREACH|ICMP_SOURCE_QUENCH|ICMP_REDIRECT|ICMP_TIME_EXCEEDED|ICMP_PARAMETERPROB), 4) = 0 setsockopt(3, SOL_IP, IP_RECVERR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [324], 4) = 0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [131072], [4]) = 0 brk(0) = 0x93ac000 brk(0x93cd000) = 0x93cd000 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f9a000 write(1, "PING 127.0.0.1 (127.0.0.1) 56(84"..., 49) = 49 setsockopt(3, SOL_SOCKET, SO_TIMESTAMP, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_SNDTIMEO, "\1\0\0\0\0\0\0\0", 8) = 0 setsockopt(3, SOL_SOCKET, SO_RCVTIMEO, "\1\0\0\0\0\0\0\0", 8) = 0 getpid()= 9353 rt_sigaction(SIGINT, {0x804b5b0, [], SA_INTERRUPT}, NULL, 8) = 0 rt_sigaction(SIGALRM, {0x804b5b0, [], SA_INTERRUPT}, NULL, 8) = 0 rt_sigaction(SIGQUIT, {0x804b5c0, [], SA_INTERRUPT}, NULL, 8) = 0 gettimeofday({1190968694, 793502}, NULL) = 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TIOCGWINSZ, {ws_row=52, ws_col=157, ws_xpixel=0, ws_ypixel=0}) = 0 gettimeofday({1190968694, 793762}, NULL) = 0 gettimeofday({1190968694, 793824}, NULL) = 0 sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0),
2.6.23-rc[68]-mm: network hangs
Hi, From time to time, I experience some complete network hangs: Suddenly, all network connections become unresponsive. Even ping 127.0.0.1 doesn't work. SysRq-w does not show any blocked processus. When such hang happen, I have to reboot (shutdown does work). This is not easily reproducible: it happens several minutes after boot (could be 45 minutes or 2 hours). I do not use heavy networking apps (like P2P). My typical usage is a Gnome desktop with browser, mailer, IM, video or audio streaming. I have a single PC connected to a DSL router via ethernet (so no LAN with NFS or CIFS). This happens with 2.6.23-rc8-mm2 and 2.6.23-rc6-mm1. I can't remember when I first see this problem. Maybe 2 months ago. I attached the output of strace ping 127.0.0.1. How can I collect some more data when this problem happens ? ~~ laurent execve(/bin/ping, [ping, 127.0.0.1], [/* 42 vars */]) = 0 brk(0) = 0x93ac000 access(/etc/ld.so.nohwcap, F_OK) = -1 ENOENT (No such file or directory) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f9b000 access(/etc/ld.so.preload, R_OK) = -1 ENOENT (No such file or directory) open(/etc/ld.so.cache, O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=57464, ...}) = 0 mmap2(NULL, 57464, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f8c000 close(3)= 0 access(/etc/ld.so.nohwcap, F_OK) = -1 ENOENT (No such file or directory) open(/lib/tls/i686/cmov/libresolv.so.2, O_RDONLY) = 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0!\0\000..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=67408, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f8b000 mmap2(NULL, 75976, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7f78000 mmap2(0xb7f87000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf) = 0xb7f87000 mmap2(0xb7f89000, 6344, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f89000 close(3)= 0 access(/etc/ld.so.nohwcap, F_OK) = -1 ENOENT (No such file or directory) open(/lib/tls/i686/cmov/libc.so.6, O_RDONLY) = 3 read(3, \177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0`\1\000..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0644, st_size=1307104, ...}) = 0 mmap2(NULL, 1312164, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7e37000 mmap2(0xb7f72000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13b) = 0xb7f72000 mmap2(0xb7f75000, 9636, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7f75000 close(3)= 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7e36000 set_thread_area({entry_number:-1 - 6, base_addr:0xb7e366c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect(0xb7f72000, 4096, PROT_READ) = 0 munmap(0xb7f8c000, 57464) = 0 socket(PF_INET, SOCK_RAW, IPPROTO_ICMP) = 3 getuid32() = 0 setuid32(0) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 4 connect(4, {sa_family=AF_INET, sin_port=htons(1025), sin_addr=inet_addr(127.0.0.1)}, 16) = 0 getsockname(4, {sa_family=AF_INET, sin_port=htons(44103), sin_addr=inet_addr(127.0.0.1)}, [16]) = 0 close(4)= 0 setsockopt(3, SOL_RAW, ICMP_FILTER, ~(ICMP_ECHOREPLY|ICMP_DEST_UNREACH|ICMP_SOURCE_QUENCH|ICMP_REDIRECT|ICMP_TIME_EXCEEDED|ICMP_PARAMETERPROB), 4) = 0 setsockopt(3, SOL_IP, IP_RECVERR, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [324], 4) = 0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [65536], 4) = 0 getsockopt(3, SOL_SOCKET, SO_RCVBUF, [131072], [4]) = 0 brk(0) = 0x93ac000 brk(0x93cd000) = 0x93cd000 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f9a000 write(1, PING 127.0.0.1 (127.0.0.1) 56(84..., 49) = 49 setsockopt(3, SOL_SOCKET, SO_TIMESTAMP, [1], 4) = 0 setsockopt(3, SOL_SOCKET, SO_SNDTIMEO, \1\0\0\0\0\0\0\0, 8) = 0 setsockopt(3, SOL_SOCKET, SO_RCVTIMEO, \1\0\0\0\0\0\0\0, 8) = 0 getpid()= 9353 rt_sigaction(SIGINT, {0x804b5b0, [], SA_INTERRUPT}, NULL, 8) = 0 rt_sigaction(SIGALRM, {0x804b5b0, [], SA_INTERRUPT}, NULL, 8) = 0 rt_sigaction(SIGQUIT, {0x804b5c0, [], SA_INTERRUPT}, NULL, 8) = 0 gettimeofday({1190968694, 793502}, NULL) = 0 ioctl(1, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0 ioctl(1, TIOCGWINSZ, {ws_row=52, ws_col=157, ws_xpixel=0, ws_ypixel=0}) = 0 gettimeofday({1190968694, 793762}, NULL) = 0 gettimeofday({1190968694, 793824}, NULL) = 0 sendmsg(3, {msg_name(16)={sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr(127.0.0.1)},