[Bug 422016] Re: DNS lookups hang network in 2.9-4ubuntu6

2009-09-30 Thread Tollef Fog Heen
** Changed in: glibc (Ubuntu)
   Status: New => Invalid

-- 
DNS lookups hang network in 2.9-4ubuntu6
https://bugs.launchpad.net/bugs/422016
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 422016] Re: DNS lookups hang network in 2.9-4ubuntu6

2009-09-04 Thread Brandon Mitchell
After looking at every possible cause for this problem, I finally
tracked it down to openvpn starting automatically and connecting to a
local machine.  In that past, I discovered this quickly because openvpn
could keep reconnecting and dropping the network, but it seems they've
now figured out how to keep the connection to the local network via the
local network up.  I've since discovered how to prevent openvpn from
starting automatically.

Please feel free to close this bug since this was entirely user error.

-- 
DNS lookups hang network in 2.9-4ubuntu6
https://bugs.launchpad.net/bugs/422016
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 422016] Re: DNS lookups hang network in 2.9-4ubuntu6

2009-09-01 Thread Brandon Mitchell
Ok, this is really bugging me.  The newer version of glibc is reducing
the problem, but it still exists.  There seems to be something blocking
network activity during DNS lookups, perhaps only when the lookup has
problems but it could just be that lookups when everything is normal go
so fast that it's not noticeable.  I don't know if this is a libc issue,
some other low level library, or a kernel issue, but I'm now able to
reproduce it in a controlled way.

Both machines in this test are behind this router 192.168.234.254 (wrt).
DNS is provided by ISP on IP 68.105.28.11, 68.105.29.11, and
68.105.28.12 (cdns*.cox.net).  One machine is an old Debian system which
I'm using for a control and to access the router.  At no point does its
ping to 192.168.234.254 drop during these tests using the same commands.

The Ubuntu system has the following network config:
# ifconfig eth0
eth0  Link encap:Ethernet  HWaddr 00:01:6c:ea:99:0b  
  inet addr:192.168.234.12  Bcast:192.168.234.255  Mask:255.255.255.0
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:535502 errors:0 dropped:0 overruns:0 frame:0
  TX packets:400989 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:100 
  RX bytes:532182373 (532.1 MB)  TX bytes:72156008 (72.1 MB)

If I release my IP on the switch to the ISP, the Debian system continues to 
ping.  However, I'm seeing the following on the Ubuntu system:
# window 1:
$ date; ping -n -c 20 192.168.234.254; date
Tue Sep  1 11:24:05 EDT 2009
PING 192.168.234.254 (192.168.234.254) 56(84) bytes of data.
64 bytes from 192.168.234.254: icmp_seq=1 ttl=63 time=2.68 ms
64 bytes from 192.168.234.254: icmp_seq=2 ttl=63 time=2.38 ms

--- 192.168.234.254 ping statistics ---
20 packets transmitted, 2 received, 90% packet loss, time 19002ms
rtt min/avg/max/mdev = 2.384/2.536/2.688/0.152 ms
Tue Sep  1 11:24:25 EDT 2009

# window 2:
$ date; host www.google.com; date
Tue Sep  1 11:24:04 EDT 2009
;; connection timed out; no servers could be reached
Tue Sep  1 11:24:18 EDT 2009

Essentially the ping command was succeeding to my internal network until
a few seconds into the DNS lookup.  At that point, no further pings were
returned.

I then renewed my IP from my ISP on the router from the Debian system (Ubuntu 
can't even reach it by IP in this state as you can see from the ping), and 
after receiving an address from the ISP, ran the same commands again:
# window 1:
$ date; ping -n -c 20 192.168.234.254; date
Tue Sep  1 11:25:11 EDT 2009
PING 192.168.234.254 (192.168.234.254) 56(84) bytes of data.
64 bytes from 192.168.234.254: icmp_seq=6 ttl=63 time=2.95 ms
64 bytes from 192.168.234.254: icmp_seq=7 ttl=63 time=2.13 ms
64 bytes from 192.168.234.254: icmp_seq=8 ttl=63 time=2.45 ms
64 bytes from 192.168.234.254: icmp_seq=9 ttl=63 time=2.18 ms
64 bytes from 192.168.234.254: icmp_seq=10 ttl=63 time=2.13 ms
64 bytes from 192.168.234.254: icmp_seq=11 ttl=63 time=2.15 ms
64 bytes from 192.168.234.254: icmp_seq=12 ttl=63 time=2.28 ms
64 bytes from 192.168.234.254: icmp_seq=13 ttl=63 time=2.19 ms
64 bytes from 192.168.234.254: icmp_seq=14 ttl=63 time=1.80 ms
64 bytes from 192.168.234.254: icmp_seq=15 ttl=63 time=2.08 ms
64 bytes from 192.168.234.254: icmp_seq=16 ttl=63 time=2.17 ms
64 bytes from 192.168.234.254: icmp_seq=17 ttl=63 time=2.17 ms
64 bytes from 192.168.234.254: icmp_seq=18 ttl=63 time=2.35 ms
64 bytes from 192.168.234.254: icmp_seq=19 ttl=63 time=2.08 ms
64 bytes from 192.168.234.254: icmp_seq=20 ttl=63 time=2.18 ms

--- 192.168.234.254 ping statistics ---
20 packets transmitted, 15 received, 25% packet loss, time 19026ms
rtt min/avg/max/mdev = 1.805/2.224/2.955/0.240 ms
Tue Sep  1 11:25:31 EDT 2009

# window 2:
$ date; host www.google.com; date
Tue Sep  1 11:25:10 EDT 2009
www.google.com is an alias for www.l.google.com.
www.l.google.com has address 209.85.225.99
www.l.google.com has address 209.85.225.103
www.l.google.com has address 209.85.225.104
www.l.google.com has address 209.85.225.147
Tue Sep  1 11:25:17 EDT 2009

As you can see from this, shortly before the DNS resolution completes,
the pings resume.  I can't quite make out why or how a DNS lookup would
block a ping command to an IP without name resolution active, but that
appears to be the case.  I've attached a "tcpdump -i eth0 -v" of the
session as well to help track down what the network is doing in each
scenario.

The only logs I see in syslog are from the tcpdump and from a cron job that 
does an imap connection to the Debian system:
Sep  1 11:24:03 bmitch-t42 kernel: [82087.299987] device eth0 entered 
promiscuous mode
Sep  1 11:25:02 bmitch-t42 /USR/SBIN/CRON[3]: (bmitch) CMD (offlineimap 
>${HOME}/.offlineimap.log 2>&1)
Sep  1 11:25:39 bmitch-t42 kernel: [82183.164614] device eth0 left promiscuous 
mode

Let me know how else I may assist in tracking down this bug.  Thank you.

** Attachment added: "TCP Dump"
   http://launchpadlibrarian.net/31148982/tcpdump.20090901.g