The interesting part about that PCAP is the query for microshaft.org isn't answered until 9 seconds later, what was the server doing for 9 seconds?
--Augie On Tue, Dec 13, 2011 at 3:55 PM, Grant Keller <[email protected]> wrote: > Thanks for the help so far on this. > > So this is some information gathered from one of our productions servers > that we have been having problems with. These are the smae problems I have > been working to replicate on a test server. > > The pdns.conf file: > cache-ttl=300 > disable-axfr=yes > launch=gmysql > gmysql-socket=/var/lib/mysql/mysql.sock > gmysql-user=root > gmysql-dbname=pdns > local-port=53 > logging-facility=0 > loglevel=4 > max-tcp-connections=1000 > negquery-cache-ttl=600 > out-of-zone-additional-processing=yes > query-cache-ttl=300 > recursive-cache-ttl=300 > recursor=127.0.0.1:5300 > send-root-referral=no > setgid=pdns > setuid=pdns > webserver=yes > webserver-address=64.142.56.28 > webserver-port=8081 > version-string=powerdns > > This is the netstat -tn output from that server around the time of the > failure: > > Active Internet connections (w/o servers) > Proto Recv-Q Send-Q Local Address Foreign Address State > tcp 0 0 127.0.0.1:45810 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45811 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45809 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45814 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45815 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45812 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45813 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45818 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45819 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45816 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45817 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45822 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45823 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45820 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45821 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45843 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45841 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45844 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45827 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45826 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45825 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45824 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45831 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45830 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45829 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45828 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45835 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45834 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45833 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45832 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45839 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45838 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45837 127.0.0.1:5300 TIME_WAIT > tcp 0 0 127.0.0.1:45836 127.0.0.1:5300 TIME_WAIT > tcp 0 0 64.142.56.28:34543 64.142.105.21:3306 > ESTABLISHED > tcp 0 0 64.142.56.28:43329 206.188.198.28:53 > TIME_WAIT > tcp 0 0 64.142.56.28:60970 64.142.56.26:179 > ESTABLISHED > tcp 0 0 64.142.56.28:33423 149.20.69.87:49222 > TIME_WAIT > tcp 0 0 208.201.224.11:48469 208.201.224.11:53 > TIME_WAIT > tcp 0 0 64.142.56.28:39078 64.142.56.27:179 > ESTABLISHED > tcp 0 0 64.142.56.28:46383 64.34.211.132:53 > TIME_WAIT > tcp 0 0 ::ffff:64.142.56.28:22 ::ffff:64.142.18.25:35662 > ESTABLISHED > tcp 0 0 ::ffff:64.142.56.28:22 ::ffff:64.142.18.25:36017 > ESTABLISHED > > > This server is mixed auth/recursive. The querys we test for are > authoritative. The link below is a tcpdump that captured the problem, it is > just the single tcpstream. If more of that pcap is needed, let me know. > > ftp://ftp.sonic.net/pub/users/gkeller/dns/c.ns.sr.pcap > > >> >> You really should check the existing TCP connections when the timeouts >> start. Probably you should also check for TCP connections waiting to be >> torn down (TIME_WAIT). There also might be issues if connection tracking >> is enabled and netfilter runs out of memory. Is there some firewall/NAT >> between the client and the server? >> >> What does tcp_timeout really mean? Is the timeout triggered when waiting >> for the DNS response or even before during establishment of the TCP >> connection? >> >> regards >> klaus >> >> >> regards >> Klaus >> > > > -- > Grant Keller > > _______________________________________________ > Pdns-users mailing list > [email protected] > http://mailman.powerdns.com/mailman/listinfo/pdns-users -- Augie Schwer - [email protected] - http://schwer.us _______________________________________________ Pdns-users mailing list [email protected] http://mailman.powerdns.com/mailman/listinfo/pdns-users
