Hi, Recently I've noticed some issues in the pound logs where a connection to the back-end server timed out, e.g.:
Nov 4 03:13:11 balance1 pound: backend 166.70.134.194:80 connect: Connection timed out Nov 4 03:13:16 balance1 pound: BackEnd 166.70.134.194:80 resurrect After looking into it a bit, it seems that once in a while (say ~200 times per day out of 2,000,000+ requests) iptables on the back-end server blocks a packet from the pound server. I say this because I can see entries like the following in the logs on the back-end web servers: Nov 4 03:10:11 wwwut3 [1280662.309643] RULE 5 -- DENY IN=eth0 OUT= SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65011 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 Nov 4 03:10:14 wwwut3 [1280665.307411] RULE 5 -- DENY IN=eth0 OUT= SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65012 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 Nov 4 03:10:20 wwwut3 [1280671.307415] RULE 5 -- DENY IN=eth0 OUT= SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65013 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 Nov 4 03:10:32 wwwut3 [1280683.307406] RULE 5 -- DENY IN=eth0 OUT= SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65014 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 Nov 4 03:10:56 wwwut3 [1280707.307406] RULE 5 -- DENY IN=eth0 OUT= SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65015 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 Nov 4 03:11:44 wwwut3 [1280755.307410] RULE 5 -- DENY IN=eth0 OUT= SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=65016 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0 which, to my reading, is blocking traffic that would normally be allowed by the firewall: iptables --list --numeric Chain INPUT (policy DROP) target prot opt source destination ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED ... Cid43519296.0 tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:80 state NEW ... RULE_5 all -- 0.0.0.0/0 0.0.0.0/0 ... Chain Cid43519296.0 (1 references) target prot opt source destination ACCEPT all -- 166.70.134.196 0.0.0.0/0 ACCEPT all -- 166.70.134.197 0.0.0.0/0 ... Chain RULE_5 (3 references) target prot opt source destination LOG all -- 0.0.0.0/0 0.0.0.0/0 LOG flags 0 level 6 prefix `RULE 5 -- DENY ' DROP all -- 0.0.0.0/0 0.0.0.0/0 There's always a sequence like that, too - an initial packet gets blocked, followed by another 3 seconds later, then 6 seconds, etc. I believe the sequence ends where it does because I have a 180 second timeout in pound to the back-end servers. In an effort to troubleshoot this situation, I switched to a different machine running pound, with no effect (both machines use a 2.6.24 kernel). I was previously using pound 2.3.2 but earlier this week upgraded to 2.4.5, again with no effect. The back-end web servers are running kernel versions 2.6.29 to 2.6.30 and all seem affected more or less equally. None of the servers involved are anywhere near capacity (CPU usage below 10%, memory usage below 33%). I would <love> to know what I could do to identify the reason behind these errors, and even better, eliminate them. Any help or investigative steps to take would be <much> appreciated. -David Clark -- To unsubscribe send an email with subject unsubscribe to [email protected]. Please contact [email protected] for questions.
