Hi,

Recently I've noticed some issues in the pound logs where a connection
to the back-end server timed out, e.g.:

Nov  4 03:13:11 balance1 pound: backend 166.70.134.194:80 connect:
Connection timed out
Nov  4 03:13:16 balance1 pound: BackEnd 166.70.134.194:80 resurrect

After looking into it a bit, it seems that once in a while (say ~200
times per day out of 2,000,000+ requests) iptables on the back-end
server blocks a packet from the pound server. I say this because I can
see entries like the following in the logs on the back-end web servers:

Nov  4 03:10:11 wwwut3 [1280662.309643] RULE 5 -- DENY IN=eth0 OUT=
SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=65011 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0

Nov  4 03:10:14 wwwut3 [1280665.307411] RULE 5 -- DENY IN=eth0 OUT=
SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=65012 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0

Nov  4 03:10:20 wwwut3 [1280671.307415] RULE 5 -- DENY IN=eth0 OUT=
SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=65013 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0

Nov  4 03:10:32 wwwut3 [1280683.307406] RULE 5 -- DENY IN=eth0 OUT=
SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=65014 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0

Nov  4 03:10:56 wwwut3 [1280707.307406] RULE 5 -- DENY IN=eth0 OUT=
SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=65015 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0

Nov  4 03:11:44 wwwut3 [1280755.307410] RULE 5 -- DENY IN=eth0 OUT=
SRC=166.70.134.196 DST=166.70.134.194 LEN=60 TOS=0x00 PREC=0x00 TTL=64
ID=65016 DF PROTO=TCP SPT=33942 DPT=80 WINDOW=5840 RES=0x00 SYN URGP=0

which, to my reading, is blocking traffic that would normally be allowed
by the firewall:

iptables --list --numeric
Chain INPUT (policy DROP)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0           state
RELATED,ESTABLISHED
...
Cid43519296.0  tcp  --  0.0.0.0/0        0.0.0.0/0           tcp dpt:80
state NEW
...
RULE_5     all  --  0.0.0.0/0            0.0.0.0/0
...
Chain Cid43519296.0 (1 references)
target     prot opt source               destination
ACCEPT     all  --  166.70.134.196       0.0.0.0/0
ACCEPT     all  --  166.70.134.197       0.0.0.0/0
...
Chain RULE_5 (3 references)
target     prot opt source               destination
LOG        all  --  0.0.0.0/0            0.0.0.0/0           LOG flags 0
level 6 prefix `RULE 5 -- DENY '
DROP       all  --  0.0.0.0/0            0.0.0.0/0

There's always a sequence like that, too - an initial packet gets
blocked, followed by another 3 seconds later, then 6 seconds, etc.  I
believe the sequence ends where it does because I have a 180 second
timeout in pound to the back-end servers.

In an effort to troubleshoot this situation, I switched to a different
machine running pound, with no effect (both machines use a 2.6.24
kernel).  I was previously using pound 2.3.2 but earlier this week
upgraded to 2.4.5, again with no effect.  The back-end web servers are
running kernel versions 2.6.29 to 2.6.30 and all seem affected more or
less equally.  None of the servers involved are anywhere near capacity
(CPU usage below 10%, memory usage below 33%).

I would <love> to know what I could do to identify the reason behind
these errors, and even better, eliminate them.  Any help or
investigative steps to take would be <much> appreciated.

-David Clark

--
To unsubscribe send an email with subject unsubscribe to [email protected].
Please contact [email protected] for questions.

Reply via email to