Nigel Kukard wrote:

>I'm unable to reproduce anything similar to this on our side, we have
>some Policyd boxes in production which handle a very large number of
>mails per day and maintain about 500 connections to a single MySQL
>server. I've seen maybe one of those errors in Policyd. Can you maybe
>shove Policyd into debug mode and see if it appears any query is getting
>stuck? possibly may be an idea to tcpdump -w the traffic out and see if
>there are any PSH's or repeated packet transmissions which may indicate
>packet loss?

OK, loads of PSHs, but as I understand it, those are normal and the result of 
the application forcing any data to be sent.

I still see no errors in the policyd logging.

Looking through the packet capture (taken on the backend, and I suddenly 
realise it might have been useful to capture both ends at once), I reckon there 
are (using a bit of grep/sed/sort/uniq) 187 duplicated lines - ie everything 
except packet number and timestamp the same) from a capture of 8.4k packets.

Just searching by port number from a list of duplicate packets, I did find this 
in the logs :
172.16.1.112 is the backend, 172.16.0.85 is one of the mail servers

8118 207.641643  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539235 TSER=40961195
8119 207.842502  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539286 TSER=40961195
8120 208.250402  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539388 TSER=40961195
8121 209.066162  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539592 TSER=40961195
8122 210.697905  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515540000 TSER=40961195
8123 213.961077  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515540816 TSER=40961195
8124 220.487679  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515542448 TSER=40961195

Which does seem a bit strange !

All the lines containing 53304 are :
5769  76.216021  172.16.0.85 -> 172.16.1.112 TCP 74 74 53304 > 10031 [SYN] 
Seq=0 Win=5840 Len=0 MSS=1460 SACK_PERM=1 TSV=515506372 TSER=0 WS=6
5770  76.216052 172.16.1.112 -> 172.16.0.85  TCP 74 74 10031 > 53304 [SYN, ACK] 
Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 SACK_PERM=1 TSV=40961146 TSER=515506372 WS=6
5771  76.216310  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [ACK] 
Seq=1 Ack=1 Win=5888 Len=0 TSV=515506372 TSER=40961146
5772  76.216500  172.16.0.85 -> 172.16.1.112 TCP 609 609 53304 > 10031 [PSH, 
ACK] Seq=1 Ack=1 Win=5888 Len=543 TSV=515506372 TSER=40961146
5773  76.216518 172.16.1.112 -> 172.16.0.85  TCP 66 66 10031 > 53304 [ACK] 
Seq=1 Ack=544 Win=6912 Len=0 TSV=40961146 TSER=515506372
5781  76.413098 172.16.1.112 -> 172.16.0.85  TCP 80 80 10031 > 53304 [PSH, ACK] 
Seq=1 Ack=544 Win=6912 Len=14 TSV=40961195 TSER=515506372
5782  76.413345 172.16.1.112 -> 172.16.0.85  TCP 66 66 10031 > 53304 [FIN, ACK] 
Seq=15 Ack=544 Win=6912 Len=0 TSV=40961195 TSER=515506372
5783  76.413443  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [ACK] 
Seq=544 Ack=15 Win=5888 Len=0 TSV=515506421 TSER=40961195
5787  76.451696  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515506431 TSER=40961195
8118 207.641643  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539235 TSER=40961195
8119 207.842502  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539286 TSER=40961195
8120 208.250402  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539388 TSER=40961195
8121 209.066162  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515539592 TSER=40961195
8122 210.697905  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515540000 TSER=40961195
8123 213.961077  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515540816 TSER=40961195
8124 220.487679  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515542448 TSER=40961195
8344 233.539700  172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] 
Seq=544 Ack=16 Win=5888 Len=0 TSV=515545712 TSER=40961195

Now I don't know if it's relevant at all, but as I read that, the backend is 
pushing remaining data and then closing the connection (5781, 5782), the mail 
server is ACKing the packets (5783, 5787), and then 5 minutes later the mail 
server is timing out and closing the connection. Seems odd since it's ACK'd a 
FIN packet earlier.


If I transfer a 1/2G file with SCP, it just streams across between the machines 
(either way) at about 11.2MB/s which is about right for a 100mbps network 
connection.

-- 
Simon Hobson

Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed
author Gladys Hobson. Novels - poetry - short stories - ideal as
Christmas stocking fillers. Some available as e-books.
_______________________________________________
Users mailing list
[email protected]
http://lists.policyd.org/mailman/listinfo/users

Reply via email to