Andrew, I've been following this thread and what strikes me is that it's similar to your previous issue, which was some of your remote links were down and dropping packets.
Can you get access to the remote side and tru pushing data to a test box on your side to see if you can replicate the problem? It's going to be hard to debug (as we all know!) if it's not repeatable. In your solution you say that some sockets die, and it randomly moves around, but that you have some sockets which have never died. Do you have any logs which show the connections for each socket? Can you sort that data to look for patterns? Maybe there's a single remote host having problems which gets scattered across your setup? Also, you talk about having multiple VPNs, which is confusing. Do you really mean that you have multiple VLANs, with one setup for internal traffic, and the other one for external traffic? Having the window size drop to zero on your client side sniffs of another problem with the link(s) between you and the source. You say you have tcpdump logs which show the problem of the window size starting at 26k and dropping to zero. During this time, are you seeing *any* traffic from the source on that specific connection, or is the client side sending back TCP Acks and looking for a reply? If the source systems are reasonably close by latency wise, maybe you could run a test to shrink the TCP window to 16k (what size packets are you getting?) or even smaller, to see if that's an issue. Also, are you running Jumbo Frames at all on these links? Just a thought... But please keep us in the loop, it's great when we can all share our experiences and problem solutions. John _______________________________________________ Tech mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
