On 9/25/2012 8:29 AM, Ralf Hildebrandt wrote: > * Mikael Bak <m...@inbox.lv>: >> Hi Stan, >> >> On 09/25/2012 08:22 AM, Stan Hoeppner wrote: >>> >>> Apparently Linux and Windows TCP window scaling doesn't always work >>> reliably together. Try disabling TCP window scaling on the Linux box(en): >>> >> [snip] >> >> Perhaps off topic, but do you have any links to documents or similar >> that proves that there is a problem between the two operationg systems >> with regard to TCP window scaling. This is the first time I hear about >> this to be honest. > > I was wondering about this as well. I mean, it doesn't happen THAT > often.
First, this does seem to be a rare issue. Given the behavior you're seeing it seems likely the problem is in the TCP stack. TCP window scaling mis-negotiation simply seems a likely culprit. Linux kernels have a workaround hack for window scaling issues: man 7 tcp tcp_workaround_signed_windows (Boolean; default: disabled; since Linux 2.6.26) If enabled, assume that no receipt of a window-scaling option means that the remote TCP is broken and treats the window as a signed quantity. If disabled, assume that the remote TCP is not broken even if we do not receive a window scaling option from it. To me this seems a partial workaround, not an absolute, which is why I recommended testing with window scaling totally disabled on one side of the connection. Since window scaling is designed to maximize throughput for streaming data transfer applications such as FTP, disabling it will have little, if any, negative impact on SMTP traffic, which is transactional and bursty in nature. Disabling windows scaling in your Postfix/Exchange case should simply force both to use the RFC1323 64KB max window size. If the problem is window negotiation, disabling it should fix the problem. The rarity of manifestation seems to indicate that on occasion you have long bursts of traffic between the two hosts--bursts of sufficient duration to cause one or both hosts to initiate window scaling to increase throughput. When this occurs, and if negotiation fails, you may see things break at the application level. Regarding docs or links, I couldn't find any official documentation describing this issue, only a few scattered forum posts, which is likely directly related to the rarity of occurrence. You could always put a trace on the Linux ethernet interface to confirm the TCP problem. But given the rarity of occurrence, twice in 4 weeks, that would yield a rather large file to search. -- Stan