On 9/25/2012 8:29 AM, Ralf Hildebrandt wrote:
> * Mikael Bak <m...@inbox.lv>:
>> Hi Stan,
>>
>> On 09/25/2012 08:22 AM, Stan Hoeppner wrote:
>>>
>>> Apparently Linux and Windows TCP window scaling doesn't always work
>>> reliably together.  Try disabling TCP window scaling on the Linux box(en):
>>>
>> [snip]
>>
>> Perhaps off topic, but do you have any links to documents or similar
>> that proves that there is a problem between the two operationg systems
>> with regard to TCP window scaling. This is the first time I hear about
>> this to be honest.
> 
> I was wondering about this as well. I mean, it doesn't happen THAT
> often.

First, this does seem to be a rare issue.  Given the behavior you're
seeing it seems likely the problem is in the TCP stack.  TCP window
scaling mis-negotiation simply seems a likely culprit.  Linux kernels
have a workaround hack for window scaling issues:

man 7 tcp

tcp_workaround_signed_windows (Boolean; default: disabled;
since Linux 2.6.26)

              If enabled, assume that no receipt of a window-scaling
              option means that the remote TCP is broken and  treats
              the window  as  a  signed quantity.  If disabled, assume
              that the remote TCP is not broken even if we do not
              receive a window scaling option from it.

To me this seems a partial workaround, not an absolute, which is why I
recommended testing with window scaling totally disabled on one side of
the connection.  Since window scaling is designed to maximize throughput
for streaming data transfer applications such as FTP, disabling it will
have little, if any, negative impact on SMTP traffic, which is
transactional and bursty in nature.  Disabling windows scaling in your
Postfix/Exchange case should simply force both to use the RFC1323 64KB
max window size.  If the problem is window negotiation, disabling it
should fix the problem.

The rarity of manifestation seems to indicate that on occasion you have
long bursts of traffic between the two hosts--bursts of sufficient
duration to cause one or both hosts to initiate window scaling to
increase throughput.  When this occurs, and if negotiation fails, you
may see things break at the application level.

Regarding docs or links, I couldn't find any official documentation
describing this issue, only a few scattered forum posts, which is likely
directly related to the rarity of occurrence.

You could always put a trace on the Linux ethernet interface to confirm
the TCP problem.  But given the rarity of occurrence, twice in 4 weeks,
that would yield a rather large file to search.

-- 
Stan

Reply via email to