On 1/12/2010 12:40 PM, Richard Brittain wrote:
> The combination of properties we have investigated most thoroughly is:
> 
>  Windows XP (32-bit)
>  Symantec EndPoint Protection 11.0.4 firewall (Network Threat Protection)
>  OpenAFS 1.5.34 or later
>  Windows network interface with a non-default MTU set

[...]

> HKLM\System\CurrentControlSet\Services\TransarcAFSDaemon\Parameters\RxMaxMTU
> 
> (default value is 0)

To clarify, the problem is unrelated to the version of OpenAFS for
Windows being used.  What matters is the "RxMaxMTU" value that is set in
the registry.  Beginning in the 1.3 series the RxMaxMTU value was
changed from 0 to 1260 octets in order to avoid communication failures
that ensued when the then current version of Cisco VPN IPSec client
software was in use.  The Cisco IPSec tunnel would not transfer UDP
packets that required fragmentation when the effective MTU size for the
network was reduced by the size of the IPSec headers.

The 1.5.34 release of OpenAFS for Windows restored the RxMaxMTU value to
0 which permits the Rx library to use the largest MTU size that can be
negotiated between the two Rx peers (in most cases 1444 octets.)

When the network interface MTU size is hard coded to a value smaller
than 1500 octets, the UDP packets are fragmented.  When the Symantec
EndPoint Protection 11.0.4 firewall is added, data corruption ensues.

Removing the artificial limit on network interface MTU or forcing the Rx
library to use a RxMaxMTU smaller than 1272 octets prevents the UDP
fragmentation from taking place and the data corruption is avoided.

----

Other observations:

1. I produced a modified cache manager that computed and tracked MD5
hash values for all writes accepted via the SMB server and verified them
prior to passing data into the rx_Write() function.  The corruption is
not occurring to the cache manager file chunk buffers.

2. Data corruption occurs even when "fs setcrypt -on" is active.  This
indicates that the corruption is occurring prior to the rxkad checksum
being computed and added to the packet header.

3. I built a special version of an Rx library from 2008 to use with the
1.5.68 afsd_service.exe.  The data corruption still occurs so the
problem is not something that was introduced recently into the code base.

4. Even when the network interface MTU is manually set from the
registry, the API calls that report the MTU to afsd_service.exe indicate
the configured MTU for the network subnet router.

----

Conclusions:

For the time being at least, set the RxMaxMTU value back to 1260 octets.
 This will result in a performance penalty but will avoid this data
corruption.   Hopefully one night soon I will wake up in the middle of
the night understanding where and how this data corruption is taking place.

Jeffrey Altman



Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to