On 7/8/2010 9:31 AM, Matt Renzelmann wrote:
> Hello,
> 
> I've observed the following issue with OpenAFS.  Platform is Windows 7
> x64 "Ultimate" with all the latest Windows Update patches.  The behavior
> occurs with the last three stable releases of OpenAFS recommended for
> Windows:  1.5.75, 1.5.74, and 1.5.73.  Using Network Identity Manager
> 2.0.0.304 per Help -> About – the latest.
> 
> Details of the behavior:
> 
> - If I disable and then reenable the main network adapter--the one that
> AFS is ultimately using to access my AFS data--I observe that windows
> Explorer gets "stuck."  It appears to be stuck in some kind of busy
> live-lock state.
> 
> - I suspect that if I lose my Internet connection on the same adapter
> for any reason, I get a similar symptom, but I've not confirmed this.
> 
> - Attempting to terminate the explorer process once it's in this state
> fails.  It will not terminate.  Task Manager and Process Explorer +
> administrative escalation is not sufficient.
> 
> - All applications that use Explorer functionality, e.g. file open/save
> windows, will hang as soon as they invoke said functionality.
> 
> - Rebooting resolves the problem, though I often have some difficulty
> rebooting cleanly in this scenario.

This is a known issue which can only be fixed by Microsoft.  It is
documented in the OpenAFS 1.5.75 release notes.

In brief, if any network adapter link state changes or the DHCP IP
address registration is altered after the OpenAFS SMB Server registers
the "AFS <20>" netbios name, the Microsoft SMB redirector will drop the
connection to the "AFS <20>" SMB server and will never again be able to
resolve the name even though the name is locally registered.  The
registration can be checked with

  nbtstat -n

and the active netbios connections can be reported with

  nbtstat -S

If you examine the stack (Sysinternals Process Explorer) of any thread
that is accessing \\AFS after the failure occurs you will find that it
is blocked in the Microsoft SMB redirector (mrxsmb.sys) in a function
that is attempting to re-establish a connection to the SMB server.
Since the name "AFS <20>" is never found, the connection is never
established and the process will block indefinitely.

This problem cannot be addressed within OpenAFS except by replacing the
AFS SMB server interface with a native kernel file system driver.  The
work to do so is quite far along.  However, the AFS redirector has
uncovered other problems in Microsoft Office that prevents its use with
non-SMB file system interfaces.  Microsoft is working on that problem
but has no commitment as to when Office 2007 and 2010 might be patched
to correct the failure.

> More background:
> 
> - I'm using the DEBUG version of AFS currently in an effort to resolve
> this.  I've had the problem with 1.5.74/73 using the standard "release"
> version.

The debug version of OpenAFS will not help and in fact in most cases
running the debug version will hurt.  Please do not use the debug
version unless you are actively debugging a problem with a debugger.

The only thing that can be done is to file reports with Microsoft in an
effort to get them to escalate the problem and decide it is worth
fixing.  Several reports have been filed but as yet no one with
sufficient clout has made enough noise.

Jeffrey Altman

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to