This is exactly our situation.  We have a single CPU Alpha 600 running
3.4a and are accessing AFS via the translator from our multi-cpu Alpha
8400.  In our situation, after a file is written, you can do several
"ls -l" and see the file, see the file, oops - it's gone (zero
length).  The translator then spits out the "failed to store"
message.  This tends to happen randomly, i.e. not for every file
written.  It also happens more often when you do several things in a
volume at the same time.

Adding "sync" and "wsize=16384" and increasing the timeo seem to have
helped some, but it failures are still frequent enough that we have to
supply UNIX file space for users on the 8400.  (mts are umich
suggested increasing the write size).  It is unclear to me why the
error is coming back as permission denied, and why the translator is
dropping files after getting them from NFS.  It is not because tokens
have expired, or some other fairly high level explanation.  It
appears more to be with the interface between AFS and NFS.  I have
tried virtually all combinations of NFS parameters, and have greatly
reduced the frequency, but not solved the problem.  Since we are just
spinning up AFS here, it's been quite difficult for me to convince
users that this is not the AFS that I have come to know and love (it's
created a lot of headaches).

We are also unable to move files (mv foo foobar).  We see an error on
the translator like:
vmunix:  rename, client address = xxx.xxx.xxx.xxx, errno 2
vmunix: NFS server: stale file handle fs(70,1) file 1 gen 536870993

but I'm not sure if this is an AFS thing or a Digital/NFS thing.

If you hear more, or come up with a solution, please let me know.
Hopefully we will be able to run AFS natively on the 8400 in a few
weeks and this will be less of an issue.  I hope the port is finished
on time is is as robust as the Sun multi-cpu ports.

thanks
-dave

Reply via email to