> Excerpts from ext.misc.info-afs: 27-Apr-95 File delete semantics in AFS
> Todd [EMAIL PROTECTED] (395)
>
> > I am seeing that if a process holds an AFS file open, and that
> > file is deleted externally, that eventually the process can no longer
> > read the file.
>
> I assume by lack of any response that nothing can be done about this
> (i.e. "AFS design")? Oh well, I hear it is fixed in DFS :-).
>
> -todd inglett
>
NFS has the same problem. The major reason one might want
to do this deliberately is for temporary files - and there's relatively
little benefit to storing temporary files in a networked filesystem if
you don't intend to "share" them.
The only time I can see this being a problem, if you're in fact updating
or replacing files that are shared. For instance, installing new
application binaries. But there are loads of other problems there: for
instance, if you're updating binaries - do you necessarily want to
discover a week later that someone encountered a bug you thought you
had fixed - because they ran it a week ago and never exited the program?
The design issues are certainly kind of interesting. NFS is
designed to be as entirely stateless as possible. That means
it can't keep files around after they're deleted - that would
imply it's keeping state information on clients, which would
violate one of its design precepts. The same is more or
less true of AFS - the callback mechanism is primarily
intended to alert a cache manager that a file has
changed - there isn't any real notion in AFS of a cache
manager "owning" any part of a file, and indeed, the
AFS file server is more or less free to discard state
on everything but active RPC's, at any time. The advantage
of this is the same as NFS - since the server doesn't have
any long-term state information, it's simple to recover
from a network partition or fileserver crash.
DFS knows more about what the client is doing; whereas in
AFS, callbacks apply to the entire file, in DFS, they apply
to a byte-range, and there is a notion of that a cache
manager can "check out" part of a file and "own it" - and
if another client requests that part of the file, the
file server will actually suck it back out of the client
that owns it. That permits DFS to much more closely model
Unix filesystem semantics, but there are some costs: while
an application that used shared files to share read-write
data between clients will now work, it will certainly not
perform anywhere near as well as the Unix filesystem.
Reliability may also suffer - a network partition that
breaks another client's connection to the server may now
be visible to your client, even though your client's connection
to the server may still be fine. Such problems may be very
application and site specific - they are more likely to show up
with canned applications perhaps ported from DOS that might
like to open files for writing if possible, or might
even update some sort of "last visited" record, than
with Unix utilities (which don't tend to keep files
open for long, and which rarely open files for writing
unless they actually mean to write to the file.)
Design tradeoffs are rarely straight-forward or obvious - and while
AFS obviously is not perfect, I'd hesitate to call it "broken" or call
for it being fixed, just because it's a bit "different". DFS is
very likely a step forward, and I'm sure the guys at transarc
put a lot of thought into just how to fix the problems in AFS.
The problems with AFS are at least minor enough that AFS can
be used to solve a lot of real-world problems. DFS will certainly
be an even better fit, but I'm sure it won't be perfect either,
and I assume it will only be a little bit better than AFS, not
because of any lack of effort on transarc's part, but because
AFS is really pretty good already.
-Marcus Watts
UM ITD RS Umich Systems Group