> Preserving "low numbers" in source code is of limited
> value, because you already have the handy dandy library
> functions to map error codes.  Preserving "low numbers"
> on the wire means you can preserve server/client interoperability.
> That would eliminate the "rev all the clients" incompatibilty -
> you would just gradually "work better" as times goes on.

Actually, this could be done in just one revision.  There are really only
two cases where "OS" errors are currently sent over the wire.  The first,
which I believe is actually quite rare, is when a system call fails in
a server.  In that case, it often makes sense to send the actual value
of errno across the wire, because it can make debugging easter.  I would
propose that no change be made here.

The other case is when fileservers report errors to clients.  In this
case, the fileserver often uses UNIX error codes that are close to the
intended meaning.  I believe the correct solution here is to devise
an entirely new table of error codes that the fileserver reports to the
cache manager, and always use those codes rather than UNIX error codes
to report "filesystem" errors.  In addition to promoting better
interoperability, this would allow a set of error codes that more
closely represent reality - conversion to UNIX error codes should be done
by the cache manager, which would then be able to use a more rich set of
error codes if the OS has them.

As I said, new error codes could be introduced in a single version,
without a loss of backward compatibility.  For starters, the new error
codes would never overlap the old ones, so it would be possible for new
cache managers to support both sets of codes simultaneously.  Naturally,
UNIX applications would only see UNIX-style error codes; the cache manager
would always do translation of some sort.

The fileserver should always use new-style error codes internally, since
they are supposed to be more precise.  Since each new error code maps to
exactly one UNIX error code, it should be possible to include a function
on the server to translate new error codes to old ones, which would be
called at the end of each RPC to translate the error code before returning
it.

Now comes the tricky part.  Using the scheme above, AFS 3.5 could support
new error codes in the cache manager, but not produce them, and AFS 3.6
could produce them.  This would maintain backward compatibility only one
version, and take two versions to release.  However, there is an alternative
which allows the change to be done in one version, without affecting
backward compatibility ever (theoretically).  Let the fileserver keep track
of whether each client supports new-style error codes.  A client can make
a call at each connection, or when it gets the first connection to a server,
or some such, indicating that it understands new-style error codes.  The
server then remembers this information, and produces new codes for that
client.  This shouldn't be too unreasonable, since the fileserver must
keep track of each client anyway.

-- Jeff

Reply via email to