Re: AMDS problem with AFS3.4a

Marcus Watts Mon, 20 May 1996 07:07:14 GMT
Today, on the wire, RX is already "de facto" using a "standard"
range of error codes - the codes that "most clients" are using.
Here's the three places where it's worth thinking about error codes:

(1) on the server side (say, the fileserver) - what
error numbers can it generate in the source?

(2) on the wire - the error codes passed in the rx packets.

(3) on the client side (say, the cache manager) - what
error numbers does it see here?

Preserving "low numbers" in source code is of limited
value, because you already have the handy dandy library
functions to map error codes.  Preserving "low numbers"
on the wire means you can preserve server/client interoperability.
That would eliminate the "rev all the clients" incompatibilty -
you would just gradually "work better" as times goes on.
It does mean that on the server and client sides in the
source, for the error #'s that clash, you would want to want
to use different manifests (which should be easy to find);
that's mildly ugly, but not nearly as bad as all that.

One other thing worth mentioning is there are some further
pecularities that relate to com_err.  For instance, MIT
compile_et understands lower-case table names, and can
generate negative error numbers for these.  This has some
unfortunate consequences with regard to AFS - for instance,
ubik (ubikclient.c: ubik_Call_New) thinks that all negative numbers
are "communications" errors and retries the next server,
and thinks that positive numbers are hard errors.  The error message
logic in comerr/error_msg.c: error_message() shares a similar
misunderstanding about negative numbers, and further interprets error
codes 101-111 as volume errors, and not as operating system errors.
One of the places this ugliness rears its head (for us) is in making
an rx-capable version of kadmind for MIT style password changing;
the error numbers returned by the MIT kadm libraries use the
lower-case table name of "kadm", which of course generates
the negative error base of -1783126272.  Fortunately, ubik
never has reason to see kadm style negative numbers, but
compile_et & error_message were both issues.

On the bright side of things, I like the fact that transarc's
compile_et doesn't try to run the C compiler.  I've become quite
fond of cross-compilation...

It would be nice if:
        error_msg checked for tables first, and only
                did the negative_message logic AFTER not finding a table.
        compile_et didn't automatically upper-case table
                names, and the AFS *.et files had upper-case
                names in them (compile_et would *then* need
                to lower-case table names, most probably.)
        ubik only considered a small range (perhaps, -500 or -50 to -1)
                as "find another server" type errors.

                                -Marcus Watts
                                UM ITD PD&D Umich Systems Group
Re: AMDS problem with AFS3.4a

Reply via email to