Jay writes:
> The problem is that all authentication commands (klog, afs login, kas)
> get failed though other functionality (fs, bos) is working fine. When I
> issue 'klog', the output is like this;
>
> Unalble to authenticate to AFS because Authentication Server was unavailable.
It seems the subroutine ka_UserAuthenticateGeneral
will return "Authentication Server was unavailable" if it gets
back KAUBIKCALL from the ka wrapper routines. That in turn
happens if the ubik call fails, and returns any error other
than one between 180480 and 180735 (a ka. error). Sadly, that's
not very informative; you don't know if it was ubik, rx, or
something else that failed. I believe you have two strategies, (1)
guess, and (2) find clues. You may be able to manage (1) by some
combination such as noticing that your time is a bit off, or
remembering that you did something funny to CellServDB, or some such.
Doing (2)--find clues is best done with a copy of klog that isn't
stripped. Basically, you need to run it until kawrap_ubik_Call
returns, then find out what the error code is. This is something
you can best do in cooperation with your Transarc Customer
representative, both in terms of acquiring such a "klog", and in
terms of deciphering the resulting error codes (you may need to step
it through several calls to kawrap_ubik_Call, in succession.)
If you can't get ahold of such a klog, you may find it simplier to try
to get a packet trace. The header file rx/rx_packet.h describes
the rx packet layout, you will want to watch traffic to/from port
7004 (on the database server), and you will want to look for
records of type RX_PACKET_TYPE_ABORT (4) in the type field.
Such packets should have 60 bytes of UDP data, the byte at offset
20 should be the type, and the last 4 bytes of data should be
the AFS error code. You can translate it into a decimal number
and use translate_et to look up the error message. You should
compare the traffice with that from a healthy machine that works
to be sure you understand where it differs.
Another thing you may as well because it's easy is "trace".
If something obvious like a "sento" is failing, you may
learn enough from the "trace" to fix it. This might also be
useful in terms of telling you if you are getting anything
back at all from the database server, etc.
-Marcus Watts
UM ITD PD&D Umich Systems Group