>>  1. Log at DEBUG level the entry into all CentralAuthenticationService
>> methods.
>
> If anything they should most likely be trace

TRACE is fine, even preferable.  You mention CAS4, but this has
sufficient value that it's worth implementing in CAS3 in the
CentralAuthenticationServiceImpl class.  It would probably be easiest
to avoid AOP in CAS3 since it's just a handful of places that need
attention.

>>  2. Log error/failure conditions at INFO or higher.
>
> That's what we do.  Though people have differing definitions of
> error/failure conditions. ;-)

Sure, but it was my intention to make a specific suggestion for
improvement.  In most cases before an exception is thrown in
CentralAuthenticationServiceImpl, a very clear and specific cause for
the error is logged at DEBUG.  We are currently missing that valuable
information in the logs because we don't put
org.jasig.cas.CentralAuthenticationServiceImpl in DEBUG.  I would
imagine most CAS deployers are missing that important information
because it's not logged by the default logger configuration.
Moreover, it would require code review to determine the specific
logger category to enable those messages.  The default configuration
should be sufficient to write INFO (or higher) messages for failures.
Simply replacing log.DEBUG(...) with log.INFO(...) in these cases
would remedy that.

> We've actually found the easiest way to
> diagnose any problem we have is to actually just look at the audit data
> provided by Inspektr.

The logs should tell a complete story, and they don't currently do
that.  For a service ticket validation failure, our logs only tell
half the story:

2009-11-02 00:00:14,164 INFO
[edu.vt.middleware.cas.VTCentralAuthenticationService] - Granted
service ticket [ST-36402-3OGekA5b32k2c6zYEAmp-auth.vt.edu] for service
[https://my.vt.edu/Login] for user [somevtuser]

This is the last entry before a ticket validation failure.  The only
way we know validation failed is by correlating with the client
application.  It's both reasonable and desirable to indicate that
ticket validation failed and to mention the exact cause.

While Inspektr data is an excellent supplement to log output, it is
unhelpful for diagnostics in our case.  The fact that data live in two
fundamentally places in our case, filesystem vs RDBMS, complicates
cross-correlation of data sources for troubleshooting.

>>  3. Place detailed information about failure conditions into
>> exceptions so it may be communicated to logs and/or services; ideally,
>> both.
>
> You example below is partially incorrect.  Each of those ticket exceptions
> has a code associated with it.  That code is mapped and internationalized to
> an error message listed in the CAS specification which is sent back to the
> client

I'm familiar with this implementation.  The CAS 2 protocol supports
both a code and a description, but only the former is passed to
clients in many important cases.  The following snippet from
ServiceValidateController illustrates this:

        } catch (final TicketValidationException e) {
            return generateErrorView(e.getCode(), e.getCode(), new
Object[] {serviceTicketId, e.getOriginalService().getId(),
service.getId()});
        } catch (final TicketException te) {
            return generateErrorView(te.getCode(), te.getCode(),
                new Object[] {serviceTicketId});

The second argument is the message/description, but since the error
code is duplicated, the detailed cause of the failure is never
communicated to clients.  I believe it would be very helpful to send
the detailed cause back in the ticket validation response.  The code
snippet I shared formerly, where the detailed error cause is placed in
the exception, would allow the following:

        } catch (final TicketValidationException e) {
            return generateErrorView(e.getCode(), e.getMessage(), new
Object[] {serviceTicketId, e.getOriginalService().getId(),
service.getId()});
        } catch (final TicketException te) {
            return generateErrorView(te.getCode(), te.getMessage(),
                new Object[] {serviceTicketId});

> For the most part I believe people have been happy with the error messages 
> returned.

I'd like to offer a rebuttal in the form of a goofy food metaphor
that's close to my heart.  You never know what you're missing in apple
pie if you've never had one that's homemade.  The current error
messages aren't bad, but the additional information would be great for
troubleshooting.

M

-- 
You are currently subscribed to [email protected] as: 
[email protected]
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/cas-dev

Reply via email to