Hi Marvin, Thanks for your reply. I have seen your stat page for Virginia Tech while posting to this mailing list.
I would appreciate if somebody can provide a patch which shows all the customizations/diffs needed to provide a healthcheck url which at least tests the availability of the cas web applications. The patch can be a part of this wiki page https://wiki.jasig.org/display/CASUM/7.+Monitoring+and+Management, so that new user like me can implement CAS easily in high availability systems. Should checking the availability of LDAP and Database be a part of healthcheck for CAS ? I am not sure, if the database is for some reason down (or slow enough to trigger the threshold timeout in load balancer), the Proxy/Loadbalancer can take out all the cas nodes from the cluster. Thanks, Mahmudul Hasan On Fri, Nov 4, 2011 at 7:48 AM, Marvin Addison <[email protected]>wrote: > > Has anyone implemented any public healthcheck page for CAS ? > > We have one that's specific for Virginia Tech but it could be better, > and I've been meaning to open an issue for an improved one that ships > with CAS. At the least it should health check all connection pools > (LDAP, database, etc). Strictly speaking all LDAP context sources > should be tested since not all LDAP connections are pooled (e.g. LDAP > auth handlers). Anything else? > > What's vitally important is that the controller for the health check > view be configurable such that it can be tuned by deployers to expose > health information in a way that supports various monitoring and > management needs. A numeric scale, or better an HTTP-like scale, > seems reasonable. A straw-man proposal: > > - 100 - no data available > - 200 - normal > - 4xx - health check succeeded outside of threshold, where the xx > digits are the number of checks out of threshold > - 5xx - health check failure, where the xx digits represent the number > of individual check failures > > If the result codes above are implemented as HTTP status codes, I > would imagine most health check tools that want a 2xx response for > success would work naturally, and more sophisticated tools could map > codes to custom behavior. This also allows for detailed health check > results to be returned as text in the response for human review or > tooling. > > Defining thresholds for the 4xx codes may prove tricky, but it seems > valuable to at least consider. Thoughts on that area welcome. > > Please consider this proposal and provide feedback. Once we agree on > something that has the flexibility to suit most deployer needs, I'll > create the issue and work on it for the next release. > > M > > -- > You are currently subscribed to [email protected] as: > [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/cas-user > -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
