All, This morning one of our production SSO servers ran out of Java heap memory when trying to add to the (ehcache) ticket cache (after about 6 days of continuous operation). The server hung itself in such a way that the other production server locked up as well waiting for ticket replication to complete. This caused the entire service to be down until we killed off the affected server (we actually had to "kill -9" the tomcat process). Needless to say this is not a good thing and we need to take steps to make sure it doesn't happen again.
We've found where to increase the heap size max for tomcat, but we would like to be able to monitor the memory usage so we can take more proactive action if we start to run out. We've looked at https://<server>/cas/status but can't tell if the memory usage there includes an ehcache ticket registry or if that is excluded. I know I've asked this question before and never gotten an answer (maybe no one knows), but is there any way we can monitor the number of tickets (TGTs, LTs, STs) in an ehcache? The performance monitor that comes with the default server states that it only works for the in-memory and JPA-based ticket registries. Finally, are there any "best practices" surrounding regular restarts of the servers (any known memory leaks, etc) and what would the timing of those restarts need to be? Dave -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
