Hello, We have a two nodes (active-active) CAS 4.0 environment backed by a JPA ticket registry on an active-active MySQL setup. This setup runs for about 3 to 4 months now, but we are experiencing some problems while cleaning the registry since we have a long TGT timeout: 90 days. The idea is basically that users don't have to login again if they don't want to.
This ran fine for about a month when TGTs started to pile up (this was expected of course), after that we experienced the same problems as described here: https://lists.wisc.edu/read/messages?id=11402745. I fixed those by making sure that we are 'streaming' (forward-only ResultSet) the tickets and only adding them to the list of expired tickets if they are expired. This ran fine for about another two monts, which brings us to now. I again received a report of OutOfMemory errors and couldn't figure out where it went wrong since I tested the previous setup with a database with close to 650K TGTs and the cleaner ran fine. When I got a heap dump I was shocked to see that there were tickets in there which took close to 10Mb of memory (deserialized). After investigation this was pretty much all allocated in the 'services' map that is stored in the TGT. As far as I understand (and please correct me if I'm wrong) this is a map {ServiceTicketId -> Service} and an entry is put into this map whenever a ST is granted for this TGT. So this is (seems to be?) an every growing map for every TGT and thus every ticket in the database is growing over time. Not a sustainable situation in our case since we have this long timeout. I'm thinking this is because of the Single Logout feature: at that point we can retrieve which STs have actually been given out and whether or not we need to trigger a (backchannel) logout on that service. However for that use case it's not really logical to keep track of all STs that have every been given out right? I would say that in this case: 1. user has TGT-1 2. user goed to service 1 and generates/validates ST-1 for TGT-1 3. user gets a client session for service 1 4. user goes to lunch and client session times out for service 1 5. user uses service 1 again and generates/validates ST-2 for TGT-2 6. user gets a new client session for service 1 7. ... At this point the TGT-1 contains two entries in the 'services' map (ST-1 -> service 1, ST-2 -> service 1). But I would say a logout for the first client session (3) doesn't every need to happen again. This timeout is not explicitly signalled to CAS of course, but the fact that another ST is generated / validated for the same service (4) isn't that enough of a signal to lose any information about ST-1 in the 'services' map? Is this information still relevant for anything else? If not I would have guessed this services to be structured the other way around actually { Service -> Service Ticket } to keep track of the latest ST given out for a given service. ... Any insights in how to remedy this situation as best as possible? This code is so deep in CAS I can't really work around it as far as I can see. I was thinking of a background thread (kind of like the cleaner) that will clean up these services with some nasty reflection stuff :-S, but I'd rather have another solution of course. -- Auke -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
