While rate limiting may be a way to patch over the problem, I think everyone is somewhat missing the point here: when an ST for Google is created, something on the other server is getting created that does not seem to be GCed in a timely fashion and that is NOT happening on the server where the ST was created - that's Just Not Right® :-)
The high rate of the Google logins just makes the issue more evident, however it likely exists for every Google login, and if left uncorrected will eventually exceed the heap memory available as surely (if not as quickly) as when there are 3000+ for a single user inside of an hour or so. That's assuming that whatever is happening never gets GCed of course. We do have the default ticket registry clean running on those systems as well, just as an added precaution, but this seems to be something not touched by that process because we've seen at least one successful clean operation on an affected server that did not free more than a tiny percentage of the memory used. That would make sense to me since this does not appear an actual ticket but something related to the Google SAML validation process instead. On 12/4/14 3:10 PM, Waldbieser, Carl wrote: > I found login rate limiting info here: > > https://wiki.jasig.org/display/CASUM/Throttling+Login+Attempts > > Is there a means to limit how many ST validations are allowed per user in a > given unit of time? > > Thanks, > Carl Waldbieser > ITS Systems Programmer > Lafayette College > > ----- Original Message ----- > From: "Trenton D. Adams" <[email protected]> > To: [email protected] > Sent: Tuesday, December 2, 2014 7:23:40 PM > Subject: Re: [cas-user] Rapid Memory Consumption and Interpreting Heap Dump > > It does have a way of rate limiting per user, check the docs. :D > > On 14-12-02 05:17 PM, Carl Waldbieser wrote: >> Dave, >> >> How many logins? >> We recently had a misconfugured cas client from a vendor almost fill >> /var. It was tens of thousands of logins. >> >> It would be nice if cas had some way to rate limit ST and login requests >> per user. >> >> Thanks, >> Carl >> >> On Dec 2, 2014 3:26 PM, "David A. Kovacic" <[email protected] >> <mailto:[email protected]>> wrote: >> >> I'm not sure how or where you would mark this as a singleton >> instance - although if you go back to an actual Google web page >> multiple times from the same browser session you reuse the ST if >> that's what you mean. This actually looked like multiple logins >> from a single user over the span of about 30 minutes. Not sure if >> this was some poorly written webapp logging in several time or what. >> >> >> On 12/2/14 1:32 PM, Erik-Paul Dittmer wrote: >>> Rapid heap memory consumption (which are not garbage collected) >>> *can* be caused by unfinished Spring Webflow flow sessions; this >>> is something we have observed. However, when looking at your >>> memory dump, the majority of the instances (and size) is being >>> claimed by the GoogleAccountService. Perhaps this is not marked as >>> a singleton instance? >>> >>> On Tue, Dec 2, 2014 at 6:38 PM, David A. Kovacic <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> All, >>> >>> Yesterday evening one of our CAS 4.0.0 servers went from under >>> a GB of heap usage to 3GB in a matter of about 10 minutes. >>> The end result was that again the SSO service died (one server >>> with a heap memory OoM error and the other trying to replicate >>> the ehcache to the dead server. This was definitely not a >>> memory leak issue as the servers had been restarted only >>> earlier that morning, so they had only been up for about 17 >>> hours or so. Out system monitors also indicated that the >>> memory usage rather suddenly skyrocketed (over the course of >>> about 20 minutes) so we suspect that the memory consumption is >>> a symptom of some other issue. >>> >>> We have a heap dump but I am having a bit of trouble trying to >>> analyze it with jvisualvm as I have never used the tool >>> before. If I am interpreting the dump correctly, it appears >>> that tickets only play a very small part of the overall memory >>> usage (see screen shot). >>> >>> >>> >>> Has anyone heard or experienced anything like what we are >>> seeing? This is becoming increasingly frustrating as every >>> time we think we have the issues resolved and turn our >>> attention elsewhere one server or the other crashes and takes >>> the service down with it. >>> >>> Dave >>> >>> -- >>> You are currently subscribed [email protected] >>> <mailto:[email protected]> as:[email protected] >>> <mailto:[email protected]> >>> To unsubscribe, change settings or access archives, >>> seehttp://www.ja-sig.org/wiki/display/JSG/cas-user >>> >>> >>> >>> >>> -- >>> Erik-Paul Dittmer >>> T: REDACTED >>> >>> Visit us at http://www.digitalmisfits.com >>> >>> - - - - - - - - - - - - - - - - - - - - - - - - - - >>> Digital Misfits does not accept any liability for any errors, >>> omissions, delays of receipt or viruses in the contents of this >>> message which arise as a result of e-mail transmission. >>> -- >>> You are currently subscribed [email protected] >>> <mailto:[email protected]> as:[email protected] <mailto:[email protected]> >>> To unsubscribe, change settings or access archives, >>> seehttp://www.ja-sig.org/wiki/display/JSG/cas-user >> -- >> You are currently subscribed [email protected] >> <mailto:[email protected]> as:[email protected] >> <mailto:[email protected]> >> To unsubscribe, change settings or access archives, >> seehttp://www.ja-sig.org/wiki/display/JSG/cas-user >> >> -- >> You are currently subscribed to [email protected] as: >> [email protected] >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/cas-user >> > -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
