Exactly right. If we see 3000 STs created in an hour for a user, all to Google, we are also seeing Google report 3000 successful logins in the same time frame reported in the Google admin console audit logs. As far as I can tell, whatever condition triggers this (and it may be some form of malware being used to send spam through us) gets credentials once (only one TGT is ever created) and then somehow does >1000 logins in about an hour to Google without ever logging back out of Google. As far as we can tell, there are no errors in the process of logging in, but because the Google SAML process seems to leave fairly large remnants of instances in the heap, and those remnants are not being GCed in a timely fashion, we run out of heap memory and the SSO server process locks up, taking the other server with it. To summarize, it seems not to be the SAML process itself, but the VOLUME of SAML processes in a very short time that seems to cause the issue.
On 12/11/14 9:01 PM, Sean Baker wrote: > Now that's interesting -- is that to say that when you see these > rapidly-generated service tickets for particular users you're seeing > them logging in as many times to Google as well? > > > > > On 12/11/14, 14:17 PM, David A. Kovacic wrote: >> Google seems to be accepting the assertions each time as we are >> seeing the same number of logins in Google's audit logs as the number >> of STs being created. I would expect that if there was something >> wrong with assertion we would be receiving complaints from the >> users. I am more inclined at this point to believe some sort of >> crazy browser loop, but it's definitely not happening with any >> consistency. >> >> We have tried contacting the two people we identified once we started >> to get a handle on what the issue was, however neither has >> responded. That's not terribly surprising given that we are in our >> finals period here and requests for information go pretty much >> ignored by students and faculty alike at this time. >> >> Dave >> >> On 12/10/14 8:14 PM, Sean Baker wrote: >>> Your access logs should show the individual SAMLRequest's generated by >>> Google; if it's rejecting your assertions in some automated way you >>> should see a new SAMLRequest each time. If it's the same request over >>> and over, one might infer a more local issue (not definitively mind you; >>> just much more likely) [ehcache issue, browser configuration, etc.]. >>> >>> Has anyone talked with your end users who're triggering these events >>> about what they experienced? >>> >>> On 12/10/14, 15:16 PM, David A. Kovacic wrote: >>>> Does anyone know what I would need to do to be able to log the >>>> actual SAML transactions? Is there any way to actually do that? >>>> We have isolated this issue to only logins to Google and only under >>>> certain conditions when something seems to start looping and >>>> generating STs rapidly. We are trying to isolate the conditions >>>> under which the loop starts. >>>> >>>> It would be helpful to actually see the SAML transactions being >>>> generated so we could begin to get a handle on what Google apps is >>>> being referenced and if Google is returning any errors or not >>>> (although Google claims valid logins). >>>> >>>> >>>> On 12/6/14 9:11 AM, Marvin Addison wrote: >>>>> >>>>> Second, the massive number of STs are being created on only >>>>> one server (we can tell by the host name in the logged ST) but >>>>> the OTHER SERVER is where the memory is growing out of bounds. >>>>> >>>>> >>>>> I'm still working through this thread, but I wanted to point out >>>>> that the other is hurting likely because of load balancer session >>>>> affinity. Recall that ticket validation is a back-channel call, >>>>> and the network source differs from that of the user's browser. In >>>>> our environment, services typically get stuck on one node causing >>>>> hot spots. This is because the service is validating tickets >>>>> frequently enough that the session affinity timeout never kicks in. >>>>> >>>>> M >>>>> >>>>> -- >>>>> You are currently subscribed to [email protected] as: [email protected] >>>>> To unsubscribe, change settings or access archives, see >>>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>>>> >>>>> -- >>>>> >>>> -- >>>> You are currently subscribed to [email protected] as: >>>> [email protected] >>>> To unsubscribe, change settings or access archives, see >>>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>> >>> -- >>> You are currently subscribed to [email protected] as: [email protected] >>> To unsubscribe, change settings or access archives, see >>> http://www.ja-sig.org/wiki/display/JSG/cas-user >> -- >> You are currently subscribed to [email protected] as: >> [email protected] >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/cas-user > > -- > You are currently subscribed to [email protected] as: [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/cas-user -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
