Let me explain the Ticket Serialization collision problem with an example.

We imagine a user we call Tabguy. He is already logged into CAS SSO, but he 
likes to use the option in his browser to open a group of bookmarks as separate 
tabs.
We assume that the browser will manifest the contents of the tabs immediately 
using threading rather than waiting for him to click on the tab and display the 
page.
So more or less simultaneously the browser generates two or more requests for 
applications that use CAS, and because there is a TGT and a cookie they are 
non-interactive requests for new STs.
One of the tabs happens to come first and login. Because of the SingleSignOut 
support, this not only generates the ST but it also adds an entry in the Map in 
the TGT of issued STs and their Services.
The ST is put into the TicketRegistry and disappears into the mumbleCache (eh-, 
mem-, jboss-) implementation of TicketRegistry. In practice, the default 
(paranoid) configuration of all CAS TicketRegistries causes a synchronous 
replication of the ST, but because the ST has a reference to the TGT the 
writeObject() method also tries to make a copy of the TGT.
[Once upon a time (2.4.2) the TGT also had a table of references to STs, but 
this is no longer the case. So, thankfully, you only get the ST and the TGT 
(and its associated Authentication, Principal, Credentials, but no other STs]
However, the Web Server is multithreaded, and it has assigned a second thread 
to handle the second tab, and so on. The CPU is multicore, so the threads run 
concurrently.
At some point, the second tab thread issues an ST. As part of that process, the 
thread is trying to add a new ST ID and Service to the Map in the TGT 
maintained for SingleSignOut.
Meanwhile, the first thread is trying to Serialize the ST. Since there is a 
reference to the TGT, it also Serializes the TGT. Because the TGT has a Map, 
serialization internally obtains an iterator over the Map. It starts to iterate 
entries in the Map.
Now back to the second thread, it is trying to add an entry to the Map that the 
first thread is trying to iterate through. That is a big NO-NO in Java. Now it 
will work 99.9999% of the time and maybe all the time if the map is small. But 
at some point adding a new element to the Map reorganizes it enough to break 
the iterator, and then you get a ConcurrentAccessException.
TicketRegistries synchronize addTicket operations with each other, but you 
cannot synchronize with the writeObject() because that call is somewhere buried 
in mumbleCache .

The solution is to modify the class being serialized (TicketGrantingTicketImpl) 
and add a standard bit of boilerplate:

private synchronized void writeObject(ObjectOutputStream s) throws IOException 
{ s.defaultWriteObject();}

This is a Java idiom that tells the Serialization mechanism to lock the object 
before serializing (and iterating through) it.
This automatically synchronizes with any method of the class that is also 
declared to be synchronized.  So if the methods that add an entry to the Map 
are declared to be synchronized then the Map.put will wait for the iteration or 
the iteration will wait for the Map.put to end and all is safe.

This presupposes that mumbleCache was written by people smart enough not to try 
to serialize one object while they hold the lock on another object in the same 
category. The Java idiom is widely enough used that it is unlikely that anyone 
would be dumb enough to do that, but we have to accept mumbleCache as a black 
box because that is the deal.
THIS IS THE ISSUE. The fix is easy, but everyone has to sign off that they 
believe that none of the TicketRegistry implementations was written by people 
who do not know how to handle synchronized objects. Getting that agreement is 
the entire problem.

Note that this problem is not really load related. Tabguy can create the 
problem on a CAS nobody else is using. It depends on two concurrent threads in 
the Web Server processing two concurrent requests from the same browser 
colliding at exactly the right moment. The window is so small that this is why 
you don’t see it much.

[Though back in 2.4.2 when the TGT has a table of references to STs, and 
serialization could process hundreds of tickets and megabytes of data and the 
iterator had to remain valid through the entire process, the window was 
enormous and it was easy to hit.]

From: Misagh Moayyed [mailto:mmoay...@unicon.net]
Sent: Thursday, December 11, 2014 12:25 PM
To: cas-dev@lists.jasig.org
Subject: RE: [cas-dev] Reducing CASImpl's complexity: ArgExtractors and more

Small note on the serialization issue before diving deeper: Part of the 
difficulty here is the assumption that the entire object could be serializable 
which would make it challenging when things start to cross reference each other 
in a non-trivial way. I suppose if the design separated the actual object model 
from the serialization model this problem would be reduced to some extent. 
Something like a serialization proxy might work well (which I believe is 
something Scott recently did with the CAS client) but it still lots of 
boilerplate code.

From: Jérôme LELEU [mailto:lel...@gmail.com]
Sent: Thursday, December 11, 2014 6:24 AM
To: cas-dev@lists.jasig.org<mailto:cas-dev@lists.jasig.org>
Subject: Re: [cas-dev] Reducing CASImpl's complexity: ArgExtractors and more

Hi,

Thanks for jumping into the discussion. All opinions are welcome, and not only 
from committers.

You raise an interesting idea about the relationships between tickets. So far, 
real service tickets objects are hold inside TGTs instead of simple 
identifiers. We could use identifiers: it would certainly make things easier 
for Serialization but would require more steps to get the information. 2 steps 
to get all the service tickets of a TGT instead of one.



-- 
You are currently subscribed to cas-dev@lists.jasig.org as: 
arch...@mail-archive.com
To unsubscribe, change settings or access archives, see 
http://www.ja-sig.org/wiki/display/JSG/cas-dev

Reply via email to