Seems reasonable to me. I don't think there is anything inherently bad with synchronous replication, you just need to have a good reason like you have here to add in the extra cost of sending the replication data on cache put.
On Fri, Feb 7, 2014 at 12:24 PM, James Wennmacher <[email protected]>wrote: > As an update to the group, one minor difference with what Eric suggested > I thought I'd mention to get group feedback on (and to share general > knowledge) is that for the CAS Clearpass Proxy-Granting Tickets (PGTs) > replication, I'm setting replicateAsynchronously=false to force synchronous > replication which is really just an immediate channel.send() and doesn't > wait for the peer acknowledgement, see > http://jira.terracotta.org/jira/browse/EHC-874). I don't think this is a > significant performance impact and no chance of deadlock, and is to reduce > the chance that the uPortal server the user is on attempts to obtain the > PGT before it has been replicated to it. > > I'm adding commented-out configuration and wiki documentation about > clustered CAS Clearpass so this should be much easier for anyone who wants > to do it in the future. > > As a background for those interested (and I'll add this to the wiki > documentation), the way the CAS Clearpass works with uPortal is: > > 1. When uPortal receives the service ticket in the URL from the CAS > redirect after the user enters their credentials in CAS, uPortal does the > service ticket validation with CAS to get the userId and verify > authentication is good. uPortal always requests the clearpass PGT in the > service ticket validation. > > 2. If enabled on CAS, CAS will not respond to uPortal until CAS initiates > sending a PGT to uPortal and getting a response (unsure what happens if > there is a failure). At that point, one of the uPortal servers has the PGT > so CAS will reply to the service ticket validation. > > 3. uPortal at that point needs access to the PGT if it needs to provide > the user's password to a portlet. > > I set replicateAsynchronously=false so the uPortal server that CAS has > invoked with the PGT will at least fire off the replication request. The > expectation and hope is that by the time CAS receives the PGT POST response > and responds back to the original uPortal server for the service ticket > validation, the PGT replication packet will be received and processed by > peer uPortal nodes (or at least the one the user has logged into) and the > PGT will be available in ehcache when the uPortal server the user has > logged into receives the service ticket response from CAS. > > I hope all this makes sense. If anyone sees a problem with setting > replicateAsynchronously=false let me know. > > Thanks! > > James Wennmacher - Unicon480.558.2420 > > On 02/07/2014 11:00 AM, Eric Dalquist wrote: > > That is correct. You can set those ports if you need to, I believe we do > at UW to deal with TCP firewalls but they are not required in most cases. > > One thing to note is the important data in that table is stored in the > BLOBs. jGroups only supports using Java serialization to store their > PhysicalAddress class. I added the human-readable toString output columns > so it would be easier to debug. > > For debugging/monitoring all of this take a look at JMX (jConsole) data. > The jGroups instance registers itself and exposes a ton of useful data via > JMX. You can see who all the members of a current view are, who is the > coordinator, traffic stats, etc. > > > There is also some vestigial code in > https://github.com/Jasig/uPortal/tree/8de4d5030be8dbd219b73a28037185e1d2df661d/uportal-war/src/main/java/org/jasig/portal/jgroups/authwhich > I was trying to use to get group auth and encryption "just working" > as well but I never had much luck with it. If that could be reliably > enabled then the coordinator node would write out a random auth/crypto > token to the database which the other nodes would then use to auth into the > group. > > > On Fri, Feb 7, 2014 at 9:47 AM, James Wennmacher > <[email protected]>wrote: > >> To verify, we typically don't need to set any of the jgroups property >> values in portlet.properties for a clustered environment... in particular >> #uPortal.cacheManager.jgroups.fd_sock.start_port= >> #uPortal.cacheManager.jgroups.tcp.bind_port= >> >> jGroups will by default pick a random port and communicate the port that >> node chose to the other nodes in the cluster so the nodes can communicate >> (replicate cache invalidations or cache data, in addition to discovery of >> new/removed nodes)? I assume that's what is stored in the database >> (UP_JGROUPS_PING table, PHYSICAL_ADDRESS column - holds a value like >> fe80:0:0:0:a288:b4ff:febe:ed0%3:43362). >> >> Thanks, >> >> James Wennmacher - Unicon480.558.2420 >> >> On 02/05/2014 06:30 PM, Eric Dalquist wrote: >> >> +uportal-dev so everyone sees the background on this. >> >> >> UDP multicast is great ... in theory. In practice across the complex >> networks in most data centers it is a nightmare. At UW and other places I >> tested we had constant problems with peer discovery, message routing and >> other issues. >> >> As you said TCP is a pain because you have the discovery issue but >> uPortal doesn't have this discovery issue. One of the neat things with >> jGroups is the ability to write your own "protocol" handlers. uPortal >> provides a custom implementation of >> PING<https://github.com/Jasig/uPortal/blob/master/uportal-war/src/main/resources/properties/jgroups.xml?source=c#L64>, >> the jGroups discovery protocol via >> DAO_PING<https://github.com/Jasig/uPortal/blob/master/uportal-war/src/main/java/org/jasig/portal/jgroups/protocols/DAO_PING.java>. >> This handler uses the already shared uPortal database to coordinate node >> discovery. >> >> When a uPortal instance starts up (which starts ehcache, which starts >> jgroups) and instance of DAO_PING is created and start() is called. This >> schedules a Timer that runs every 60 seconds (configurable in jgroups.xml) >> that writes out the JVM's current physical address (as determined by >> jGroups, again configurable if it auto-discovers the wrong one) to the >> database via the >> JdbcPingDao<https://github.com/Jasig/uPortal/blob/master/uportal-war/src/main/java/org/jasig/portal/jgroups/protocols/JdbcPingDao.java> >> . >> >> The next thing jGroups needs to do after start is discover peers, to do >> this it calls fetchClusterMembers on DAO_PING which uses the JdbcPingDao to >> get a list of all of the physical address that have been written to that >> table. jGroups then uses that list to join the cluster. >> >> The last part of the process is what the coordinator node (there is >> always a coordinator that is elected in a jGroups cluster) does. Every time >> the view (what jgroups calls the list of currently active cluster members) >> changes the coordinator purges the database by removing all rows that do >> not match known members. This handles pruning addresses of old/dead >> instances. >> >> This system has worked very well and effectively anyone running uPortal >> 4.0.8 or later very likely has a coherent jGroups cluster doing ehcache >> invalidation with zero extra work: >> https://issues.jasig.org/browse/UP-3607 >> >> You can take a look at which caches uPortal uses jGroups for and how >> they are configured: >> https://github.com/Jasig/uPortal/blob/master/uportal-war/src/main/resources/properties/ehcache.xml >> >> Note that uPortal does not do true replication anywhere. All of the >> data cached in uPortal can be retrieved from the database or recalculated >> very quickly so the caches are configured to do invalidation based >> replication where when a key in a cache is replaced with a new value or >> deleted a message is sent to the cluster that results in the other caches >> removing the key so that the value is reloaded the next time it is needed. >> >> As for overhead, the recommended approach is to have the >> replicateAsynchronously flag set to true in which case ehcache batches up >> replication messages and sends them in the background (very quickly but >> still in batches). >> >> >> >> For what you need in CAS tickets which I believe are ephemeral you >> would need to set replicatePuts=true and replicateUpdatesViaCopy=true to >> copy the actual data between nodes. >> >> >> As for performance, you configure replication behavior on a cache by >> cache basis. There are a bunch of caches in uPortal that are not replicated >> at all either because the data doesn't change or is local to the instance. >> >> >> >> Something that might be worth investigating is a way to share the >> jGroups Channel that gets created for ehcache in uPortal across all of the >> portlets in Tomcat. I had wanted to look into that but never had time to. I >> doubt it is a simple change but could be VERY valuable in providing cache >> consistency for portlets as well as uPortal. The general concept I was >> thinking of was to do the following (large chunk of work) >> >> - Have uPortal initialize jGroups at start time (see the ehcache >> JGroupsCacheManagerPeerProvider) >> - Have uPortal expose the JChannel as an attribute in the >> PortletContext each portlet gets access to at init time >> - You probably need a tomcat context scoped wrapper around it that >> hides each context's messages from each other context >> - Write a custom Ehcache replication service (that likely extends the >> existing jgroups replication service) which has: >> - A spring listener that gets the PortletContext injected in, gets >> the jGroups channel and stores it in some context-global location >> - A version of jGroupsCacheManagerPeerProvider that uses the >> jGroups channel from the global location >> - This should fail-nice so that if uPortal doesn't provide a >> jChannel things just don't get replicated >> >> >> Hope that is a helpful wall of text :) >> >> >> >> On Wed, Feb 5, 2014 at 3:55 PM, James Wennmacher >> <[email protected]>wrote: >> >>> Hi Eric. >>> >>> I am starting to configure uPortal for CAS clearpass for a customer. In >>> the CAS documentation (Replicating PGT using >>> "proxyGrantingTicketStorageClass" and Distributed Caching in >>> https://wiki.jasig.org/display/CASC/Configuring+the+Jasig+CAS+Client+for+Java+in+the+web.xml), >>> they reference an example ehcache config ( >>> https://github.com/mmoayyed/cas/blob/master/cas-server-integration-ehcache/src/test/resources/ehcache-replicated.xml) >>> that has an option for >>> net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory for >>> multi-cast replication. I noticed you added jGroups UDP multicast >>> replication into uPortal's ehcache.xml, then changed it to TCP later, which >>> has the disadvantage of requiring explicit knowledge of host IP addresses. >>> >>> What are the reasons you switched from UDP multicast to TCP? Just >>> looking for background, and possible suggestions. Also Do you have insight >>> into using jGroups vs. >>> net.sf.ehcache.distribution.RMICacheManagerPeerProviderFactory? Why I >>> might use one vs. the other? I suspect you might have been down this road >>> before ... I haven't started doing research on one approach vs. the other >>> yet. >>> >>> I suspect if I have ehcache replication configured it applies to all >>> caches, which will likely be a performance issue. Do you have >>> thoughts/experience on that RE UW? I haven't looked yet into having only >>> the CAS info replicated. I suspect there is a way to do that. >>> >>> Thanks, >>> >>> -- >>> James Wennmacher - Unicon480.558.2420 >>> >>> >> -- >> >> You are currently subscribed to [email protected] as: >> [email protected] >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/uportal-dev >> >> >> -- >> >> You are currently subscribed to [email protected] as: >> [email protected] >> >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/uportal-dev >> >> > -- > > You are currently subscribed to [email protected] as: > [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/uportal-dev > > > -- > > You are currently subscribed to [email protected] as: > [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/uportal-dev > > -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/uportal-dev
