I have been able to do more testing (finally). I have developed a VERY simple ASP.NET web application that uses the DotNetCasClient out of the box to CASify the application. With only one server in our cluster (behind the load balancer name) things work correctly. As soon as we add in the other server things do not work and I get an unrecognized ticket in the response to validate the ST. In troubleshooting I went into the database that is behind our two CAS servers and tried to find the ST that was sent back to my application. It was not in the database! (Which explains why the other server couldn't find it to validate it.) I removed the other server out of the cluster and ran the same test. The ticket validated, however, it again was NOT in the database. This made it look like the cas server was validating the ticket (in memory?) since it was the one that granted it, but it still hadn't written the ticket to the database.
I thought that I must have somehow turned off the JPATicketRegistry in my spring config and deployerConfigContext files; I looked but it was still configured. So I ran just a normal test through my browser by going to https://server.domain/cas/login?service=http://www.server.edu. I logged in, got redirected to the static html page and grabbed the ticket from the url. This ticket WAS in the database! I am going to look into this more, but I thought I would post my discovery here in case someone might have an idea as to why some tickets are being persisted and others are not. Oh, I checked the cas.log and the catalina logs on the server, no errors were recorded. Any thoughts on what might be going on would be greatly appreciated. On Fri, Sep 25, 2009 at 1:16 PM, Scott Battaglia <[email protected]>wrote: > We're currently running two Sun T5120 servers in an active-active > scenario. Each server holds one instance of CAS and one instance of > memcached-patched-as-repcache. We've disabled Web Flow sessions so our CAS > instance requires no Tomcat sessions. > > We're currently deployed behind a Cisco Content Service Switch but moving > to their ACE hardware. We use no sticky-session, and its most likely > least-connection load balancing. We have a simple servlet in each CAS > deployment that the CSS uses to test whether CAS is "up" or not. > > CAS currently points to two "virtual LDAP servers". I say virtual because > one is actually backed by two load balanced machines (our primary LDAP) and > the other has one LDAP server (these are all ~ V240s +/- a 5). The LDAPs > point to a ring of Kerberos servers. > > Currently, the LDAP servers and the Kerberos servers are located in various > machine rooms across the campus. Both CAS machines are currently located in > one machine room. They're testing some additional networking hardware and > configuration which would make it easier for us to deploy CAS in multiple > machine rooms. > > Hopefully that helps. > > Cheers, > Scott > > > On Fri, Sep 25, 2009 at 3:03 PM, Marvin Addison > <[email protected]>wrote: > >> > Thank you that was the problem and it validated the ticket correctly >> just as >> > in test 2. This makes me think our cluster should be working. Right >> now >> > our load balancer address (https://loadbalancer.domain/cas/) only has >> one >> > server active >> >> This may explain why you're not getting the error you did formerly. >> Whatever problems you are having appear to manifest only when both >> nodes are active. >> >> > Is there anyway I can get more details as to how you (Scott & >> > Marvin) have setup your clusters? >> >> We use Foundry (now Brocade) ServerIron application switches for load >> balancing. Our cluster is active-active using least number of >> connections algorithm for routing requests. We use the host affinity >> feature (sticky sessions) so that a request from a given source is >> routed to the same node during a session (30m timeout). There are >> some other details that I'll reserve since they're not relevant here, >> but in summary our setup is a common active-active configuration. >> >> It is standard practice here to deploy all core services in a high >> availability configuration. If you value your CAS service, it's worth >> the effort to invest in clustering. >> >> M >> >> -- >> You are currently subscribed to [email protected] as: >> [email protected] >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/cas-user >> >> > -- > You are currently subscribed to [email protected] as: > [email protected] > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/cas-user > > -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
