On Tue, Sep 29, 2009 at 5:35 PM, Ryan Andreasen <[email protected]>wrote:
> I have been able to do more testing (finally). I have developed a VERY > simple ASP.NET web application that uses the DotNetCasClient out of the > box to CASify the application. With only one server in our cluster (behind > the load balancer name) things work correctly. As soon as we add in the > other server things do not work and I get an unrecognized ticket in the > response to validate the ST. In troubleshooting I went into the database > that is behind our two CAS servers and tried to find the ST that was sent > back to my application. It was not in the database! (Which explains why > the other server couldn't find it to validate it.) I removed the other > server out of the cluster and ran the same test. The ticket validated, > however, it again was NOT in the database. This made it look like the cas > server was validating the ticket (in memory?) since it was the one that > granted it, but it still hadn't written the ticket to the database. > If its not in there, it was validated (or at least it failed validation). Can you make sure the validation was successful? (It should say in the logs). > > I thought that I must have somehow turned off the JPATicketRegistry in my > spring config and deployerConfigContext files; I looked but it was still > configured. So I ran just a normal test through my browser by going to > https://server.domain/cas/login?service=http://www.server.edu. I logged > in, got redirected to the static html page and grabbed the ticket from the > url. This ticket WAS in the database! I am going to look into this more, > but I thought I would post my discovery here in case someone might have an > idea as to why some tickets are being persisted and others are not. > > Oh, I checked the cas.log and the catalina logs on the server, no errors > were recorded. Any thoughts on what might be going on would be greatly > appreciated. > > On Fri, Sep 25, 2009 at 1:16 PM, Scott Battaglia < > [email protected]> wrote: > >> We're currently running two Sun T5120 servers in an active-active >> scenario. Each server holds one instance of CAS and one instance of >> memcached-patched-as-repcache. We've disabled Web Flow sessions so our CAS >> instance requires no Tomcat sessions. >> >> We're currently deployed behind a Cisco Content Service Switch but moving >> to their ACE hardware. We use no sticky-session, and its most likely >> least-connection load balancing. We have a simple servlet in each CAS >> deployment that the CSS uses to test whether CAS is "up" or not. >> >> CAS currently points to two "virtual LDAP servers". I say virtual >> because one is actually backed by two load balanced machines (our primary >> LDAP) and the other has one LDAP server (these are all ~ V240s +/- a 5). >> The LDAPs point to a ring of Kerberos servers. >> >> Currently, the LDAP servers and the Kerberos servers are located in >> various machine rooms across the campus. Both CAS machines are currently >> located in one machine room. They're testing some additional networking >> hardware and configuration which would make it easier for us to deploy CAS >> in multiple machine rooms. >> >> Hopefully that helps. >> >> Cheers, >> Scott >> >> >> On Fri, Sep 25, 2009 at 3:03 PM, Marvin Addison <[email protected] >> > wrote: >> >>> > Thank you that was the problem and it validated the ticket correctly >>> just as >>> > in test 2. This makes me think our cluster should be working. Right >>> now >>> > our load balancer address (https://loadbalancer.domain/cas/) only has >>> one >>> > server active >>> >>> This may explain why you're not getting the error you did formerly. >>> Whatever problems you are having appear to manifest only when both >>> nodes are active. >>> >>> > Is there anyway I can get more details as to how you (Scott & >>> > Marvin) have setup your clusters? >>> >>> We use Foundry (now Brocade) ServerIron application switches for load >>> balancing. Our cluster is active-active using least number of >>> connections algorithm for routing requests. We use the host affinity >>> feature (sticky sessions) so that a request from a given source is >>> routed to the same node during a session (30m timeout). There are >>> some other details that I'll reserve since they're not relevant here, >>> but in summary our setup is a common active-active configuration. >>> >>> It is standard practice here to deploy all core services in a high >>> availability configuration. If you value your CAS service, it's worth >>> the effort to invest in clustering. >>> >>> M >>> >>> -- >>> You are currently subscribed to [email protected] as: >>> [email protected] >>> To unsubscribe, change settings or access archives, see >>> http://www.ja-sig.org/wiki/display/JSG/cas-user >>> >>> >> -- >> You are currently subscribed to [email protected] as: >> [email protected] >> >> To unsubscribe, change settings or access archives, see >> http://www.ja-sig.org/wiki/display/JSG/cas-user >> >> > -- > You are currently subscribed to [email protected] as: > [email protected] > > To unsubscribe, change settings or access archives, see > http://www.ja-sig.org/wiki/display/JSG/cas-user > > -- You are currently subscribed to [email protected] as: [email protected] To unsubscribe, change settings or access archives, see http://www.ja-sig.org/wiki/display/JSG/cas-user
