Pat,

We saw a bunch of replication errors also. We haven't had too much time to
delve into it as we had to loadtest a large set of scenarios (distributed,
database, single machine).  If you find anything interesting or an optimal
configuration please let the list know and update it in Confluence, if you
can ;-)

Thanks
-Scott

On Sun, Mar 9, 2008 at 9:24 AM, Pat Hennessy <[EMAIL PROTECTED]> wrote:

>
> So the first message I got was..
>
> 2008-03-06 23:51:20,838 ERROR
> [org.jasig.cas.ticket.registry.JBossCacheTicketRegistry] -
> <org.jboss.cache.ReplicationException: rsp=sender=138.1
> 23.130.81:32772, retval=null, received=false, suspected=false>
> org.jboss.cache.ReplicationException: rsp=sender=138.123.130.81:32772,
> retval=null, received=false, suspected=false
>
> I found an old message on the list..
>
> http://tp.its.yale.edu/pipermail/cas/2006-September/003412.html
>
> I'm using our VMWare Infrastructure for our two nodes.  I had them on
> two different hosts, so I then moved them to the same VMWare host and
> everything appeared to be better.  So, next week I plan on talking to
> the network guy about using multicast on that switch (I had spoken to
> him earlier and he thought the settings on it should be fine).
>
> I left one of my casified apps in the browser all night.  This app does
> a page refresh and uses one of the apache module, so it's been helpful
> in finding these problems.
>
> The next day, I found another exception...
>
> 2008-03-08 11:02:09,130 ERROR
> [org.apache.catalina.core.ContainerBase.[Catalina].[c-cas-02.dtcc.edu
> ].[/cas].[cas]]
> - <Servlet.service() for servlet cas threw exception>
> org.jboss.cache.lock.TimeoutException: Response timed out:
> sender=138.123.130.81:32772, retval=null, received=false, suspected=false
>
> These are happening on either node and it does appear that I get get
> tickets and appear validated from both nodes.  So the clustering is
> working ok, but there seems to be a timing issue.
>
> Did some more searching and saw references on some Japanese site for
> some totally different application to the "SyncReplTimeout" value.  I
> also found some document on Redhat's site about setting up JBoss.  So, I
> made some adjustments to the different timeouts.  I think I read
> something somewhere on some JBoss site that some timeouts need to be
> shorter than others.  I'm no Tomcat or JBoss expert, but I set the below
> settings and I think it's been behaving as expected..
>
> <attribute name="InitialStateRetrievalTimeout">15000</attribute>
>
> <attribute name="SyncReplTimeout">20000</attribute>
>
> <attribute name="LockAcquisitionTimeout">25000</attribute>
>
> I also wonder if the JBoss replication stuff is dependent on the system
> clock.  I noticed one of them was off and had to fiddle with ntp.
>
> My next goal is to load test logins with JMeter.  I did try another
> program that someone posted on the list, but I thought JMeter looked
> better.  I haven't exactly gotten that one working yet.  But just
> hitting the servers with page retrievals doesn't seem to cause any
> exceptions.
>
> Pat
>
> On 3/7/08 5:25 PM, Pat Hennessy wrote:
> > On 7/23/2007 11:45 AM, Brian Donnelly wrote:
> >> Thanks Scott,
> >>
> >> I've attached my jbossCache.xml config file.  It is almost identical to
> the jbossTestCache.xml configuration included in CAS 3.0.6.  I did have to
> comment out the authentication protocol version tag because it was
> generating errors.
> >>
> >> If anyone has any pointers or would be willing to send their JBossCache
> configuation parameters, I'd be very appreciative.
> >>
> >
> > Brian,
> >
> > Did you ever find a fix for the org.jboss.cache.ReplicationException
> > error you found?
> >
> > I just setup the jboss replication using the directions on the CAS wiki
> > (and the jbossTestCache.xml file).  On the dev cluster, I didn't get the
> > error.  After putting it on our new to be production cluster, I've been
> > finding the same error showing up as a RuntimeException with some of our
> > test apps.  I don't think we putting these services under any real load
> > though.
> >
> > Pat
> >
> >> Thanks,
> >>
> >> Brian Donnelly
> >> --
> >> Brian Donnelly
> >> University of Calfornia, Davis
> >> Information and Educational Technology
> >> Middleware Team
> >> (530) 754-5909
> >> [EMAIL PROTECTED]
> >>
> >> -----Original Message-----
> >> From: Scott Battaglia [mailto:[EMAIL PROTECTED]
> >> Sent: Fri 7/20/2007 6:29 AM
> >> To: Brian Donnelly; Yale CAS mailing list
> >> Subject: Re: JBossCache Ticket Registry performance under load?
> >>
> >> Brian,
> >>
> >> We don't deploy that at Rutgers so I can't comment on that.  A few
> people
> >> have deployed it in production without issues.  Maybe you can include
> your
> >> configuration file and those who have deployed it successfully can
> compare
> >> it to theirs if they get a minute (hopefully).
> >>
> >> Thanks
> >> -Scott
> >>
> >> On 7/18/07, Brian Donnelly <[EMAIL PROTECTED]> wrote:
> >>> Hi all,
> >>>
> >>> We're getting ready at UC Davis to switch to a JBossCache Clustered
> >>> configuration for our CAS installation.  I have been load testing two
> >>> Redhat EL 5 clustered nodes running CAS 3.0.6 using the default
> >>> JBossCache implementation, (UDP multicast.)
> >>>
> >>> I've been using JMeter to generate ~7 login actions per second.  Both
> >>> clustered servers perform fine for several hours.  Somewhere in the
> >>> third hour of testing, I start seeing the following errors in the
> logs:
> >>>
> >>> 2007-07-18 13:43:54,813 ERROR
> >>> [org.jasig.cas.ticket.registry.JBossCacheTicketRegistry] -
> >>> <org.jboss.cache.ReplicationException: rsp=sender=
> 169.237.104.235:53768,
> >>> retval=null, received=false, suspected=false>
> >>>
> >>> and
> >>>
> >>> 2007-07-18 13:48:33,448 ERROR
> >>> [org.apache.catalina.core.ContainerBase
> .[Catalina].[localhost].[/cas].[cas]]
> >>> - <Servlet.service() for servlet cas threw exception>
> >>>
> >>> These start piling up until both servers stop responding to incoming
> >>> requests.  A restart is required to restore service.
> >>>
> >>> Has anyone else encountered errors of this type in their testing of
> the
> >>> JBossCache registry?
> >>>
> >>> Thanks,
> >>>
> >>> Brian Donnelly
> >>> --
> >>> Brian Donnelly
> >>> University of Calfornia, Davis
> >>> Information and Educational Technology
> >>> Middleware Team
> >>> (530) 754-5909
> >>> [EMAIL PROTECTED]
> >>> _______________________________________________
> >>> Yale CAS mailing list
> >>> [email protected]
> >>> http://tp.its.yale.edu/mailman/listinfo/cas
> >>>
> >>
> >>
> >
> >
>
>
> --
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> Pat Hennessy, RHCE                        ([EMAIL PROTECTED])
>
> Senior Systems Specialist
> Division of Information and Educational Technology
> Delaware Technical and Community College
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> _______________________________________________
> Yale CAS mailing list
> [email protected]
> http://tp.its.yale.edu/mailman/listinfo/cas
>



-- 
-Scott Battaglia
PGP Public Key Id: 0x383733AA
LinkedIn: http://www.linkedin.com/in/scottbattaglia
_______________________________________________
Yale CAS mailing list
[email protected]
http://tp.its.yale.edu/mailman/listinfo/cas

Reply via email to