So the first message I got was..

2008-03-06 23:51:20,838 ERROR 
[org.jasig.cas.ticket.registry.JBossCacheTicketRegistry] - 
<org.jboss.cache.ReplicationException: rsp=sender=138.1
23.130.81:32772, retval=null, received=false, suspected=false>
org.jboss.cache.ReplicationException: rsp=sender=138.123.130.81:32772, 
retval=null, received=false, suspected=false

I found an old message on the list..

http://tp.its.yale.edu/pipermail/cas/2006-September/003412.html

I'm using our VMWare Infrastructure for our two nodes.  I had them on 
two different hosts, so I then moved them to the same VMWare host and 
everything appeared to be better.  So, next week I plan on talking to 
the network guy about using multicast on that switch (I had spoken to 
him earlier and he thought the settings on it should be fine).

I left one of my casified apps in the browser all night.  This app does 
a page refresh and uses one of the apache module, so it's been helpful 
in finding these problems.

The next day, I found another exception...

2008-03-08 11:02:09,130 ERROR 
[org.apache.catalina.core.ContainerBase.[Catalina].[c-cas-02.dtcc.edu].[/cas].[cas]]
 
- <Servlet.service() for servlet cas threw exception>
org.jboss.cache.lock.TimeoutException: Response timed out: 
sender=138.123.130.81:32772, retval=null, received=false, suspected=false

These are happening on either node and it does appear that I get get 
tickets and appear validated from both nodes.  So the clustering is 
working ok, but there seems to be a timing issue.

Did some more searching and saw references on some Japanese site for 
some totally different application to the "SyncReplTimeout" value.  I 
also found some document on Redhat's site about setting up JBoss.  So, I 
made some adjustments to the different timeouts.  I think I read 
something somewhere on some JBoss site that some timeouts need to be 
shorter than others.  I'm no Tomcat or JBoss expert, but I set the below 
settings and I think it's been behaving as expected..

<attribute name="InitialStateRetrievalTimeout">15000</attribute>

<attribute name="SyncReplTimeout">20000</attribute>

<attribute name="LockAcquisitionTimeout">25000</attribute>

I also wonder if the JBoss replication stuff is dependent on the system 
clock.  I noticed one of them was off and had to fiddle with ntp.

My next goal is to load test logins with JMeter.  I did try another 
program that someone posted on the list, but I thought JMeter looked 
better.  I haven't exactly gotten that one working yet.  But just 
hitting the servers with page retrievals doesn't seem to cause any 
exceptions.

Pat

On 3/7/08 5:25 PM, Pat Hennessy wrote:
> On 7/23/2007 11:45 AM, Brian Donnelly wrote:
>> Thanks Scott,
>>
>> I've attached my jbossCache.xml config file.  It is almost identical to the 
>> jbossTestCache.xml configuration included in CAS 3.0.6.  I did have to 
>> comment out the authentication protocol version tag because it was 
>> generating errors.
>>
>> If anyone has any pointers or would be willing to send their JBossCache 
>> configuation parameters, I'd be very appreciative.
>>
> 
> Brian,
> 
> Did you ever find a fix for the org.jboss.cache.ReplicationException 
> error you found?
> 
> I just setup the jboss replication using the directions on the CAS wiki 
> (and the jbossTestCache.xml file).  On the dev cluster, I didn't get the 
> error.  After putting it on our new to be production cluster, I've been 
> finding the same error showing up as a RuntimeException with some of our 
> test apps.  I don't think we putting these services under any real load 
> though.
> 
> Pat
> 
>> Thanks,
>>
>> Brian Donnelly
>> --
>> Brian Donnelly
>> University of Calfornia, Davis
>> Information and Educational Technology
>> Middleware Team
>> (530) 754-5909
>> [EMAIL PROTECTED]
>>
>> -----Original Message-----
>> From: Scott Battaglia [mailto:[EMAIL PROTECTED]
>> Sent: Fri 7/20/2007 6:29 AM
>> To: Brian Donnelly; Yale CAS mailing list
>> Subject: Re: JBossCache Ticket Registry performance under load?
>>  
>> Brian,
>>
>> We don't deploy that at Rutgers so I can't comment on that.  A few people
>> have deployed it in production without issues.  Maybe you can include your
>> configuration file and those who have deployed it successfully can compare
>> it to theirs if they get a minute (hopefully).
>>
>> Thanks
>> -Scott
>>
>> On 7/18/07, Brian Donnelly <[EMAIL PROTECTED]> wrote:
>>> Hi all,
>>>
>>> We're getting ready at UC Davis to switch to a JBossCache Clustered
>>> configuration for our CAS installation.  I have been load testing two
>>> Redhat EL 5 clustered nodes running CAS 3.0.6 using the default
>>> JBossCache implementation, (UDP multicast.)
>>>
>>> I've been using JMeter to generate ~7 login actions per second.  Both
>>> clustered servers perform fine for several hours.  Somewhere in the
>>> third hour of testing, I start seeing the following errors in the logs:
>>>
>>> 2007-07-18 13:43:54,813 ERROR
>>> [org.jasig.cas.ticket.registry.JBossCacheTicketRegistry] -
>>> <org.jboss.cache.ReplicationException: rsp=sender=169.237.104.235:53768,
>>> retval=null, received=false, suspected=false>
>>>
>>> and
>>>
>>> 2007-07-18 13:48:33,448 ERROR
>>> [org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/cas].[cas]]
>>> - <Servlet.service() for servlet cas threw exception>
>>>
>>> These start piling up until both servers stop responding to incoming
>>> requests.  A restart is required to restore service.
>>>
>>> Has anyone else encountered errors of this type in their testing of the
>>> JBossCache registry?
>>>
>>> Thanks,
>>>
>>> Brian Donnelly
>>> --
>>> Brian Donnelly
>>> University of Calfornia, Davis
>>> Information and Educational Technology
>>> Middleware Team
>>> (530) 754-5909
>>> [EMAIL PROTECTED]
>>> _______________________________________________
>>> Yale CAS mailing list
>>> [email protected]
>>> http://tp.its.yale.edu/mailman/listinfo/cas
>>>
>>
>>
> 
> 


-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Pat Hennessy, RHCE                        ([EMAIL PROTECTED])

Senior Systems Specialist
Division of Information and Educational Technology
Delaware Technical and Community College
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
_______________________________________________
Yale CAS mailing list
[email protected]
http://tp.its.yale.edu/mailman/listinfo/cas

Reply via email to