I'm running a tremendous load on a remote cache cluster, but on much weaker hardware. We haven't restarted the server in over a month. I don't recommend restarting them nightly. . . . .
I'm trying to get clear on the issue your having. Is your client crashing from an out of memory error, or is it reporting that the server encountered an out of memory exception? Is the remote server crashing from out of memory errors? Do you have the remote servers in a cluster? When a client is no longer reachable, the server will try 10 times or so to send an event to that client before killing the client queue. Basically a queue is built for each client. If you are pushing a lot of data through, perhaps you are spiking during the error detection interval. Turn on verbosegc so we can get a sense of how the memory usage is climbing. What auxiliaries are you using on the remote cache? I would upgrade to 1.2.7.9, which is what I'm using. There have been a few subtle remote cache changes that since 1.2.7.3. There have also been significant indexed disk and jdbc disk cache improvements. Some of these are documented here: http://jakarta.apache.org/jcs/changes-report.html Yes, I recommend INFO level logs for JCS. There is some improved logging in 1.2.7.9 also. Aaron > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Friday, November 10, 2006 3:06 PM > To: JCS Users List > Subject: Out-of-memory errors > > > Hi, > > We've been using the JCS cache in our production environment for about 3 > months and have noticed that every so often we get out-of-memory errors on > both the local application server (Websphere 4.0.7 running IBM JDK 1.3.1 > SR5w) and the remote cache server (IBM JDK 1.3.1 SR8). > > The error on the app server appears to be related to period when we need > to > fail-over to a secondary remote cache server when we bounce our servers > once every 24 hours. The exact error message in the app server log file > is: > > [ERROR] RemoteCacheFailoverRunner - -Trouble trying to deregister old > failover listener prior to restoring th > e primary = kax90070c:1101 <java.rmi.ServerError: Error occurred in server > thread; nested exception is: > java.lang.OutOfMemoryError>java.rmi.ServerError: Error occurred in > server thread; nested exception is: java.lang.OutOfMemoryError > java.lang.OutOfMemoryError > at > sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRem ot > eCall.java(Compiled > Code > )) > at > sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java(Com pi > led > Code)) > at sun.rmi.server.UnicastRef.invoke(UnicastRef.java(Compiled > Code)) > at > org.apache.jcs.auxiliary.remote.server.RemoteCacheServer_Stub.removeCach eL > istener(Unknown > Source) > at > org.apache.jcs.engine.CacheWatchRepairable.removeCacheListener(CacheWatc hR > epairable.java(Compiled > Code)) > at > org.apache.jcs.auxiliary.remote.RemoteCacheManager.removeRemoteCacheList en > er(RemoteCacheManager.ja > va(Compiled Code)) > at > org.apache.jcs.auxiliary.remote.RemoteCacheFailoverRunner.restorePrimary (R > emoteCacheFailoverRunner > .java(Compiled Code)) > at > org.apache.jcs.auxiliary.remote.RemoteCacheFailoverRunner.connectAndRest or > e(RemoteCacheFailoverRun > ner.java(Compiled Code)) > at > org.apache.jcs.auxiliary.remote.RemoteCacheFailoverRunner.run(RemoteCach eF > ailoverRunner.java:104) > at java.lang.Thread.run(Thread.java(Compiled Code)) > > The error on the remote cache server just says "java.lang.OutOfMemory" > with > no additional detail. The remote cache process does not restart and the > cache appears to continue to function just fine. This only appears under > conditions of very high load. Should we turn up logging on these cache > servers in order to get additional detail? This happens on all 4 of our > remote cache servers from time to time but not on every one, even when we > have done heavy load testing that brings the CPU utilization up to 65% on > an AIX 5.2 server with 8 CPUs and 16 gigs of memory. The JVM size is > 2048, > with about 53% of the memory free on average during the peak utilization. > > Our current version of JCS is 1.2.7.3....are any of these error conditions > fixed in a later release? > > Thanks for any help. > > Paul > > CONFIDENTIALITY NOTICE: > This is a transmission from Kohl's Department Stores, Inc. > and may contain information which is confidential and proprietary. > If you are not the addressee, any disclosure, copying or distribution or > use of the contents of this message is expressly prohibited. > If you have received this transmission in error, please destroy it and > notify us immediately at 262-703-7000. > > CAUTION: > Internet and e-mail communications are Kohl's property and Kohl's reserves > the right to retrieve and read any message created, sent and received. > Kohl's reserves the right to monitor messages to or from authorized Kohl's > Associates at any time > without any further consent. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]