Hi Michael, The logs are the bit that went haywire. The applications at this point still work but often, there's not enough time to troubleshoot much else. The logs can increase by 5-6GB in a matter of an hour or so and hence, we often just kill the service (normal shutdown.sh doesn't respond any more at this point, we have to kill -9 it) in panic and delete the logs before the entire server goes kaboom. This time, I managed to tail out some of the logs, for which I pasted an extract (same repeating pattern of errors):
Aug 25, 2009 11:44:02 AM org.apache.catalina.ha.session.DeltaRequest reset SEVERE: Unable to remove element java.util.NoSuchElementException at java.util.LinkedList.remove(LinkedList.java:788) at java.util.LinkedList.removeFirst(LinkedList.java:134) at org.apache.catalina.ha.session.DeltaRequest.reset(DeltaRequest.java:201) at org.apache.catalina.ha.session.DeltaRequest.execute(DeltaRequest.java:195) at org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA(DeltaManager.java:1364) at org.apache.catalina.ha.session.DeltaManager.messageReceived(DeltaManager.java:1320) at org.apache.catalina.ha.session.DeltaManager.messageDataReceived(DeltaManager.java:1083) at org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:87) at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:916) at org.apache.catalina.ha.tcp.SimpleTcpCluster.messageReceived(SimpleTcpCluster.java:897) at org.apache.catalina.tribes.group.GroupChannel.messageReceived(GroupChannel.java:264) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.messageReceived(TcpFailureDetector.java:110) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79) at org.apache.catalina.tribes.group.ChannelInterceptorBase.messageReceived(ChannelInterceptorBase.java:79) at org.apache.catalina.tribes.group.ChannelCoordinator.messageReceived(ChannelCoordinator.java:241) at org.apache.catalina.tribes.transport.ReceiverBase.messageDataReceived(ReceiverBase.java:225) at org.apache.catalina.tribes.transport.nio.NioReplicationTask.drainChannel(NioReplicationTask.java:188) at org.apache.catalina.tribes.transport.nio.NioReplicationTask.run(NioReplicationTask.java:91) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:619) Wong On Tue, Aug 25, 2009 at 3:36 PM, Michael Ludwig <m...@as-guides.com> wrote: > CS Wong schrieb: > >> Periodically, I'm getting problems with my Tomcat 6 cluster (2 nodes). >> One of the nodes would just go haywire >> > > Could you elaborate on what "going haywire" means? > > > Below, you write: > > [The NoSuchElementException is] the only thing that it shows. The >> other node in the cluster is still active at this time. There's >> nothing to do but to restart. The large amount of logs has caused >> disk space issues more than a couple of times too. >> > > So is that server not active any more? Unresponsive? Hyperactive writing > to the log file? Looping? > > and generate a ton of logs repeating the following: >> >> Aug 25, 2009 11:44:10 AM org.apache.catalina.ha.session.DeltaRequest reset >> SEVERE: Unable to remove element >> java.util.NoSuchElementException >> at java.util.LinkedList.remove(LinkedList.java:788) >> at java.util.LinkedList.removeFirst(LinkedList.java:134) >> at >> org.apache.catalina.ha.session.DeltaRequest.reset(DeltaRequest.java:201) >> at >> org.apache.catalina.ha.session.DeltaRequest.execute(DeltaRequest.java:195) >> at >> org.apache.catalina.ha.session.DeltaManager.handleSESSION_DELTA(DeltaManager.java:1364) >> at >> org.apache.catalina.ha.session.DeltaManager.messageReceived(DeltaManager.java:1320) >> at >> org.apache.catalina.ha.session.DeltaManager.messageDataReceived(DeltaManager.java:1083) >> at >> org.apache.catalina.ha.session.ClusterSessionListener.messageReceived(ClusterSessionListener.java:87) >> > > I only found this, which seems to have led you here: > > http://stackoverflow.com/questions/1326336/ > > Maybe it is helpful to others who know about Tomcat internals. > > -- > Michael Ludwig > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >