Apr 1, 2009 3:28:42 AM org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts WARNING: Channel key is registered, but has had no interest ops for the last 3000 ms.

this is a sign of a thread being stuck, and its confirmed by the session transfer timing out. Do a thread dump on the server where you are seeing this message, that will tell us where your system is stuck

Filip

Jimmy Phillips wrote:
Hi,

I've been having issues with tomcat session replication. I have a number of tomcat servers running in a cluster mode, behind an Apache load balancer. The tomcat version is 6.0.18 on CentOS 5.1. Running the cluster using the DeltaManager seems to be working fine, however when I try to use the BackupManager for session replication, I get the following entries in the logs:

Apr 1, 2009 3:28:42 AM org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts WARNING: Channel key is registered, but has had no interest ops for the last 3000 ms. (cancelled:false):sun.nio.ch.selectionkeyi...@62af9d74 last access:2009-04-01 03:28:35.969 Apr 1, 2009 3:28:42 AM org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts WARNING: Channel key is registered, but has had no interest ops for the last 3000 ms. (cancelled:false):sun.nio.ch.selectionkeyi...@4c4947d3 last access:2009-04-01 03:28:35.969 Apr 1, 2009 3:29:04 AM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Received memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 99, 86, 47}:4000,{10, 99, 86, 47},4000, alive=1380182,id={-121 25 -2 -7 81 -1 76 3 -92 -20 122 69 67 102 -31 -15 }, payload={}, command={}, domain={}, ]] message. Will verify. Apr 1, 2009 3:29:04 AM org.apache.catalina.tribes.group.interceptors.TcpFailureDetector memberDisappeared INFO: Verification complete. Member still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, 99, 86, 47}:4000,{10, 99, 86, 47},4000, alive=1380182,id={-121 25 -2 -7 81 -1 76 3 -92 -20 122 69 67 102 -31 -15 }, payload={}, command={}, domain={}, ]] Apr 1, 2009 3:29:04 AM org.apache.catalina.tribes.tipis.AbstractReplicatedMap heartbeat
SEVERE: Unable to send AbstractReplicatedMap.ping message
org.apache.catalina.tribes.ChannelException: Operation has timed out(60000 ms.).; Faulty members:tcp://{10, 99, 86, 47}:4000; at org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(ParallelNioSender.java:97) at org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(PooledParallelSender.java:53) at org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(ReplicationTransmitter.java:80) at org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoordinator.java:78) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMessage(ThroughputInterceptor.java:61) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sendMessage(MessageDispatchInterceptor.java:73) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage(TcpFailureDetector.java:87) at org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelInterceptorBase.java:75) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216) at org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175) at org.apache.catalina.tribes.group.RpcChannel.send(RpcChannel.java:89) at org.apache.catalina.tribes.tipis.AbstractReplicatedMap.ping(AbstractReplicatedMap.java:253) at org.apache.catalina.tribes.tipis.AbstractReplicatedMap.heartbeat(AbstractReplicatedMap.java:793) at org.apache.catalina.tribes.group.GroupChannel.heartbeat(GroupChannel.java:153) at org.apache.catalina.tribes.group.GroupChannel$HeartbeatThread.run(GroupChannel.java:661)

Of course the above entry is just one of many, for the different hosts. Searching the mailing lists, I found this post http://markmail.org/message/jv4dykh7fdhr4mvp which looks like the same problem I am having. The outcome of that thread states that the problem is fixed by a patch in revision 618823, so I compiled a version of the current 6.x trunk (rev 759722) and deployed it to all the servers. However, the problem is still appearing. I've attached a copy of the current server.xml ( it is common to all tomcat instances ).

I've done a thread dump on one of the servers when these errors started appearing, and the output is attached, thread_dump.txt (removed threads that were running by our application).

This problem is reproducable each time I restart the servers. At this stage, I'm clueless on what to try next, so I'm looking forward to your replies.


Regards,
Jim.

Attached: server.xml, thread_dump.txt

------------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to