Re: Cluster Not Syncing After Restart
Yes, you are right! Feedback welcome :-) Peter Am 18.01.2008 um 22:24 schrieb David Rees: On Jan 18, 2008 1:13 PM, David Rees [EMAIL PROTECTED] wrote: So your testing is with the to-be-released 5.5.26? Where can I get a copy of this and the changes going into 5.5.26? I found the changelog, looks like I need to build 5.5.26 from source myself if i want to test it? http://svn.apache.org/viewvc/tomcat/container/tc5.5.x/webapps/docs/ changelog.xml It does look like there are some bugfixes in it which may affect cluster synchronization. -Dave - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Cluster Not Syncing After Restart
Hi David, I have also made a test with two cluster nodes. Is node2 under load as node1 is started the message are normal. Cluster is starts first and receive message, but the application starts later. As application is startet at node1 the manager request node2 for current sessions. All message that received before the sync is complete are dropped. All message has a timestamp and only message dropped before the GET_ALL_SESSION message are created. At next release 5.5.26 the sessionCounter is also set correct after restart. At my test all sessions are correct load after restart. Peter PS: My Cluster Config: Cluster className=org.apache.catalina.cluster.tcp.SimpleTcpCluster doClusterLog=true clusterLogName=clusterlog manager.className=org.apache.catalina.cluster.session.DeltaManager manager.expireSessionsOnShutdown=false manager.notifyListenersOnReplication=false manager.notifySessionListenersOnReplication=false manager.sendAllSessions=false manager.sendAllSessionsSize=500 manager.sendAllSessionsWaitTime=20 Membership className=org.apache.catalina.cluster.mcast.McastService mcastAddr=224.0.0.4 mcastPort=${mcast.port.prefix}7 mcastFrequency=1000 mcastDropTime=3/ Receiver className=org.apache.catalina.cluster.tcp.ReplicationListener tcpListenAddress=auto tcpListenPort=${port.prefix}7 tcpSelectorTimeout=100 tcpThreadCount=6 / Sender className=org.apache.catalina.cluster.tcp.ReplicationTransmitter replicationMode=fastasyncqueue recoverTimeout=5000 recoverCounter=6 doTransmitterProcessingStats=true doProcessingStats=true doWaitAckStats=true queueTimeWait=true queueDoStats=true queueCheckLock=true ackTimeout=15000 waitForAck=true keepAliveTimeout=8 keepAliveMaxRequestCount=-1/ Valve className=org.apache.catalina.cluster.tcp.ReplicationValve filter=.*\.gif;.* \.js;.*\.css;.*\.png;.*\.jpeg;.*\.jpg;.*\.htm;.*\.html;.*\.txt; primaryIndicator=true / Valve className=org.apache.catalina.cluster.session.JvmRouteBinderValve enabled=true / ClusterListener className=org.apache.catalina.cluster.session.ClusterSessionListener / ClusterListener className=org.apache.catalina.cluster.session.JvmRouteSessionIDBinderLi stener / /Cluster Am 17.01.2008 um 09:24 schrieb David Rees: On Jan 17, 2008 12:16 AM, Peter Rossbach [EMAIL PROTECTED] wrote: you can't restart your nodes so quickly. As you shutdown one node you must wait 30-45 sec before next restart. Don't restart both nodes at same time. I am waiting at 4 minutes before restarting node 2 before restarting node 1, so that is not the problem in this case. The sequence goes like this: 04:30: Shut down node 1, it gets restarted automatically immediately after it has completely shut down (typically within 10 seconds). 04:34: Shut down node 2, it gets restarted automatically immediately after it has completely shut down (typically within 10 seconds). -Dave - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Cluster Not Syncing After Restart
On Jan 18, 2008 1:13 PM, David Rees [EMAIL PROTECTED] wrote: So your testing is with the to-be-released 5.5.26? Where can I get a copy of this and the changes going into 5.5.26? I found the changelog, looks like I need to build 5.5.26 from source myself if i want to test it? http://svn.apache.org/viewvc/tomcat/container/tc5.5.x/webapps/docs/changelog.xml It does look like there are some bugfixes in it which may affect cluster synchronization. -Dave - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Cluster Not Syncing After Restart
On Jan 18, 2008 8:17 AM, Peter Rossbach [EMAIL PROTECTED] wrote: I have also made a test with two cluster nodes. Is node2 under load as node1 is started the message are normal. Cluster is starts first and receive message, but the application starts later. As application is startet at node1 the manager request node2 for current sessions. All message that received before the sync is complete are dropped. All message has a timestamp and only message dropped before the GET_ALL_SESSION message are created. Just to be clear, you are able to stop and immediately restart a node multiple times without issue? At next release 5.5.26 the sessionCounter is also set correct after restart. At my test all sessions are correct load after restart. So your testing is with the to-be-released 5.5.26? Where can I get a copy of this and the changes going into 5.5.26? Thanks, Dave - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Cluster Not Syncing After Restart
Hi David, you can't restart your nodes so quickly. As you shutdown one node you must wait 30-45 sec before next restart. Don't restart both nodes at same time. Peter Am 17.01.2008 um 02:25 schrieb David Rees: I'm running a Tomcat 5.5.25 cluster with 2 nodes and the following cluster configuration in the Host/ element: Cluster className=org.apache.catalina.cluster.tcp.SimpleTcpCluster Valve className=org.apache.catalina.cluster.tcp.ReplicationValve filter=.*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.* \.css;.*\.txt;/ Valve className=org.apache.catalina.cluster.session.JvmRouteBinderValve enabled=true/ ClusterListener className=org.apache.catalina.cluster.session.ClusterSessionListener / ClusterListener className=org.apache.catalina.cluster.session.JvmRouteSessionIDBinder Listener/ /Cluster Every night, I restart the two nodes of the cluster, one at a time, using a standard shutdown with a processing monitoring Tomcat to start it back up automatically after it has exited (daemontools for those interested). But after restarting the first node, it does not rejoin the cluster. In the logs of node 1 are this (node was shut down at 04:30): 2008-01-17 04:30:07,212 INFO : SimpleTcpCluster: Cluster is about to start 2008-01-17 04:30:07,317 INFO : SimpleTcpCluster: Add Default ClusterReceiver at cluster localhost 2008-01-17 04:30:07,326 INFO : SimpleTcpCluster: Add Default ClusterSender at cluster localhost 2008-01-17 04:30:07,341 INFO : SocketReplicationListener: Open Socket at [10.1.1.5:8015] 2008-01-17 04:30:07,363 ERROR: ClusterListener: Context manager doesn't exist: ^^^ That line repeats ~25,000 times for a couple seconds 2008-01-17 04:30:10,110 ERROR: ClusterListener: Context manager doesn't exist: 2008-01-17 04:30:11,443 INFO : McastService: membership mbean registered (Catalina:type=ClusterMembership,host=localhost) 2008-01-17 04:30:11,790 INFO : DeltaManager: Starting clustering manager...: 2008-01-17 04:30:11,790 INFO : DeltaManager: Register manager to cluster element Host with name localhost 2008-01-17 04:30:11,790 INFO : DeltaManager: Starting clustering manager at 2008-01-17 04:30:11,794 WARN : DeltaManager: Manager [], requesting session state from org.apache.catalina.cluster.mcast.McastMember[tcp:// 10.1.1.6:8015,catalina,10.1.1.6,8015, alive=86161577]. This operation will timeout if no session state has been received within 60 seconds. 2008-01-17 04:30:21,894 ERROR: DeltaManager: Manager []: No session state send at 17/01/08 04:30 received, timing out after 10,102 ms. While node 1 is shutting down, ~4000 messages like this are generated on node 2: 2008-01-17 04:30:06,233 WARN : FastAsyncSocketSender: Unable to asynchronously send session with id=[x.c-web2-1200438737718] - message will be ignored. java.lang.NullPointerException at org.apache.catalina.cluster.tcp.DataSender.pushMessage (DataSender.java:1057) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender $FastQueueThread.pushQueuedMessages(FastAsyncSocketSender.java:524) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender $FastQueueThread.run(FastAsyncSocketSender.java:487) 2008-01-17 04:30:06,235 WARN : FastAsyncSocketSender: Unable to asynchronously send session with id=[x.c-web2-1200438739053] - message will be ignored. java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java: 333) at java.net.PlainSocketImpl.connectToAddress (PlainSocketImpl.java:195) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:519) at org.apache.catalina.cluster.tcp.DataSender.createSocket (DataSender.java:897) at org.apache.catalina.cluster.tcp.DataSender.openSocket (DataSender.java:866) at org.apache.catalina.cluster.tcp.DataSender.pushMessage (DataSender.java:1009) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender $FastQueueThread.pushQueuedMessages(FastAsyncSocketSender.java:524) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender $FastQueueThread.run(FastAsyncSocketSender.java:487) These stop after about 1 second (last one at 04:30:07.340), and then a decent number of Send stats lines are then printed on node 2: 2008-01-17 04:30:07,376 INFO : DataSender: Send stats from [10.1.1.5:8,015], Nr of bytes sent=20,441,242 over 26,000 = 786 bytes/request, processing time 58,669,413 msec, avg processing time 2,257 msec !-- More lines similar to the above and below -- 2008-01-17 04:30:10,053 INFO : DataSender: Send stats from [10.1.1.5:8,015], Nr of bytes sent=39,242,737 over 50,000 = 784 bytes/request, processing time 58,672,059 msec, avg processing time 1,173 msec When node 2 restarts a couple minutes later, it is
Re: Cluster Not Syncing After Restart
On Jan 17, 2008 12:16 AM, Peter Rossbach [EMAIL PROTECTED] wrote: you can't restart your nodes so quickly. As you shutdown one node you must wait 30-45 sec before next restart. Don't restart both nodes at same time. I am waiting at 4 minutes before restarting node 2 before restarting node 1, so that is not the problem in this case. The sequence goes like this: 04:30: Shut down node 1, it gets restarted automatically immediately after it has completely shut down (typically within 10 seconds). 04:34: Shut down node 2, it gets restarted automatically immediately after it has completely shut down (typically within 10 seconds). -Dave - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Cluster Not Syncing After Restart
I'm running a Tomcat 5.5.25 cluster with 2 nodes and the following cluster configuration in the Host/ element: Cluster className=org.apache.catalina.cluster.tcp.SimpleTcpCluster Valve className=org.apache.catalina.cluster.tcp.ReplicationValve filter=.*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;/ Valve className=org.apache.catalina.cluster.session.JvmRouteBinderValve enabled=true/ ClusterListener className=org.apache.catalina.cluster.session.ClusterSessionListener/ ClusterListener className=org.apache.catalina.cluster.session.JvmRouteSessionIDBinderListener/ /Cluster Every night, I restart the two nodes of the cluster, one at a time, using a standard shutdown with a processing monitoring Tomcat to start it back up automatically after it has exited (daemontools for those interested). But after restarting the first node, it does not rejoin the cluster. In the logs of node 1 are this (node was shut down at 04:30): 2008-01-17 04:30:07,212 INFO : SimpleTcpCluster: Cluster is about to start 2008-01-17 04:30:07,317 INFO : SimpleTcpCluster: Add Default ClusterReceiver at cluster localhost 2008-01-17 04:30:07,326 INFO : SimpleTcpCluster: Add Default ClusterSender at cluster localhost 2008-01-17 04:30:07,341 INFO : SocketReplicationListener: Open Socket at [10.1.1.5:8015] 2008-01-17 04:30:07,363 ERROR: ClusterListener: Context manager doesn't exist: ^^^ That line repeats ~25,000 times for a couple seconds 2008-01-17 04:30:10,110 ERROR: ClusterListener: Context manager doesn't exist: 2008-01-17 04:30:11,443 INFO : McastService: membership mbean registered (Catalina:type=ClusterMembership,host=localhost) 2008-01-17 04:30:11,790 INFO : DeltaManager: Starting clustering manager...: 2008-01-17 04:30:11,790 INFO : DeltaManager: Register manager to cluster element Host with name localhost 2008-01-17 04:30:11,790 INFO : DeltaManager: Starting clustering manager at 2008-01-17 04:30:11,794 WARN : DeltaManager: Manager [], requesting session state from org.apache.catalina.cluster.mcast.McastMember[tcp://10.1.1.6:8015,catalina,10.1.1.6,8015, alive=86161577]. This operation will timeout if no session state has been received within 60 seconds. 2008-01-17 04:30:21,894 ERROR: DeltaManager: Manager []: No session state send at 17/01/08 04:30 received, timing out after 10,102 ms. While node 1 is shutting down, ~4000 messages like this are generated on node 2: 2008-01-17 04:30:06,233 WARN : FastAsyncSocketSender: Unable to asynchronously send session with id=[x.c-web2-1200438737718] - message will be ignored. java.lang.NullPointerException at org.apache.catalina.cluster.tcp.DataSender.pushMessage(DataSender.java:1057) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.pushQueuedMessages(FastAsyncSocketSender.java:524) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.run(FastAsyncSocketSender.java:487) 2008-01-17 04:30:06,235 WARN : FastAsyncSocketSender: Unable to asynchronously send session with id=[x.c-web2-1200438739053] - message will be ignored. java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366) at java.net.Socket.connect(Socket.java:519) at org.apache.catalina.cluster.tcp.DataSender.createSocket(DataSender.java:897) at org.apache.catalina.cluster.tcp.DataSender.openSocket(DataSender.java:866) at org.apache.catalina.cluster.tcp.DataSender.pushMessage(DataSender.java:1009) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.pushQueuedMessages(FastAsyncSocketSender.java:524) at org.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.run(FastAsyncSocketSender.java:487) These stop after about 1 second (last one at 04:30:07.340), and then a decent number of Send stats lines are then printed on node 2: 2008-01-17 04:30:07,376 INFO : DataSender: Send stats from [10.1.1.5:8,015], Nr of bytes sent=20,441,242 over 26,000 = 786 bytes/request, processing time 58,669,413 msec, avg processing time 2,257 msec !-- More lines similar to the above and below -- 2008-01-17 04:30:10,053 INFO : DataSender: Send stats from [10.1.1.5:8,015], Nr of bytes sent=39,242,737 over 50,000 = 784 bytes/request, processing time 58,672,059 msec, avg processing time 1,173 msec When node 2 restarts a couple minutes later, it is able to successfully rejoin the cluster: 2008-01-17 04:34:05,433 INFO : SimpleTcpCluster: Cluster is about to start 2008-01-17 04:34:05,515 INFO : SimpleTcpCluster: Add Default ClusterReceiver at cluster localhost 2008-01-17 04:34:05,521 INFO :