On Sep 5, 2012, at 8:48 PM, kharp...@oreillyauto.com wrote: > I'm working with Lee on this as well, so I can help answer most of that. > > In short: Yes, all our replication is working well. We have keepalived > acting as a vrrp device (no round-robin dns) in front of a few web servers > (apache 2.2.x, mod_proxy/mod_ajp) which are using stickysessions and > BalancerMembers. Replication (DeltaManager/SimpleTCPCluster) is working > as intended on the tomcat side (6.0.24).
Jumping in a little late on this thread, but have you considered trying the BackupManager instead of DeltaManager? The DeltaManager is going to replicate session data to all cluster members while BackupManager will only replicate to one backup cluster member. This might save you some time on restart. https://tomcat.apache.org/tomcat-7.0-doc/config/cluster-manager.html#org.apache.catalina.ha.session.BackupManager_Attributes Dan > > After further research, the problem we're seeing is performance with > replication when the number of sessions is larger than around 2000. Using > Jmeter on our test servers I can reproduce the problem. Here are the times > it takes to replicate X number of sessions when an application is > restarted: > Sess Time (sec) > 10 0.101 > 125 0.401 > 500 1.302 > 1500 2.104 > 1800 5.308 > 1800 6.709 > 2400 15.02 > 3600 30.285 > 3600 27.238 > > The times make sense until around 1500. The time it takes to replicate > more than 1500 sessions becomes exponentially worse. Here is our cluster > configuration from "node1": > <Engine name="Catalina" defaultHost="localhost" > jvmRoute="tntest-app-a-1"> > <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > channelSendOptions="8"> > <Manager className="org.apache.catalina.ha.session.DeltaManager" > stateTransferTimeout="45" > expireSessionsOnShutdown="false" > notifyListenersOnReplication="true" /> > <Channel className="org.apache.catalina.tribes.group.GroupChannel"> > <Membership > className="org.apache.catalina.tribes.membership.McastService" > address="239.255.0.1" > port="45564" > frequency="500" > dropTime="3000" /> > > <Receiver > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > address="auto" > port="4000" > autoBind="100" > selectorTimeout="5000" > maxThreads="6" /> > > <Sender > className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> > <Transport > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender" > timeout="45000" /> > </Sender> > > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/> > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/> > </Channel> > > <Valve className="org.apache.catalina.ha.tcp.ReplicationValve" > filter=""/> > <Valve > className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> > > <ClusterListener > className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/> > <ClusterListener > className="org.apache.catalina.ha.session.ClusterSessionListener"/> > </Cluster> > > > The best time we got for 3600 sessions was 24 seconds, and that's when I > added the following to the Manager tag (stole this from the 5.5 docs; not > even sure it's valid in 6.x): > sendAllSessions="false" > sendAllSessionsSize="500" > sendAllSessionsWait="20" > > > What has me stumped is why the time required to do more sessions is > exponentially higher beyond 1500 sessions. Using JMeter I can simulate > 3600 new users (all creating a session) and the two servers can serve the > requests AND generate/replicate the sessions in under 19 seconds. Any > ideas would be greatly appreciated. I have a full test environment to > simulate anything you might recommend. > > Sincerely, > Kyle Harper > > > > > > From: Igor Cicimov <icici...@gmail.com> > To: Tomcat Users List <users@tomcat.apache.org> > Date: 09/05/2012 07:12 PM > Subject: Re: Tuning session replication on clusters > > > > On Thu, Sep 6, 2012 at 5:51 AM, <llow...@oreillyauto.com> wrote: > >> >> I have a small cluster of 3 nodes running tomcat 6.0.24 with openJDK >> 1.6.0_20 on Ubuntu 10.04 LTS. >> >> I have roughly 5,000-6,000 sessions at any given time, and when I restart >> one of the nodes I am finding that not all sessions are getting >> replicated , even when I have the state transfer timeout set to 60 >> seconds. >> >> It seems that only sessions that have been touched recently are > replicated, >> even if the session is still otherwise valid. I did one test where I >> created about 1,500 sessions and then took out one node, When I brought > it >> back online, it only replicated the 4-5 sessions that were from active >> users on the test cluster. It did not replicated the idle sessions that >> were still valid that my prior test had created. >> >> I am wanting to tune my settings, but I am unsure where would be the > best >> place to start. Should I start with the threads available to the NIO >> Receiver, or would I be better off focusing on a different set of >> attributes first, such as the send or receive timeout values? >> >> Any tips or pointers as to which setting might be the most productive > would >> be greatly appreciated. >> >> Lee Lowder >> O'Reilly Auto Parts >> Web Systems Administrator >> (417) 862-2674 x1858 >> >> This communication and any attachments are confidential, protected by >> Communications Privacy Act 18 USCS § 2510, solely for the use of the >> intended recipient, and may contain legally privileged material. If you > are >> not the intended recipient, please return or destroy it immediately. > Thank >> you. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org >> For additional commands, e-mail: users-h...@tomcat.apache.org >> >> > For starter does your cluster satisfy the requirements bellow? > > To run session replication in your Tomcat 6.0 container, the following > steps should be completed: > > - All your session attributes must implement java.io.Serializable > - Uncomment the Cluster element in server.xml > - If you have defined custom cluster valves, make sure you have the > ReplicationValve defined as well under the Cluster element in server.xml > - If your Tomcat instances are running on the same machine, make sure > the tcpListenPort attribute is unique for each instance, in most cases > Tomcat is smart enough to resolve this on it's own by autodetecting > available ports in the range 4000-4100 > - Make sure your web.xml has the <distributable/> element > - If you are using mod_jk, make sure that jvmRoute attribute is set at > your Engine <Engine name="Catalina" jvmRoute="node01" > and that the > jvmRoute attribute value matches your worker name in workers.properties > - Make sure that all nodes have the same time and sync with NTP service! > - Make sure that your loadbalancer is configured for sticky session > mode. > > > Also you don't say what are you using for load balancing? Not bad to post > your cluster definition as well. > > -- > This message has been scanned for viruses and > dangerous content by MailScanner, and is > believed to be clean. > > > > This communication and any attachments are confidential, protected by > Communications Privacy Act 18 USCS § 2510, solely for the use of the intended > recipient, and may contain legally privileged material. If you are not the > intended recipient, please return or destroy it immediately. Thank you. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org