OK, I just had to re-read the comments in server.xml and think about my setup a bit more. It still seems like it should have worked the other way [shrug]. Basically I had to tell the cluster setup to use only the network adapters that represent the private link between the two servers (ignoring the other NIC that is my outlet to the Internet).
Anyway I changed server.xml as follows: 1) Added mcastBindAddr to thee "<Membership" tag. 2) Changed tcpListentAddress from "auto" to the actual NIC that I wanted it to listen on. Here's the "cluster" section of my server.xml in case anyone can benefit from it: <Cluster className="org.apache.catalina.cluster.tcp.SimpleTcpCluster" managerClassName="org.apache.catalina.cluster.session.D eltaManager" expireSessionsOnShutdown="false" useDirtyFlag="true" notifyListenersOnReplication="true"> <Membership className="org.apache.catalina.cluster.mcast.McastServic e" mcastAddr="228.0.0.4" mcastPort="45564" mcastBindAddr="192.168.11.3" mcastFrequency="500" mcastDropTime="3000"/> <Receiver className="org.apache.catalina.cluster.tcp.ReplicationLi stener" tcpListenAddress="192.168.11.3" tcpListenPort="4001" tcpSelectorTimeout="100" tcpThreadCount="6"/> <Sender className="org.apache.catalina.cluster.tcp.ReplicationTr ansmitter" replicationMode="pooled" ackTimeout="15000"/> <Valve className="org.apache.catalina.cluster.tcp.ReplicationValve" filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.ht ml;.*\.css;.*\.txt;"/> <Deployer className="org.apache.catalina.cluster.deploy.FarmWarDeployer" tempDir="/tmp/war-temp/" deployDir="/home/ltojsw/jakarta-tomcat-5.5.7/webap ps" watchDir="/home/ltojsw/jakarta-tomcat-5.5.7-deploy er/build/webapp" watchEnabled="false"/> </Cluster> --- Richard Richard Mixon (qwest) wrote: > OK, we still have one more issue with our Tomcat cluster as we move > to our Linux environment. > > For some reason, both instances (jvmRoute=srv1 and jvmRoute=srv2) see > each other at startup. We see that they each join the cluster just > fine. But when the first request comes through we get an exception > timeout trying to replicate. > > Of course it works fine in our Windows development environment, but > now we are moving to our testing and production environments - SuSE > Linux SLES9. > > Any ideas and suggestions are much appreciated. The catalina.log > messages for both Tomcat instances are below. > > Thanks - Richard > > CLUSTER MEMBER 2 (jvmRoute=srv1): > > INFO: Server startup in 7332 ms > Feb 21, 2005 9:02:58 PM > org.apache.catalina.cluster.tcp.SimpleTcpCluster > memberAdded > INFO: Replication member > added:org.apache.catalina.cluster.mcast.McastMember[tcp://140.99.50.58:4 > 001,140.99.50.58,4001, alive=2] > 21:03:36,258 INFO [TP-Processor3] UserCounterListener:137 - Before > increment, User Count: 0 21:03:36,262 INFO [TP-Processor3] > UserCounterListener:140 - After increment, User Count: 1 21:03:36,263 > INFO [TP-Processor3] UserCounterListener:73 - sessionCreated - > Session info: id: '6615ABC7BD43B096AB54C031B7BE02C5.srv1'; createdAt > '21:03:36'; lastAccessedAt '21:03:36'; currentTime '21:03:36; session > count: '1 21:03:36,264 INFO [TP-Processor3] UserCounterListener:76 - > sessionCreated - Session info: id: > '6615ABC7BD43B096AB54C031B7BE02C5.srv1'; createdAt '21:03:36'; > lastAccessedAt '21:03:36'; currentTime '21:03:36; session count: '1 > 21:05:14,482 INFO [TP-Processor2] UserCounterListener:137 - Before > increment, User Count: 1 21:05:14,483 INFO [TP-Processor2] > UserCounterListener:140 - After increment, User Count: 2 21:05:14,484 > INFO [TP-Processor2] UserCounterListener:73 - sessionCreated - > Session info: id: '61B3F35D9B0AAAE46F75AAA19FFC7D1B.srv1'; createdAt > '21:05:14'; lastAccessedAt '21:05:14'; currentTime '21:05:14; session > count: '2 21:05:14,485 INFO [TP-Processor2] UserCounterListener:76 - > sessionCreated - Session info: id: > '61B3F35D9B0AAAE46F75AAA19FFC7D1B.srv1'; createdAt '21:05:14'; > lastAccessedAt '21:05:14'; currentTime '21:05:14; session count: '2 > Feb 21, 2005 9:06:45 PM > org.apache.catalina.cluster.tcp.ReplicationTransmitter > sendMessageData WARNING: Unable to send replicated message, is server > down? java.net.ConnectException: Connection timed out at > java.net.PlainSocketImpl.socketConnect(Native Method) > at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at > > > > > > > > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:364) at > java.net.Socket.connect(Socket.java:507) at > java.net.Socket.connect(Socket.java:457) at > java.net.Socket.<init>(Socket.java:365) at > java.net.Socket.<init>(Socket.java:207) at > org.apache.catalina.cluster.tcp.SocketSender.connect(SocketSender.java:1 > 10) at > org.apache.catalina.cluster.tcp.SocketSender.sendMessage(SocketSender.ja > va:157) at > org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(PooledSoc > ketSender.java:147) at > org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(R > eplicationTransmitter.java:247) at > org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessage(Repli > cationTransmitter.java:281) at > org.apache.catalina.cluster.tcp.SimpleTcpCluster.send(SimpleTcpCluster.j > ava:454) at > org.apache.catalina.cluster.tcp.SimpleTcpCluster.send(SimpleTcpCluster.j > ava:467) at > > org.apache.catalina.cluster.session.DeltaManager.createSession(DeltaMana > ger.java:290) at > org.apache.catalina.cluster.session.DeltaManager.createSession(DeltaMana > ger.java:239) at > org.apache.catalina.connector.Request.doGetSession(Request.java:2199) > at > org.apache.catalina.connector.Request.getSessionInternal(Request.java:21 > 50) at > org.apache.catalina.authenticator.FormAuthenticator.authenticate(FormAut > henticator.java:230) at > org.apache.catalina.authenticator.AuthenticatorBase.invoke(Authenticator > Base.java:446) at > org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java >> 126) > at > org.apache.catalina.cluster.tcp.ReplicationValve.invoke(ReplicationValve > .java:130) at > org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java >> 105) > at > org.apache.catalina.valves.FastCommonAccessLogValve.invoke(FastCommonAcc > essLogValve.java:481) at > org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve. > java:107) at > org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:1 > 48) at > org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:306) > at > org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:385) > at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:745) > at > org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java: > 675) at > > org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:868) > at > org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool > .java:684) at java.lang.Thread.run(Thread.java:595) > > > CLUSTER MEMBER 2 (jvmRoute=srv2): > > INFO: Cluster is about to start > Feb 21, 2005 9:02:56 PM > org.apache.catalina.cluster.mcast.McastService start INFO: Sleeping > for 2000 secs to establish cluster membership > Feb 21, 2005 9:02:56 PM > org.apache.catalina.cluster.tcp.SimpleTcpCluster > memberAdded > INFO: Replication member > added:org.apache.catalina.cluster.mcast.McastMember[tcp://140.99.50.60:4 > 001,140.99.50.60,4001, alive=34160] > Feb 21, 2005 9:02:58 PM > org.apache.catalina.cluster.deploy.FarmWarDeployer start > INFO: Cluster FarmWarDeployer started. > Feb 21, 2005 9:06:10 PM > org.apache.catalina.cluster.tcp.ReplicationTransmitter sendMessageData > WARNING: Unable to send replicated message, is server down? > java.net.ConnectException: Connection timed out > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333) at > > > > > > > > java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195) > at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:364) at > java.net.Socket.connect(Socket.java:507) at > java.net.Socket.connect(Socket.java:457) at > java.net.Socket.<init>(Socket.java:365) at > java.net.Socket.<init>(Socket.java:207) at > org.apache.catalina.cluster.tcp.SocketSender.connect(SocketSender.java:1 > 10) at > org.apache.catalina.cluster.tcp.SocketSender.sendMessage(SocketSender.ja > va:157) at > org.apache.catalina.cluster.tcp.PooledSocketSender.sendMessage(PooledSoc > ketSender.java:147) at > org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessageData(R > eplicationTransmitter.java:247) at > org.apache.catalina.cluster.tcp.ReplicationTransmitter.sendMessage(Repli > cationTransmitter.java:270) at > org.apache.catalina.cluster.tcp.SimpleTcpCluster.send(SimpleTcpCluster.j > ava:451) at > org.apache.catalina.cluster.session.DeltaManager.start(DeltaManager.java >> 600) > at > org.apache.catalina.core.ContainerBase.setManager(ContainerBase.java:431 > ) at > org.apache.catalina.startup.ContextConfig.managerConfig(ContextConfig.ja > va:347) at > org.apache.catalina.startup.ContextConfig.start(ContextConfig.java:970) > at > > org.apache.catalina.startup.ContextConfig.lifecycleEvent(ContextConfig.j > ava:249) at > org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSu > pport.java:119) at > org.apache.catalina.core.StandardContext.start(StandardContext.java:4020 > ) at > org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.ja > va:759) at > org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:739) > at > org.apache.catalina.core.StandardHost.addChild(StandardHost.java:524) > at > org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java: > 590) at > org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java >> 535) > at > org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:470) > at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1106) > at > > org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:31 > 0) at > org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSu > pport.java:119) at > org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1019) > at org.apache.catalina.core.StandardHost.start(StandardHost.java:718) > at > org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1011) > at > > org.apache.catalina.core.StandardEngine.start(StandardEngine.java:440) > at > org.apache.catalina.core.StandardService.start(StandardService.java:450) > at > org.apache.catalina.core.StandardServer.start(StandardServer.java:683) > at org.apache.catalina.startup.Catalina.start(Catalina.java:537) at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav > a:39) at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor > Impl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) > at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:271) > at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:409) > 21:06:10,700 WARN [main] DeltaManager:602 - Manager[/stars], > requesting session state from > org.apache.catalina.cluster.mcast.McastMember[tcp://140.99.50.60:4001,14 > 0.99.50.60,4001, alive=228016]. This operation will timeout if no > session state has been received within 60 seconds > 21:07:10,781 ERROR [main] DeltaManager:616 - Manager[/stars], No > session state received, timing out. > [Filter: profiling] Using parameter [app_profile] > [Filter: profiling] defaulting to off [autostart=false] > [Filter: profiling] Turning filter off [app_profile=off] > Feb 21, 2005 9:07:13 PM org.apache.coyote.http11.Http11Protocol start > INFO: Starting Coyote HTTP/1.1 on http-8080 > Feb 21, 2005 9:07:13 PM org.apache.jk.common.ChannelSocket init > INFO: JK2: ajp13 listening on /0.0.0.0:8009 > Feb 21, 2005 9:07:13 PM org.apache.jk.server.JkMain start > INFO: Jk running ID=0 time=0/140 config=null > Feb 21, 2005 9:07:13 PM org.apache.catalina.storeconfig.StoreLoader > load INFO: Find registry server-registry.xml at classpath resource > Feb 21, 2005 9:07:13 PM org.apache.catalina.startup.Catalina start > INFO: Server startup in 257629 ms > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] Richard Mixon (qwest) wrote: > BTW, > > Each of my two servers has two network cards: > a) One facing the internet; > b) the second is a private connection between the two servers. > > The second connection is intended for session replication. > > Also, I did not specify an mcastBindAddr - though it probably should > be specified as the second network card. > > > Thanks - Richard Mixon --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]