cluster problems: memberDisappeared errors

Joshua Szmajda Mon, 18 Jul 2005 16:58:13 -0700

I'm using the cluster fix patch on 5.5.9 (fromhttp://issues.apache.org/bugzilla/show_bug.cgi?id=34389) with 8 hostsclustered together. I was seeing alotof memberDisappeared errors before I applied this patch, now I'm stillseeing them, but with more detail.

Here's an example error from catalina.out:

Jul 18, 2005 5:40:51 PM org.apache.catalina.cluster.tcp.SimpleTcpClustermemberDisappearedINFO: Received memberdisappeared:org.apache.catalina.cluster.mcast.McastMember[tcp://10.0.0.15:4002,10.0.0.15,4002,alive=1018550]Jul 18, 2005 5:40:51 PM org.apache.catalina.cluster.tcp.DataSenderpushMessageINFO: resending 782 bytes to 10.0.0.15:4002 from 55784java.net.SocketException: Socket closed

       at java.net.SocketInputStream.read(SocketInputStream.java:162)
       at java.net.SocketInputStream.read(SocketInputStream.java:182)

atorg.apache.catalina.cluster.tcp.DataSender.waitForAck(DataSender.java:542)atorg.apache.catalina.cluster.tcp.DataSender.pushMessage(DataSender.java:504)atorg.apache.catalina.cluster.tcp.FastAsyncSocketSender$FastQueueThread.run(FastAsyncSocketSender.java:401)


A typical cluster config is:

<ClusterclassName="org.apache.catalina.cluster.tcp.SimpleTcpCluster"name="hydraNation"managerClassName="org.apache.catalina.cluster.session.DeltaManager"

                expireSessionsOnShutdown="false"
                useDirtyFlag="true"
                notifyListenersOnReplication="true">

           <Membership
               className="org.apache.catalina.cluster.mcast.McastService"
               mcastAddr="228.0.0.4"
               mcastPort="45564"
               mcastFrequency="700"
               mcastDropTime="5000"/>

           <Receiver

className="org.apache.catalina.cluster.tcp.Jdk13ReplicationListener"

               tcpListenAddress="10.0.0.12"
               compress="false"
               tcpListenPort="4002"
               />

           <Sender

className="org.apache.catalina.cluster.tcp.ReplicationTransmitter"

                 replicationMode="fastasyncqueue"
                 compress="false"
                 doProcessingStats="true"
                 queueTimeWait="true"
                 maxQueueLength="1000"
                 queueDoStats="true"
                 queueCheckLock="true"
                 ackTimeout="15000"
                 waitForAck="true"
                 autoConnect="false"
                 keepAliveTimeout="@node.ackTimeout@"
                 keepAliveMaxRequestCount="-1"/>

<ValveclassName="org.apache.catalina.cluster.tcp.ReplicationValve"filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>

<DeployerclassName="org.apache.catalina.cluster.deploy.FarmWarDeployer"

                 tempDir="/tmp/war-temp/"
                 deployDir="/tmp/war-deploy/"
                 watchDir="/tmp/war-listen/"
                 watchEnabled="false"/>
       </Cluster>

any ideas? I'm thinking there's something wrong with my multicast setup,but everything was working fine this morning... The servers are runningRHEL3, all 2 way AMD64 machines with 4Gb ram each. They each have twonetwork interfaces, each eth0 is connected to one gigabit switch, eacheth1 to another (internal) gigabit switch. I don't think I should behitting any network bottlenecks.. ? There is alot of load on the sitebeing served in general, but no big jump in hits today.


Should I be using a fastasyncqueue? What are the tradeoffs in Sender modes?

Thanks in advance!

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

cluster problems: memberDisappeared errors

Reply via email to