I'll take this offline with you, and if we resolve it, we will post the solution here

Filip

Raúl García wrote:
Hi again,

I try the config using keepAliveTime to 10:

<Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="60000" keepAliveTime="10"
keepAliveCount="0"/>

One more time, the cluster is not working, the big problem is that I cannot
reproduce the error at my backup server that works perfectly.

Node 2, drops a log error at 12:58 AM, then, at the same time, node 1 report
"ClusterError" continuously (Continuous errors are on every hit; the server supports 1 hit per second)

Logs:

NODE 2 - LOG
=============
Jan 31, 2008 12:58:13 PM
org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts
WARNING: Channel key is registered, but has had no interest ops for the last
3000 ms. (cancelled:false):[EMAIL PROTECTED] last
access:2008-01-31 12:58:10.208


NODE 1 - LOG
=============
Jan 31, 2008 12:58:04 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
alhost:4002,localhost,4002, alive=101194547,id={123 -66 95 -10 88
24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={},
]] message. Will verify.
Jan 31, 2008 12:58:04 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002,
localhost,4002, alive=101194547,id={123
 -66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={},
command={}, domain={}, ]]
Jan 31, 2008 12:58:04 PM org.apache.catalina.ha.tcp.SimpleTcpCluster send
SEVERE: Unable to send message through cluster sender.
org.apache.catalina.tribes.ChannelException: Operation has timed out(60000
ms.).; Faulty members:tcp://localhost:4002;
        at
org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral
lelNioSender.java:97)
        at
org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po
oledParallelSender.java:53)
        at
org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl
icationTransmitter.java:80)
        at
org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord
inator.java:78)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess
age(ThroughputInterceptor.java:61)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen
dMessage(MessageDispatchInterceptor.java:73)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage
(TcpFailureDetector.java:87)
        at
org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175)
        at
org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835)
        at
org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust
er.java:814)
        at
org.apache.catalina.ha.tcp.ReplicationValve.send(ReplicationValve.java:551)
        at
org.apache.catalina.ha.tcp.ReplicationValve.sendMessage(ReplicationValve.jav
a:535)
        at
org.apache.catalina.ha.tcp.ReplicationValve.sendSessionReplicationMessage(Re
plicationValve.java:517)
        at
org.apache.catalina.ha.tcp.ReplicationValve.sendReplicationMessage(Replicati
onValve.java:428)
        at
org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:362
)
        at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http
11Protocol.java:584)
        at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)
Jan 31, 2008 12:58:07 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
alhost:4002,localhost,4002, alive=101197553,id={123 -66 95 -10 88
24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={}, command={}, domain={},
]] message. Will verify.
Jan 31, 2008 12:58:07 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002,
localhost,4002, alive=101197553,id={123
 -66 95 -10 88 24 77 -32 -93 16 -13 -112 90 52 -18 78 }, payload={},
command={}, domain={}, ]]
[...] repeats on every hit.
========================

I cannot understand the node 2 log, why is the node 2 crashing??

What can I do??

Thanks on advance.

Raúl.


-----Mensaje original-----
De: Filip Hanik - Dev Lists [mailto:[EMAIL PROTECTED] Enviado el: lunes, 28 de enero de 2008 1:45
Para: Tomcat Users List
Asunto: Re: Tomcat 6 - Cluster error.

I'd set keepAliveTime to 10 as well,

Filip

Raúl García wrote:
Hi Again, once again thanks for your time, but we still have problems,

We applied the "keepAliveCount=0" param. and last Wednesday 23 Jan we
restart both nodes.

Around 11 hour after the startup, node 1 reports a new error, but both
nodes
are working perfectly.

I cannot imagine why the member disappear unexpectedly, I repost the
error,
and the config files.

INSTANCE 1 - LOG
================
Jan 24, 2008 10:25:54 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received

memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://loc
alhost:4002,localhost,4002, alive=123412856,id={-31 -91 -122 -60 -58 -5 68
25 -87 13 -20 -12 -100 5 -16 94 }, payload={}, command={}, domain={}, ]]
message. Will verify.
Jan 24, 2008 10:25:54 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Verification complete. Member still

alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://localhost:4002,
localhost,4002, alive=123412856,id={-31 -91 -122 -60 -58 -5 68 25 -87 13
-20
-12 -100 5 -16 94 }, payload={}, command={}, domain={}, ]]
Jan 24, 2008 10:25:54 PM org.apache.catalina.ha.tcp.SimpleTcpCluster send
SEVERE: Unable to send message through cluster sender.
org.apache.catalina.tribes.ChannelException: Operation has timed out(60000
ms.).; Faulty members:tcp://localhost:4002;
        at

org.apache.catalina.tribes.transport.nio.ParallelNioSender.sendMessage(Paral
lelNioSender.java:97)
        at

org.apache.catalina.tribes.transport.nio.PooledParallelSender.sendMessage(Po
oledParallelSender.java:53)
        at

org.apache.catalina.tribes.transport.ReplicationTransmitter.sendMessage(Repl
icationTransmitter.java:80)
        at

org.apache.catalina.tribes.group.ChannelCoordinator.sendMessage(ChannelCoord
inator.java:78)
        at

org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at

org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor.sendMess
age(ThroughputInterceptor.java:61)
        at

org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at

org.apache.catalina.tribes.group.interceptors.MessageDispatchInterceptor.sen
dMessage(MessageDispatchInterceptor.java:73)
        at

org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at

org.apache.catalina.tribes.group.interceptors.TcpFailureDetector.sendMessage
(TcpFailureDetector.java:87)
        at

org.apache.catalina.tribes.group.ChannelInterceptorBase.sendMessage(ChannelI
nterceptorBase.java:75)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:216)
        at
org.apache.catalina.tribes.group.GroupChannel.send(GroupChannel.java:175)
        at

org.apache.catalina.ha.tcp.SimpleTcpCluster.send(SimpleTcpCluster.java:835)
        at

org.apache.catalina.ha.tcp.SimpleTcpCluster.sendClusterDomain(SimpleTcpClust
er.java:814)
        at

org.apache.catalina.ha.tcp.ReplicationValve.send(ReplicationValve.java:551)
        at

org.apache.catalina.ha.tcp.ReplicationValve.sendMessage(ReplicationValve.jav
a:535)
        at

org.apache.catalina.ha.tcp.ReplicationValve.sendSessionReplicationMessage(Re
plicationValve.java:517)
        at

org.apache.catalina.ha.tcp.ReplicationValve.sendReplicationMessage(Replicati
onValve.java:428)
        at

org.apache.catalina.ha.tcp.ReplicationValve.invoke(ReplicationValve.java:362
)
        at

org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
        at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
        at

org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http
11Protocol.java:584)
        at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
        at java.lang.Thread.run(Thread.java:619)

Jan 24, 2008 10:26:54 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared
INFO: Received memberDisappeared [...] repeats only once again.

Jan 25, 2008 5:37:52 AM
org.apache.catalina.tribes.group.interceptors.ThroughputInterceptor report
INFO: ThroughputInterceptor Report[
        Tx Msg:66167 messages
        Sent:37.02 MB (total)
        Sent:37.02 MB (application)
        Time:118.53 seconds
        Tx Speed:0.31 MB/sec (total)
        TxSpeed:0.31 MB/sec (application)
        Error Msg:2
        Rx Msg:90000 messages
        Rx Speed:0.00 MB/sec (since 1st msg)
        Received:41.06 MB]




INSTANCE-1 --- Server.xml
==========================
NOTE:: 111.111.111.111 is the server ip address.
==========================
<Server port="8006" shutdown="SHUTDOWN" debug="0">
  <Listener className="org.apache.catalina.core.JasperListener"
debug="0"/>
  <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"
debug="0"/>
  <Listener
className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
debug="0"/>

  <GlobalNamingResources>
    <Environment name="InstanceName" type="java.lang.String"
value="pro1"/>
    <Resource name="UserDatabase" auth="Container"
              type="org.apache.catalina.UserDatabase"
              description="User database that can be updated and saved"

factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
              pathname="conf/tomcat-users.xml" />
  </GlobalNamingResources>

  <Service name="Catalina">

    <Connector port="8081" protocol="HTTP/1.1" maxHttpHeaderSize="8192"
emptySessionPath="true"
               maxThreads="150" minSpareThreads="100"
maxSpareThreads="300"
               enableLookups="false" redirectPort="81443"
acceptCount="1000"
               debug="0" connectionTimeout="20000"
disableUploadTimeout="true"
               compression="on"
                           compressionMinSize="2048"
                           noCompressionUserAgents="gozilla, traviata"
                           compressableMimeType="text/html,text/xml" />

    <Engine name="Catalina" defaultHost="localhost" debug="0"
jvmRoute="PR1">
                        <Cluster
className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="6">

          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>

          <Channel
className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership
className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="1000"
                        dropTime="30000"/>
            <Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                      address="127.0.0.1"
                      port="4001"
                      autoBind="100"
                      selectorTimeout="5000"
                      maxThreads="12"/>

            <Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="60000" keepAliveCount="0"/>
            </Sender>
            <Interceptor

className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
/>
            <Interceptor

className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In
terceptor"/>
            <Interceptor

className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept
or"/>
          </Channel>

          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
          <Deployer
className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

          <ClusterListener

className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
          <ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
        </Cluster>
      <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
             debug="0" resourceName="UserDatabase"/>
      <Host name="localhost" debug="0" appBase="webapps"
            unpackWARs="true" autoDeploy="true"
            xmlValidation="false" xmlNamespaceAware="false">
          <Valve className="org.apache.catalina.valves.RemoteAddrValve"
                   allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/>
      </Host>
    </Engine>
  </Service>
</Server>
==============================================


INSTANCE-2 server.xml
=====================
<Server port="8007" shutdown="SHUTDOWN" debug="0">

  <Listener className="org.apache.catalina.core.JasperListener"
debug="0"/>
  <Listener className="org.apache.catalina.mbeans.ServerLifecycleListener"
debug="0"/>
  <Listener
className="org.apache.catalina.mbeans.GlobalResourcesLifecycleListener"
debug="0"/>

  <GlobalNamingResources>

    <Environment name="InstanceName" type="java.lang.String"
value="pro2"/>
    <Resource name="UserDatabase" auth="Container"
              type="org.apache.catalina.UserDatabase"
              description="User database that can be updated and saved"

factory="org.apache.catalina.users.MemoryUserDatabaseFactory"
              pathname="conf/tomcat-users.xml"/>
  </GlobalNamingResources>

  <Service name="Catalina">

    <Connector port="8082" protocol="HTTP/1.1" maxHttpHeaderSize="8192"
emptySessionPath="true"
               maxThreads="150" minSpareThreads="100"
maxSpareThreads="300"
               enableLookups="false" redirectPort="82443"
acceptCount="1000"
               debug="0" connectionTimeout="20000"
disableUploadTimeout="true"
               compression="on"
                           compressionMinSize="2048"
                           noCompressionUserAgents="gozilla, traviata"
                           compressableMimeType="text/html,text/xml" />
    <Engine name="Catalina" defaultHost="localhost" debug="0"
jvmRoute="PR2">

                        <Cluster
className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
                 channelSendOptions="6">


          <Manager className="org.apache.catalina.ha.session.DeltaManager"
                   expireSessionsOnShutdown="false"
                   notifyListenersOnReplication="true"/>

          <Channel
className="org.apache.catalina.tribes.group.GroupChannel">
            <Membership
className="org.apache.catalina.tribes.membership.McastService"
                        address="228.0.0.4"
                        port="45564"
                        frequency="1000"
                        dropTime="30000"/>
            <Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
                      address="127.0.0.1"
                      port="4002"
                      autoBind="100"
                      selectorTimeout="5000"
                      maxThreads="12"/>

            <Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
              <Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
timeout="60000" keepAliveCount="0"/>
            </Sender>
            <Interceptor

className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"
/>
            <Interceptor

className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15In
terceptor"/>
            <Interceptor

className="org.apache.catalina.tribes.group.interceptors.ThroughputIntercept
or"/>
          </Channel>

          <Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=".*\.gif;.*\.js;.*\.jpg;.*\.png;.*\.htm;.*\.html;.*\.css;.*\.txt;"/>
          <!-- <Valve
className="org.apache.catalina.ha.session.JvmRouteBinderValve"/> -->

          <Deployer
className="org.apache.catalina.ha.deploy.FarmWarDeployer"
                    tempDir="/tmp/war-temp/"
                    deployDir="/tmp/war-deploy/"
                    watchDir="/tmp/war-listen/"
                    watchEnabled="false"/>

          <ClusterListener

className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
          <ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
        </Cluster>

      <Realm className="org.apache.catalina.realm.UserDatabaseRealm"
             resourceName="UserDatabase" debug="0"/>

      <Host name="localhost" debug="0" appBase="webapps"
            unpackWARs="true" autoDeploy="true"
            xmlValidation="false" xmlNamespaceAware="false">

          <Valve className="org.apache.catalina.valves.RemoteAddrValve"
                   allow="10.0.0.*,127.0.0.1,228.0.0.4,111.111.111.111"/>
      </Host>
    </Engine>
  </Service>
</Server>
===============================

-----Mensaje original-----
De: Filip Hanik - Dev Lists [mailto:[EMAIL PROTECTED] Enviado el: jueves, 17 de enero de 2008 19:01
Para: Tomcat Users List
Asunto: Re: Tomcat 6 - Cluster error.

already replied to your old thread

ok, it looks like you might have ended up with a rogue socket,
and what happens is that any message sent to that socket just gets lost in the ether, since it doesn't have any interest ops. There is a workaround for this, turn off keep alives all together, or implement a keep alive timeout

Option 1 - no keep alives at all

<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
          timeout="60000"
          keepAliveCount="0"/>

Option 2 - implement a keep alive timeout

<Transport className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"
          timeout="60000"
          keepAliveTime="120000"/>

or make a combination of both values

either option should work for you.

On a side note, I'm interested if the scenario you run into is reproducible, it keeps happening over and over again, then if possible, I'd like to get some debug logs from you

Filip




---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to