I have 2 Tomcat 6.0.16 servers set up in a cluster running on a Windows
2003 VM as a windows service, with java version 1.6.0_10.  After 10 - 14
days of running one of the Tomcat instances will start using 100% of the
server CPU.  

 

Through JConsole I see that the NIOReciever thread is the top CPU using
thread, where it is usually at the bottom with next to none CPU usage.
When I restart the Tomcat6 windows service everything goes back to
normal, but a couple of days later the other server in the cluster will
need to be restarted.  I searched for similar occurrences but I was only
able to find a problem with the NIO selector while running on Linux, and
it was supposed to be fixed in a previous build of 1.6.  

 

I used the cluster setup from the tomcat manual, with the exception of
using synchronous replication.  

 

<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="4">

            <Manager
className="org.apache.catalina.ha.session.DeltaManager"

                        expireSessionsOnShutdown="false"
notifyListenersOnReplication="true" />

            <Channel
className="org.apache.catalina.tribes.group.GroupChannel">

                        <Membership
className="org.apache.catalina.tribes.membership.McastService"

                                    address="228.0.0.4" port="45564"
frequency="500" dropTime="3000" />

                        <Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"

                                    address="auto" port="4000"
autoBind="100" selectorTimeout="5000"

                                    maxThreads="6" />

                        <Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">

                                    <Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender
" />

                        </Sender>

                        <Interceptor
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetec
tor" />

                        <Interceptor
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch
15Interceptor" />

            </Channel>

            <Valve
className="org.apache.catalina.ha.tcp.ReplicationValve" filter="" />

            <Valve
className="org.apache.catalina.ha.session.JvmRouteBinderValve" />

            <ClusterListener
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListene
r" />

            <ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener" />

</Cluster>

 

I took a thread dump during the most recent occurrence:

 

[2010-05-04 07:49:40] [info] "NioReceiver" 

[2010-05-04 07:49:40] [info] daemon 

[2010-05-04 07:49:40] [info] prio=6 tid=0x54f9b400 

[2010-05-04 07:49:40] [info] nid=0x2e8 

[2010-05-04 07:49:40] [info] runnable 

[2010-05-04 07:49:40] [info] [0x5522f000..0x5522fa18]

[2010-05-04 07:49:40] [info]    java.lang.Thread.State: RUNNABLE

[2010-05-04 07:49:40] [info]         at
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method)

[2010-05-04 07:49:40] [info]         at
sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source)

[2010-05-04 07:49:40] [info]         at
sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source)

[2010-05-04 07:49:40] [info]         at
sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source)

[2010-05-04 07:49:40] [info]         at
sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source)

[2010-05-04 07:49:40] [info]         - locked <0x07563448> 

[2010-05-04 07:49:40] [info] (a sun.nio.ch.Util$1)

[2010-05-04 07:49:40] [info]         - locked <0x07563458> 

[2010-05-04 07:49:40] [info] (a java.util.Collections$UnmodifiableSet)

[2010-05-04 07:49:40] [info]         - locked <0x075633d0> 

[2010-05-04 07:49:40] [info] (a sun.nio.ch.WindowsSelectorImpl)

[2010-05-04 07:49:40] [info]         at
sun.nio.ch.SelectorImpl.select(Unknown Source)

[2010-05-04 07:49:40] [info]         at
org.apache.catalina.tribes.transport.nio.NioReceiver.listen(NioReceiver.
java:243)

[2010-05-04 07:49:40] [info]         at
org.apache.catalina.tribes.transport.nio.NioReceiver.run(NioReceiver.jav
a:353)

[2010-05-04 07:49:40] [info]         at java.lang.Thread.run(Unknown
Source)

 

 

The only other thing I have noticed is that every evening around the
same time I see the following messages posted in the catalina log for 5
- 30 minutes:

 

Apr 28, 2010 6:47:16 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared

INFO: Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp:/
/{10, -116, 111, 42}:4000,{10, -116, 111, 42},4000,
alive=155973672,id={78 -71 -19 48 57 82 65 122 -80 52 -24 28 -126 95 77
27 }, payload={}, command={}, domain={}, ]] message. Will verify.

 

Apr 28, 2010 6:47:16 PM
org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts

WARNING: Channel key is registered, but has had no interest ops for the
last 3000 ms. (cancelled:false):sun.nio.ch.selectionkeyi...@a3ae07 last
access:2010-04-28 18:47:10.283

 

 

And this is the last message I see every day:

 

Apr 28, 2010 6:47:29 PM
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector
memberDisappeared

INFO: Verification complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, -116,
111, 42}:4000,{10, -116, 111, 42},4000, alive=156009672,id={78 -71 -19
48 57 82 65 122 -80 52 -24 28 -126 95 77 27 }, payload={}, command={},
domain={}, ]]

 

 

I'm trying to track down what in our environment is causing the two
instances not to be able to communicate, and I'm not sure if this is
what causes the NIOReciever to use all the CPU.  

 

Any help identifying what is causing the CPU usage increase would be
appreciated.  

 

 

Thanks,

 

Ryan

 

 

 



*****************************************************************************
If you wish to communicate securely with Commerce Bank and its
affiliates, you must log into your account under Online Services at 
http://www.commercebank.com or use the Commerce Bank Secure
Email Message Center at https://securemail.commercebank.com

NOTICE: This electronic mail message and any attached files are
confidential. The information is exclusively for the use of the
individual or entity intended as the recipient. If you are not
the intended recipient, any use, copying, printing, reviewing,
retention, disclosure, distribution or forwarding of the message
or any attached file is not authorized and is strictly prohibited.
If you have received this electronic mail message in error, please
advise the sender by reply electronic mail immediately and
permanently delete the original transmission, any attachments
and any copies of this message from your computer system.
*****************************************************************************

Reply via email to