On 10/05/2010 18:58, Schoenemann, Ryan wrote: > I have 2 Tomcat 6.0.16 servers set up in a cluster running on a Windows > 2003 VM as a windows service, with java version 1.6.0_10. After 10 - 14 > days of running one of the Tomcat instances will start using 100% of the > server CPU.
Can you upgrade to the latest version? 6.0.16 is getting on a bit... p > Through JConsole I see that the NIOReciever thread is the top CPU using > thread, where it is usually at the bottom with next to none CPU usage. > When I restart the Tomcat6 windows service everything goes back to > normal, but a couple of days later the other server in the cluster will > need to be restarted. I searched for similar occurrences but I was only > able to find a problem with the NIO selector while running on Linux, and > it was supposed to be fixed in a previous build of 1.6. > > > > I used the cluster setup from the tomcat manual, with the exception of > using synchronous replication. > > > > <Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster" > channelSendOptions="4"> > > <Manager > className="org.apache.catalina.ha.session.DeltaManager" > > expireSessionsOnShutdown="false" > notifyListenersOnReplication="true" /> > > <Channel > className="org.apache.catalina.tribes.group.GroupChannel"> > > <Membership > className="org.apache.catalina.tribes.membership.McastService" > > address="228.0.0.4" port="45564" > frequency="500" dropTime="3000" /> > > <Receiver > className="org.apache.catalina.tribes.transport.nio.NioReceiver" > > address="auto" port="4000" > autoBind="100" selectorTimeout="5000" > > maxThreads="6" /> > > <Sender > className="org.apache.catalina.tribes.transport.ReplicationTransmitter"> > > <Transport > className="org.apache.catalina.tribes.transport.nio.PooledParallelSender > " /> > > </Sender> > > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetec > tor" /> > > <Interceptor > className="org.apache.catalina.tribes.group.interceptors.MessageDispatch > 15Interceptor" /> > > </Channel> > > <Valve > className="org.apache.catalina.ha.tcp.ReplicationValve" filter="" /> > > <Valve > className="org.apache.catalina.ha.session.JvmRouteBinderValve" /> > > <ClusterListener > className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListene > r" /> > > <ClusterListener > className="org.apache.catalina.ha.session.ClusterSessionListener" /> > > </Cluster> > > > > I took a thread dump during the most recent occurrence: > > > > [2010-05-04 07:49:40] [info] "NioReceiver" > > [2010-05-04 07:49:40] [info] daemon > > [2010-05-04 07:49:40] [info] prio=6 tid=0x54f9b400 > > [2010-05-04 07:49:40] [info] nid=0x2e8 > > [2010-05-04 07:49:40] [info] runnable > > [2010-05-04 07:49:40] [info] [0x5522f000..0x5522fa18] > > [2010-05-04 07:49:40] [info] java.lang.Thread.State: RUNNABLE > > [2010-05-04 07:49:40] [info] at > sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(Native Method) > > [2010-05-04 07:49:40] [info] at > sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(Unknown Source) > > [2010-05-04 07:49:40] [info] at > sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(Unknown Source) > > [2010-05-04 07:49:40] [info] at > sun.nio.ch.WindowsSelectorImpl.doSelect(Unknown Source) > > [2010-05-04 07:49:40] [info] at > sun.nio.ch.SelectorImpl.lockAndDoSelect(Unknown Source) > > [2010-05-04 07:49:40] [info] - locked <0x07563448> > > [2010-05-04 07:49:40] [info] (a sun.nio.ch.Util$1) > > [2010-05-04 07:49:40] [info] - locked <0x07563458> > > [2010-05-04 07:49:40] [info] (a java.util.Collections$UnmodifiableSet) > > [2010-05-04 07:49:40] [info] - locked <0x075633d0> > > [2010-05-04 07:49:40] [info] (a sun.nio.ch.WindowsSelectorImpl) > > [2010-05-04 07:49:40] [info] at > sun.nio.ch.SelectorImpl.select(Unknown Source) > > [2010-05-04 07:49:40] [info] at > org.apache.catalina.tribes.transport.nio.NioReceiver.listen(NioReceiver. > java:243) > > [2010-05-04 07:49:40] [info] at > org.apache.catalina.tribes.transport.nio.NioReceiver.run(NioReceiver.jav > a:353) > > [2010-05-04 07:49:40] [info] at java.lang.Thread.run(Unknown > Source) > > > > > > The only other thing I have noticed is that every evening around the > same time I see the following messages posted in the catalina log for 5 > - 30 minutes: > > > > Apr 28, 2010 6:47:16 PM > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector > memberDisappeared > > INFO: Received > memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp:/ > /{10, -116, 111, 42}:4000,{10, -116, 111, 42},4000, > alive=155973672,id={78 -71 -19 48 57 82 65 122 -80 52 -24 28 -126 95 77 > 27 }, payload={}, command={}, domain={}, ]] message. Will verify. > > > > Apr 28, 2010 6:47:16 PM > org.apache.catalina.tribes.transport.nio.NioReceiver socketTimeouts > > WARNING: Channel key is registered, but has had no interest ops for the > last 3000 ms. (cancelled:false):sun.nio.ch.selectionkeyi...@a3ae07 last > access:2010-04-28 18:47:10.283 > > > > > > And this is the last message I see every day: > > > > Apr 28, 2010 6:47:29 PM > org.apache.catalina.tribes.group.interceptors.TcpFailureDetector > memberDisappeared > > INFO: Verification complete. Member still > alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{10, -116, > 111, 42}:4000,{10, -116, 111, 42},4000, alive=156009672,id={78 -71 -19 > 48 57 82 65 122 -80 52 -24 28 -126 95 77 27 }, payload={}, command={}, > domain={}, ]] > > > > > > I'm trying to track down what in our environment is causing the two > instances not to be able to communicate, and I'm not sure if this is > what causes the NIOReciever to use all the CPU. > > > > Any help identifying what is causing the CPU usage increase would be > appreciated. > > > > > > Thanks, > > > > Ryan > > > > > > > > > > ***************************************************************************** > If you wish to communicate securely with Commerce Bank and its > affiliates, you must log into your account under Online Services at > http://www.commercebank.com or use the Commerce Bank Secure > Email Message Center at https://securemail.commercebank.com > > NOTICE: This electronic mail message and any attached files are > confidential. The information is exclusively for the use of the > individual or entity intended as the recipient. If you are not > the intended recipient, any use, copying, printing, reviewing, > retention, disclosure, distribution or forwarding of the message > or any attached file is not authorized and is strictly prohibited. > If you have received this electronic mail message in error, please > advise the sender by reply electronic mail immediately and > permanently delete the original transmission, any attachments > and any copies of this message from your computer system. > *****************************************************************************
signature.asc
Description: OpenPGP digital signature