Re: tomcat 5.0.16 Replication

jean-philippe . belanger Thu, 08 Jan 2004 11:29:19 -0800

More content for you Filip.

I've checked and followed the code of the listen event in ReplicationListener.java

Here's what happening:

selector.select(timeout) -> return immediatly with one SelectorKey available That key is not Acceptable and not Readable so it immediatly skip those IFs and loops back to the beginning.

I've put traces and this is executed once every millisecond hence the 100% load on the server. Just to make sure, I've put a Thread.sleep(10) at the end of the loop and the CPU dropped back to 0% and the replication still worked nicely but probably a little slower since the wait of 10ms.

I don't know much about those NIO packages but seams like the select(timeout) method shouldn't return a SelectorKey of that state. with any waiting.

Let me know what you can dig from those.

Jean-Philippe B�langer

[EMAIL PROTECTED] wrote:

Hi Filip.

I did some profiling of 40mins of tomcat with and without a 2nd node up. here are the results with -Xrunhprof:cpu=samples,thread=y,file=/u01/portal/java.hprof.txt,depth=10:

Those number are cpu=times and not samples since the later one freezes on my systems. So that list shows the time spent in each methods.

Major difference the some call to the sun.nio.ch.PollArrayWrapper class. I don't know much about those NIOs packages but 819000 call in 40 mins is a lot. The Socket Interface was called more than twice with 2 hosts than with a single one. Which seams normal.
Maybe this can help.
If you need the complete hprof file I can send them to you.
1 host in cluster: CPU TIME (ms) BEGIN (total = 19701) Thu Jan 8 10:00:59 2004 rank self accum count trace method 1 11.48% 11.48% 54 85 java.lang.Object.wait 2 11.46% 22.94% 117 86 java.lang.Object.wait 3 10.95% 33.89% 4115 215 java.net.PlainDatagramSocketImpl.receive 4 10.93% 44.81% 4114 224 java.lang.Thread.sleep 5 10.91% 55.73% 19005 214 sun.nio.ch.PollArrayWrapper.poll0 6 7.37% 63.09% 28 495 java.lang.Object.wait 7 7.24% 70.34% 10 576 java.lang.Object.wait 8 4.57% 74.90% 90 716 java.lang.Thread.sleep 9 4.48% 79.38% 1 909 java.lang.Object.wait 10 4.48% 83.86% 1 908 java.lang.Object.wait 11 4.48% 88.34% 15 810 java.lang.Object.wait 12 4.47% 92.81% 1 910 java.net.PlainSocketImpl.socketAccept 13 0.71% 93.52% 2 623 java.lang.Object.wait 14 0.56% 94.08% 2 706 java.lang.Object.wait 15 0.38% 94.46% 2 914 java.lang.Object.wait 16 0.24% 94.70% 775 913 java.lang.String.toCharArray 17 0.23% 94.93% 3 475 java.lang.Thread.sleep 18 0.16% 95.09% 2 472 java.lang.Object.wait 19 0.15% 95.24% 2 595 java.lang.Thread.sleep 20 0.15% 95.40% 2 586 java.lang.Thread.sleep 21 0.15% 95.55% 2 703 java.lang.Thread.sleep 22 0.15% 95.70% 2 476 java.lang.Thread.sleep 23 0.15% 95.85% 2 692 java.lang.Thread.sleep 24 0.12% 95.97% 218595 385 java.lang.CharacterDataLatin1.toLowerCase 25 0.12% 96.09% 218595 408 java.lang.Character.toLowerCase 26 0.11% 96.20% 218595 433 java.lang.CharacterDataLatin1.getProperties 27 0.10% 96.30% 210925 389 java.lang.String.equalsIgnoreCase 28 0.08% 96.38% 157259 387 java.lang.String.charAt 29 0.08% 96.46% 1 646 java.lang.Thread.sleep 30 0.08% 96.53% 1 634 java.lang.Thread.sleep 31 0.08% 96.61% 1 903 java.lang.Thread.sleep 32 0.08% 96.69% 1 714 java.lang.Thread.sleep 33 0.08% 96.76% 1 811 java.lang.Thread.sleep 34 0.08% 96.84% 1 715 java.lang.Thread.sleep
2 hosts:
CPU TIME (ms) BEGIN (total = 37247) Thu Jan  8 11:01:28 2004
rank   self  accum   count trace method
  1  9.56%  9.56%      52    85 java.lang.Object.wait
  2  9.56% 19.12%      29    86 java.lang.Object.wait
  3  9.30% 28.43%       3   267 java.lang.Object.wait
  4  9.25% 37.68%    6644   224 java.lang.Thread.sleep
  5  9.23% 46.91%   13116   215 java.net.PlainDatagramSocketImpl.receive
  6  7.67% 54.58%       3   266 java.lang.Object.wait
  7  5.90% 60.47%      39   847 java.lang.Object.wait
  8  5.76% 66.24%      12   503 java.lang.Object.wait
  9  3.90% 70.14%     145   975 java.lang.Thread.sleep
 10  3.90% 74.04%       1  1174 java.lang.Object.wait
 11  3.90% 77.94%       1  1173 java.lang.Object.wait
 12  3.90% 81.84%      25   973 java.lang.Object.wait
 13  3.90% 85.74%       1  1175 java.net.PlainSocketImpl.socketAccept
 14  3.88% 89.62%  819692   214 sun.nio.ch.PollArrayWrapper.poll0
 15  0.75% 90.37%       2   958 java.lang.Object.wait
 16  0.28% 90.65%       2   457 java.lang.Object.wait
 17  0.26% 90.91%       2  1181 java.lang.Object.wait
Filip Hanik wrote:
I'll try to get an instance going today. Will let you know how it goes
also, try asynchronous replication, does it still go to 100%?
Filip
-----Original Message-----
From: Steve Nelson [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 07, 2004 12:08 PM
To: 'Tomcat Users List'
Subject: RE: tomcat 5.0.16 Replication
Okay, did that got this
BEGIN TO RECEIVE
SENT:Default 1
RECEIVED:Default 1 FROM /10.0.0.110:5555
SENT:Default 2
BEGIN TO RECEIVE
RECEIVED:Default 2 FROM /10.0.0.110:5555
SENT:Default 3
BEGIN TO RECEIVE
RECEIVED:Default 3 FROM /10.0.0.110:5555
SENT:Default 4
BEGIN TO RECEIVE
RECEIVED:Default 4 FROM /10.0.0.110:5555
*shrug*

BTW It didn't go to 100% CPU ute before I started using the code from CVS. Of course the Manager would almost always timeout before it would recieve the message.

Now it gets the message right away, but maxes my machine out.
-----Original Message-----
From: Filip Hanik [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 07, 2004 1:58 PM
To: Tomcat Users List
Subject: RE: tomcat 5.0.16 Replication
100% cpu can mean that you have a multicast problem, try to run

java -cp tomcat-replication.jar MCaster

download the jar from http://cvs.apache.org/~fhanik/

Filip
-----Original Message-----
From: Steve Nelson [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 07, 2004 6:51 AM
To: '[EMAIL PROTECTED]'
Subject: tomcat 5.0.16 Replication
I was having random problems with clustering when starting up. Mostly it had to do with Timing out when the manager was starting up. I built the CVS version and it solved that problem. But it has caused some serious performance problems.

First a little background.

I have 2 servers, dual 300mhz cpq proliants, both running Redhat - 9, Tomcat 5.0.16 (with catalina-cluster.jar build from cvs) The multicast packets are restricted to a crossover link between the servers. There are 3 hosts in the server.xml, all with clustering set up. They all function just fine.

But.....the cpu's spikes up to 100% if I start up both servers. I know this didn't happen without the new catalina-cluster.jar. If I shut down 1 server (doesn't matter which) everything returns to normal. But when both are running both servers are at 100% CPU. I am trying to profile it now, but I figured if someone has already experienced this they could save me some time.

Oh, and there isn't anything relevant in my logs. It's not throwing millions of errors or something.

-Steve Nelson
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--
Jean-Philippe B�langer
(514)228-8800 ext 3060
111 Duke
CGI


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: tomcat 5.0.16 Replication

Reply via email to