RE: Issues | Suggestion any?

Arnab Chakravarty Thu, 23 Oct 2003 12:10:00 -0700

Hey Shapira,
 
First of all, thanks to you for the response.
 
I was surprised to see tomcat 3.x and apache 1.x with such massive
hardware and elaborate setup.  Any particular reason why you don't have
more recent versions of tomcat and apache httpd?


We have a client dependency and cannot make some decisions on our front. There are 
other reasons for ex- not enough time for performance testing and other testing 
operations which can delay a release. We would be upgrading to the new versions but 
would like to know the causes of such behaviour.
 
You're doing -Xmx, -Xms, -XX:NewSize, right?  Not exactly as you typed
above.  You should keep tuning to reduce your GC pause: the train
(-Xincgc) or parallel collectors would probably be good for you.

Yes we are doing a  -Xmx, -Xms, -XX:NewSize and not as I typed, sorry about the 
confusion. We are in the process to use either of the parallel GC algorithm with jvm 
1.4.2 but dont have grounds to prove it would be better but only theoretical (wish if 
you can point to some). But, we need a parallel collector as we have 4 CPU's per 
machine and that in fact would help it with some more parameters like compaction.
 
If -Xmx=778m, that's the max heap.  778m is a strange number BTW: how
did you arrive at that?  Anyways, the output from top represents the
whole JVM, i.e. heap + stack + symbols + OS overhead, so it's always
more than the heap itself.  However, a nearly 25% overhead
(778mb->1024mb) is pretty wild.  The difference between -Xmx and top
when the heap is maxed out on Solaris is usually closer to 12-15%.

I don't have at present the idea about this number (778m). But, I was more interested 
to understand the output that pmap returns and am enclosing one with this mail right 
now:
 
pmap 16803 (output made it short)
 
16803:  .../java/bin/java -server -XX:PermSize=64m -XX:NewSize=128m -X
00010000     40K read/exec         .../j2sdk1.4.2/bin/java
00028000      8K read/write/exec   .../j2sdk1.4.2/bin/java
0002A000 124312K read/write/exec     [ heap ] (Actual heap size OS or VM or both ???)
C4218000    424K read/write/exec     [ anon ]
C4496000    432K read/write/exec     [ anon ]
C4780000      8K read/write/exec     [ anon ]
F9800000  26752K read/write/exec     [ anon ] -  (Some swapping here, not sure ???)
FC400000  26136K read/shared       dev:85,35 ino:53053 -  (Some swapping here, not 
sure ???)
C4800000 266240K read/write/exec     [ anon ] - (Some swapping here, not sure ???)
D4C00000 532480K read/write/exec     [ anon ] - (Some swapping here, not sure ???)
F5400000  65536K read/write/exec     [ anon ] - PermSize parameter (i think, but again 
swapped not sure ???)
 
124m + 266m + 532m + 26m + 26m = 974m + some more small chunks (1066m total) (Any 
ideas if this is correct???)
 
The total process size under 'TOP' is 1066m (pid: 16803)


The output marked in red, what does this actually signify ??? Excess swapping ???
Why is the heap size in pmap output not equal to the one under the column size in 
'TOP'  ???
 
Anyways what I meant was, if swapping was causing some problems here and do we require 
more memory or do we need to tune the application more. Comments???
 
You have your causality mixed up here.  High GC does not cause a high
number of threads.
 
Yeah I think you are right that high threads don't cause the GC but the default % heap 
which when filled will invoke the GC.  But, more or less all these threads account for 
database connections actually and to some downloads.
 
Why are they unable to come back?  Are they not released back into the
pool?  Do they go bad but not abandoned by the pool?

This is what I wanted to know .... but seems like the connections are held up and not 
released or something else...I needed to know if you have any pointer what to look for 
in such cases....yeah one thing we are using the pool that oracle jdbc driver provides 
and setting min and max connections at 30 and 100 and using the dynamic scheme there 
(Which causes it to go beyond 100, should we restrict it to not exceed beyond 100 
using the fixed scheme but even if we do that we don't know the aftermath of it). 
Overall these connections are not restored and only this causes us to kill this tomcat 
which brings down all the connections in no time.
 
What are the crash messages?  If they're internal JVM crashes (producing
hs_err_[pid] files), make sure you have the latest Solaris patches
required for your JDK.  There are a bunch of them required for JDK
1.4.x.

The crash message and id was checked. This was found an active bug on the sun's bug 
database. But, seems they have corrected it in jdk1.4.2. As we have't had any kills 
all of a sudden yet, it may have been solved or ...??
 
Whenever the process used to die, it used to leave a core file and on running the pmap 
these anon and heap details were printed...which I was not sure...meant what ??? and 
still trying to find more answers.
 
Please help me find a way out of this or a checklist of what need to be done/checked 
at this stage.
 
Thanks in advance.
 
Regards,
Arnab
 
 

________________________________

From: Shapira, Yoav [mailto:[EMAIL PROTECTED]
Sent: Thu 10/23/2003 6:07 PM
To: Tomcat Users List
Subject: RE: Issues | Suggestion any?




Howdy,

>The (4) production servers are running Solaris OS with 4CPU, 4GB RAM
and
>7GB of swap space. In all we have 12 tomcats and 4 Apaches.
>
>Each machine is equipped with one apache and 3 tomcats.
>
>The Database machines is also running Solaris with 16CPU's, 20 GB RAM
>and 20GB swap space.
>
>We have apache (1.3.27) and tomcat(3.3) in our production servers with
>JVM (1.4.2 recently upgraded). The frequent problems we face are:

I was surprised to see tomcat 3.x and apache 1.x with such massive
hardware and elaborate setup.  Any particular reason why you don't have
more recent versions of tomcat and apache httpd?

>- High GC (increased pause time when the NewSize=128m -ms=256m -mx=778m
>and using the default LWP synchronization scheme with GC parameters
>PrintGCDetails and Time Stamps to analyse in GCportal)(these setting
are
>for individual tomcats)

You're doing -Xmx, -Xms, -XX:NewSize, right?  Not exactly as you typed
above.  You should keep tuning to reduce your GC pause: the train
(-Xincgc) or parallel collectors would probably be good for you.

>- The Process running the JVM reached 1GB of size in the 'TOP' list
>recently, which at this point had to be killed on one of the tomcats.

If -Xmx=778m, that's the max heap.  778m is a strange number BTW: how
did you arrive at that?  Anyways, the output from top represents the
whole JVM, i.e. heap + stack + symbols + OS overhead, so it's always
more than the heap itself.  However, a nearly 25% overhead
(778mb->1024mb) is pretty wild.  The difference between -Xmx and top
when the heap is maxed out on Solaris is usually closer to 12-15%.

>and sometime with high GC and CPU usage. The 3rd production machine
>causes high number of thread due to High GC most of the times and on

You have your causality mixed up here.  High GC does not cause a high
number of threads.

>Once the database connections are high with an increased number of
>threads, they are unable to come back to the normal condition and we
>have to kill this tomcat as after some time when the load increases,

Why are they unable to come back?  Are they not released back into the
pool?  Do they go bad but not abandoned by the pool?

>One last thing, there are some occasional tomcat death (JVM crashes)
>once in a while.

What are the crash messages?  If they're internal JVM crashes (producing
hs_err_[pid] files), make sure you have the latest Solaris patches
required for your JDK.  There are a bunch of them required for JDK
1.4.x.

Yoav Shapira



This e-mail, including any attachments, is a confidential business communication, and 
may contain information that is confidential, proprietary and/or privileged.  This 
e-mail is intended only for the individual(s) to whom it is addressed, and may not be 
saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) 
intended recipient, please immediately delete this e-mail from your computer system 
and notify the sender.  Thank you.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

RE: Issues | Suggestion any?

Reply via email to