Re: Issues in tomcat 5.0.19

2004-04-07 Thread David Rees
Carl Olivier wrote, On 4/6/2004 10:30 AM:
Could the problem be that too many high processor-requirement threads are
being started, and as such each gets less time on the processor - thus
taking longer to process..and thus, should we not set the AJP worker
maxThreads DOWN thus allowing the processor to finish each processor
intensive task quicker?  Maybe set the acceptCount up.hmmm  - thoughts?
I think you've found the problem.  If you have a page which takes 10s to 
process (and you say that it's CPU bound, not IO bound), once you have 
as many threads running as you have CPUs on your server, that 10s is 
goint to start taking a longer.  You said your server was a single CPU 
machine, so if you're running 2 concurrent threads, now it will take 20s 
per request to process.

There isn't much you can do.  You need to limit the number of concurrent 
requests so that at maximum load, your slow page takes a reasonable time 
to complete.  Once you hit that limit you either need to start queueing 
requests or rejecting them.  If you don't, the server will just bog 
further and further down.  Once you get more than 10 or so CPU hogging 
threads going at a time, performance will really start to degrade and 
requests will take longer than 10x to complete than usual due to context 
switching overhead.

You can limit the number of concurrent requests at the connector level 
and then you might want increase the accept count as you suggested.

The next thing you'll want to do is figure out how to turn those 10s 
page requests into 1s or less.  ;-)

-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Issues in tomcat 5.0.19

2004-04-07 Thread Carl Olivier
Hi David.

Ok, well I am trying our stress testing out with a LOWER maxThread count and
a higher acceptCount.  Nice to have some confirmation about the theory!
Thanks for your feedback.

Yes - those 10 second pages need to be optimised - the problem is that those
pages are dependant upon data retrieval - with the AMOUNT of data being
variable.  Also the implementation of our tag library by a web developer
could be done in a number of different ways - so we will need to get the
subroutines optimised as much as possible!

Anyway - thanks for your reply - appreciated.

Regards,

Carl

-Original Message-
From: David Rees [mailto:[EMAIL PROTECTED] 
Sent: 07 April 2004 08:59 AM
To: Tomcat Users List
Subject: Re: Issues in tomcat 5.0.19


Carl Olivier wrote, On 4/6/2004 10:30 AM:
 
 Could the problem be that too many high processor-requirement threads 
 are being started, and as such each gets less time on the processor - 
 thus taking longer to process..and thus, should we not set the AJP 
 worker maxThreads DOWN thus allowing the processor to finish each 
 processor intensive task quicker?  Maybe set the acceptCount 
 up.hmmm  - thoughts?

I think you've found the problem.  If you have a page which takes 10s to 
process (and you say that it's CPU bound, not IO bound), once you have 
as many threads running as you have CPUs on your server, that 10s is 
goint to start taking a longer.  You said your server was a single CPU 
machine, so if you're running 2 concurrent threads, now it will take 20s 
per request to process.

There isn't much you can do.  You need to limit the number of concurrent 
requests so that at maximum load, your slow page takes a reasonable time 
to complete.  Once you hit that limit you either need to start queueing 
requests or rejecting them.  If you don't, the server will just bog 
further and further down.  Once you get more than 10 or so CPU hogging 
threads going at a time, performance will really start to degrade and 
requests will take longer than 10x to complete than usual due to context 
switching overhead.

You can limit the number of concurrent requests at the connector level 
and then you might want increase the accept count as you suggested.

The next thing you'll want to do is figure out how to turn those 10s 
page requests into 1s or less.  ;-)

-Dave

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Issues in tomcat 5.0.19

2004-04-06 Thread Shapira, Yoav

Hi,
That was a nice, detailed message and explanation.

In Tomcat we have all 25 hosts set up using the Tomcat Host / blocks
-
each Host / has its own Context / pointing at the correct location
on
the drive for the site context ROOT.

25 hosts/contexts for tomcat shouldn't be a problem.

Each Apache VirtualHost sends the Tomcat related requests to Tomcat
using
the same worker (ajp13:localhost:8009) - is this a potential problem?

Yes, though I don't know enough to say for sure.  Have you tried using
multiple workers and rerunning your stress test?

In order to get around memory issues Tomcat is started with the
following
VM Settings:

-Xms128m -Xmx256m -XX:+UseParallelGC

Does performance change if you allocate more memory to the heap, i.e.
are you memory-bound, cpu-bound, i/o-bound, or a combination thereof?

1.  First request to sites are causing problems with TEI classloading -
even though the classes ARE in fact in the WEB-INF/classes location of
the
relevant web app.

This I'm completely unfamiliar with.  Does it happen with tomcat
standalone?  If so, can you create a small WAR that will allow us to
reproduce the problem?

2.  Load - when we run load tests on the server we have major issues.
The
sites run REALLY slowly - our test is a custom written load tester that
parses apache request logs for the exact test sites we have on the box
(the
live equivs) and duplicates the requests - except we can configure the
number of concurrent requests to run the test on.  The page responses
are
REALLY slow after a while and we also get IOExceptions occurring.

A couple of notes: JMeter's last release has something that will parse
web server access logs to create comparable load in a stress test plan
for you to run, so you might be able to ditch your custom tester.
(Which is good, because no one can reproduce results obtained with your
custom tester).

Are you saying the requests are first run fast, and then slow, when they
should all be taking the same time?

1.  The number of concurrent requests and the fact that pages and the
request reponse times are getting slow cause a major bottleneck in
Tomcat -
which leads to the piling up of queued requests - until the acceptCount
is
reached - and the server becomes totally unresponsive.

Are the requests slow because your tags take a while (you mentioned they
were big tags, which I assume means they do a lot of work)?  Or are
they slow because tomcat is churning?  I'm not sure of the causality in
your statement above.  What happens if you raise
maxProccesors/maxThreads as well as the acceptCount?


2.  GC - the JVM is not able to handle the GC of so many concurrent
threads
- which in turn leads to unresonsive threads/requests.

I highly doubt that.  How many threads are we talking about?  We have
apps that have hundreds of threads in the JVM and gigabytes in the heap,
and GC doesn't even cause a discernible pause.  Granted, we have big
hardware.

3.  The classloading issue in tomcat for TEI classes - ???  Is this a
known
bug in 5.019?

I don't think so.

Some things I would do in your place:

- Eliminate Apache and mod_jk, and run your tests on tomcat standalone
for now, to help diagnose the classloading, maxProcessors, acceptCount
issues.
- Set your contexts to reloadable=false to take away some threads and
internal tomcat processing, improve performance.
- Try to come up with a tiny, as simple as possible test that brings up
the TEI classloading error,
- Try to use a standard tool like JMeter or AB for your stress testing,
so that at least we're on the same page regarding your results.

Yoav Shapira



This e-mail, including any attachments, is a confidential business communication, and 
may contain information that is confidential, proprietary and/or privileged.  This 
e-mail is intended only for the individual(s) to whom it is addressed, and may not be 
saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) 
intended recipient, please immediately delete this e-mail from your computer system 
and notify the sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Issues in tomcat 5.0.19

2004-04-06 Thread Carl Olivier
Hi Yoav.

Thanks for your response - was hoping someone would wade through my rather
lengthy email!

Anyway, just some additional info/responses that may assist with
understanding our problem(s):

1. We have tested Tomcat standalone. We probably shouldn't have even
mentioned the Apache/AJP setup as it just confuses the issue.

2. There are several pages on our server which take over 10 seconds to
process. If we leave these pages out of the stress test, we don't have any
problems. If these are included in the stress test, they seem to create
bottlenecks which eventually result in Tomcat not responding (or responding
VERY slowly) and needing to be restarted.  Surely Tomcat should be able to
process the slower connections/requests without affecting the other, faster
requests.  Many of our tags do database retrievals (currently the database
server IS on the server - sorry forgot to include this in my last post - we
running MSSQL 2000 - using the JDBC Type 4 driver from MS - through a highly
efficient connection pool).  We have run a profiler on the SQL Server - and
that definitely does not have a bottleneck - very quick indeed.

2. Memory: this used to be our biggest problem; however, increasing the heap
and using the -XX:+UseParallelGC switch seemed to solve most of our memory
problems. In fact, with this garbage collector, we rarely see any errors
during our test, it simply goes slower and slower over time.

3. CPU usage goes to 100% during the stress test and stays that way for a
minute or two after we stop the test.

4. If we delete Tomcat's working folder and restart (thus forcing
recompilation), it cannot handle our stress test (lots of timeouts and
exceptions) and we have to leave it for a minute before trying again.

5. Our TLD file contains over 200 tags but only 6 TEI's. The first time we
request a JSP on any site after restarting Tomcat, we get a
ClassNotFoundException for a TEI, even if that JSP does not use a tag with a
TEI. The second JSP request for that site always succeeds. This behaviour
occurs 100% of the time.

Ok, well - I will look into sending you something to try emulate the TEI
problem (probably tomorrow).

Thanks a stack for your help so far - I will try some of your suggestions.

Regards,

Carl Olivier

-Original Message-
From: Shapira, Yoav [mailto:[EMAIL PROTECTED] 
Sent: 06 April 2004 04:53 PM
To: Tomcat Users List
Subject: RE: Issues in tomcat 5.0.19



Hi,
That was a nice, detailed message and explanation.

In Tomcat we have all 25 hosts set up using the Tomcat Host / blocks
-
each Host / has its own Context / pointing at the correct location
on
the drive for the site context ROOT.

25 hosts/contexts for tomcat shouldn't be a problem.

Each Apache VirtualHost sends the Tomcat related requests to Tomcat
using
the same worker (ajp13:localhost:8009) - is this a potential problem?

Yes, though I don't know enough to say for sure.  Have you tried using
multiple workers and rerunning your stress test?

In order to get around memory issues Tomcat is started with the
following
VM Settings:

-Xms128m -Xmx256m -XX:+UseParallelGC

Does performance change if you allocate more memory to the heap, i.e. are
you memory-bound, cpu-bound, i/o-bound, or a combination thereof?

1.  First request to sites are causing problems with TEI classloading - 
even though the classes ARE in fact in the WEB-INF/classes location of
the
relevant web app.

This I'm completely unfamiliar with.  Does it happen with tomcat standalone?
If so, can you create a small WAR that will allow us to reproduce the
problem?

2.  Load - when we run load tests on the server we have major issues.
The
sites run REALLY slowly - our test is a custom written load tester that 
parses apache request logs for the exact test sites we have on the box
(the
live equivs) and duplicates the requests - except we can configure the 
number of concurrent requests to run the test on.  The page responses
are
REALLY slow after a while and we also get IOExceptions occurring.

A couple of notes: JMeter's last release has something that will parse web
server access logs to create comparable load in a stress test plan for you
to run, so you might be able to ditch your custom tester. (Which is good,
because no one can reproduce results obtained with your custom tester).

Are you saying the requests are first run fast, and then slow, when they
should all be taking the same time?

1.  The number of concurrent requests and the fact that pages and the 
request reponse times are getting slow cause a major bottleneck in
Tomcat -
which leads to the piling up of queued requests - until the acceptCount
is
reached - and the server becomes totally unresponsive.

Are the requests slow because your tags take a while (you mentioned they
were big tags, which I assume means they do a lot of work)?  Or are they
slow because tomcat is churning?  I'm not sure of the causality in your
statement above.  What happens if you raise maxProccesors/maxThreads as well

RE: Issues in tomcat 5.0.19

2004-04-06 Thread Carl Olivier
Oh, by the way - one thing:

When we tested Tomcat standalone it was actually worse than when we run it
through Apache and mod_jk.  The reason is clearly that as Tomcat slows down,
even the static content (none JSP/servlet) takes ages to come back - whereas
through Apache, Apache would serve that quickly without sending to tomcat
(thus taking some load off).

An interesting thing we noted recently is that it seems to be the processing
time more than anything else that causes trouble, not throughput (which is
good).

Could the problem be that too many high processor-requirement threads are
being started, and as such each gets less time on the processor - thus
taking longer to process..and thus, should we not set the AJP worker
maxThreads DOWN thus allowing the processor to finish each processor
intensive task quicker?  Maybe set the acceptCount up.hmmm  - thoughts?

Thanks!

Regards,

Carl

-Original Message-
From: Carl Olivier [mailto:[EMAIL PROTECTED] 
Sent: 06 April 2004 06:01 PM
To: 'Tomcat Users List'
Subject: RE: Issues in tomcat 5.0.19


Hi Yoav.

Thanks for your response - was hoping someone would wade through my rather
lengthy email!

Anyway, just some additional info/responses that may assist with
understanding our problem(s):

1. We have tested Tomcat standalone. We probably shouldn't have even
mentioned the Apache/AJP setup as it just confuses the issue.

2. There are several pages on our server which take over 10 seconds to
process. If we leave these pages out of the stress test, we don't have any
problems. If these are included in the stress test, they seem to create
bottlenecks which eventually result in Tomcat not responding (or responding
VERY slowly) and needing to be restarted.  Surely Tomcat should be able to
process the slower connections/requests without affecting the other, faster
requests.  Many of our tags do database retrievals (currently the database
server IS on the server - sorry forgot to include this in my last post - we
running MSSQL 2000 - using the JDBC Type 4 driver from MS - through a highly
efficient connection pool).  We have run a profiler on the SQL Server - and
that definitely does not have a bottleneck - very quick indeed.

2. Memory: this used to be our biggest problem; however, increasing the heap
and using the -XX:+UseParallelGC switch seemed to solve most of our memory
problems. In fact, with this garbage collector, we rarely see any errors
during our test, it simply goes slower and slower over time.

3. CPU usage goes to 100% during the stress test and stays that way for a
minute or two after we stop the test.

4. If we delete Tomcat's working folder and restart (thus forcing
recompilation), it cannot handle our stress test (lots of timeouts and
exceptions) and we have to leave it for a minute before trying again.

5. Our TLD file contains over 200 tags but only 6 TEI's. The first time we
request a JSP on any site after restarting Tomcat, we get a
ClassNotFoundException for a TEI, even if that JSP does not use a tag with a
TEI. The second JSP request for that site always succeeds. This behaviour
occurs 100% of the time.

Ok, well - I will look into sending you something to try emulate the TEI
problem (probably tomorrow).

Thanks a stack for your help so far - I will try some of your suggestions.

Regards,

Carl Olivier

-Original Message-
From: Shapira, Yoav [mailto:[EMAIL PROTECTED] 
Sent: 06 April 2004 04:53 PM
To: Tomcat Users List
Subject: RE: Issues in tomcat 5.0.19



Hi,
That was a nice, detailed message and explanation.

In Tomcat we have all 25 hosts set up using the Tomcat Host / blocks
-
each Host / has its own Context / pointing at the correct location
on
the drive for the site context ROOT.

25 hosts/contexts for tomcat shouldn't be a problem.

Each Apache VirtualHost sends the Tomcat related requests to Tomcat
using
the same worker (ajp13:localhost:8009) - is this a potential problem?

Yes, though I don't know enough to say for sure.  Have you tried using
multiple workers and rerunning your stress test?

In order to get around memory issues Tomcat is started with the
following
VM Settings:

-Xms128m -Xmx256m -XX:+UseParallelGC

Does performance change if you allocate more memory to the heap, i.e. are
you memory-bound, cpu-bound, i/o-bound, or a combination thereof?

1.  First request to sites are causing problems with TEI classloading -
even though the classes ARE in fact in the WEB-INF/classes location of
the
relevant web app.

This I'm completely unfamiliar with.  Does it happen with tomcat standalone?
If so, can you create a small WAR that will allow us to reproduce the
problem?

2.  Load - when we run load tests on the server we have major issues.
The
sites run REALLY slowly - our test is a custom written load tester that
parses apache request logs for the exact test sites we have on the box
(the
live equivs) and duplicates the requests - except we can configure the
number of concurrent requests to run the test