Re: Tomcat 5.0 under load

2004-10-15 Thread Remy Maucherat
Keith Wannamaker wrote:
Last month I took Yoav's advice and attempted to upgrade our 
production server from 4.1 to 5.0.  The production server handles 5 - 
10 requests a second across 300 threads.  The problem I had then, and 
the problem I have now is that the server's accept thread will die 
within a short time after server start.  I hate to think I am the only 
person running tomcat 5 under a heavy load, but it sure looks that way.

I initially blamed threadpool's bulletproofing, but because 4.1.31 and 
5.0.28 share the same threadpool, and 4.1.31 runs indefinitely, there 
is a problem in core 5.0 that this load is exercising.

I very much want to be able to recommend that tomcat 5.0 is 
production-ready but since we can't run it, I certainly am not in a 
position to do that.  I have reserved the next day or two for 
bulletproofing tomcat 5.0, so the point of this is to solicit any 
comments from those who may have been faced with the same problem and 
have looked at the problem themselves. 
First, I didn't see any previous message about any of your problems.
Then, why do I even have to ask which OS / VM / connector / etc you are 
running ? Also mention exactly what the server's accept thread will 
die mean.

BTW, jboss.org is running on Tomcat 4.1 (inside JBoss), with more load 
than your site, on RedHat EL 3 (we did have many issues on RH 9 that 
LD_ASSUME_KERNEL solved in a large part). Except when we're making a 
configuration mistake (it happens more often than I would like), we 
don't have many issues (we do have annoyance of TC 4.1 compared to 5.0 - 
we'll upgrade soon, I am being told, so I'll let people know how well it 
goes).

Rémy
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


RE: Tomcat 5.0 under load

2004-10-15 Thread Shapira, Yoav

Hi,
There are certainly other sites running Tomcat 5.0 under heavy load,
such as the ones listed on our wiki.  I personally have put Tomcat
5.0-based apps in production that have handled the load you describe
(and much higher peak bursty loads) for months at a time without need to
restart.

However, it could very well be your specific app or configuration is
exercising parts of Tomcat in ways other apps aren't.  Every app load
profile is unique.  So this should definitely result in an improvement
to Tomcat, or maybe the connectors if you're running Tomcat behind a
front-end web server.

I have no specific advice beyond the usual, which is to start with
something reproducible.  Can you get the accept thread to die every time
within a given time window after the server start?  Does it happen with
Tomcat standalone as well as behind a front-end server, or just the
latter?  Does it happen with the out-of-the-box server.xml or a heavily
modified one?

Yoav Shapira http://www.yoavshapira.com


-Original Message-
From: Keith Wannamaker [mailto:[EMAIL PROTECTED]
Sent: Friday, October 15, 2004 10:42 AM
To: [EMAIL PROTECTED]
Subject: Tomcat 5.0 under load

Last month I took Yoav's advice and attempted to upgrade our production
server from 4.1 to 5.0.  The production server handles 5 - 10 requests
a
second across 300 threads.  The problem I had then, and the problem I
have now is that the server's accept thread will die within a short
time
after server start.  I hate to think I am the only person running
tomcat
5 under a heavy load, but it sure looks that way.

I initially blamed threadpool's bulletproofing, but because 4.1.31 and
5.0.28 share the same threadpool, and 4.1.31 runs indefinitely, there
is
a problem in core 5.0 that this load is exercising.

I very much want to be able to recommend that tomcat 5.0 is
production-ready but since we can't run it, I certainly am not in a
position to do that.  I have reserved the next day or two for
bulletproofing tomcat 5.0, so the point of this is to solicit any
comments from those who may have been faced with the same problem and
have looked at the problem themselves.

Thanks for any input,
Keith


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




This e-mail, including any attachments, is a confidential business communication, and 
may contain information that is confidential, proprietary and/or privileged.  This 
e-mail is intended only for the individual(s) to whom it is addressed, and may not be 
saved, copied, printed, disclosed or used by anyone else.  If you are not the(an) 
intended recipient, please immediately delete this e-mail from your computer system 
and notify the sender.  Thank you.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Tomcat 5.0 under load

2004-10-15 Thread Remy Maucherat
Shapira, Yoav wrote:
Hi,
There are certainly other sites running Tomcat 5.0 under heavy load,
such as the ones listed on our wiki.  I personally have put Tomcat
5.0-based apps in production that have handled the load you describe
(and much higher peak bursty loads) for months at a time without need to
restart.  

However, it could very well be your specific app or configuration is
exercising parts of Tomcat in ways other apps aren't.  Every app load
profile is unique.  So this should definitely result in an improvement
to Tomcat, or maybe the connectors if you're running Tomcat behind a
front-end web server.
I have no specific advice beyond the usual, which is to start with
something reproducible.  Can you get the accept thread to die every time
within a given time window after the server start?  Does it happen with
Tomcat standalone as well as behind a front-end server, or just the
latter?  Does it happen with the out-of-the-box server.xml or a heavily
modified one?
Some facts:
- the higher level code cannot cause the accept thread to die
- the code for the whole threadpool is shared
So there's nothing which can be inherently more broken in TC 5.
However, what differs is timing: TC 5 will generally be faster, and 
since this issue would involve sheduling and stuff, this likely matters ...

If Keith is feeling like experimenting a little (without too much risk 
involved, though): try 5.5.3 with strategy=ms on the Connector. This 
will use the old TC 4.0 thread pool strategy, which is far less fancy, 
and was never reported as having trouble on stuff like RH 9.

Rémy
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Tomcat 5.0 under load

2004-10-15 Thread Keith Wannamaker
The time window is within about 15 minutes.  We run tomcat standalone, 
with the standard http/11 connector.  The server.xml is minimal.  I 
agree with the reproduceable angle, that is always a good place to start.

Keith
Shapira, Yoav wrote:
Hi,
There are certainly other sites running Tomcat 5.0 under heavy load,
such as the ones listed on our wiki.  I personally have put Tomcat
5.0-based apps in production that have handled the load you describe
(and much higher peak bursty loads) for months at a time without need to
restart.  

However, it could very well be your specific app or configuration is
exercising parts of Tomcat in ways other apps aren't.  Every app load
profile is unique.  So this should definitely result in an improvement
to Tomcat, or maybe the connectors if you're running Tomcat behind a
front-end web server.
I have no specific advice beyond the usual, which is to start with
something reproducible.  Can you get the accept thread to die every time
within a given time window after the server start?  Does it happen with
Tomcat standalone as well as behind a front-end server, or just the
latter?  Does it happen with the out-of-the-box server.xml or a heavily
modified one?
Yoav Shapira http://www.yoavshapira.com
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Tomcat 5.0 under load

2004-10-15 Thread Keith Wannamaker
Hey Remy,
Some facts:
- the higher level code cannot cause the accept thread to die
Thread dump from T0 shows the two expected accepts, from main and from 
the HttpConnector; thread dump at T(5mins) shows the main accept, many 
idle Http handler threads waiting for work, and a few long-running 
uploads  downloads, but no Http accept.  So, indeed, the thread is 
either stopping itself with no message or being killed with no message.

Interesting note on logging.  I tried today to use both jdk14 logger and 
simple log to show the progress of the accept.  The behavior of going 
through either logger is that init messages come through but runIt 
messages don't.  I thought it was the logger config so I reverted to 
System.out and still got that behavior.  I wonder if the underlying 
exception causing the problem is being squelched by whatever is 
squelching my messages.  Ever seen this?

If Keith is feeling like experimenting a little (without too much risk 
involved, though): try 5.5.3 with strategy=ms on the Connector. This 
will use the old TC 4.0 thread pool strategy, which is far less fancy, 
and was never reported as having trouble on stuff like RH 9.
I may try this.
Thanks,
Keith
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Tomcat 5.0 under load

2004-10-15 Thread Remy Maucherat
Keith Wannamaker wrote:
Hey Remy,
Some facts:
- the higher level code cannot cause the accept thread to die

Thread dump from T0 shows the two expected accepts, from main and from 
the HttpConnector; thread dump at T(5mins) shows the main accept, many 
idle Http handler threads waiting for work, and a few long-running 
uploads  downloads, but no Http accept.  So, indeed, the thread is 
either stopping itself with no message or being killed with no message.
This is the same as the RH 9 bug. Which OS are you using ?
Interesting note on logging.  I tried today to use both jdk14 logger 
and simple log to show the progress of the accept.  The behavior of 
going through either logger is that init messages come through but 
runIt messages don't.  I thought it was the logger config so I 
reverted to System.out and still got that behavior.  I wonder if the 
underlying exception causing the problem is being squelched by 
whatever is squelching my messages.  Ever seen this?
Well, no. I have to admit I didn't look in detail at what everything 
does, since I didn't write that algorithm, and never quite understood 
why it works.


If Keith is feeling like experimenting a little (without too much 
risk involved, though): try 5.5.3 with strategy=ms on the 
Connector. This will use the old TC 4.0 thread pool strategy, which 
is far less fancy, and was never reported as having trouble on stuff 
like RH 9.
I may try this.
That algorithm is really stupid. OTOH, it does seem to have more syncing 
(not that I can see any performance impact from it).

Rémy
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Tomcat 5.0 under load

2004-10-15 Thread Keith Wannamaker
Hey Remy, by RH 9 bug do you mean the problem with jdk14 and nptl on 
RH9?  (os is RH9)  I am running without nptl.

I did some more tracing and what I am seeing is that notify is called, 
the thread is waiting, but it never wakes up.

Keith
Remy Maucherat wrote:
Keith Wannamaker wrote:
Hey Remy,
Some facts:
- the higher level code cannot cause the accept thread to die

Thread dump from T0 shows the two expected accepts, from main and from 
the HttpConnector; thread dump at T(5mins) shows the main accept, many 
idle Http handler threads waiting for work, and a few long-running 
uploads  downloads, but no Http accept.  So, indeed, the thread is 
either stopping itself with no message or being killed with no message.

This is the same as the RH 9 bug. Which OS are you using ?
Interesting note on logging.  I tried today to use both jdk14 logger 
and simple log to show the progress of the accept.  The behavior of 
going through either logger is that init messages come through but 
runIt messages don't.  I thought it was the logger config so I 
reverted to System.out and still got that behavior.  I wonder if the 
underlying exception causing the problem is being squelched by 
whatever is squelching my messages.  Ever seen this?

Well, no. I have to admit I didn't look in detail at what everything 
does, since I didn't write that algorithm, and never quite understood 
why it works.


If Keith is feeling like experimenting a little (without too much 
risk involved, though): try 5.5.3 with strategy=ms on the 
Connector. This will use the old TC 4.0 thread pool strategy, which 
is far less fancy, and was never reported as having trouble on stuff 
like RH 9.

I may try this.

That algorithm is really stupid. OTOH, it does seem to have more syncing 
(not that I can see any performance impact from it).

R?my
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Re: Tomcat 5.0 under load

2004-10-15 Thread Remy Maucherat
Keith Wannamaker wrote:
Hey Remy, by RH 9 bug do you mean the problem with jdk14 and nptl on 
RH9?  (os is RH9)  I am running without nptl.
It doesn't matter which JDK version you're running. On RH 9, you need 
LD_ASSUME_KERNEL=2.4.1. This disables the nptl backport, which doesn't 
give any performance increase anyway.

jboss.org was very unstable (with TC 4.1 embedded) without it, and is 
working fine with it. This is not needed on RH EL 3.0, I am told.

I did some more tracing and what I am seeing is that notify is called, 
the thread is waiting, but it never wakes up.
Rémy
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]