Re: Tomcat 5.0 under load
Keith Wannamaker wrote: Last month I took Yoav's advice and attempted to upgrade our production server from 4.1 to 5.0. The production server handles 5 - 10 requests a second across 300 threads. The problem I had then, and the problem I have now is that the server's accept thread will die within a short time after server start. I hate to think I am the only person running tomcat 5 under a heavy load, but it sure looks that way. I initially blamed threadpool's bulletproofing, but because 4.1.31 and 5.0.28 share the same threadpool, and 4.1.31 runs indefinitely, there is a problem in core 5.0 that this load is exercising. I very much want to be able to recommend that tomcat 5.0 is production-ready but since we can't run it, I certainly am not in a position to do that. I have reserved the next day or two for bulletproofing tomcat 5.0, so the point of this is to solicit any comments from those who may have been faced with the same problem and have looked at the problem themselves. First, I didn't see any previous message about any of your problems. Then, why do I even have to ask which OS / VM / connector / etc you are running ? Also mention exactly what the server's accept thread will die mean. BTW, jboss.org is running on Tomcat 4.1 (inside JBoss), with more load than your site, on RedHat EL 3 (we did have many issues on RH 9 that LD_ASSUME_KERNEL solved in a large part). Except when we're making a configuration mistake (it happens more often than I would like), we don't have many issues (we do have annoyance of TC 4.1 compared to 5.0 - we'll upgrade soon, I am being told, so I'll let people know how well it goes). Rémy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Tomcat 5.0 under load
Hi, There are certainly other sites running Tomcat 5.0 under heavy load, such as the ones listed on our wiki. I personally have put Tomcat 5.0-based apps in production that have handled the load you describe (and much higher peak bursty loads) for months at a time without need to restart. However, it could very well be your specific app or configuration is exercising parts of Tomcat in ways other apps aren't. Every app load profile is unique. So this should definitely result in an improvement to Tomcat, or maybe the connectors if you're running Tomcat behind a front-end web server. I have no specific advice beyond the usual, which is to start with something reproducible. Can you get the accept thread to die every time within a given time window after the server start? Does it happen with Tomcat standalone as well as behind a front-end server, or just the latter? Does it happen with the out-of-the-box server.xml or a heavily modified one? Yoav Shapira http://www.yoavshapira.com -Original Message- From: Keith Wannamaker [mailto:[EMAIL PROTECTED] Sent: Friday, October 15, 2004 10:42 AM To: [EMAIL PROTECTED] Subject: Tomcat 5.0 under load Last month I took Yoav's advice and attempted to upgrade our production server from 4.1 to 5.0. The production server handles 5 - 10 requests a second across 300 threads. The problem I had then, and the problem I have now is that the server's accept thread will die within a short time after server start. I hate to think I am the only person running tomcat 5 under a heavy load, but it sure looks that way. I initially blamed threadpool's bulletproofing, but because 4.1.31 and 5.0.28 share the same threadpool, and 4.1.31 runs indefinitely, there is a problem in core 5.0 that this load is exercising. I very much want to be able to recommend that tomcat 5.0 is production-ready but since we can't run it, I certainly am not in a position to do that. I have reserved the next day or two for bulletproofing tomcat 5.0, so the point of this is to solicit any comments from those who may have been faced with the same problem and have looked at the problem themselves. Thanks for any input, Keith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0 under load
Shapira, Yoav wrote: Hi, There are certainly other sites running Tomcat 5.0 under heavy load, such as the ones listed on our wiki. I personally have put Tomcat 5.0-based apps in production that have handled the load you describe (and much higher peak bursty loads) for months at a time without need to restart. However, it could very well be your specific app or configuration is exercising parts of Tomcat in ways other apps aren't. Every app load profile is unique. So this should definitely result in an improvement to Tomcat, or maybe the connectors if you're running Tomcat behind a front-end web server. I have no specific advice beyond the usual, which is to start with something reproducible. Can you get the accept thread to die every time within a given time window after the server start? Does it happen with Tomcat standalone as well as behind a front-end server, or just the latter? Does it happen with the out-of-the-box server.xml or a heavily modified one? Some facts: - the higher level code cannot cause the accept thread to die - the code for the whole threadpool is shared So there's nothing which can be inherently more broken in TC 5. However, what differs is timing: TC 5 will generally be faster, and since this issue would involve sheduling and stuff, this likely matters ... If Keith is feeling like experimenting a little (without too much risk involved, though): try 5.5.3 with strategy=ms on the Connector. This will use the old TC 4.0 thread pool strategy, which is far less fancy, and was never reported as having trouble on stuff like RH 9. Rémy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0 under load
The time window is within about 15 minutes. We run tomcat standalone, with the standard http/11 connector. The server.xml is minimal. I agree with the reproduceable angle, that is always a good place to start. Keith Shapira, Yoav wrote: Hi, There are certainly other sites running Tomcat 5.0 under heavy load, such as the ones listed on our wiki. I personally have put Tomcat 5.0-based apps in production that have handled the load you describe (and much higher peak bursty loads) for months at a time without need to restart. However, it could very well be your specific app or configuration is exercising parts of Tomcat in ways other apps aren't. Every app load profile is unique. So this should definitely result in an improvement to Tomcat, or maybe the connectors if you're running Tomcat behind a front-end web server. I have no specific advice beyond the usual, which is to start with something reproducible. Can you get the accept thread to die every time within a given time window after the server start? Does it happen with Tomcat standalone as well as behind a front-end server, or just the latter? Does it happen with the out-of-the-box server.xml or a heavily modified one? Yoav Shapira http://www.yoavshapira.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0 under load
Hey Remy, Some facts: - the higher level code cannot cause the accept thread to die Thread dump from T0 shows the two expected accepts, from main and from the HttpConnector; thread dump at T(5mins) shows the main accept, many idle Http handler threads waiting for work, and a few long-running uploads downloads, but no Http accept. So, indeed, the thread is either stopping itself with no message or being killed with no message. Interesting note on logging. I tried today to use both jdk14 logger and simple log to show the progress of the accept. The behavior of going through either logger is that init messages come through but runIt messages don't. I thought it was the logger config so I reverted to System.out and still got that behavior. I wonder if the underlying exception causing the problem is being squelched by whatever is squelching my messages. Ever seen this? If Keith is feeling like experimenting a little (without too much risk involved, though): try 5.5.3 with strategy=ms on the Connector. This will use the old TC 4.0 thread pool strategy, which is far less fancy, and was never reported as having trouble on stuff like RH 9. I may try this. Thanks, Keith - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0 under load
Keith Wannamaker wrote: Hey Remy, Some facts: - the higher level code cannot cause the accept thread to die Thread dump from T0 shows the two expected accepts, from main and from the HttpConnector; thread dump at T(5mins) shows the main accept, many idle Http handler threads waiting for work, and a few long-running uploads downloads, but no Http accept. So, indeed, the thread is either stopping itself with no message or being killed with no message. This is the same as the RH 9 bug. Which OS are you using ? Interesting note on logging. I tried today to use both jdk14 logger and simple log to show the progress of the accept. The behavior of going through either logger is that init messages come through but runIt messages don't. I thought it was the logger config so I reverted to System.out and still got that behavior. I wonder if the underlying exception causing the problem is being squelched by whatever is squelching my messages. Ever seen this? Well, no. I have to admit I didn't look in detail at what everything does, since I didn't write that algorithm, and never quite understood why it works. If Keith is feeling like experimenting a little (without too much risk involved, though): try 5.5.3 with strategy=ms on the Connector. This will use the old TC 4.0 thread pool strategy, which is far less fancy, and was never reported as having trouble on stuff like RH 9. I may try this. That algorithm is really stupid. OTOH, it does seem to have more syncing (not that I can see any performance impact from it). Rémy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0 under load
Hey Remy, by RH 9 bug do you mean the problem with jdk14 and nptl on RH9? (os is RH9) I am running without nptl. I did some more tracing and what I am seeing is that notify is called, the thread is waiting, but it never wakes up. Keith Remy Maucherat wrote: Keith Wannamaker wrote: Hey Remy, Some facts: - the higher level code cannot cause the accept thread to die Thread dump from T0 shows the two expected accepts, from main and from the HttpConnector; thread dump at T(5mins) shows the main accept, many idle Http handler threads waiting for work, and a few long-running uploads downloads, but no Http accept. So, indeed, the thread is either stopping itself with no message or being killed with no message. This is the same as the RH 9 bug. Which OS are you using ? Interesting note on logging. I tried today to use both jdk14 logger and simple log to show the progress of the accept. The behavior of going through either logger is that init messages come through but runIt messages don't. I thought it was the logger config so I reverted to System.out and still got that behavior. I wonder if the underlying exception causing the problem is being squelched by whatever is squelching my messages. Ever seen this? Well, no. I have to admit I didn't look in detail at what everything does, since I didn't write that algorithm, and never quite understood why it works. If Keith is feeling like experimenting a little (without too much risk involved, though): try 5.5.3 with strategy=ms on the Connector. This will use the old TC 4.0 thread pool strategy, which is far less fancy, and was never reported as having trouble on stuff like RH 9. I may try this. That algorithm is really stupid. OTOH, it does seem to have more syncing (not that I can see any performance impact from it). R?my - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0 under load
Keith Wannamaker wrote: Hey Remy, by RH 9 bug do you mean the problem with jdk14 and nptl on RH9? (os is RH9) I am running without nptl. It doesn't matter which JDK version you're running. On RH 9, you need LD_ASSUME_KERNEL=2.4.1. This disables the nptl backport, which doesn't give any performance increase anyway. jboss.org was very unstable (with TC 4.1 embedded) without it, and is working fine with it. This is not needed on RH EL 3.0, I am told. I did some more tracing and what I am seeing is that notify is called, the thread is waiting, but it never wakes up. Rémy - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]