Hi Gerhardus,

you allow 575 parallel requests with Apache. If something gets slow and you get more and more "W" states, this parallelism will really get used. Each parallel requests needs a connection to your Tomcat (unless it gets served directly by Apache), and each connection to Tomcat needs a thread to handle the request.

You didn't configure the number of threads for Tomcats AJP connector. The default is something like 200 threads. So you allow much more incoming parallelism on Apache, than you configured on you backend. Once you get above the 200 parallel requests, you should see messages like "all threads are busy" in your Tomcat log files.

But: This doesn't really explain, why you get into a situation, where that many requests need to be run in parallel. If your requests start to queue up (more "W" than normally), you should do Java Thread Dumps for Tomcat (sending kill -QUIT) which will go to catalina.out. Those will tell you/us/your webapp developers, which parts of the code Tomcat or more likely your webapp is getting slow in.

Java thread dumps only give a snapshot information, so it is good to do a couple (3-5) of them a few seconds apart from each other.

Regards,

Rainer

[EMAIL PROTECTED] wrote:
Hi
I'm kind of between a rock and a hard place.
We have a problem in our production system that occurs quite regularly.
Apache's connections all get into a Sending Reply ( W ) state and which
makes the application unresponsive.
We have an apache 2.0.52 fronting 12 tomcat 5.5 all on CentOS 4.5 using
mod_jk with, dare is say it, default settings. :-(
We have a stripped down apache installed on each Tomcat blade and what
is interesting is that when we reach this stage of all connections in
"W" state we can access the application using the local apache on the
tomcat blade using port 8080 but not access it on port 8009 using the
local apache on the tomcat blade. This to me points to a connector
problem.
I believe that the problem is related to our mod_jk settings or lack
there off and also the version we are using.
What makes matters a bit difficult for me is that we are unable to
recreate the problems we are seeing in production on our test systems
which makes it very difficult to push out changes to production.
Management is quite strict in allowing production changes, which is
understandable because downtime is expensive.
We are using mod_jk-1.2.22-2.0.52-linux-x86_64.so
httpd-2.0.52-28.ent prefork
Tomcat 5.5

Questions:
~~~~~~~~~~
* Do you agree that it is mod_jk settings?
* What more information do I need or should look at to determine
problems.( The developers regularly scrutinize thread dumps we make)
* mod_jk docs says: mod_jk-1.2.25-httpd-2.0.59.so is for Apache 2.0.x
and works with Apache 2.0.59 and later,
 will using httpd-2.0.52-28.ent be a problem?

Settings
~~~~~~
httpd.conf
~~~~~~~~~~
<IfModule prefork.c>
StartServers       8
MinSpareServers    8
MaxSpareServers   300
ServerLimit      575
MaxClients       575
MaxRequestsPerChild  4000

workers.properties
~~~~~~~~~~~~~~~~~~
# Worker list
worker.list=xml-gta,jkstatus

# Worker definitions
worker.xml-gta.type=lb
worker.xml-gta.method=Busyness
worker.xml-gta.balanced_workers=
lonstct01agx,lonstct01bgx,lonstct01cgx,,lonstct01dgx,lonstct01egx,lonstc
t01fgx,lonstct01ggx,lonstct01hgx,lonstct01igx,lonstct01jgx,lonstct01kgx,
lonstct01lgx
worker.jkstatus.type=status

# Balance workers
worker.lonstct01agx.port=8009
worker.lonstct01agx.host=xx.xx.xx.xx
worker.lonstct01agx.type=ajp13
worker.lonstct01agx.lbfactor=1

...

worker.lonstct01lgx.port=8009
worker.lonstct01lgx.host=xx.xx.xx.xx
worker.lonstct01lgx.type=ajp13
worker.lonstct01lgx.lbfactor=1
server.xml
~~~~~~~~~~
<Connector port="8009"
enableLookups="false" redirectPort="8443" protocol="AJP/1.3" />


Suggested Changes I want to make (but still need approval for)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Upgrade to mod_jk .25

Change workers.properties to:

workers.properties
~~~~~~~~~~~~~~~~~~
# Worker list
worker.list=xml-oct-gta,jkstatus

# Worker definitions
worker.xml-oct-gta.type=lb
worker.xml-oct-gta.method=Busyness
worker.xml-oct-gta.balance_workers=longtct02c,longtct02d
worker.xml-oct-gta.lock=Pessimistic
worker.xml-oct-gta.max_reply_timeouts=10

worker.jkstatus.type=status


# Worker Template
worker.reference.port=8009
worker.reference.type=ajp13
worker.reference.lbfactor=1
worker.reference.socket_timeout=60
worker.reference.socket_keepalive=true
worker.reference.connect_timeout=500
worker.reference.prepost_timeout=500
worker.reference.reply_timeout=32000
worker.reference.recovery_options=27
# 16 8 2 1
worker.reference.retries=12 #

# Balance workers
worker.longtct02c.reference=worker.reference
worker.longtct02c.host=xx.xx.xx.xx
... (there are 10 other servers not listed here for space saving
purposes)
worker.longtct02d.reference=worker.reference
worker.longtct02d.host=xx.xx.xx.xx

Regards

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to