Gang,
I have a somewhat heavily loaded tomcat application that uses a
fairly standard Apache 1.3.33 / mod_jk 1.2.15 / tomcat 5.0.28 set up. I
am doing load balancing in the mod_jk, going to 2 RedHat 2.6.9 boxes
that run the tomcats. The load is split evenly across the two tomcat
boxes. The site processes anywhere between 400,000 and 700,000 JSP
pages per day. I've included the configuration files at the bottom of
this message.
Usually, the site runs great. No problems whatsoever (well, my code
occasionally crashes, but that's to be expected). But once in awhile,
my mod_jk seems to suddenly start to "lose" connections to my tomcats.
mod_jk.log sees this:
[Tue Apr 04 18:23:35 2006] [info] ajp_process_callback::jk_ajp_common.c
(1384): Connection aborted or network problems
[Tue Apr 04 18:23:35 2006] [info] ajp_service::jk_ajp_common.c (1731):
Receiving from tomcat failed, because of client error without recovery
in send loop 0
[Tue Apr 04 18:23:35 2006] [info] service::jk_lb_worker.c (711):
unrecoverable error 400, request failed. Client failed in the middle of
request, we can't recover to another instance.
[Tue Apr 04 18:23:35 2006] [info] jk_handler::mod_jk.c (1841): Aborting
connection for worker=loadbalancer
[Tue Apr 04 18:23:35 2006] [info] ajp_process_callback::jk_ajp_common.c
(1384): Connection aborted or network problems
[Tue Apr 04 18:23:35 2006] [info] ajp_service::jk_ajp_common.c (1731):
Receiving from tomcat failed, because of client error without recovery
in send loop 0
[Tue Apr 04 18:23:35 2006] [info] service::jk_lb_worker.c (711):
unrecoverable error 400, request failed. Client failed in the middle of
request, we can't recover to another instance.
[Tue Apr 04 18:23:35 2006] [info] jk_handler::mod_jk.c (1841): Aborting
connection for worker=loadbalancer
[Tue Apr 04 18:23:35 2006] [info] ajp_process_callback::jk_ajp_common.c
(1384): Connection aborted or network problems
[Tue Apr 04 18:23:35 2006] [info] ajp_service::jk_ajp_common.c (1731):
Receiving from tomcat failed, because of client error without recovery
in send loop 0
[Tue Apr 04 18:23:35 2006] [info] service::jk_lb_worker.c (711):
unrecoverable error 400, request failed. Client failed in the middle of
request, we can't recover to another instance.
[Tue Apr 04 18:23:35 2006] [info] jk_handler::mod_jk.c (1841): Aborting
connection for worker=loadbalancer
...when this happens, my Apache scoreboard fills up with threads in the
"W" state. All new requests seem to get stuck in this state. Quickly,
the apache fills up and new requests are locked out. If I look at my
tomcat logs, I see that processing has slowed and stopped. One
interesting thing is that if I kill and restart my apache, the tomcat
logs go crazy with output, as if killing the apache somehow "unsticks"
them.
Is something running out of sockets? If I had to guess, I'd say that it
feels like mod_jk can't receive the data back from the tomcats, but
thats just a hunch. I've seen the apache mod_jk config page which seems
to have a bunch of different parameters for dealing with "stuck"
tomcats, but I'm unsure of which one to use, because I don't really know
whats happening or what the problem is. I've found other posts on the
web that list similar problems, but haven't seen any "Oh this is the
problem and this is the simple solution". I haven't yet dug in with
netstat to see what's going on network-wise, I'm hoping there some
magical bullet configuration parameter that will help. C'moooonnnnn
magic bullet!!
Thanks for any insight you can provide!
/kurt
Appropriate configs:
My Apache is set to 1024 simultaneous connections and the JkMounts are
configured properly. I have my maximum number of file descriptors
(sockets) set to 65535 on all machines.
Tomcat AJP connector configs on machines tc1 and tc2
<Connector port="8089" protocol="AJP/1.3" maxThreads="400"
maxProcessors="0" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8593"/>
workers.properties config
worker.list=tc1-w1,tc2-w1,status,loadbalancer
# tomcat 1 on host tc1
worker.tc1-w1.host=192.168.1.254
worker.tc1-w1.port=8089
worker.tc1-w1.type=ajp13
worker.tc1-w1.lbfactor=8
# tomcat 1 on host tc2
worker.tc2-w1.host=192.168.1.247
worker.tc2-w1.port=8089
worker.tc2-w1.type=ajp13
worker.tc2-w1.lbfactor=8
# status and loadbalancer workers
worker.status.type=status
worker.loadbalancer.type=lb
worker.loadbalancer.balanced_workers=tc1-w1,tc2-w1
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]