André, thanks for the suggestions. I removed all the time outs except for connection_pool_timeout which I set to 120. I changed the connectionTimeout to 120000 and maxThreads to 200 in server.xml. Things are running much smoother now than before. I still see the error messages but less frequently than before. I used to see the messages every 1 to 2 minutes, now it is between 5 minutes and 1 hour.
Where to go from here? How do I determine which timeout needs to be adjusted? Another more serious issue that I have not solved is the sudden surge of http connections, from ~30 to over 200 in a matter of few seconds causing all kind of read timeout and service unavailable errors. Any suggestions to how to go about fixing or determining the root cause of either of the two issues? Here are the latest configurations: --- workers.properties----- worker.list=loadbl,status worker.template.port=8009 worker.template.type=ajp13 worker.template.lbfactor=1 worker.template.ping_mode=A worker.template.connection_pool_timeout=120 worker.worker1.reference=worker.template worker.worker1.host=jboss_server1 worker.worker2.reference=worker.template worker.worker2.host= jboss_server2 worker.loadbl.type=lb worker.loadbl.balance_workers=worker1,worker2 worker.loadbl.sticky_session=True worker.status.type=status ---------------------------- ---- server.xml ------ <!-- Define an AJP 1.3 Connector on port 8009 --> <Connector port="8009" address="${jboss.bind.address}" protocol="AJP/1.3" emptySessionPath="true" enableLookups="false" redirectPort="8443" maxThreads="200" connectionTimeout="120000"/> --------------------- ---------- error messages ---------- [Thu Oct 28 05:51:57 2010][19938:3086886672] [info] ajp_service::jk_ajp_common.c (2540): (worker1) sending request to tomcat failed (unrecoverable), because of client write error (attempt=1) [Thu Oct 28 05:51:57 2010][19938:3086886672] [info] service::jk_lb_worker.c (1388): service failed, worker worker1 is in local error state [Thu Oct 28 05:51:57 2010][19938:3086886672] [info] service::jk_lb_worker.c (1407): unrecoverable error 200, request failed. Client failed in the middle of request, we can't recover to another instance. [Thu Oct 28 05:51:57 2010][19938:3086886672] [info] jk_handler::mod_jk.c (2611): Aborting connection for worker=loadbl [Thu Oct 28 06:03:06 2010][27490:3086886672] [info] ajp_process_callback::jk_ajp_common.c (1882): Writing to client aborted or client network problems [Thu Oct 28 06:03:06 2010][27490:3086886672] [info] ajp_service::jk_ajp_common.c (2540): (worker1) sending request to tomcat failed (unrecoverable), because of client write error (attempt=1) [Thu Oct 28 06:03:06 2010][27490:3086886672] [info] service::jk_lb_worker.c (1388): service failed, worker worker1 is in local error state [Thu Oct 28 06:03:06 2010][27490:3086886672] [info] service::jk_lb_worker.c (1407): unrecoverable error 200, request failed. Client failed in the middle of request, we can't recover to another instance. [Thu Oct 28 06:03:06 2010][27490:3086886672] [info] jk_handler::mod_jk.c (2611): Aborting connection for worker=loadbl [Thu Oct 28 06:38:49 2010][25752:3086886672] [info] ajp_handle_cping_cpong::jk_ajp_common.c (879): timeout in reply cpong [Thu Oct 28 06:38:51 2010][25752:3086886672] [info] ajp_send_request::jk_ajp_common.c (1518): (worker1) failed sending request, socket -1 prepost cping/cpong failure (errno=110) [Thu Oct 28 06:38:51 2010][25752:3086886672] [info] ajp_send_request::jk_ajp_common.c (1574): (worker1) all endpoints are disconnected, detected by connect check (0), cping (1), send (0) Thanks, -mo -----Original Message----- From: André Warnier [mailto:a...@ice-sa.com] Sent: Tuesday, October 26, 2010 3:15 AM To: Tomcat Users List Subject: Re: mod_jk 1.2.28 errors Pid wrote: > On 26/10/2010 00:05, Hannaoui, Mo wrote: ... >> >> worker.template.ping_mode=A >> >> worker.template.reply_timeout=30000 >> worker.template.socket_connect_timeout=10000 >> worker.template.socket_timeout=10 >> worker.template.connection_pool_timeout=600 > > I can't get to the jk docs docs at the moment, but that socket_timeout > seems a little low. Are those the defaults? > > 1) What happens when you just leave the line >> worker.template.ping_mode=A and *remove* all the other timeout-related lines (to let the defaults be configured) ? 2) About the Tomcat-side configuration : <!-- Define an AJP 1.3 Connector on port 8009 --> > > <Connector port="8009" address="${jboss.bind.address}" > protocol="AJP/1.3" > > emptySessionPath="true" enableLookups="false" > redirectPort="8443" > > maxThreads="800" connectionTimeout="600000"/> > You have MaxClients=250 at the Apache side, and maxThreads=800 at the Tomcat side (2 times, because 2 Tomcats). Unless each Apache client can issue several requests to Tomcat(s) at the same time, that seems a bit unbalanced. Also, connectionTimeout="600000" means that when a client makes a connection but does not send a request on it, Tomcat is going to keep a thread busy, waiting 600 seconds (10 minutes) until the client deigns sending something. After you change it, you should then go back to the explanation of connection_pool_timeout in http://tomcat.apache.org/connectors-doc/reference/workers.html, to resynchronise that side. Better yet probably, restart from the default values for everything, and start modifying from the defaults only if you really have a problem. The default values are chosen sensibly, for a range of situations. Playing around with them usually makes things worse rather than better. I would leave the worker.template.ping_mode=A as it is however. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org