Hi André, On 22.04.2009 18:29, André Warnier wrote: > As (maybe) part of another issue which I am still trying to track down > with the concerned network people (client write errors, Sample 2 below),
Before commenting Sample 1: Although we generally respond to the "client errors" topic with "Users press reload or click on another link before answer returns - check whether your responses take to long", we now observed two cases, where there were real network problems (I generally avoid to bash the network people, but it is still possible that the trouble comes from the network): - once a bad cable - once a bad driver for a network card used by a VMware guest system (problems with TCP offloading). TCP offloading problems seem to become a new trend :( > I find the following kind of messages regularly in the mod_jk logfile > (Sample 1). > I just want to know if this indicates a problem somewhere, or if these > are normal occurrences. > These [info] messages do not always come as often as shown below, but > seem to come with some regularity nevertheless. > > Basically, my question is whether this indicates that there might be a > discrepancy between front-end and back(-)end or something of the kind, > or if there is some other parameter to adjust to make these go away. > > Apache 2.x prefork MPM > mod_jk 1.2.27 > Tomcat 5.5.x > under RHEL, single host > > JkLogLevel info > and > worker.ajp13.port=8009 > worker.ajp13.host=localhost > worker.ajp13.type=ajp13 > # new options since 1.2.27 : > worker.ajp13.ping_mode=A > (single worker, no load balancing) > > back-end Connector : > <Connector port="8009" > enableLookups="false" redirectPort="8443" protocol="AJP/1.3" > maxThreads="150" minSpareThreads="25" maxSpareThreads="100" > backlog="100" /> The configuration contains no means for shutting down idle workers. Neither mod_jk (connection_pool_timeout) nor Tomcat (connectTimeout) will close idle connections. So it's possible, that an intermediate system, e.g. firewall drops connections. > Sample 1 : > -------- > > [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.007701 > 4897 > [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.004344 > 1457 > [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.101926 > 1945 > [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.322939 > 296975 > [Tue Apr 21 13:35:14 2009] POST /myapp/normen/servlet.myapp 200 0.032625 > 1576 > [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.003084 > 4176 It would be nicer to put the JK access logging into the access log and add the pid (process id) there, so you could identify exaclty which access log line relates to which JK info/error message block. not critical though. > [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 16 returned -32 (errno=32) > [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) As you already noticed, errno=32 is a broken pipe error. Sending the cping packet uses the write() system call. The linux man pages for write(2) on my systems say: EPIPE fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive a SIGPIPE signal. (Thus, the write return value is seen only if the program catches, blocks or ignores this signal.) So we should have "reading end is closed", which is Tomcat, so the connection should bw in CLOSE_WAIT from the point of view of mod_jk. I would expect a RST send by a firewall or similar between mod_jk and Tomcat to also result in EPIPE. > [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.010538 > 1366 > [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.002768 > 2507 > [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.002658 > 1999 > [Tue Apr 21 13:35:17 2009] POST /myapp/normen/servlet.myapp 200 0.060688 > 21560 > [Tue Apr 21 13:35:19 2009] POST /myapp/normen/servlet.myapp 200 0.017692 > 23635 > [Tue Apr 21 13:35:22 2009] POST /myapp/normen/servlet.myapp 200 1.032645 > 21400 > [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 16 returned -32 (errno=32) > [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) Same again. > [Tue Apr 21 13:35:34 2009] POST /myapp/normen/servlet.myapp 200 0.071825 > 20753 > [Tue Apr 21 13:35:36 2009] POST /myapp/normen/servlet.myapp 200 0.115782 > 14137 > [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 1 returned -32 (errno=32) > [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) Again > [Tue Apr 21 13:35:56 2009] POST /myapp/normen/servlet.myapp 200 0.040233 > 20753 > [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 16 returned -32 (errno=32) > [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) Again > [Tue Apr 21 13:35:57 2009] POST /myapp/normen/servlet.myapp 200 0.036396 > 12264 > [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.292587 > 1409 > [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.010833 > 6729 > [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.015750 > 3484 > [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.032983 > 2915 Note that all 4 cases were logged by a diffrent PID (process id), and in the prefork case, each process has its own pool of size 1. So 4 different processes experienced a connection closed by the backend or something in between. If you would log the JK access data and the PID in the apache access log, you could verify, when those processes used the backend connection the last time previously to verify the idleness theory. > Sample 2 : (for illustration of the other issue only) > -------- > [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.016737 > 6729 > [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.019133 > 3484 > [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.010458 > 2915 > [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.112756 > 18897 > [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 1 returned -32 (errno=32) > [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) The same for pid 29831 > [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 17 returned -32 (errno=32) > [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) pid 30401 > [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info] > ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for > socket 16 returned -32 (errno=32) > [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info] > ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query > [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info] > ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request, > socket -1 keepalive cping/cpong failure (errno=32) pid 29827 > [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info] > ajp_process_callback::jk_ajp_common.c (1748): Writing to client aborted > or client network problems > [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info] > ajp_service::jk_ajp_common.c (2407): (ajp13) sending request to tomcat > failed (unrecoverable), because of client write error (attempt=1) Now a client write error, again pid 29831. Possibly, the cping write error above too a long time to be detected and that's why the response retrieved via a fresh connection wasn't consumed by the client? Although the below response times seem to tell us, that the longest request in the snippet here took 9 seconds, which is not that long. There is no mod_jk internal relation between the cping send failure and a client write error. It could only be in some indirect way, like both being caused by a general network problem (occuring on the path to the browser and on the path to the backend - but then why don't we see lots of other communication errors as well?) or by some timing issue triggering user behaviour - not very likely in light of the blow duration lines. > [Tue Apr 21 13:15:49 2009] POST /myapp/normen/servlet.myapp 200 5.210358 > 8184 > [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info] > jk_handler::mod_jk.c (2469): Aborting connection for worker=ajp13 This is a consequence of the client write error, so next time this process will open a fresh connection to the backend. > [Tue Apr 21 13:15:49 2009] POST /myapp/normen/servlet.myapp 200 0.363977 > 20753 > [Tue Apr 21 13:15:54 2009] POST /myapp/normen/servlet.myapp 200 9.057331 > 20755 > [Tue Apr 21 13:15:55 2009] POST /myapp/normen/servlet.myapp 200 0.215937 > 14166 Regards, Rainer --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org