Hi André,

On 25.04.2009 13:58, André Warnier wrote:
> Hi Rainer.
> 
> Thanks very much for the time you took to analyse the case below.
> However, I am still puzzled :
> 
> This is indeed a system where there appear to be problems at the level
> of the connection between the client workstations/browser, and the
> distant server.  Those are the "client write" errors.
> We are in the process (right, with the network people), of
> analysing/discussing that (automated tests, Wireshark etc..).
> 
> However, it is also a single host with both Apache and mod_jk and Tomcat
> on the same host.  So what I am troubled about is the "cping/cpong" and
> "sendfull" kind of messages. Are they not strictly between mod_jk and
> Tomcat ?  And if yes, considering these are on the same host, do these
> messages also indicate that something else is wrong ?

They are strictly between Apache and Tomcat. "write" failed during
sending cping from Apache to Tomcat overa previously established
connection, and EPIPE should indicate a remote socket/connection close
(or maybe reset?).

Maybe Tomcat has a default connectTimeout? Not sure at the moment.

Regards,

Rainer

> Rainer Jung wrote:
>> Hi André,
>>
>> On 22.04.2009 18:29, André Warnier wrote:
>>> As (maybe) part of another issue which I am still trying to track down
>>> with the concerned network people (client write errors, Sample 2 below),
>>
>> Before commenting Sample 1: Although we generally respond to the "client
>> errors" topic with "Users press reload or click on another link before
>> answer returns - check whether your responses take to long", we now
>> observed two cases, where there were real network problems (I generally
>> avoid to bash the network people, but it is still possible that the
>> trouble comes from the network):
>>
>> - once a bad cable
>> - once a bad driver for a network card used by a VMware guest system
>> (problems with TCP offloading). TCP offloading problems seem to become a
>> new trend :(
>>
>>> I find the following kind of messages regularly in the mod_jk logfile
>>> (Sample 1).
>>> I just want to know if this indicates a problem somewhere, or if these
>>> are normal occurrences.
>>> These [info] messages do not always come as often as shown below, but
>>> seem to come with some regularity nevertheless.
>>>
>>> Basically, my question is whether this indicates that there might be a
>>> discrepancy between front-end and back(-)end or something of the kind,
>>> or if there is some other parameter to adjust to make these go away.
>>>
>>> Apache 2.x prefork MPM
>>> mod_jk 1.2.27
>>> Tomcat 5.5.x
>>> under RHEL, single host
>>>
>>> JkLogLevel  info
>>> and
>>> worker.ajp13.port=8009
>>> worker.ajp13.host=localhost
>>> worker.ajp13.type=ajp13
>>> # new options since 1.2.27 :
>>> worker.ajp13.ping_mode=A
>>> (single worker, no load balancing)
>>>
>>> back-end Connector :
>>> <Connector port="8009"
>>>  enableLookups="false" redirectPort="8443" protocol="AJP/1.3"
>>>  maxThreads="150" minSpareThreads="25" maxSpareThreads="100"
>>> backlog="100" />
>>
>> The configuration contains no means for shutting down idle workers.
>> Neither mod_jk (connection_pool_timeout) nor Tomcat (connectTimeout)
>> will close idle connections. So it's possible, that an intermediate
>> system, e.g. firewall drops connections.
>>
>>> Sample 1 :
>>> --------
>>>
>>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.007701
>>> 4897
>>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.004344
>>> 1457
>>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.101926
>>> 1945
>>> [Tue Apr 21 13:34:59 2009] GET /myapp/normen/servlet.myapp 200 0.322939
>>> 296975
>>> [Tue Apr 21 13:35:14 2009] POST /myapp/normen/servlet.myapp 200 0.032625
>>> 1576
>>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.003084
>>> 4176
>>
>> It would be nicer to put the JK access logging into the access log and
>> add the pid (process id) there, so you could identify exaclty which
>> access log line relates to which JK info/error message block. not
>> critical though.
>>
>>> [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 16 returned -32 (errno=32)
>>> [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:35:14 2009] [18753:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> As you already noticed, errno=32 is a broken pipe error. Sending the
>> cping packet uses the write() system call. The linux man pages for
>> write(2) on my systems say:
>>
>> EPIPE  fd is connected to a pipe or socket whose reading end is closed.
>>  When this happens the writing process will also receive a SIGPIPE
>> signal.  (Thus, the write return value is seen only if the program
>> catches, blocks or ignores this signal.)
>>
>> So we should have "reading end is closed", which is Tomcat, so the
>> connection should bw in CLOSE_WAIT from the point of view of mod_jk. I
>> would expect a RST send by a firewall or similar between mod_jk and
>> Tomcat to also result in EPIPE.
>>
>>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.010538
>>> 1366
>>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.002768
>>> 2507
>>> [Tue Apr 21 13:35:14 2009] GET /myapp/normen/servlet.myapp 200 0.002658
>>> 1999
>>> [Tue Apr 21 13:35:17 2009] POST /myapp/normen/servlet.myapp 200 0.060688
>>> 21560
>>> [Tue Apr 21 13:35:19 2009] POST /myapp/normen/servlet.myapp 200 0.017692
>>> 23635
>>> [Tue Apr 21 13:35:22 2009] POST /myapp/normen/servlet.myapp 200 1.032645
>>> 21400
>>> [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 16 returned -32 (errno=32)
>>> [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:35:34 2009] [30410:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> Same again.
>>
>>> [Tue Apr 21 13:35:34 2009] POST /myapp/normen/servlet.myapp 200 0.071825
>>> 20753
>>> [Tue Apr 21 13:35:36 2009] POST /myapp/normen/servlet.myapp 200 0.115782
>>> 14137
>>> [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 1 returned -32 (errno=32)
>>> [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:35:56 2009] [30413:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> Again
>>
>>> [Tue Apr 21 13:35:56 2009] POST /myapp/normen/servlet.myapp 200 0.040233
>>> 20753
>>> [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 16 returned -32 (errno=32)
>>> [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:35:57 2009] [29827:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> Again
>>
>>> [Tue Apr 21 13:35:57 2009] POST /myapp/normen/servlet.myapp 200 0.036396
>>> 12264
>>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.292587
>>> 1409
>>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.010833
>>> 6729
>>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.015750
>>> 3484
>>> [Tue Apr 21 13:36:36 2009] GET /myapp/normen/servlet.myapp 200 0.032983
>>> 2915
>>
>> Note that all 4 cases were logged by a diffrent PID (process id), and in
>> the prefork case, each process has its own pool of size 1. So 4
>> different processes experienced a connection closed by the backend or
>> something in between. If you would log the JK access data and the PID in
>> the apache access log, you could verify, when those processes used the
>> backend connection the last time previously to verify the idleness
>> theory.
>>
>>> Sample 2 : (for illustration of the other issue only)
>>> --------
>>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.016737
>>> 6729
>>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.019133
>>> 3484
>>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.010458
>>> 2915
>>> [Tue Apr 21 13:15:32 2009] GET /myapp/normen/servlet.myapp 200 0.112756
>>> 18897
>>> [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 1 returned -32 (errno=32)
>>> [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:15:44 2009] [29831:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> The same for pid 29831
>>
>>> [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 17 returned -32 (errno=32)
>>> [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:15:44 2009] [30401:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> pid 30401
>>
>>> [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info]
>>> ajp_connection_tcp_send_message::jk_ajp_common.c (1064): sendfull for
>>> socket 16 returned -32 (errno=32)
>>> [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info]
>>> ajp_handle_cping_cpong::jk_ajp_common.c (855): can't send cping query
>>> [Tue Apr 21 13:15:49 2009] [29827:2537034048] [info]
>>> ajp_maintain::jk_ajp_common.c (3046): (ajp13) failed sending request,
>>> socket -1 keepalive cping/cpong failure (errno=32)
>>
>> pid 29827
>>
>>> [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info]
>>> ajp_process_callback::jk_ajp_common.c (1748): Writing to client aborted
>>> or client network problems
>>> [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info]
>>> ajp_service::jk_ajp_common.c (2407): (ajp13) sending request to tomcat
>>> failed (unrecoverable), because of client write error (attempt=1)
>>
>> Now a client write error, again pid 29831. Possibly, the cping write
>> error above too a long time to be detected and that's why the response
>> retrieved via a fresh connection wasn't consumed by the client? Although
>> the below response times seem to tell us, that the longest request in
>> the snippet here took 9 seconds, which is not that long.
>>
>> There is no mod_jk internal relation between the cping send failure and
>> a client write error. It could only be in some indirect way, like both
>> being caused by a general network problem (occuring on the path to the
>> browser and on the path to the backend - but then why don't we see lots
>> of other communication errors as well?) or by some timing issue
>> triggering user behaviour - not very likely in light of the blow
>> duration lines.
>>
>>> [Tue Apr 21 13:15:49 2009] POST /myapp/normen/servlet.myapp 200 5.210358
>>> 8184
>>> [Tue Apr 21 13:15:49 2009] [29831:2537034048] [info]
>>> jk_handler::mod_jk.c (2469): Aborting connection for worker=ajp13
>>
>> This is a consequence of the client write error, so next time this
>> process will open a fresh connection to the backend.
>>
>>> [Tue Apr 21 13:15:49 2009] POST /myapp/normen/servlet.myapp 200 0.363977
>>> 20753
>>> [Tue Apr 21 13:15:54 2009] POST /myapp/normen/servlet.myapp 200 9.057331
>>> 20755
>>> [Tue Apr 21 13:15:55 2009] POST /myapp/normen/servlet.myapp 200 0.215937
>>> 14166
>>
>> Regards,
>>
>> Rainer
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
> 
> 

-- 
kippdata
informationstechnologie GmbH   Tel: 0228 98549 -0
Bornheimer Str. 33a            Fax: 0228 98549 -50
53111 Bonn                     www.kippdata.de

HRB 8018 Amtsgericht Bonn / USt.-IdNr. DE 196 457 417
Geschäftsführer: Dr. Thomas Höfer, Rainer Jung, Sven Maurmann
===============================
kippdata
informationstechnologie GmbH   Tel: +49 228 98549 -0
Bornheimer Str. 33a            Fax: +49 228 98549 -50
D-53111 Bonn                   www.kippdata.de

HRB 8018 Amtsgericht Bonn / USt.-IdNr. DE 196 457 417
Geschäftsführer: Dr. Thomas Höfer, Rainer Jung, Sven Maurmann


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to