Re: What causes "client errors" with mod_jk

Christopher Schultz Thu, 26 May 2022 14:04:04 -0700

All,

On 5/26/22 15:49, Christopher Schultz wrote:

Rainer,
On 5/16/22 13:48, Christopher Schultz wrote:
Rainer,

I've been looking into this a little more in my production environment.
These errors are not super common, but there seems to be a steadytrickle of errors from my two services that have human users. I see 0errors for my API-based services, which makes me think that I don'thave any performance issues... I probably have human usersdisappearing for random reasons.
The errors in mod_jk.log look like this:
[Sun May 15 04:19:15.643 2022] [5859:139625025315392] [info]ajp_process_callback::jk_ajp_common.c (2077): (myworker) Writing toclient aborted or client network problems[Sun May 15 04:19:15.644 2022] [5859:139625025315392] [info]ajp_service::jk_ajp_common.c (2773): (myworker) sending request totomcat failed (unrecoverable), because of client write error (attempt=1)[Sun May 15 04:19:15.646 2022] [5859:139625025315392] [info]service::jk_lb_worker.c (1595): service failed, worker myworker is inlocal error state[Sun May 15 04:19:15.646 2022] [5859:139625025315392] [info]service::jk_lb_worker.c (1614): unrecoverable error 200, requestfailed. Client failed in the middle of request, we can't recover toanother instance.[Sun May 15 04:19:15.646 2022] [5859:139625025315392] [info]jk_handler::mod_jk.c (2984): Aborting connection for worker=myworker
(Note that the last message "Aborting connection for worker=myworker"may have a bug; my actual worker name has a name containing a hyphen(-) and only the text before the hyphen is being emitted in that errormessage.)
Anyway, when researching these errors, it would be helpful to me toknow which requests are failing. By looking at the access_log, I onlysee a single request logged for 04:19:15 on that server so it'sprobably the right one, but httpd says that the response code is 302instead of e.g. 50x for an error.
When we log these kinds of errors, it would be great to know a fewthings IMO:
1. What was the URL of the request
2. How long did the client wait for the response before we found wecouldn't write to the stream (or, conversely, the start-timestamp ofthe request as well as the timestamp of the error, which I think isalready in the log itself)
I see the place in the code where the error is generated, but I'm notfamiliar enough with the code to know how to add that kind of thing.The function in question (ajp_process_callback) has a pointer to ajk_ws_service_t structure. Is it as simple as also logging like this?
   /* convert start-time to a string */
   char[32] timestamp;
apr_strftime(timestamp, NULL, 32, "%Y-%m-%d %H:%M:%S",r->r->request_time);
   /* emit a detailed log message */
   jk_log(l, JK_LOG_INFO,
          "(%s) Request to (%s %s) started at %s,%ld",
ae->worker->name, r->method, r->req_uri, timestamp,r->r->request_time.tm_usec);
Does anyone think this might be generally useful?
It looks like I needed a few more things in order to get this to work:

   {
     char *timestamp = malloc(32);
     apr_time_exp_t *timerec = malloc(sizeof(apr_time_exp_t));
     apache_private_data_t *ap = (apache_private_data_t*)r->ws_private;
     apr_time_exp_gmt(timerec, ap->r->request_time);
     apr_strftime(timestamp, NULL, 32, "%Y-%m-%d %H:%M:%S", timerec);
     /* emit a detailed log message */
     jk_log(l, JK_LOG_INFO,
            "(%s) Request to (%s %s) started at %s",
            ae->worker->name, r->method, r->req_uri, timestamp);
     free(timerec);
     free(timestamp);
   }
The compiler wouldn't let me use an automatically-allocated char[32] soI had to malloc it. I also had to convert from long to apr_time_exp_twhich required another structure/malloc as well.
Finally, I had to cast the r->ws_private to apache_private_data_t* whichrequired me to copy/paste the definition of apache_private_data_t frommod_jk.h into jk_ajp_common.c -- just as a test for now.
Obviously, this works only for Apache httpd.
I have it running on one of my web servers for now -- hoping to catch anerror and see this additional logging to see if it's helpful to me. (LOLjust segfaulted; time to go back and look.)
What would be the best way to get the "start time" of the request in aplatform-independent way? Maybe just expand the jk_ws_service structureto include it?


Here is my latest code which hasn't segfaulted in a while:

{
  /* emit a detailed log message */
  char timestamp[32];     // For formatted timestamp string
  apr_time_exp_t timerec; // Required for apr_strftime
  apr_size_t retsize;     // Required for apr_strftime
  apache_private_data_t *ap = (apache_private_data_t*)r->ws_private;

apr_time_exp_gmt(&timerec, ap->r->request_time); // Convertrequest_time to apr_time_exp_tapr_strftime(timestamp, &retsize, 32, "%Y-%m-%d %H:%M:%S", &timerec);// Format the timestamp

  if(0 == retsize) {
    strcpy(timestamp, "(overflow)");
  }

  jk_log(l, JK_LOG_INFO,
         "(%s) Request to (%s %s%s%s) started at %s,%ld",
         ae->worker->name, r->method, r->req_uri,
         r->query_string ? "?" : "",
         r->query_string ? r->query_string : "",
         timestamp, timerec.tm_usec);
}

There are no more calls to malloc in there, so the compiler is handlingthe memory for all the locals I'm inventing in this code block.

Back to my original investigation... the small number or errors thus farseem to indicate pathological behavior on the user's part rather thanany fault in my own application. In most cases, the (same?) userlaunches a bunch of requests to the same URI and most of them fail. Isuspect it's a person multi-clicking a link to a specific resource.Based upon the URI and our user patterns, it's not likely mobile clients.


-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: What causes "client errors" with mod_jk

Reply via email to