Re: jk_handler::mod_jk.c (2917): Could not get endpoint for worker ...

Rainer Jung Fri, 21 Sep 2018 08:35:04 -0700

Am 15.09.2018 um 12:50 schrieb Clemens Wyss DEV:

Hi all,
we are seeing quite a few:
"[Mon Sep 10 15:19:46 2018] [27562:140532026529536] [error] jk_handler::mod_jk.c 
(2917): Could not get endpoint for worker=testAPJ"


errors in our md_jk.log. Worker properties are as follwos:

...
worker.list=testAPJ

worker.testAPJ.port=8009
worker.testAPJ.host=127.0.0.1
worker.testAPJ.type=ajp13
worker.testAPJ.socket_keepalive=1
worker.testAJP.connection_pool_timeout=600
...

At that point Apache seems to be stuck/struggling (but our tomcat does not seem 
to be under pressure). Restarting Apache solves the issue ... till it pops up 
again ...

What is happening? What needs tob e tuned?

Apache 2.4.34, tried both event- and worker-MPM


Assuming this is mod_jk 1.2.44? Are there more setting for worker testAPJ?

Normally mod_jk creates as many local connection structures (namedendpoints) in each Apache httpd child process, as that process hasworker threads. When an httpd worker thread wants to talk to tomcat, itretrieves such an endpoint and uses it to create and handle thecommnunication.

The error you observe means, that all endpoints were already in use.Since we create as many structures as there are worker threads -everything is per httpd process, this should not happen (and I don'tremember any case were it did happen).


Ideas what could go wrong:

- setting the worker property connection_pool_size or the deprecatedcachesize for worker testAPJ to a smaller value than your httpdThreadsPerChild (32 from your config snippet). If not set, mod_jkautomatically detects the number of httpd worker threads

- setting connection_acquire_timeout to a small value. By default it isequals to retries*retry_interval which in turn by default is equals to2*100 milliseconds. mod_jk will retry getting an endpoint before itshows you error message "retries" times with a sleep pause of"retry_interval" milliseconds but no longer thanconnection_acquire_timeout milliseconds.

- retrieving and endpoint must acquire a lock first. On some platformslocking can lead to problems like false positives in deadlock detection.But i think this can't happen here since the code doesn't check thereturn value of the locking.


- memory shortage leading to failing allocations (not likely but possible)

Do you see any other log messages? Any ones in the httpd error log orespecially the mod_jk log? There should be a WARN message of type"Unable to get the free endpoint for worker %s from %u slots" but maybemore before that final problem happens? What do you see with JkLogLevelinfo?

Does the problem happen under high load or when your backend gets slow?What does "netstat -anp | grep 8009" show when the hang occurs?


Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Re: jk_handler::mod_jk.c (2917): Could not get endpoint for worker ...

Reply via email to