Hello Jetty Friends,

My Awesome Jetty Based Proxy Server 2.0 has now been in production for a few 
delivering traffic like there's no tomorrow, and mostly it runs great.

However, we occasionally suffer transient outages, where one of our backing 
suddenly is very slow to respond.  We do have limits, but even with reasonably 
aggressive limits O(10 seconds)
we handle enough traffic that the very fronting proxy server queues quite 
quickly if one of its backends goes away.

Normally, that's fine -- we're just a proxy server -- but additionally we are 
observing that under periods of high load
we are experiencing "leaks" both of our internal tracking metrics as well as 
actual file descriptors.

Roughly pseudocoding, we do
class AwesomeJettyProxyServer20 extends AsyncMiddleManServlet {
  void service(...) {

  void onProxyResponse{Success,Failure}(...) {

In normal operation this works fine, and we see all our inFlight metrics hover 
around 0, as they should.
However under load, we find that eventually our queue gets filled with work and 
rejects.  Exactly as designed!

        • 2018-02-12T22:23:49.385Z WARN  <ed9d2edc-8d56-40b4-a0a5-e051055b3f08> 
[default-pool-47] o.e.j.util.thread.QueuedThreadPool - 
QueuedThreadPool@default-pool{STARTED,192<=192<=192,i=163,q=128} rejected 
 of -1}]]:runFillable:BLOCKING
        • 2018-02-12T22:23:49.386Z WARN  <ed9d2edc-8d56-40b4-a0a5-e051055b3f08> 
[default-pool-47] o.e.j.u.t.strategy.EatWhatYouKill -
        • java.util.concurrent.RejectedExecutionException: 
 of -1}]]:runFillable:BLOCKING
        •       at 
        •       at 
        •       at 
        •       at 
        •       at 
        •       at 
        •       at 
        •       at java.lang.Thread.run(Thread.java:748)

[ Stack trace is an example, in reality we get them from all over the place ]

But oh no, this seems (maybe?) to totally break the Jetty handling flow -- 
after we see these sorts of exceptions fill the logs, we never again return to 
baseline 0 connections read -- and if you 'lsof' the process later, you see a 
number of:

java    4729 root  688u  IPv4          263035204      0t0       TCP> (CLOSE_WAIT)
java    4729 root  690u  IPv4          263106017      0t0       TCP> (CLOSE_WAIT)
java    4729 root  716u  IPv4          263175957      0t0       TCP> (CLOSE_WAIT)
java    4729 root  740u  IPv4          263189945      0t0       TCP> (CLOSE_WAIT)

Eventually, the process runs out of file descriptors and effectively dies.

We don't have proof positive that it is this execution rejection causing the 
problem, but it seems like a very likely cause -- failing to execute a callback 
could easily leak a half shutdown socket, or cause one of the proxy servlet 
callbacks to not get invoked, and we've noticed a strong correlation between 
seeing it in our logs and finding mysteriously unhealthy instances.

Does this sound like an explanation for our observed file descriptor and 
accounting "leak"?  What's the right way to configure Jetty to reject excess 
load, while still cleaning up after itself?  It seems that tasks related to 
already existing requests should preferentially continue to execute over 
accepting new entries into the queue, but I don't see how to implement 
something like that.  Should we switch from the AbortPolicy to e.g. 
CallerRunsPolicy?  I've observed (what I think were) deadlocks with small 
number of threads (< 10), would we risk similar phenomena with CallerRunsPolicy?

Finally, we also configure the Client and Server to share the same thread pool. 
 Are there additional dangers lurking here, if the Jetty Client gets similar 
REEs from the pool?

Thanks for any advice!

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

jetty-users mailing list
To change your delivery options, retrieve your password, or unsubscribe from 
this list, visit

Reply via email to