On Wed, 2021-03-31 at 13:42 +0200, Oleg Kalnichevski wrote:
> On Tue, 2021-03-30 at 15:54 -0700, Ryan Schmitt wrote:
> > Reproducer just dropped:
> > 
> > https://github.com/rschmitt/httpclient-benchmark
> > 
> > Just run the `benchmark` target as usual, using JDK11. I've tested
> > it
> > on
> > Linux and macOS; Windows *will not work*. The output should look
> > something
> > like this:
> > 
> 
> I can reproduce the issue and am trying to find its cause.
> 
> Oleg
> 

Hi Ryan

I believe I have found the root cause of the leak. It is a classic race
condition when a connection request completes and its requester gives
up on it (request times out) at the about same time.

I am working on a fix.

Did you have any luck reproducing the other defect?

Oleg


> > > > Task :benchmark
> > > =================================
> > > HTTP agent: Apache HttpClient (ver: 5.0)
> > > =================================
> > > 12800 GET requests
> > > ---------------------------------
> > > No connection leak detected...
> > > Connection leak detected!
> > > Connection leak detected
> > > [leased: 3; pending: 0; available: 0; max: 8]
> > > Document URI:           http://localhost:8888/rnd?c=2000
> > > Document Length:        0 bytes
> > > 
> > > Concurrency level:      64
> > > Time taken for tests:   0.349 seconds
> > > Complete requests:      0
> > > Failed requests:        128
> > > Content transferred:    0 bytes
> > > Requests per second:    0.0 [#/sec] (mean)
> > > 
> > > BUILD SUCCESSFUL in 4s
> > > 3 actionable tasks: 2 executed, 1 up-to-date
> > 
> > On Tue, Mar 30, 2021 at 3:08 PM Ryan Schmitt <[email protected]>
> > wrote:
> > 
> > > No need to exchange messages. It turns out that you can reproduce
> > > this
> > > issue purely with connection timeouts, or TLS handshake timeouts.
> > > It
> > > appears that both the strict and the lax connection pools can
> > > leak
> > > connections, but it appears easier to reproduce with the strict
> > > one.
> > > 
> > > On Tue, Mar 30, 2021 at 1:52 PM Oleg Kalnichevski <
> > > [email protected]
> > > wrote:
> > > 
> > > > On Tue, 2021-03-30 at 13:35 -0700, Ryan Schmitt wrote:
> > > > > Good news, actually: I think I *just* reproduced it now. I
> > > > > ran
> > > > > a
> > > > > hacked up
> > > > > benchmark that sends 100,000 HTTPS requests across 50 threads
> > > > > with
> > > > > various
> > > > > randomized timeouts and delays, and after everything was done
> > > > > there
> > > > > were
> > > > > still two "leased" connections in the thread pool. This is
> > > > > exactly
> > > > > what I
> > > > > was looking for. A turnkey repro and a fix might not be far
> > > > > off
> > > > > now.
> > > > > 
> > > > 
> > > > All connections have a unique ID assigned to them at
> > > > construction
> > > > time
> > > > which is also used in the context logs as a correlation id.
> > > > 
> > > > If you could dump the ids of the connections still leased from
> > > > the pool
> > > > at the end of a benchmark run, you could look for abnormalities
> > > > in
> > > > message exchanges over those connections.
> > > > 
> > > > Oleg
> > > > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to