Linton Miller created HTTPCORE-590:
--------------------------------------

             Summary: LaxConnPool may leak connections
                 Key: HTTPCORE-590
                 URL: https://issues.apache.org/jira/browse/HTTPCORE-590
             Project: HttpComponents HttpCore
          Issue Type: Bug
          Components: HttpCore
    Affects Versions: 5.0-beta8
            Reporter: Linton Miller


LaxConnPool may leak connections when a caller cancels a pending lease request.

Consider a pool of just 1 connection, currently assigned to Thread 2.

Thread1:
future = lease
  ... no available connection so added to pending and unsatisifed future 
returned ...

Thread2:
release
  ... put connection back in available list ...
  servicePendingRequest
    pending.poll() -> Thread1's outstanding LeaseRequest
    leaseRequest.isDone() = false
    leaseRequest.getDeadline.isExpired() = false
    entry = getAvailableEntry()
    addLeased(entry)

Thread1:
  future.cancel
    ...
    future.completed = true
    ...
    return true;
  ... future successfully cancelled. Given up waiting for a connection, and 
moving on without one ...

Thread2:
  leaseRequest.completed(entry)
    future.completed(entry)
      if (future.completed) return false;

At this point, the completed call failed (returned false), meaning the entry 
hasn't successfully been received by any caller waiting on the future. But we 
ignore that failure and proceed as if it has succeeded; leaving the entry in 
the leased list, where it will permanently now stay because no caller will ever 
release it.


The code must test the result of attempting to complete the future, and if it 
fails, must return the entry to the available pool.


The error can be recreated with the following test code:
{code:java}
import org.apache.hc.core5.io.ModalCloseable;
import org.apache.hc.core5.io.CloseMode;
import org.apache.hc.core5.pool.PoolEntry;
import org.apache.hc.core5.pool.LaxConnPool;
import org.apache.hc.core5.pool.PoolEntry;
import org.apache.hc.core5.util.TimeValue;

import java.util.concurrent.Future;
import java.util.concurrent.Phaser;


public class PoolCancelTest {

  private static class TestConn implements ModalCloseable {
    public void close() {
    }
    public void close(CloseMode mode) {
    }
  }

  private static LaxConnPool<String,TestConn> pool;
  private static Phaser sync = new Phaser(1);

  private static void dbg(String msg) {
    System.out.println(Thread.currentThread().getName()+": 
"+System.currentTimeMillis()+": "+msg);
  }

  public static void main(String[] args) {
    final int poolSize = Integer.parseInt(args[0]);
    pool = new LaxConnPool<>(poolSize);
    // Create a pool entry purely to prime class loading
    new PoolEntry<>("", TimeValue.NEG_ONE_MILLISECONDS);
    
    for (int i=0;i<poolSize*2;i++) {
      sync.register();
      new Thread(String.format("req-%04d", i)) {
        @Override
        public void run() {
          dbg("Started");
          Future<PoolEntry<String,TestConn>> req = pool.lease("somehost",null);
          PoolEntry<String,TestConn> entry = null;
          if (req.isDone()) {
            try {
              entry = req.get();
              if (!entry.hasConnection()) {
                entry.assignConnection(new TestConn());
              }
            }
            catch (Exception shouldNotHappen) {
              dbg("Should not happen!");
              shouldNotHappen.printStackTrace();
            }
          }
          sync.arriveAndAwaitAdvance();
          if (entry != null) {
            pool.release(entry, true);
            dbg("Released pool entry "+System.identityHashCode(entry));
          }
          else {
            if (req.cancel(true)) {
              req = pool.lease("somehost",null);
            }
          }
          try {
            dbg("Assigned pool entry "+System.identityHashCode(req.get()));
          }
          catch (Exception fail) {
            dbg("Getting pool entry failed");
            fail.printStackTrace();
          }
        }
      }.start();
    }
    sync.arriveAndDeregister();
  }
}
{code}
It generally requires a significant number of threads to make the right timing 
sequence reliably occur, but I find that on my machine, if I run with command 
line arg of 300, that causes the code to lock up (because a connection is 
leaked from the pool, so a thread ends up in an infinite wait) on 50+% of runs.

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to