[ 
https://issues.apache.org/jira/browse/HTTPCORE-750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gaojie Liu updated HTTPCORE-750:
--------------------------------
        Fix Version/s:     (was: 5.2-alpha1)
                           (was: 5.1.2)
    Affects Version/s: 5.2.1
                           (was: 5.1.1)
          Description: 
We are using httpcore5 5.2.1 and we observed that at application startup time, 
httpclient5 could create more than 1 conn per endpoint and in some cases, too 
many connections caused OOM issue since httpclient5 creates a pair of 
input/output buffer per conn.

The regression is introduced by HTTPCORE-750, which made this change:

[https://github.com/apache/httpcomponents-core/commit/14caf43eae407c544161c7f92329e8beb42a3534]

 
{code:java}
 if (poolEntry.session != null) {
                callback.completed(poolEntry.session);
            } else {
                poolEntry.requestQueue.add(callback);
                if (poolEntry.sessionFuture != null && 
poolEntry.sessionFuture.isDone()) {
                    poolEntry.sessionFuture = null;
                } {code}
When poolEntry.sessionFuture.isDone() is true, the connection is already ready, 
but the existing logic will abandon it and create a new one, and under high 
load, this logic could create a lot of conns per endpoint, which consumes a lot 
of memory.

 

The proposed fix:
{code:java}
if (poolEntry.session != null) {
                callback.completed(poolEntry.session);
            } else {
                poolEntry.requestQueue.add(callback);
                if (poolEntry.sessionFuture != null && 
poolEntry.sessionFuture.isDone()) {
                    // Check whether we should recreate a new conn or not
                    try {
                        poolEntry.session = poolEntry.sessionFuture.get();
                        while (true) {
                            final FutureCallback<IOSession> pendingCallback = 
poolEntry.requestQueue.poll();
                            if (pendingCallback != null) {
                                pendingCallback.completed(poolEntry.session);
                            } else {
                                break;
                            }
                        }
                    } catch (final Exception e) {
                        poolEntry.sessionFuture = null;
                    }
                } {code}
I am planning to cut a PR for this.

 

  was:
There is an implementation problem in 
*AbstractIOSessionPool.getSessionInternal()*

Wrong pattern (starting line 190 in httpcore5-5.1.1) :

if (poolEntry.sessionFuture == null) {
                     *poolEntry.sessionFuture =* connectSession(
                             namedEndpoint,
                             connectTimeout,
                             new FutureCallback<IOSession>() {
 
                                ...
 
                                 @Override
                                 public void failed(final Exception ex) {
                                     synchronized (poolEntry) {
                                         poolEntry.session = null;
                                         *poolEntry.sessionFuture = null;*
                                         ...
                                     }
                                 }
                              ...

                            });

In case of a synchronous failure during connectSession (for example : 
UnknownHostException), the sessionFuture cleanup in the failed() callback (line 
216) is executed +before+ poolEntry.sessionFuture is assigned with the return 
value of connectSession (line 191)

Thus, after the failure, a cancelled future is assigned to 
poolEntry.sessionFuture and never cleaned up !

This result in a stalled entry in the pool because the _if 
(poolEntry.sessionFuture == null)_ condition will no longer return true.

 

*Quick Reproducer*

       H2AsyncClientBuilder httpClientBuilder = HttpAsyncClients.customHttp2()
                         .setDefaultRequestConfig(RequestConfig.custom()
                                         .setConnectTimeout(5000, 
TimeUnit.MILLISECONDS)
                                         .setResponseTimeout(5000, 
TimeUnit.MILLISECONDS)
                                         .build())
                         .setH2Config(H2Config.custom()
                                         .setPushEnabled(false)
                                         .build());
                 
         CloseableHttpAsyncClient client = httpClientBuilder.build();
         client.start();
         
         for(int n=1;n<=2;n++) 
         {
             System.out.println("Try #"+n);    
             
             HttpHost targetHost = new HttpHost(URIScheme.HTTPS.getId(), 
"invalidhost-gfhsgfxyzd.com", 443);
             SimpleHttpRequest request = new SimpleHttpRequest("GET", 
targetHost, "/");
     
             Future<SimpleHttpResponse> responseFuture = 
client.execute(request, null);
             try
             {
                 responseFuture.get();  // {color:#FF0000}*HANGS FOREVER at Try 
#2*{color}
             }
             catch (ExecutionException e)
             {
                 e.printStackTrace();
             }
         }

 

*Quick Fix*

Cleaning the sessionFuture immediately after connectSession() returns in case 
its already done fixes the issue :

(Line 234)

+ if (poolEntry.sessionFuture != null && poolEntry.sessionFuture.isDone())
 +                       poolEntry.sessionFuture = null;


> H2ConnPool creates too many conns under high load at initialization time
> ------------------------------------------------------------------------
>
>                 Key: HTTPCORE-750
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-750
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore
>    Affects Versions: 5.2.1
>         Environment: Linux / Ubuntu / openjdk version "1.8.0_292"
>            Reporter: Gaojie Liu
>            Priority: Major
>
> We are using httpcore5 5.2.1 and we observed that at application startup 
> time, httpclient5 could create more than 1 conn per endpoint and in some 
> cases, too many connections caused OOM issue since httpclient5 creates a pair 
> of input/output buffer per conn.
> The regression is introduced by HTTPCORE-750, which made this change:
> [https://github.com/apache/httpcomponents-core/commit/14caf43eae407c544161c7f92329e8beb42a3534]
>  
> {code:java}
>  if (poolEntry.session != null) {
>                 callback.completed(poolEntry.session);
>             } else {
>                 poolEntry.requestQueue.add(callback);
>                 if (poolEntry.sessionFuture != null && 
> poolEntry.sessionFuture.isDone()) {
>                     poolEntry.sessionFuture = null;
>                 } {code}
> When poolEntry.sessionFuture.isDone() is true, the connection is already 
> ready, but the existing logic will abandon it and create a new one, and under 
> high load, this logic could create a lot of conns per endpoint, which 
> consumes a lot of memory.
>  
> The proposed fix:
> {code:java}
> if (poolEntry.session != null) {
>                 callback.completed(poolEntry.session);
>             } else {
>                 poolEntry.requestQueue.add(callback);
>                 if (poolEntry.sessionFuture != null && 
> poolEntry.sessionFuture.isDone()) {
>                     // Check whether we should recreate a new conn or not
>                     try {
>                         poolEntry.session = poolEntry.sessionFuture.get();
>                         while (true) {
>                             final FutureCallback<IOSession> pendingCallback = 
> poolEntry.requestQueue.poll();
>                             if (pendingCallback != null) {
>                                 pendingCallback.completed(poolEntry.session);
>                             } else {
>                                 break;
>                             }
>                         }
>                     } catch (final Exception e) {
>                         poolEntry.sessionFuture = null;
>                     }
>                 } {code}
> I am planning to cut a PR for this.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to