[Jakarta-httpclient Wiki] Update of "ConnectionManagementDesign" by RolandWeber

Apache Wiki Sat, 06 Jan 2007 11:40:18 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Jakarta-httpclient 
Wiki" for change notification.


The following page has been changed by RolandWeber:
http://wiki.apache.org/jakarta-httpclient/ConnectionManagementDesign

New page:
This page is for collecting ideas for and thoughts on connection management in 
HttpComponents.
For the time being, the focus is on client side connection management.

----
[[TableOfContents]]
----

= Introduction =

== Purpose of Connection Management ==

 1. Enforce connection limits.[[BR]]Typical limits are a maximum number of 
connections in total, and a maximum number of connections to a target host.
 1. Re-use open connections to increase performance.[[BR]]Opening a connection 
is time consuming, in particular for TLS/SSL connections. The connection 
keep-alive feature of HTTP allows for multiple requests to be sent over a 
connection.

== Terminology ==

Although the definitions talk about ''sending'', communication is always 
bidirectional. After an HTTP request is sent, the response needs to be received 
on the same connection.

 target:: HTTP server, the endpoint for HTTP communication

 proxy:: HTTP proxy server, a transit point for HTTP communication. By default, 
a proxy server will receive HTTP requests to process and forward them. Proxies 
can also be ''tunnelled'', see below. 

 connection:: communication link for sending data towards the target. 
Technically, an open connection corresponds to a connected TCP/IP or TLS/SSL 
socket.

 tunnel:: logical communication link that protects data from interpretation and 
modification by a proxy. A tunnel must be ''established'' by sending a CONNECT 
request to the proxy. The proxy will create a communication link to the next 
hop, and forward all data without interpretation. Tunnels are used to send 
encrypted data through proxies.

 route:: a sequence of proxies leading to a target. For all practical purposes, 
a route is either direct to the target or via one proxy. The route is either 
using plain HTTP, or TLS/SSL directly to the target, or TLS/SSL over a tunnel 
via the proxy. Tunnelling a plain HTTP connection via a proxy is possible but 
unusual.


= Discussion =

== Usage Patterns ==

In the traditional usage pattern, a thread that needs to send an HTTP request 
to a target will ask the connection manager for a connection to that target. 
The thread blocks until a connection becomes available. If the connection is 
already openend to the target, it is used immediately. Otherwise it needs to be 
opened first, which might require tunnelling a TLS/SSL connection through a 
proxy. When the thread is done with the HTTP communication, the connection is 
returned to the connection manager. If there are no side conditions that 
prevent re-use, the connection is kept open for a limited time so that 
subsequent requests to the same target can be sent immediately.

''!HttpCore-NIO'': The NIO module of !HttpCore hides the details of NIO 
communication from applications. The usage pattern will be similar to the 
traditional pattern, except that the application thread will not establish a 
connection itself. IO is performed by a background thread.

''!HttpAsync'': In !HttpAsync, the traditional usage pattern is reversed. 
Instead of requests looking for a connection, there are connections looking for 
requests. Requests generated by the application will be collected in a pool, 
the application threads don't block. Background threads will choose requests 
that can be sent over an open connection. If there are none for that target, 
the connection can be closed and re-openend to a different target in order to 
send other requests.

== Connection States ==

Opening a connection for a route can require several steps. The type of route 
that is established determines for which routes the connection can be re-used.

 closed:: The connection is not open. It can be used for any route, but it 
needs to be opened first.

 direct to target:: The connection is opened directly to the target, either for 
plain HTTP communication or with TLS/SSL. It can be re-used for exactly this 
route.

 plain to proxy:: The connection is openend to a proxy for plain HTTP 
communication. It can be re-used for all plain HTTP routes through that proxy. 
It can also be upgraded to a tunnelled TLS/SSL communication via that host.

 tunnelled via proxy:: The connection is tunnelled via a proxy to a target, 
typically for communication with TLS/SSL. It can be re-used for exactly this 
route.

Reusability can also be affected by the authentication state of a connection.
If TLS/SSL with client authentication is used, a connection identifies the user 
to the server.
[[BR]]
''NTLM authentication is connection based too, in some way at least. Details?''

== Implementations ==

The {{{SimpleHttpConnectionManager}}} in !HttpClient 3.x has exactly one 
connection. It is re-used only for exactly the same route. No limits need to be 
enforced.

The {{{MultiThreadedHttpConnectionManager}}} in !HttpClient 3.x keeps a pool of 
connections. Re-use depends on exact matches of the route. Limits are enforced 
per route. In other words, the maximum connections per host are interpreted as 
maximum connections per route. If multiple proxy servers are available, the 
maximum per host can be exceeded by using different routes to the target host. 
This has not been a problem in the past.

A more sophisticated connection manager could provide performance benefits for 
!HttpClient 4.x. In particular, re-use of plain connections to a proxy for all 
routes via that proxy has potential. Even more so if the proxy requires NTLM 
authentication and the previous authentication state can be re-used.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[Jakarta-httpclient Wiki] Update of "ConnectionManagementDesign" by RolandWeber

Reply via email to