[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-2142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carter Kozak updated HTTPCLIENT-2142:
-------------------------------------
    Description: 
This ticket is based on a discussion from the mailing list: 
http://mail-archives.apache.org/mod_mbox/hc-dev/202103.mbox/%3C60fa1a2bad046869a2da9269b7e90b9484a050ca.camel%40apache.org%3E
h3. Background

Currently several connection managers support a validation interval for idle 
connections which is used to check if they're still open before attempting to 
send a request. The java Socket API provides no way for a client to detect if 
the remote side has closed a connection without attempting a blocking read 
operation. In my experience the most common failure is that a remote server 
_cleanly_ closes idle connections sooner than the client expects (either due to 
a low idle connection timeout or a server shutdown) resulting in failed 
requests and retries.

A stale check on a java socket takes at minimum 1 millisecond due to the 
millisecond socket timeout resolution, and the only way (that I'm aware of) to 
check if a connection has been closed is to do a blocking read. A 
SocketTimeoutException tells us the connection is intact, while other 
IOExceptions suggest the connection is unusable. In many environments a 1ms 
check is far more expensive than the median request to a remote server and 
causes additional context switching.

When the validation interval is set to a higher value to avoid performance 
pitfalls, idle connections may build up in the connection pool. If the remote 
server unexpectedly restarts, the client may need to churn through hundreds of 
requests burning through retries before the higher validation interval is 
reached or remaining connections are closed due to reaching the keepalive 
timeout.
h3. Potential Ideas
h4. Route state awareness

Track connection health state per route such that stale checks are required 
before using a connection that has been idle longer than any connection that 
has been determined to be stale. This requires functionality to detect 
{{ConnectionClosedException}} and {{NoHttpResponseException}} produced by 
attempted requests, and send it back to the connection pool. This allows 
retryable requests to be executed optimistically (in cases where it's safe to 
do so) and only pre-validate after a failure has occurred to limit the number 
of retries that may be consumed.
h4. Request-aware stale checking predicate

Provide a configurable mechanism to force stale checks based on the incoming 
request, for example I may always want to validate connections before making 
requests to non-idempotent endpoints or when by request body is not considered 
repeatable, but have more risk tolerance when I'm confident a retry is 
possible. Perhaps also force stale connection checks for all retries. 
h4. Alternative forms of stale checks

We may consider making the stale-check mechanism itself configurable, a 
blocking read to be used, or an http2 ping when available, or even a full 
{{OPTIONS}} or {{HEAD}} request.

It's not entirely clear if any of this should be shared between the classic and 
async clients. I think the nio interfaces provide support for the common case 
(cleanly closed connections) which is a deficiency of the blocking socket API.

What do you think? Additional ideas are appreciated.

  was:
This ticket is based on a discussion from the mailing list: 
http://mail-archives.apache.org/mod_mbox/hc-dev/202103.mbox/%3C60fa1a2bad046869a2da9269b7e90b9484a050ca.camel%40apache
 .org%3E
h3. Background

Currently several connection managers support a validation interval for idle 
connections which is used to check if they're still open before attempting to 
send a request. The java Socket API provides no way for a client to detect if 
the remote side has closed a connection without attempting a blocking read 
operation. In my experience the most common failure is that a remote server 
_cleanly_ closes idle connections sooner than the client expects (either due to 
a low idle connection timeout or a server shutdown) resulting in failed 
requests and retries.

A stale check on a java socket takes at minimum 1 millisecond due to the 
millisecond socket timeout resolution, and the only way (that I'm aware of) to 
check if a connection has been closed is to do a blocking read. A 
SocketTimeoutException tells us the connection is intact, while other 
IOExceptions suggest the connection is unusable. In many environments a 1ms 
check is far more expensive than the median request to a remote server and 
causes additional context switching.

When the validation interval is set to a higher value to avoid performance 
pitfalls, idle connections may build up in the connection pool. If the remote 
server unexpectedly restarts, the client may need to churn through hundreds of 
requests burning through retries before the higher validation interval is 
reached or remaining connections are closed due to reaching the keepalive 
timeout.
h3. Potential Ideas
h4. Route state awareness

Track connection health state per route such that stale checks are required 
before using a connection that has been idle longer than any connection that 
has been determined to be stale. This requires functionality to detect 
{{ConnectionClosedException}} and {{NoHttpResponseException}} produced by 
attempted requests, and send it back to the connection pool. This allows 
retryable requests to be executed optimistically (in cases where it's safe to 
do so) and only pre-validate after a failure has occurred to limit the number 
of retries that may be consumed.
h4. Request-aware stale checking predicate

Provide a configurable mechanism to force stale checks based on the incoming 
request, for example I may always want to validate connections before making 
requests to non-idempotent endpoints or when by request body is not considered 
repeatable, but have more risk tolerance when I'm confident a retry is 
possible. Perhaps also force stale connection checks for all retries. 
h4. Alternative forms of stale checks

We may consider making the stale-check mechanism itself configurable, a 
blocking read to be used, or an http2 ping when available, or even a full 
{{OPTIONS}} or {{HEAD}} request.

It's not entirely clear if any of this should be shared between the classic and 
async clients. I think the nio interfaces provide support for the common case 
(cleanly closed connections) which is a deficiency of the blocking socket API.

What do you think? Additional ideas are appreciated.


> Design expressive idle connection validation support
> ----------------------------------------------------
>
>                 Key: HTTPCLIENT-2142
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2142
>             Project: HttpComponents HttpClient
>          Issue Type: New Feature
>            Reporter: Carter Kozak
>            Assignee: Carter Kozak
>            Priority: Major
>
> This ticket is based on a discussion from the mailing list: 
> http://mail-archives.apache.org/mod_mbox/hc-dev/202103.mbox/%3C60fa1a2bad046869a2da9269b7e90b9484a050ca.camel%40apache.org%3E
> h3. Background
> Currently several connection managers support a validation interval for idle 
> connections which is used to check if they're still open before attempting to 
> send a request. The java Socket API provides no way for a client to detect if 
> the remote side has closed a connection without attempting a blocking read 
> operation. In my experience the most common failure is that a remote server 
> _cleanly_ closes idle connections sooner than the client expects (either due 
> to a low idle connection timeout or a server shutdown) resulting in failed 
> requests and retries.
> A stale check on a java socket takes at minimum 1 millisecond due to the 
> millisecond socket timeout resolution, and the only way (that I'm aware of) 
> to check if a connection has been closed is to do a blocking read. A 
> SocketTimeoutException tells us the connection is intact, while other 
> IOExceptions suggest the connection is unusable. In many environments a 1ms 
> check is far more expensive than the median request to a remote server and 
> causes additional context switching.
> When the validation interval is set to a higher value to avoid performance 
> pitfalls, idle connections may build up in the connection pool. If the remote 
> server unexpectedly restarts, the client may need to churn through hundreds 
> of requests burning through retries before the higher validation interval is 
> reached or remaining connections are closed due to reaching the keepalive 
> timeout.
> h3. Potential Ideas
> h4. Route state awareness
> Track connection health state per route such that stale checks are required 
> before using a connection that has been idle longer than any connection that 
> has been determined to be stale. This requires functionality to detect 
> {{ConnectionClosedException}} and {{NoHttpResponseException}} produced by 
> attempted requests, and send it back to the connection pool. This allows 
> retryable requests to be executed optimistically (in cases where it's safe to 
> do so) and only pre-validate after a failure has occurred to limit the number 
> of retries that may be consumed.
> h4. Request-aware stale checking predicate
> Provide a configurable mechanism to force stale checks based on the incoming 
> request, for example I may always want to validate connections before making 
> requests to non-idempotent endpoints or when by request body is not 
> considered repeatable, but have more risk tolerance when I'm confident a 
> retry is possible. Perhaps also force stale connection checks for all 
> retries. 
> h4. Alternative forms of stale checks
> We may consider making the stale-check mechanism itself configurable, a 
> blocking read to be used, or an http2 ping when available, or even a full 
> {{OPTIONS}} or {{HEAD}} request.
> It's not entirely clear if any of this should be shared between the classic 
> and async clients. I think the nio interfaces provide support for the common 
> case (cleanly closed connections) which is a deficiency of the blocking 
> socket API.
> What do you think? Additional ideas are appreciated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to