Hi Patrick, hi Rachel,
so with these two patches applied on top of the previous one, I get the
behaviour that we discussed here.
Specifically, we differentiate client-read timeout, server-write timeouts
and server read timeouts during the data forwarding phase. Also, we disable
server read timeout until the client has sent its whole request. That way
I'm seeing the following flags in the logs :
- cH when client does not send everything before the server starts to
respond, which is OK. Status=408 there.
- cD when client stops sending data after the server starts to respond,
or if the client stops reading data, which in both case is a clear
client timeout. In both cases, the status is unaltered and nothing
is emitted since the beginning of the response was already transmitted ;
- sH when the server does not respond, including if it stops reading the
message body (eg: process stuck). Then we have 504.
- sD if the server stops reading or sending data during the data phase.
The changes were a bit tricky, so any confirmation from any of you would
make me more comfortable merging them into mainline. I'm attaching these
two extra patches, please give them a try.
Thanks,
Willy
>From b9edf8fbecc9d1b5c82794735adcc367a80a4ae2 Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Wed, 7 May 2014 14:24:16 +0200
Subject: BUG/MEDIUM: http: correctly report request body timeouts
This is the continuation of previous patch "BUG/MEDIUM: http/session:
disable client-side expiration only after body".
This one takes care of properly reporting the client-side read timeout
when waiting for a body from the client. Since the timeout may happen
before or after the server starts to respond, we have to take care of
the situation in three different ways :
- if the server does not read our data fast enough, we emit a 504
if we're waiting for headers, or we simply break the connection
if headers were already received. We report either sH or sD
depending on whether we've seen headers or not.
- if the server has not yet started to respond, but has read all of
the client's data and we're still waiting for more data from the
client, we can safely emit a 408 and abort the request ;
- if the server has already started to respond (thus it's a transfer
timeout during a bidirectional exchange), then we silently break
the connection, and only the session flags will indicate in the
logs that something went wrong with client or server side.
This bug is tagged MEDIUM because it touches very sensible areas, however
its impact is very low. It might be worth performing a careful backport
to 1.4 once it has been confirmed that everything is correct and that it
does not introduce any regression.
---
src/proto_http.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 74 insertions(+), 2 deletions(-)
diff --git a/src/proto_http.c b/src/proto_http.c
index e473228..797b3b8 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -5175,6 +5175,13 @@ int http_request_forward_body(struct session *s, struct
channel *req, int an_bit
*/
msg->msg_state = HTTP_MSG_ERROR;
http_resync_states(s);
+
+ if (req->flags & CF_READ_TIMEOUT)
+ goto cli_timeout;
+
+ if (req->flags & CF_WRITE_TIMEOUT)
+ goto srv_timeout;
+
return 1;
}
@@ -5455,6 +5462,68 @@ int http_request_forward_body(struct session *s, struct
channel *req, int an_bit
s->flags |= SN_FINST_D;
}
return 0;
+
+ cli_timeout:
+ if (!(s->flags & SN_ERR_MASK))
+ s->flags |= SN_ERR_CLITO;
+
+ if (!(s->flags & SN_FINST_MASK)) {
+ if (txn->rsp.msg_state < HTTP_MSG_ERROR)
+ s->flags |= SN_FINST_H;
+ else
+ s->flags |= SN_FINST_D;
+ }
+
+ if (txn->status > 0) {
+ /* Don't send any error message if something was already sent */
+ stream_int_retnclose(req->prod, NULL);
+ }
+ else {
+ txn->status = 408;
+ stream_int_retnclose(req->prod, http_error_message(s,
HTTP_ERR_408));
+ }
+
+ msg->msg_state = HTTP_MSG_ERROR;
+ req->analysers = 0;
+ s->rep->analysers = 0; /* we're in data phase, we want to abort both
directions */
+
+ session_inc_http_err_ctr(s);
+ s->fe->fe_counters.failed_req++;
+ s->be->be_counters.failed_req++;
+ if (s->listener->counters)
+ s->listener->counters->failed_req++;
+ return 0;
+
+ srv_timeout:
+ if (!(s->flags & SN_ERR_MASK))
+ s->flags |= SN_ERR_SRVTO;
+
+ if (!(s->flags & SN_FINST_MASK)) {
+ if (txn->rsp.msg_state < HTTP_MSG_ERROR)
+ s->flags |= SN_FINST_H;
+ else
+ s->flags |= SN_FINST_D;
+ }
+
+ if (txn->status > 0) {
+ /* Don't send any error message if something was already sent */
+ stream_int_retnclose(req->prod, NULL);
+ }
+ else {
+ txn->status = 504;
+ stream_int_retnclose(req->prod, http_error_message(s,
HTTP_ERR_504));
+ }
+
+ msg->msg_state = HTTP_MSG_ERROR;
+ req->analysers = 0;
+ s->rep->analysers = 0; /* we're in data phase, we want to abort both
directions */
+
+ s->be->be_counters.failed_resp++;
+ if (objt_server(s->target)) {
+ objt_server(s->target)->counters.failed_resp++;
+ health_adjust(objt_server(s->target),
HANA_STATUS_HTTP_READ_TIMEOUT);
+ }
+ return 0;
}
/* This stream analyser waits for a complete HTTP response. It returns 1 if the
@@ -5622,8 +5691,11 @@ int http_wait_for_response(struct session *s, struct
channel *rep, int an_bit)
return 0;
}
- /* read timeout : return a 504 to the client. */
- else if (rep->flags & CF_READ_TIMEOUT) {
+ /* read/write timeout : return a 504 to the client.
+ * The write timeout may happen when we're uploading POST
+ * data that the server is not consuming fast enough.
+ */
+ else if (rep->flags & (CF_READ_TIMEOUT|CF_WRITE_TIMEOUT)) {
if (msg->err_pos >= 0)
http_capture_bad_message(&s->be->invalid_rep,
s, msg, msg->msg_state, s->fe);
else if (txn->flags & TX_NOT_FIRST)
--
1.7.12.1
>From 3bed5e9337fd6eeab0f0006ebefcbe98ee5c4f9f Mon Sep 17 00:00:00 2001
From: Willy Tarreau <[email protected]>
Date: Wed, 7 May 2014 14:50:10 +0200
Subject: BUG/MEDIUM: http: disable server-side expiration until client has
sent the body
It's the final part of the 2 previous patches. We prevent the server from
timing out if we still have some data to pass to it. That way, even if the
server runs with a short timeout and the client with a large one, the server
side timeout will only start to count once the client sends everything. This
ensures we don't report a 504 before the server gets the whole request.
It is not certain whether the 1.4 state machine is fully compatible with
this change. Since the purpose is only to ensure that we never report a
server error before a client error if some data are missing from the client
and when the server-side timeout is smaller than or equal to the client's,
it's probably not worth attempting the backport.
---
src/proto_http.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/src/proto_http.c b/src/proto_http.c
index 797b3b8..1b96e3f 100644
--- a/src/proto_http.c
+++ b/src/proto_http.c
@@ -5791,6 +5791,12 @@ int http_wait_for_response(struct session *s, struct
channel *rep, int an_bit)
return 0;
}
+ /* we don't want to expire on the server side first until the
client
+ * has sent all the expected message body.
+ */
+ if (txn->req.msg_state >= HTTP_MSG_BODY && txn->req.msg_state <
HTTP_MSG_DONE)
+ rep->flags |= CF_READ_NOEXP;
+
channel_dont_close(rep);
rep->flags |= CF_READ_DONTWAIT; /* try to get back here ASAP */
return 0;
--
1.7.12.1