Re: [PATCH] Do not reuse persistent connections for PUTs

2012-09-10 Thread Alex Rousskov
 ons 2012-08-29 klockan 12:42 -0600 skrev Alex Rousskov:
 Hello,

 I saw bogus ERR_ZERO_SIZE_OBJECT responses while testing Squid v3.1,
 but the same problem ought to be present in v3.2 as well.

 A compliant proxy may retry PUTs, but Squid lacks the [rather
 complicated] code required to protect the PUT request body from being
 nibbled during the first try, when pconn races are possible.
 
 +1 on the patch from me.

Committed to trunk as r12319.


 Today, we have a choice:

   A. Fail some PUTs. Close fewer pconns (current code).
   B. Handle all PUTs. Open more connections (patched code).

 If this patch is accepted, performance bug 3398 will resurface. Henrik,
 do you think committing the patch is the right decision here even though
 it will reopen bug 3398?
 
 Yes, but with a plan to fix it correctly.
...

 The bug title is wrong. There is a long discussion in the bug report
 about what the bug is really about. I think a better bug title would be:
 persistent server connection closed before PUT.
 
 yes.
 
 Fix the bug title, reopen it with comment above and restore the check.

Done. Quality patches preserving PUT bodies (via 100-continue and/or
buffering) to optimize Squid pconn handling are welcomed.


Thank you,

Alex.



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-09-01 Thread Henrik Nordström
fre 2012-08-31 klockan 10:58 +0300 skrev Tsantilas Christos:

  1) To decide if it can reuse a healthy persistent connection. Inside
 FwdState::connectStart, we are getting a persistant connection and even
 if it is healthy, if we want to send eg a PUT request we are closing the
 persistent connection and we are opening a new one.

How do we know it's healthy?

The issue here is that we cannot retry some kinds of requests, and HTTP
requires us to handle when the server closes the connection while we are
sending the request. It is a speed of light problem, we cannot know if
the server is just about to close the connection or if it have already
closed the connection but the FIN have not yet been seen on our side.

 From what I can understand there is not any reason to not reuse a
 healthy persistent connection, for PUT requests.  Am I correct? If yes
 then the problem is only in the case (2).

There is, because we often cannot retry the PUT because we have not kep
a copy of the already partially sent PUT body, and it's expected to
sometimes fail tp send requestso n a persistent connection due to above
race condition.

THe race condition between HTTP server and client (squid) is fairly big
and easily fits many MB of traffic.

Regards
Henrik



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-09-01 Thread Henrik Nordström
fre 2012-08-31 klockan 21:44 +0300 skrev Tsantilas Christos:

 So looks that a good solution should be (similar to) the solution
 proposed by Henrik...

100 Continue aviods the race entirely on requests with bodies, leaving
only bodyless requests in the we may not retry this on failure but we
need to.. problem.

Keeping a copy of the sent body allows the request to be retried in
cases where 100 Continue were not used or could not be used.

Regards
Henrik



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-09-01 Thread Alex Rousskov
On 09/01/2012 07:11 AM, Henrik Nordström wrote:
 fre 2012-08-31 klockan 21:44 +0300 skrev Tsantilas Christos:
 
 So looks that a good solution should be (similar to) the solution
 proposed by Henrik...
 
 100 Continue aviods the race entirely on requests with bodies, leaving
 only bodyless requests in the we may not retry this on failure but we
 need to.. problem.

There are at least two problems with 100-continue:

Performance: Recall that we got into this mess because of the
performance complaint. 100-continue will not cause more connections to
be opened, but it will introduce a significant delay in some environments.

Compatibility: I am sure there will be cases where HTTP/1.1 servers will
not respond with 100 Continue. While Squid will not be at fault from
compliance point of view, the but it used to work fine or it works
without a proxy arguments will force us to add exceptions. I do not
know whether the problems will be widespread enough to warrant default
exceptions.


 Keeping a copy of the sent body allows the request to be retried in
 cases where 100 Continue were not used or could not be used.

I suspect the ideal solution will involve a combination of 100-continue
(where possible and desired) and buffering (otherwise). Unfortunately,
this combination doubles the amount of rather complex coding required to
arrive at that ideal.


Cheers,

Alex.



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-31 Thread Tsantilas Christos
On 08/30/2012 06:11 AM, Henrik Nordström wrote:
 ons 2012-08-29 klockan 15:32 -0600 skrev Alex Rousskov:
 
 Today, we have a choice:

   A. Fail some PUTs. Close fewer pconns (current code).
   B. Handle all PUTs. Open more connections (patched code).

The FwdState::checkRetriable() method is used in the following two cases:

 1) To decide if it can reuse a healthy persistent connection. Inside
FwdState::connectStart, we are getting a persistant connection and even
if it is healthy, if we want to send eg a PUT request we are closing the
persistent connection and we are opening a new one.

 2) To decide if a connection can be retried (inside FwdState::checkRetry())

From what I can understand there is not any reason to not reuse a
healthy persistent connection, for PUT requests.  Am I correct? If yes
then the problem is only in the case (2).

In the case (2) we are using the if (request-bodyNibbled()) to
examine if any body data consumed and probably sent to the server. So we
should not have any problem at all.
However in the case the HttpRequest::bodyNibbled() is not enough we can
add the check if (request-body_pipe != NULL) inside
FwdState::checkRetry() method after calling checkRetriable.

I think this is easier, at least for a short term fix, than delaying
sending body...


 If this patch is accepted, performance bug 3398 will resurface. Henrik,
 do you think committing the patch is the right decision here even though
 it will reopen bug 3398?
 
 Yes, but with a plan to fix it correctly.
 
 A suggested correct fix for 3398 is to make use of Expect: 100-continue.
 This delays sending the body a bit, allowing detection of broken
 connections before starting to send the body.
 
 An optimization from that is to add some buffering of request bodies to
 allow skipping the 100-continue to speed up forwarding.
 
 Regards
 Henrik
 
 The bug title is wrong. There is a long discussion in the bug report
 about what the bug is really about. I think a better bug title would be:
 persistent server connection closed before PUT.
 
 yes.
 
 Fix the bug title, reopen it with comment above and restore the check.
 
 Regatds
 Henrik
 
 



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-31 Thread Alex Rousskov
On 08/31/2012 01:58 AM, Tsantilas Christos wrote:
 On 08/30/2012 06:11 AM, Henrik Nordström wrote:
 ons 2012-08-29 klockan 15:32 -0600 skrev Alex Rousskov:

 Today, we have a choice:

   A. Fail some PUTs. Close fewer pconns (current code).
   B. Handle all PUTs. Open more connections (patched code).
 
 The FwdState::checkRetriable() method is used in the following two cases:
 
  1) To decide if it can reuse a healthy persistent connection. Inside
 FwdState::connectStart, we are getting a persistant connection and even
 if it is healthy, if we want to send eg a PUT request we are closing the
 persistent connection and we are opening a new one.
 
  2) To decide if a connection can be retried (inside FwdState::checkRetry())
 
 From what I can understand there is not any reason to not reuse a
 healthy persistent connection, for PUT requests.  Am I correct?

I am afraid you are not correct. There is a reason. Today, the code may
destroy PUT content before we can detect a pconn race. If we reuse a
pconn for PUT, we may later discover a pconn race but would be unable to
retry the failed PUT because the PUT content would have been already
nibbled by then.

In other words (and simplifying a bit), the current not nibbled
condition in checkRetriable() should be replaced with not nibbled and
will not be nibbled condition. Since we cannot guarantee the will not
part, we have to exclude all requests carrying body from being sent on
reused pconns.

I can reproduce this bug in the lab so it is not just a theory:
HttpStateData consumes PUT request body and only then gets a zero-length
read from the server, resulting in ERR_ZERO_SIZE_OBJECT returned to the
innocent client.



 In the case (2) we are using the if (request-bodyNibbled()) to
 examine if any body data consumed and probably sent to the server. So we
 should not have any problem at all.
 However in the case the HttpRequest::bodyNibbled() is not enough we can
 add the check if (request-body_pipe != NULL) inside
 FwdState::checkRetry() method after calling checkRetriable.

The current bodyNibbled() check is enough to avoid sending a nibbled
request to the server. Unfortunately, that means we will not retry some
failed PUTs, and we must retry them.

In other words, after (the pconn race happened and we nibbled) it is too
late to fix the situation by retrying a PUT. Thus, we have to prevent
either pconn races or nibbling. We can easily prevent races (the
proposed patch does that). Eventually, we will prevent nibbling (to get
a performance boost).


Does this clarify?

Alex.



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-31 Thread Tsantilas Christos
On 08/31/2012 05:31 PM, Alex Rousskov wrote:
 On 08/31/2012 01:58 AM, Tsantilas Christos wrote:
 On 08/30/2012 06:11 AM, Henrik Nordström wrote:
 ons 2012-08-29 klockan 15:32 -0600 skrev Alex Rousskov:

 Today, we have a choice:

   A. Fail some PUTs. Close fewer pconns (current code).
   B. Handle all PUTs. Open more connections (patched code).

 The FwdState::checkRetriable() method is used in the following two cases:

  1) To decide if it can reuse a healthy persistent connection. Inside
 FwdState::connectStart, we are getting a persistant connection and even
 if it is healthy, if we want to send eg a PUT request we are closing the
 persistent connection and we are opening a new one.

  2) To decide if a connection can be retried (inside FwdState::checkRetry())

 From what I can understand there is not any reason to not reuse a
 healthy persistent connection, for PUT requests.  Am I correct?
 
 I am afraid you are not correct. There is a reason. Today, the code may
 destroy PUT content before we can detect a pconn race. If we reuse a
 pconn for PUT, we may later discover a pconn race but would be unable to
 retry the failed PUT because the PUT content would have been already
 nibbled by then.

I see...

 
 In other words (and simplifying a bit), the current not nibbled
 condition in checkRetriable() should be replaced with not nibbled and
 will not be nibbled condition. Since we cannot guarantee the will not
 part, we have to exclude all requests carrying body from being sent on
 reused pconns.
 
 I can reproduce this bug in the lab so it is not just a theory:
 HttpStateData consumes PUT request body and only then gets a zero-length
 read from the server, resulting in ERR_ZERO_SIZE_OBJECT returned to the
 innocent client.
 
 
 
 In the case (2) we are using the if (request-bodyNibbled()) to
 examine if any body data consumed and probably sent to the server. So we
 should not have any problem at all.
 However in the case the HttpRequest::bodyNibbled() is not enough we can
 add the check if (request-body_pipe != NULL) inside
 FwdState::checkRetry() method after calling checkRetriable.
 
 The current bodyNibbled() check is enough to avoid sending a nibbled
 request to the server. Unfortunately, that means we will not retry some
 failed PUTs, and we must retry them.
 
 In other words, after (the pconn race happened and we nibbled) it is too
 late to fix the situation by retrying a PUT. Thus, we have to prevent
 either pconn races or nibbling. We can easily prevent races (the
 proposed patch does that). Eventually, we will prevent nibbling (to get
 a performance boost).
 
 
 Does this clarify?

Yep.
So looks that a good solution should be (similar to) the solution
proposed by Henrik...


 
 Alex.
 
 



[PATCH] Do not reuse persistent connections for PUTs

2012-08-29 Thread Alex Rousskov
Hello,

I saw bogus ERR_ZERO_SIZE_OBJECT responses while testing Squid v3.1,
but the same problem ought to be present in v3.2 as well.

A compliant proxy may retry PUTs, but Squid lacks the [rather
complicated] code required to protect the PUT request body from being
nibbled during the first try, when pconn races are possible. Thus, Squid
cannot safely retry some PUTs today, and FwdState::checkRetriable() must
return false for all PUTs, to avoid bogus ERR_ZERO_SIZE_OBJECT errors
(especially for clients that did not reuse a pconn and, hence, may not
be ready to handle/retry an error response).

In theory, requests with safe or idempotent methods other than PUT might
have bodies so the patch applies the same logic to them as well.


Thank you,

Alex.
Do not reuse persistent connections for PUTs to avoid ERR_ZERO_SIZE_OBJECT.

A compliant proxy may retry PUTs, but Squid lacks the [rather complicated]
code required to protect the PUT request body from being nibbled during the
first try. Thus, Squid cannot safely retry some PUTs today, and
FwdState::checkRetriable() must return false for all PUTs, to avoid
bogus ERR_ZERO_SIZE_OBJECT errors (especially for clients that did not
reuse a pconn and, hence, may not be ready to handle/retry an error response).

In theory, requests with safe or idempotent methods other than PUT might have
bodies so we apply the same logic to them as well.

=== modified file 'src/forward.cc'
--- src/forward.cc	2012-08-23 00:01:00 +
+++ src/forward.cc	2012-08-29 18:10:42 +
@@ -459,60 +459,66 @@
  * Return TRUE if this is the kind of request that can be retried
  * after a failure.  If the request is not retriable then we don't
  * want to risk sending it on a persistent connection.  Instead we'll
  * force it to go on a new HTTP connection.
  */
 bool
 FwdState::checkRetriable()
 {
 /* RFC2616 9.1 Safe and Idempotent Methods */
 switch (request-method.id()) {
 /* 9.1.1 Safe Methods */
 
 case METHOD_GET:
 
 case METHOD_HEAD:
 /* 9.1.2 Idempotent Methods */
 
 case METHOD_PUT:
 
 case METHOD_DELETE:
 
 case METHOD_OPTIONS:
 
 case METHOD_TRACE:
 break;
 
 default:
 return false;
 }
 
+// Optimize: A compliant proxy may retry PUTs, but Squid lacks the [rather
+// complicated] code required to protect the PUT request body from being
+// nibbled during the first try. Thus, Squid cannot retry some PUTs today.
+if (request-body_pipe != NULL)
+return false;
+
 return true;
 }
 
 void
 FwdState::serverClosed(int fd)
 {
 debugs(17, 2, fwdServerClosed: FD   fd entry-url());
 assert(server_fd == fd);
 server_fd = -1;
 
 retryOrBail();
 }
 
 void
 FwdState::retryOrBail()
 {
 if (checkRetry()) {
 int originserver = (servers-_peer == NULL);
 debugs(17, 3, fwdServerClosed: re-forwarding (  n_tries   tries,   (squid_curtime - start_t)   secs));
 
 if (servers-next) {
 /* use next, or cycle if origin server isn't last */
 FwdServer *fs = servers;
 FwdServer **T, *T2 = NULL;
 servers = fs-next;
 
 for (T = servers; *T; T2 = *T, T = (*T)-next);
 if (T2  T2-_peer) {
 /* cycle */
 *T = fs;



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-29 Thread Henrik Nordström
ons 2012-08-29 klockan 12:42 -0600 skrev Alex Rousskov:
 Hello,
 
 I saw bogus ERR_ZERO_SIZE_OBJECT responses while testing Squid v3.1,
 but the same problem ought to be present in v3.2 as well.
 
 A compliant proxy may retry PUTs, but Squid lacks the [rather
 complicated] code required to protect the PUT request body from being
 nibbled during the first try, when pconn races are possible.

+1 on the patch from me.

Regards
Henrik



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-29 Thread Henrik Nordström
ons 2012-08-29 klockan 22:20 +0200 skrev Henrik Nordström:
 ons 2012-08-29 klockan 12:42 -0600 skrev Alex Rousskov:
  Hello,
  
  I saw bogus ERR_ZERO_SIZE_OBJECT responses while testing Squid v3.1,
  but the same problem ought to be present in v3.2 as well.
  
  A compliant proxy may retry PUTs, but Squid lacks the [rather
  complicated] code required to protect the PUT request body from being
  nibbled during the first try, when pconn races are possible.
 
 +1 on the patch from me.

It's a reincarnation of Bug #569, kind of. The check for a request body
have always been there (well since that bug was fixed), but was
deleted in revision 11589 for some reason.

Log message on the delete says

  Bug 3398: persistent server connection closed after PUT/DELETE

Regards
Henrik



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-29 Thread Alex Rousskov
On 08/29/2012 02:32 PM, Henrik Nordström wrote:
 ons 2012-08-29 klockan 22:20 +0200 skrev Henrik Nordström:
 ons 2012-08-29 klockan 12:42 -0600 skrev Alex Rousskov:
 I saw bogus ERR_ZERO_SIZE_OBJECT responses while testing Squid v3.1,
 but the same problem ought to be present in v3.2 as well.

 A compliant proxy may retry PUTs, but Squid lacks the [rather
 complicated] code required to protect the PUT request body from being
 nibbled during the first try, when pconn races are possible.

 +1 on the patch from me.
 
 It's a reincarnation of Bug #569, kind of. The check for a request body
 have always been there (well since that bug was fixed), but was
 deleted in revision 11589 for some reason.

Indeed! I think Christos deleted the check to address the complaint that
Squid was closing a previously established pconn before sending a new
POST or PUT.

What nobody (including me) realized is that Squid is not yet ready to
retry some PUTs that fail due to a pconn race. There is code to check
that we do not retry a PUT when we cannot retry it, but there is no code
to preserve PUT so that it can be retried if needed. That missing code
would be quite tricky to write correctly.

Today, we have a choice:

  A. Fail some PUTs. Close fewer pconns (current code).
  B. Handle all PUTs. Open more connections (patched code).

If this patch is accepted, performance bug 3398 will resurface. Henrik,
do you think committing the patch is the right decision here even though
it will reopen bug 3398?


Regardless of the A/B decision today, it would be nice to add code to
protect [small] PUT contents if necessary, so that we can do (B) without
sacrificing performance in common cases. That requires a lot more work
as I already mentioned above.


 Log message on the delete says
 
   Bug 3398: persistent server connection closed after PUT/DELETE


The bug title is wrong. There is a long discussion in the bug report
about what the bug is really about. I think a better bug title would be:
persistent server connection closed before PUT.


Thank you,

Alex.



Re: [PATCH] Do not reuse persistent connections for PUTs

2012-08-29 Thread Henrik Nordström
ons 2012-08-29 klockan 15:32 -0600 skrev Alex Rousskov:

 Today, we have a choice:
 
   A. Fail some PUTs. Close fewer pconns (current code).
   B. Handle all PUTs. Open more connections (patched code).
 
 If this patch is accepted, performance bug 3398 will resurface. Henrik,
 do you think committing the patch is the right decision here even though
 it will reopen bug 3398?

Yes, but with a plan to fix it correctly.

A suggested correct fix for 3398 is to make use of Expect: 100-continue.
This delays sending the body a bit, allowing detection of broken
connections before starting to send the body.

An optimization from that is to add some buffering of request bodies to
allow skipping the 100-continue to speed up forwarding.

Regards
Henrik

 The bug title is wrong. There is a long discussion in the bug report
 about what the bug is really about. I think a better bug title would be:
 persistent server connection closed before PUT.

yes.

Fix the bug title, reopen it with comment above and restore the check.

Regatds
Henrik