Re: [PATCH] Pipeline prefetch build-time control

Amos Jeffries Mon, 20 May 2013 21:59:22 -0700

Updated patch attached that addresses most of these issues.


On 21/05/2013 9:47 a.m., Alex Rousskov wrote:

On 05/16/2013 08:51 AM, Amos Jeffries wrote:

Attached is an update which goes straight to the end and make
pipeline_prefetch directive accept an integer count of pipeline length.

+    // if client_persistent_connections OFF, then we cannot service any pipeline 
>1

Please replace "we cannot service any pipeline >1" with "we do not
expect any pipelined requests" to avoid confusing "pipeline>1" math:
Does pipeline==2 mean two concurrent requests or tree, since we are not
counting the already read request? Your math is correct, but this
phrasing is confusing IMO.

Used "we cannot service any more of the pipeline" instead. To make itclear that some has been serviced, but no more will regardless of howfar down the pipeline we are..

+        const bool result = (conn->getConcurrentRequestCount() < 1);
+        if (!result)
+            debugs(33, 3, conn->clientConnection << " persistence disabled. 
Concurrent requests discarded.");

We do not know whether we discarded something or not. Drop the second
sentence.


We do know that.

result is only false when there is 1 request already been parsed on theconnection and Connection:close will be sent in its reply(client_persistent_connections off).This check is called when a second is trying to be parsed. So at leastthat second one *will* be dropped by the TCP close.But we are missing a lot of cases like that already and I don't thinkit is critical that we catch all of them. Just the ones which are easilyidentified.


However...

If this code stays, please compute getConcurrentRequestCount() once,
before all the if-statements and then uses the computed number as needed.

Okay. Done this anyway.

+    if (!Config.onoff.client_pconns) {
+        const bool result = (conn->getConcurrentRequestCount() < 1);
+        if (!result)
+            debugs(33, 3, conn->clientConnection << " persistence disabled. 
Concurrent requests discarded.");
+        return result;
+    }

The above addition to connOkToAddRequest() seems odd: If the caller does
not check for persistency, then the above check is not sufficient --
there are other reasons the connection may not be persistent (and the
old code was broken since it lacked any checks at all!). If the caller
checks for persistency, the above check is not needed.

Hmm. Thinking about this a bit more I've decided to move it up into theconfig consistency checks next to the pipeline vs auth checks.It makes alot more sense to only check once, and to provide -k parsewarnings about the config option clash, and will also catch thethree-way edge case races between TCP packet delivery, response writingclose handler events - when a second request comes in *after* the firsthas already been responded and close scheduled for it.

Consider the case where Config.onoff.client_pconns is "on" (default) but
the client sent "Connection:close" or Squid is already responding with
"Connection:close".

If the caller does not check, I do not think you have to be responsible
for fixing this, but if you add the above code without adding other (far
more relevant in default cases) checks, then please at least add a XXX
documenting that other checks are missing here.


Indeed.

I've added a loop to scan for Connection:close (andProxy-Connection:close), and CONNECT method as blockers on furtherpipieline reads. There may be other conditions we find later.

It's not the most efficient way to do it, but before we can implementtha better way we have to verify the request processing reaches theheader interpret functionality before the next read() operation is doneby ConnStateData to prevent a race condition.

+LOC: Config.pipeline_max_prefetch

How about "pipeline_prefetch_max" in case we need to add more
pipeline_prefetch options later.

Then IMHO they would be using different names pipeline_something_else,or better the single options split into a structure with all of thefeature config options inside it:

struct pipeline_ {
   int maxPrefetch; // max queue length

int delayTimeout; // eg. if found waiting more than N seconds inqueue do something special..

} pipeline;

        To boost the performance of pipelined requests to closer
        match that of a non-proxied environment Squid can try to fetch
-       up to two requests in parallel from a pipeline.
+       several requests in parallel from a pipeline. This sets the
+       maximum number of requests to be fetched in advance.

I think this needs to be clarified to better explain the confusion
between the pipeline "depth" and the number of concurrent requests:

   HTTP clients may send a pipeline of 1+N requests to Squid using a
   single connection, without waiting for Squid to respond to the first
   of those requests. This option limits the number of concurrent
   requests Squid will try to handle in parallel. If set to N, Squid
   will try to receive and process up to 1+N requests on the same
   connection concurrently.


Great. Thank you. Done.

+        debugs(0, DBG_PARSE_NOTE(DBG_IMPORTANT), "WARNING: 'pipeline_prefetch on' 
is deprecated. Please update to use a number.");

Perhaps we should indicate what the replacement is, like we do in
pipeline_prefetch off case? We could say "Please update to use 1 (or a
higher number)."


Done.

PS. awaiting yoru parser update patch before commit of this.

Thank you
Amos

=== modified file 'doc/release-notes/release-3.4.sgml'
--- doc/release-notes/release-3.4.sgml  2013-05-14 17:54:30 +0000
+++ doc/release-notes/release-3.4.sgml  2013-05-16 03:38:06 +0000
@@ -179,6 +179,9 @@
        <p>New format code <em>%note</em> to log a transaction annotation 
linked to the
           transaction by ICAP, eCAP, a helper, or the <em>note</em> squid.conf 
directive.
 
+       <tag>pipeline_prefetch</tag>
+       <p>Updated to take a numeric count of prefetched pipeline requests 
instead of ON/OFF.
+
        <tag>unlinkd_program</tag>
        <p>New helper response format utilizing result codes <em>OK</em> and 
<em>BH</em>,
           to signal helper lookup results. Also, key-value response values to 
return

=== modified file 'src/Parsing.cc'
--- src/Parsing.cc      2013-05-04 13:14:23 +0000
+++ src/Parsing.cc      2013-05-16 13:55:49 +0000
@@ -33,6 +33,7 @@
 #include "squid.h"
 #include "cache_cf.h"
 #include "compat/strtoll.h"
+#include "ConfigParser.h"
 #include "Parsing.h"
 #include "globals.h"
 #include "Debug.h"
@@ -161,7 +162,7 @@
 int
 GetInteger(void)
 {
-    char *token = strtok(NULL, w_space);
+    char *token = ConfigParser::strtokFile();
     int i;
 
     if (token == NULL)

=== modified file 'src/SquidConfig.h'
--- src/SquidConfig.h   2013-05-13 03:57:03 +0000
+++ src/SquidConfig.h   2013-05-16 03:21:38 +0000
@@ -332,7 +332,6 @@
 
         int ie_refresh;
         int vary_ignore_expire;
-        int pipeline_prefetch;
         int surrogate_is_remote;
         int request_entities;
         int detect_broken_server_pconns;
@@ -361,6 +360,8 @@
         int client_dst_passthru;
     } onoff;
 
+    int pipeline_max_prefetch;
+
     int forward_max_tries;
     int connect_retries;
 

=== modified file 'src/cache_cf.cc'
--- src/cache_cf.cc     2013-05-14 18:36:45 +0000
+++ src/cache_cf.cc     2013-05-21 03:56:03 +0000
@@ -965,6 +965,16 @@
                (uint32_t)Config.maxRequestBufferSize, 
(uint32_t)Config.maxRequestHeaderSize);
     }
 
+    /*
+     * Disable client side request pipelining if client_persistent_connections 
OFF.
+     * Waste of resources queueing any pipelined requests when the first will 
close the connection.
+     */
+    if (Config.pipeline_max_prefetch > 0 && !Config.onoff.client_pconns) {
+        debugs(3, DBG_PARSE_NOTE(DBG_IMPORTANT), "WARNING: pipeline_prefetch " 
<< Config.pipeline_max_prefetch <<
+                   " requires client_persistent_connections ON. Forced 
pipeline_prefetch 0.".);
+        Config.pipeline_max_prefetch = 0;
+    }
+
 #if USE_AUTH
     /*
      * disable client side request pipelining. There is a race with
@@ -973,12 +983,12 @@
      * pipelining OFF, the client may fail to authenticate, but squid's
      * state will be preserved.
      */
-    if (Config.onoff.pipeline_prefetch) {
+    if (Config.pipeline_max_prefetch > 0) {
         Auth::Config *nego = Auth::Config::Find("Negotiate");
         Auth::Config *ntlm = Auth::Config::Find("NTLM");
         if ((nego && nego->active()) || (ntlm && ntlm->active())) {
-            debugs(3, DBG_IMPORTANT, "WARNING: pipeline_prefetch breaks NTLM 
and Negotiate authentication. Forced OFF.");
-            Config.onoff.pipeline_prefetch = 0;
+            debugs(3, DBG_PARSE_NOTE(DBG_IMPORTANT), "WARNING: 
pipeline_prefetch breaks NTLM and Negotiate authentication. Forced 
pipeline_prefetch 0.");
+            Config.pipeline_max_prefetch = 0;
         }
     }
 #endif
@@ -2691,6 +2701,29 @@
 
 #define free_tristate free_int
 
+void
+parse_pipelinePrefetch(int *var)
+{
+    char *token = ConfigParser::strtokFile();
+
+    if (token == NULL)
+        self_destruct();
+
+    if (!strcmp(token, "on")) {
+        debugs(0, DBG_PARSE_NOTE(DBG_IMPORTANT), "WARNING: 'pipeline_prefetch 
on' is deprecated. Please update to use 1 (or a higher number).");
+        *var = 1;
+    } else if (!strcmp(token, "off")) {
+        debugs(0, DBG_PARSE_NOTE(2), "WARNING: 'pipeline_prefetch off' is 
deprecated. Please update to use '0'.");
+        *var = 0;
+    } else {
+        ConfigParser::strtokFileUndo();
+        parse_int(var);
+    }
+}
+
+#define free_pipelinePrefetch free_int
+#define dump_pipelinePrefetch dump_int
+
 static void
 dump_refreshpattern(StoreEntry * entry, const char *name, RefreshPattern * 
head)
 {

=== modified file 'src/cf.data.depend'
--- src/cf.data.depend  2013-05-12 05:04:14 +0000
+++ src/cf.data.depend  2013-05-16 04:00:13 +0000
@@ -51,6 +51,7 @@
 onoff
 peer
 peer_access            cache_peer acl
+pipelinePrefetch
 PortCfg
 QosConfig
 refreshpattern

=== modified file 'src/cf.data.pre'
--- src/cf.data.pre     2013-05-14 17:53:18 +0000
+++ src/cf.data.pre     2013-05-21 02:50:03 +0000
@@ -8701,17 +8701,23 @@
 DOC_END
 
 NAME: pipeline_prefetch
-TYPE: onoff
-LOC: Config.onoff.pipeline_prefetch
-DEFAULT: off
+TYPE: pipelinePrefetch
+LOC: Config.pipeline_max_prefetch
+DEFAULT: 0
+DEFAULT_DOC: Do not pre-parse pipelined requests.
 DOC_START
-       To boost the performance of pipelined requests to closer
-       match that of a non-proxied environment Squid can try to fetch
-       up to two requests in parallel from a pipeline.
+       HTTP clients may send a pipeline of 1+N requests to Squid using a
+       single connection, without waiting for Squid to respond to the first
+       of those requests. This option limits the number of concurrent
+       requests Squid will try to handle in parallel. If set to N, Squid
+       will try to receive and process up to 1+N requests on the same
+       connection concurrently.
 
-       Defaults to off for bandwidth management and access logging
+       Defaults to 0 (off) for bandwidth management and access logging
        reasons.
 
+       NOTE: pipelining requires persistent connections to clients.
+
        WARNING: pipelining breaks NTLM and Negotiate/Kerberos authentication.
 DOC_END
 

=== modified file 'src/client_side.cc'
--- src/client_side.cc  2013-05-13 03:57:03 +0000
+++ src/client_side.cc  2013-05-21 04:48:42 +0000
@@ -2944,14 +2944,46 @@
     }
 }
 
-static int
+static bool
 connOkToAddRequest(ConnStateData * conn)
 {
-    int result = conn->getConcurrentRequestCount() < 
(Config.onoff.pipeline_prefetch ? 2 : 1);
-
+    const int existingRequestCount = conn->getConcurrentRequestCount();
+
+    // check that no earlier request used Connection:close.
+    // If client is going to disappear, then we cannot service any more of the 
pipeline
+    typedef ClientSocketContext::Pointer CSCP;
+    for (CSCP c = conn->getCurrentContext(); c.getRaw(); c = c->next) {
+
+        // TODO: we should not have to loop here.
+        //  It may be sufficient to set a ConnStateData flag when interpreting
+        //  the request headers for each parsed request, and that flag here.
+        //  But that interpretat may occur too late, for now do this the slow 
way.
+
+        // XXX: should never happen. But not 100% certain of that yet.
+        if (c->http == NULL || c->http->request == NULL)
+            continue;
+
+        // CONNECT will block the connection indefinitely and probably close 
it.
+        if (c->http->request->method == Htp::METHOD_CONNECT)
+            return false;
+
+        // client has sent Connection:close or equivalent
+        if (!c->http->request->persistent())
+            return false;
+    }
+
+    // XXX: check that pipeline queue length is no more than M seconds long 
already.
+    // For long-delay queues we need to push back at the client via leaving 
things in TCP buffers.
+    // By time M all queued objects are already at risk of getting 
client-timeout errors.
+
+    // default to the configured pipeline size.
+    // add 1 because the head of pipeline is counted in concurrent requests 
and not prefetch queue
+    const int concurrentRequestLimit = Config.pipeline_max_prefetch + 1;
+
+    const bool result = (existingRequestCount < concurrentRequestLimit);
     if (!result) {
-        debugs(33, 3, HERE << conn->clientConnection << " max concurrent 
requests reached");
-        debugs(33, 5, HERE << conn->clientConnection << " defering new request 
until one is done");
+        debugs(33, 3, conn->clientConnection << " max concurrent requests 
reached (" << concurrentRequestLimit << ")");
+        debugs(33, 5, conn->clientConnection << " defering new request until 
one is done");
     }
 
     return result;
@@ -2979,7 +3011,7 @@
         if (in.notYetUsed == 0)
             break;
 
-        /* Limit the number of concurrent requests to 2 */
+        /* Limit the number of concurrent requests */
         if (!connOkToAddRequest(this)) {
             break;
         }

=== modified file 'src/client_side.h'
--- src/client_side.h   2013-04-04 06:15:00 +0000
+++ src/client_side.h   2013-05-21 04:48:54 +0000
@@ -178,7 +178,7 @@
 /**
  * Manages a connection to a client.
  *
- * Multiple requests (up to 2) can be pipelined. This object is responsible 
for managing
+ * Multiple requests (up to pipeline_prefetch) can be pipelined. This object 
is responsible for managing
  * which one is currently being fulfilled and what happens to the queue if the 
current one
  * causes the client connection to be closed early.
  *

Re: [PATCH] Pipeline prefetch build-time control

Reply via email to