Re: [squid-dev] [PATCH] Increase request buffer size to 64kb

Nathan Hoad Sun, 03 Apr 2016 23:30:07 -0700

Hello,

One large reply again. Attached is a patch with all the earlier
changes recommended by Alex. Most notably this removes the change for
read_ahead_gap, leaving the default at 16kb, and opting to use
read_ahead_gap for this rather than introduce a new option.


Alex,

>> Make the size of the buffer on Http::Stream configurable.
>
> s/buffer on Http::Stream/ClientStream buffer/

That would make it inaccurate - there's still one use of
clientStreamInit that does not use this option, in
client_side_request.cc:clientBeginRequest, that works back to
ESISegment::buf, which is a char[HTTP_REQBUF_SZ]. I could convert
ESISegment::buf to a MemBlob and fix the call location in
ESIInclude::Start, but I would not be able to test the code at all, as
I've not used ESI before.

On 4 April 2016 at 10:32, Alex Rousskov
<[email protected]> wrote:
> On 03/30/2016 11:50 PM, Nathan Hoad wrote:
>
>> Alex, I've tried 8, 16, 32, 128 and 512 KB values - all sizes leading
>> up to 64 KB scaled appropriately. 128 and 512 were the same or
>> slightly worse than 64, so I think 64 KB is the "best value".
>
> Sounds good, but it is even more important that you have explained *why*
> below.
>
>
>> On 30 March 2016 at 21:29, Amos Jeffries <[email protected]> wrote:
>>> One thing you need to keep in mind with all this is that the above
>>> macros *does not* configure the network I/O buffers.
>
>> I don't think this is quite true - I don't think it's intentional, but
>> I am lead to believe that HTTP_REQBUF_SZ does influence network IO
>> buffers in some way. See below.
>
> You are probably both right: HTTP_REQBUF_SZ influences network I/O but
> not necessarily all network I/O buffer capacities. In other words,
> increasing HTTP_REQBUF_SZ improves uncachable miss performance until
> HTTP_REQBUF_SZ matches the size of the second smallest buffer (or I/O
> size) in the buffer chain. In your test, that limit was read_ahead_gap.
>
>
>>> In the long-term plan those internal uses will be replaced by SBuf
>
> ... or MemBlob or something that does not exist yet but uses MemBlob.
>
> SBuf it trying to provide "everything to everybody" and, naturally,
> becomes a bad choice in some situations with strict requirements (e.g.,
> when one has to preserve raw buffer pointer but does not want single
> buffer content ownership).
>
>
>> Looking purely at system calls, it shows the reads from the upstream
>> server are being read in 16 KB chunks, where as writes to the client
>> are done in 4 KB chunks. With the patch, the writes to the client
>> increase to 16 KB, so it appears that HTTP_REQBUF_SZ does influence
>> network IO in this way.
>
> Naturally: One cannot write using 16 KB chunks when the data goes
> through a 4KB buffer. Please note that the position of that 4KB buffer
> in the buffer chain is almost irrelevant -- the weakest link in the
> chain determines "bytes pipelining" or "bytes streaming" speed.
>
>
>
>> However post-patch the improvement is quite
>> substantial, as the write(2) calls are now using the full 64 KB
>> buffer.
>
> This is an important finding! It shows that read_ahead_gap documentation
> lies or at least misleads: Evidently, that directive controls not just
> accumulation of data but [maximum] network read sizes, at least for
> HTTP. Please fix that documentation in your next patch revision.

Done.

>
>
>> At this stage, I'm not entirely sure what the best course of action
>> is. I'm happy to investigate things further, if people have
>> suggestions. read_ahead_gap appears to influence downstream write
>> buffer sizes, at least up to the maximum of HTTP_REQBUF_SZ.
>
> In other words, Squid Client (http.cc and clients/*) is able to give
> Squid Server (client_side*cc and servers/*) the results of a single
> Client network read (i.e., a single read from the origin server of
> peer). The result is copied to the ClientStream StoreIOBuffer. Thus, if
> we do not want to slow Squid down artificially, the StoreIOBuffer
> capacity should match the maximum Client read I/O size.
>
> Do we use that StoreIOBuffer to write(2) to the HTTP client? If not,
> what controls the I/O buffer size for writing to the HTTP client?

We do yes - that buffer makes it through
client_side.cc:clientSocketRecipient, Http::One::Server::handleReply,
and finally Http::Stream::sendBody, where the actual write is
performed.


>
>Http::One::Server::handleReply
>> It would
>> be nice if that buffer size was independently run-time configurable
>
> Actually, it would be nice if the buffer sizes just matched, avoiding
> both artificial slowdown and the need for careful configuration in most
> cases.
>
> Whether configuration is also desirable in some special situations is a
> separate question. If nobody comes forward with such a special
> situation/need, then we may be better off avoiding adding a yet another
> tunable and simply tying ClientStream buffer capacity to the
> read_ahead_gap value. The only reason to add a configuration option
> [that nobody we know needs] would be backward compatibility.

I've opted to remove the configuration option completely, making the
buffer size in Http::Stream obey read_ahead_gap. If such a use case
does come forward, it's trivial to add a configuration option for it.

>
> If backward compatibility is deemed important here, we can add a
> configuration option, but change the _default_ buffer size to track
> read_ahead_gap and, hence, avoid artificial slowdown (at least in v4).
>
>
> How much impact does increasing read_ahead_gap and HTTP_REQBUF_SZ by B
> and G bytes, respectively have on overall Squid memory usage? Is that
> B+G per concurrent transaction, roughly? Are any other buffer capacities
> that depend on those two parameters?
>
> If the increase is B+G then your changes increase Squid RAM footprint by
> C*(60+48) KB where C is the number of concurrent master transactions. A
> Squid dealing with 1000 concurrent transactions would see its RAM usage
> increase by about 100 MB, which is not terrible (and decreased response
> times may reduce the number of concurrent transactions, partially
> mitigating that increase). The commit message (and release notes) should
> disclose the estimated increase though.

Looking at it, yes that is correct - the increase is indeed C*(60+48)
KB. We could (and I have) instead opt to leave the default for
read_ahead_gap at 16kb, at which point the increase is C*12 KB, which
feels nicer to me, with a note in the documentation that increasing
this number may improve throughput at the cost of memory (and should
be tested).

>
>
> Thank you,
>
> Alex.

> AFAICT, reply_header_max_size affects the HTTP response reading only
> until the headers are parsed:
>
>>     const int limitBuffer = (flags.headers_parsed ? Config.readAheadGap : 
>> Config.maxReplyHeaderSize);
>
> This is actually kind of wrong because the first read should be
> something like the maximum of those two values: Restricting response
> header size should not restrict response body data prefetch during the
> first I/O. Nathan, if you agree, fixing this would be in your patch
> scope IMO!

That line is... not very nice. It's assigning to an int,
Config.readAheadGap is an int64_t, and Config.maxReplyHeaderSize is
size_t. Making the change you recommended reveals this via a
compilation error:

http.cc: In member function ‘bool HttpStateData::maybeMakeSpaceAvailable(bool)’:
http.cc:1547:125: error: no matching function for call to
‘max(size_t&, int64_t&)’
     const int limitBuffer = (flags.headers_parsed ?
Config.readAheadGap : max(Config.maxReplyHeaderSize,
Config.readAheadGap));

                                                      ^
In file included from ../compat/compat.h:100:0,
                 from ../include/squid.h:43,
                 from http.cc:16:
../compat/compat_shared.h:142:1: note: candidate: template<class A>
const A& max(const A&, const A&)
 max(A const & lhs, A const & rhs)
 ^
../compat/compat_shared.h:142:1: note:   template argument
deduction/substitution failed:
http.cc:1547:125: note:   deduced conflicting types for parameter
‘const A’ (‘long unsigned int’ and ‘int64_t {aka long int}’)
     const int limitBuffer = (flags.headers_parsed ?
Config.readAheadGap : max(Config.maxReplyHeaderSize,
Config.readAheadGap));

I'm happy to include the change - just let me know which way I should
go with this and I'll resubmit.

Thank you,

Nathan.

Convert Http::Stream::reqbuf to a MemBlob, making it configurable at runtime.

This also makes many other auxilary changes:

 * Increases the size of Http::Stream::requestBuffer to match that of
   read_ahead_gap. Previously this was a 4kb fixed size buffer. As a result,
   the overhead for a single client connection has been increased by 12 KB in
   the default configuration, but Squid will no longer artifically slow down
   client responses in this situation by fragmenting the read(2)/write(2)
   calls.

 * Improves the performance of large uncacheable replies. This was achieved by
   increasing the buffer size to 16 KB as mentioned above, but it is worth
   mentioning separately. Specifically, for a server, client and proxy all
   running on my local machine, this patch increases throughput on a 5 GB file
   from ~110 MB/s to ~340 MB/s.

 * Documents the influence that read_ahead_gap had on the size of read(2) calls
   for HTTP, and now the size of write(2) calls.

 * Prevent read_ahead_gap from being set to 0. Previously this would result in
   hung requests.

This work is submitted on behalf of Bloomberg L.P.

diff --git a/src/cache_cf.cc b/src/cache_cf.cc
--- a/src/cache_cf.cc
+++ b/src/cache_cf.cc
@@ -946,6 +946,10 @@ configDoConfigure(void)
         }
     }
 #endif
+
+    if (Config.readAheadGap <= 0) {
+        fatalf("read_ahead_gap must be greater than 0 bytes");
+    }
 }
 
 /** Parse a line containing an obsolete directive.
diff --git a/src/cf.data.pre b/src/cf.data.pre
--- a/src/cf.data.pre
+++ b/src/cf.data.pre
@@ -5598,6 +5598,17 @@ DEFAULT: 16 KB
 DOC_START
 	The amount of data the cache will buffer ahead of what has been
 	sent to the client when retrieving an object from another server.
+	
+	This also influences the maximum network read(2)/write(2) sizes in some
+	circumstances. Reducing the size of this buffer will decrease
+	per-connection memory usage at the cost of more read(2)/write(2) calls.
+	Conversely, increasing the size of this buffer will decrease the number of
+	read(2)/write(2) calls at the cost of memory usage, potentially improving
+	performance.
+
+	Squid does not slow does the response delivery to the client in order to
+	fill the buffer.
+
 DOC_END
 
 NAME: negative_ttl
diff --git a/src/client_side.cc b/src/client_side.cc
--- a/src/client_side.cc
+++ b/src/client_side.cc
@@ -1011,12 +1011,9 @@ ConnStateData::abortRequestParsing(const
     http->uri = xstrdup(uri);
     setLogUri (http, uri);
     auto *context = new Http::Stream(clientConnection, http);
-    StoreIOBuffer tempBuffer;
-    tempBuffer.data = context->reqbuf;
-    tempBuffer.length = HTTP_REQBUF_SZ;
     clientStreamInit(&http->client_stream, clientGetMoreData, clientReplyDetach,
                      clientReplyStatus, new clientReplyContext(http), clientSocketRecipient,
-                     clientSocketDetach, context, tempBuffer);
+                     clientSocketDetach, context, context->getClientStreamBuffer());
     return context;
 }
 
@@ -1358,15 +1355,11 @@ parseHttpRequest(ConnStateData *csd, con
     http->req_sz = hp->messageHeaderSize();
     Http::Stream *result = new Http::Stream(csd->clientConnection, http);
 
-    StoreIOBuffer tempBuffer;
-    tempBuffer.data = result->reqbuf;
-    tempBuffer.length = HTTP_REQBUF_SZ;
-
     ClientStreamData newServer = new clientReplyContext(http);
     ClientStreamData newClient = result;
     clientStreamInit(&http->client_stream, clientGetMoreData, clientReplyDetach,
                      clientReplyStatus, newServer, clientSocketRecipient,
-                     clientSocketDetach, newClient, tempBuffer);
+                     clientSocketDetach, newClient, result->getClientStreamBuffer());
 
     /* set url */
     debugs(33,5, "Prepare absolute URL from " <<
diff --git a/src/http/Stream.cc b/src/http/Stream.cc
--- a/src/http/Stream.cc
+++ b/src/http/Stream.cc
@@ -7,6 +7,7 @@
  */
 
 #include "squid.h"
+#include "SquidConfig.h"
 #include "client_side_request.h"
 #include "http/Stream.h"
 #include "HttpHdrContRange.h"
@@ -20,10 +21,10 @@ Http::Stream::Stream(const Comm::Connect
     reply(nullptr),
     writtenToSocket(0),
     mayUseConnection_(false),
-    connRegistered_(false)
+    connRegistered_(false),
+    requestBuffer(nullptr)
 {
     assert(http != nullptr);
-    memset(reqbuf, '\0', sizeof (reqbuf));
     flags.deferred = 0;
     flags.parsed_ok = 0;
     deferredparams.node = nullptr;
@@ -109,12 +110,10 @@ Http::Stream::pullData()
     debugs(33, 5, reply << " written " << http->out.size << " into " << clientConnection);
 
     /* More data will be coming from the stream. */
-    StoreIOBuffer readBuffer;
+    StoreIOBuffer readBuffer = getClientStreamBuffer();
     /* XXX: Next requested byte in the range sequence */
     /* XXX: length = getmaximumrangelenfgth */
     readBuffer.offset = getNextRangeOffset();
-    readBuffer.length = HTTP_REQBUF_SZ;
-    readBuffer.data = reqbuf;
     /* we may note we have reached the end of the wanted ranges */
     clientStreamRead(getTail(), http, readBuffer);
 }
@@ -568,6 +567,18 @@ Http::Stream::deferRecipientForLater(cli
     deferredparams.queuedBuffer = receivedData;
 }
 
+StoreIOBuffer
+Http::Stream::getClientStreamBuffer()
+{
+    if (!requestBuffer) {
+        requestBuffer = new MemBlob(Config.readAheadGap);
+    }
+    StoreIOBuffer tempBuffer;
+    tempBuffer.data = requestBuffer->mem;
+    tempBuffer.length = requestBuffer->spaceSize();
+    return tempBuffer;
+}
+
 void
 Http::Stream::prepareReply(HttpReply *rep)
 {
diff --git a/src/http/Stream.h b/src/http/Stream.h
--- a/src/http/Stream.h
+++ b/src/http/Stream.h
@@ -120,12 +120,13 @@ public:
 
     void deferRecipientForLater(clientStreamNode *, HttpReply *, StoreIOBuffer receivedData);
 
+    StoreIOBuffer getClientStreamBuffer();
+
 public: // HTTP/1.x state data
 
     Comm::ConnectionPointer clientConnection; ///< details about the client connection socket
     ClientHttpRequest *http;    /* we pretend to own that Job */
     HttpReply *reply;
-    char reqbuf[HTTP_REQBUF_SZ];
     struct {
         unsigned deferred:1; ///< This is a pipelined request waiting for the current object to complete
         unsigned parsed_ok:1; ///< Was this parsed correctly?
@@ -158,6 +159,8 @@ private:
 
     bool mayUseConnection_; /* This request may use the connection. Don't read anymore requests for now */
     bool connRegistered_;
+
+    MemBlob::Pointer requestBuffer;
 };
 
 } // namespace Http
diff --git a/src/servers/FtpServer.cc b/src/servers/FtpServer.cc
--- a/src/servers/FtpServer.cc
+++ b/src/servers/FtpServer.cc
@@ -751,15 +751,11 @@ Ftp::Server::parseOneRequest()
     Http::Stream *const result =
         new Http::Stream(clientConnection, http);
 
-    StoreIOBuffer tempBuffer;
-    tempBuffer.data = result->reqbuf;
-    tempBuffer.length = HTTP_REQBUF_SZ;
-
     ClientStreamData newServer = new clientReplyContext(http);
     ClientStreamData newClient = result;
     clientStreamInit(&http->client_stream, clientGetMoreData, clientReplyDetach,
                      clientReplyStatus, newServer, clientSocketRecipient,
-                     clientSocketDetach, newClient, tempBuffer);
+                     clientSocketDetach, newClient, result->getClientStreamBuffer());
 
     result->flags.parsed_ok = 1;
     return result;

_______________________________________________
squid-dev mailing list
[email protected]
http://lists.squid-cache.org/listinfo/squid-dev

Re: [squid-dev] [PATCH] Increase request buffer size to 64kb

Reply via email to