Re: httpd 2.4.25, mpm_event, ssl: segfaults

Jacob Champion Tue, 21 Feb 2017 15:14:28 -0800

On 02/19/2017 01:37 PM, Niklas Edmundsson wrote:

On Thu, 16 Feb 2017, Jacob Champion wrote:

So, I had already hacked my O_DIRECT bucket case to just be a copy of
APR's file bucket, minus the mmap() logic. I tried making this change
on top of it...


...and holy crap, for regular HTTP it's *faster* than our current
mmap() implementation. HTTPS is still slower than with mmap, but
faster than it was without the change. (And the HTTPS performance has
been really variable.)


I'm guessing that this is with a low-latency storage device, say a
local SSD with low load? O_DIRECT on anything with latency would require
way bigger blocks to hide the latency... You really want the OS
readahead in the generic case, simply because it performs reasonably
well in most cases.

I described my setup really poorly. I've ditched O_DIRECT entirely. Thebucket type I created to use O_DIRECT has been repurposed to just be acopy of the APR file bucket, with the mmap optimization removedentirely, and with the new 64K bucket buffer limit. This new"no-mmap-plus-64K-block" file bucket type performs better on my machinethan the old "mmap-enabled" file bucket type.

(But yes, my testing is all local, with a nice SSD. Hopefully that getsa little closer to isolating the CPU parts of this equation, which isthe thing we have the most influence over.)

I think the big win here is to use appropriate block sizes, you do more
useful work and less housekeeping. I have no clue on when the block size
choices were made, but it's likely that it was a while ago. Assuming
that things will continue to evolve, I'd say making hard-coded numbers
tunable is a Good Thing to do.


Agreed.

Is there interest in more real-life numbers with increasing
FILE_BUCKET_BUFF_SIZE or are you already on it?

Yes please! My laptop probably isn't representative of most servers; itcan do nearly 3 GB/s AES-128-GCM. The more machines we test, the better.

I have an older server
that can do 600 MB/s aes-128-gcm per core, but is only able to deliver
300 MB/s https single-stream via its 10 GBps interface. My guess is too
small blocks causing CPU cycles being spent not delivering data...

Right. To give you an idea of where I am in testing at the moment: Ihave a basic test server written with OpenSSL. It sends a 10 MiBresponse body from memory (*not* from disk) for every GET it receives. Ialso have a copy of httpd trunk that's serving an actual 10 MiB filefrom disk.

My test call is just `h2load --h1 -n 100 https://localhost/`, whichshould send 100 requests over a single TLS connection. The ciphersuiteselected for all test cases is ECDHE-RSA-AES256-GCM-SHA384. Forreference, I can do in-memory AES-256-GCM at 2.1 GiB/s.


- The OpenSSL test server, writing from memory: 1.2 GiB/s
- httpd trunk with `EnableMMAP on` and serving from disk: 850 MiB/s
- httpd trunk with 'EnableMMAP off': 580 MiB/s
- httpd trunk with my no-mmap-64K-block file bucket: 810 MiB/s

So just bumping the block size gets me almost to the speed of mmap,without the downside of a potential SIGBUS. Meanwhile, the OpenSSL testserver seems to suggest a performance ceiling about 50% above where weare now.

Even with the test server serving responses from memory, that seems likeplenty of room to grow. I'm working on a version of the test server thatserves files from disk so that I'm not comparing apples to oranges, butmy prior testing leads me to believe that disk access is not thelimiting factor on my machine.


--Jacob

Re: httpd 2.4.25, mpm_event, ssl: segfaults

Reply via email to