On 02/03/2017 12:30 AM, Niklas Edmundsson wrote:
Methinks this makes mmap+ssl a VERY bad combination if the thing
SIGBUS:es due to a simple IO error, I'll proceed with disabling mmap and
see if that is a viable way to go for our workload...

(Pulling from a parallel conversation, with permission)

The question has been raised: is our mmap() optimization really giving us the utility we want for the additional instability we pay? Stefan had this to say:

On 02/03/2017 08:32 AM, Stefan Eissing wrote:
Experimented on my Ubuntu 14.04 image on Parallels on MacOS 10.12,
MacBook Pro mid 2012. Loading a 10 MB file 1000 times over 8
connections:

h2load -c 8 -t 8 -n 1000 -m1 http://xxx/10mb.file

using HTTP/1.1 and HTTP/2 (limit of 1 stream at a time per
connection). Plain and with TLS1.2, transfer speeds in GByte/sec
from  localhost:

           H1Plain H1SSL  H2Plain H2SSL
MMAP on    4.3     1.5    3.8     1.3
     off   3.5     1.1    3.8     1.3

HTTP/2 seems rather unaffected, while HTTP/1.1 experiences
significant  differences. Hmm...

and I replied:

Am 03.02.2017 um 21:47 schrieb Jacob Champion <[email protected]>:
Weird. I can't see any difference for plain HTTP/1.1 when just
toggling EnableMMAP, even with EnableSendfile off. I *do* see a
significant difference for TLS+HTTP/1.1. That doesn't really make
sense to me; is there some other optimization kicking in?

sendfile blows the mmap optimization out of the water, but naturally
it can't kick in for TLS. I would be interested to see if an
O_DIRECT-aware file bucket could speed up the TLS side of things
without exposing people to mmap instability.

I was also interested to see if there was some mmap() flag we were missing that could fix the problem for us. Turns out a few systems (used to?) have one called MAP_COPY. Linus had a few choice words about it:

    http://yarchive.net/comp/linux/map_copy.html

Linus-insult-rant aside, his point applies here too, I think. We're using mmap() as an optimized read(). We should be focusing on how to use read() in an optimized way. And surely read() for modern systems has come a long way since that thread in 2001?

Considering the massive amount of caching that's built into the entire HTTP ecosystem already, O_DIRECT *might* be an effective way to do that (in which we give up filesystem optimizations and caching in return for a DMA into userspace). I have a PoC about halfway done, but I need to split my time this week between this and the FCGI stuff I've been neglecting.

--Jacob

Reply via email to