On 02/03/2017 12:30 AM, Niklas Edmundsson wrote:
Methinks this makes mmap+ssl a VERY bad combination if the thing
SIGBUS:es due to a simple IO error, I'll proceed with disabling mmap and
see if that is a viable way to go for our workload...
(Pulling from a parallel conversation, with permission)
The question has been raised: is our mmap() optimization really giving
us the utility we want for the additional instability we pay? Stefan had
this to say:
On 02/03/2017 08:32 AM, Stefan Eissing wrote:
Experimented on my Ubuntu 14.04 image on Parallels on MacOS 10.12,
MacBook Pro mid 2012. Loading a 10 MB file 1000 times over 8
connections:
h2load -c 8 -t 8 -n 1000 -m1 http://xxx/10mb.file
using HTTP/1.1 and HTTP/2 (limit of 1 stream at a time per
connection). Plain and with TLS1.2, transfer speeds in GByte/sec
from localhost:
H1Plain H1SSL H2Plain H2SSL
MMAP on 4.3 1.5 3.8 1.3
off 3.5 1.1 3.8 1.3
HTTP/2 seems rather unaffected, while HTTP/1.1 experiences
significant differences. Hmm...
and I replied:
Am 03.02.2017 um 21:47 schrieb Jacob Champion <[email protected]>:
Weird. I can't see any difference for plain HTTP/1.1 when just
toggling EnableMMAP, even with EnableSendfile off. I *do* see a
significant difference for TLS+HTTP/1.1. That doesn't really make
sense to me; is there some other optimization kicking in?
sendfile blows the mmap optimization out of the water, but naturally
it can't kick in for TLS. I would be interested to see if an
O_DIRECT-aware file bucket could speed up the TLS side of things
without exposing people to mmap instability.
I was also interested to see if there was some mmap() flag we were
missing that could fix the problem for us. Turns out a few systems (used
to?) have one called MAP_COPY. Linus had a few choice words about it:
http://yarchive.net/comp/linux/map_copy.html
Linus-insult-rant aside, his point applies here too, I think. We're
using mmap() as an optimized read(). We should be focusing on how to use
read() in an optimized way. And surely read() for modern systems has
come a long way since that thread in 2001?
Considering the massive amount of caching that's built into the entire
HTTP ecosystem already, O_DIRECT *might* be an effective way to do that
(in which we give up filesystem optimizations and caching in return for
a DMA into userspace). I have a PoC about halfway done, but I need to
split my time this week between this and the FCGI stuff I've been
neglecting.
--Jacob