Re: Solving mutex concerns with OCSP stapling

Jeff Trawick Tue, 12 May 2015 12:11:31 -0700

On 05/06/2015 08:19 PM, Jeff Trawick wrote:

On 05/03/2015 09:58 PM, Jeff Trawick wrote:
Your thoughts on the following?
Current OCSP behavior that I think needs to be fixed:
mod_ssl holds the single stapling global mutex when looking up acached entry,deserializing it, checking validity, and (when missing/expired)communicating
with the OCSP responder to get a new response.
1. mod_ssl shouldn't hold the single stapling global mutex whentalking tothe OCSP responder. This will stall ALL initial handshakes in allstapling-
   enabled vhosts, regardless of the certificate they use.
2. For the cache itself, mod_ssl shouldn't hold the single staplingglobalmutex when looking up a cached entry unless the socache typerequires it
   for its own purposes.  (memcached and distcache do not require it.)
Assumption: The cache can be shared among different httpd instances(e.g., viamemcached) but getting different instances to agree on which instancerefreshesthe cache is not worth handling for now. (Let multiple instancesrefresh if
the timing is unlucky.)

What must be serialized globally within an httpd instance?
1. If the socache provider requires it: Any access to the staplingcache.
2. A thread claiming responsibility for refreshing the cached entry.

Why no global mutex per certificate?
1. There could be a large number of certificates, and lots of globalmutexescould be very surprising or even require OS tuning with some mutextypes.2. A single mutex is required to interact with the cache anyway (whenthe
cache requires a mutex).
3. That doesn't resolve the decision of which thread fetches a newresponse
anyway.

Solution A: Prefetching in a daemon process/thread per httpd instance

The request processing flow would be most unlikely to block for stapling
if a daemon is responsible for maintaining the cache and the requestthreadnever has to look anything up. That leaves a race betweenprefetching the
first time and requests hitting the server right after server startup.
(Browsers may report an error to the user when tryLater is returned.)

The daemon would try to renew stapling responses ahead of the time that
the existing response could no longer be used.  If it can't, the error
path on the request thread would be the same as the current handling of
an inability to fetch a new response.
Solution B: Fetch on demand largely like current code, but utilize aseparate Fetch mutex
Hold the stapling cache mutex just while reading from/writing to the
cache; grab the Fetch mutex when needing to perform a lookup.
(Once obtaining the Fetch mutex, you'd need to look in the cache again
to see if another request thread did the lookup/store while waiting
for the Fetch mutex.)

By itself this doesn't solve potentially blocking a bunch of initial
handshakes when performing a lookup, but at least it solves blocking
requests that already have a cached response (different certificate)
when performing a lookup.

A fairly simple improvement to this would be to have a small number
of Fetch mutexes, where each certificate maps to a specific fetch
mutex (but not vice versa), so that lookups for multiple certificates
could be done at once.  This doesn't solve blocking all initial
handshakes for a certificate that needs a fresh response, or completely
solve blocking those for other certificates that need a fresh response
(since multiple certificates could map to the same Fetch mutex).

Solution C: Hybrid of A and B

The request thread implements solution B but generally a lookup on
the request thread won't be needed since the daemon has already done
the work.  But at server startup the daemon and the request threads
might fight over the Fetch mutex until responses for commonly-used
certificates had been obtained/cached.  This solves a potential lack
of responses at server startup.

Since the request thread is able to do the work in a pinch, this
lends itself to a "SSLStaplingPrefetch On|Off" directive that could
be used to disable the prefetch daemon.
FWIW I'm just testing solution B for the moment. I think that theability to prefetch is needed for the busiest sites to avoid weirdpileups, but B seems necessary anyway.


r1679032 implements Plan B, without using multiple Fetch mutexes.

Some further thoughts:

Alternative to Plan A for prefetching: A request thread realizes thatstapling response will expire "soon", claims responsibility forrefreshing it so that other request threads don't do so, and does thework; this avoids another execution thread to perform the prefetch.Somehow claiming responsibility seems like it will add its owncomplication (need stapling cache entry type at front of cert-basedcache key, and one type is response and another type is refreshresponsibility???). r1679032 would still be used when this isn't done,such as when there isn't demand in time (e.g., mass vhosting, for somevalue of "mass").

Plan A could presumably use some "mod_daemon" or similar that letsmodules off-load non-request-related work to a separate child process(or thread on Windows). mod_ssl_ct is an existing module its ownservice work daemon. It doesn't seem so useful to keep adding more andmore ;)

--
Born in Roswell... married an alien...
http://emptyhammock.com/

Re: Solving mutex concerns with OCSP stapling

Reply via email to