Your thoughts on the following?
Current OCSP behavior that I think needs to be fixed:
mod_ssl holds the single stapling global mutex when looking up a
cached entry,
deserializing it, checking validity, and (when missing/expired)
communicating
with the OCSP responder to get a new response.
1. mod_ssl shouldn't hold the single stapling global mutex when
talking to
the OCSP responder. This will stall ALL initial handshakes in all
stapling-
enabled vhosts, regardless of the certificate they use.
2. For the cache itself, mod_ssl shouldn't hold the single stapling
global
mutex when looking up a cached entry unless the socache type
requires it
for its own purposes. (memcached and distcache do not require it.)
Assumption: The cache can be shared among different httpd instances
(e.g., via
memcached) but getting different instances to agree on which instance
refreshes
the cache is not worth handling for now. (Let multiple instances
refresh if
the timing is unlucky.)
What must be serialized globally within an httpd instance?
1. If the socache provider requires it: Any access to the stapling
cache.
2. A thread claiming responsibility for refreshing the cached entry.
Why no global mutex per certificate?
1. There could be a large number of certificates, and lots of global
mutexes
could be very surprising or even require OS tuning with some mutex
types.
2. A single mutex is required to interact with the cache anyway (when
the
cache requires a mutex).
3. That doesn't resolve the decision of which thread fetches a new
response
anyway.
Solution A: Prefetching in a daemon process/thread per httpd instance
The request processing flow would be most unlikely to block for stapling
if a daemon is responsible for maintaining the cache and the request
thread
never has to look anything up. That leaves a race between
prefetching the
first time and requests hitting the server right after server startup.
(Browsers may report an error to the user when tryLater is returned.)
The daemon would try to renew stapling responses ahead of the time that
the existing response could no longer be used. If it can't, the error
path on the request thread would be the same as the current handling of
an inability to fetch a new response.
Solution B: Fetch on demand largely like current code, but utilize a
separate Fetch mutex
Hold the stapling cache mutex just while reading from/writing to the
cache; grab the Fetch mutex when needing to perform a lookup.
(Once obtaining the Fetch mutex, you'd need to look in the cache again
to see if another request thread did the lookup/store while waiting
for the Fetch mutex.)
By itself this doesn't solve potentially blocking a bunch of initial
handshakes when performing a lookup, but at least it solves blocking
requests that already have a cached response (different certificate)
when performing a lookup.
A fairly simple improvement to this would be to have a small number
of Fetch mutexes, where each certificate maps to a specific fetch
mutex (but not vice versa), so that lookups for multiple certificates
could be done at once. This doesn't solve blocking all initial
handshakes for a certificate that needs a fresh response, or completely
solve blocking those for other certificates that need a fresh response
(since multiple certificates could map to the same Fetch mutex).
Solution C: Hybrid of A and B
The request thread implements solution B but generally a lookup on
the request thread won't be needed since the daemon has already done
the work. But at server startup the daemon and the request threads
might fight over the Fetch mutex until responses for commonly-used
certificates had been obtained/cached. This solves a potential lack
of responses at server startup.
Since the request thread is able to do the work in a pinch, this
lends itself to a "SSLStaplingPrefetch On|Off" directive that could
be used to disable the prefetch daemon.