Hey there,

Warning: long mail ahead. I've been meaning to explain some details of shmcb for
a while and here it is. I can now recede further into my woodwork knowing that
I've brain dumped a little :-) If you're at all interested in this stuff please
take a squint through this. It may also help those wanting to understand the
differences between shmcb and shmht, in particular the hidden weaknesses of
shmht that caused shmcb to be, and also give you some implicit tips on *other*
things you need to consider when benchmarking and capacity-planning. At the end
there's also a possible bug alert that I am personably unable to reproduce or
test - please take a look. Having said that, I was actually able to reproducibly
crash mod_ssl completely (every https request to any httpd process core-dumped)
using shmht the last time I tried, so I'm not going to say shmcb is too risky
by comparison :-) (NB: The crash required an extremely perverse timing of
requests that no attacker could reproduce unless no-one else but them ever hit
your site, don't panic :-).

On Thu, 22 Mar 2001, David Rees wrote:

> If you compile the mm library into mod_ssl, and turn on the SSL_EXPERIMENTAL
> flag during the configuration stage of apache, you get another shared memory
> cache ("shmcb") which is supposed to be faster and more robust than the
> standard shared memory cache.  This code was donated by the folks at
> Stronghold (who use mod_ssl in their server) and should be better under load
> than the standard shared memory cache.  I didn't see any performance
> difference with this cache over the standard "shmht" cache.

yup - there's a lot more at play than straight-line speed, and most "normal"
tests would struggle to spot a significant speed improvement. However, speed
improvements do exist depending on your tests. Using hardware acceleration for
crypto, tuning the Apache installation to remove lots of other sludgey
slow-downs (DirectoryIndex is really *slow* with a few entries for example),
using a high degree of parallelism in the benchmarking, and keeping the tests
running over a long-time relative to session expiry times, I've seen shmht
performance degrade over time to the point that it stablised at around 10-15%
slower than shmcb. That's requests/sec speed too which takes into account *all*
processing involved in serving an HTTPS request (I was usually using a repeating
sequence of "SRRSR" where S=new session, R=resume session). Considering how
small a piece of the puzzle session cache operations are supposed to be, 10-15%
is a significant difference - I believe, though can not currently provide
statistics to prove, that what that actually represents is that shmcb was
effectively using virtaully no CPU time relative to other Apache, OpenSSL, and
mod_ssl operations at run-time, whereas shmht was using closer to 10% of CPU
time after it stabilised.

Also, shmcb performance was uniform in all tests I ran - irrespective of cache
utilisation, expiry times, etc. shmht on the other hand would slow down and
speed up depending on a number of factors such as cache utilisation and when it
would decide to do an "expiry run" on the cache. Also, at start-up, the cache
ran faster than it did after tests had been run against it for a while - I think
this is due to fragmentation in the shared-memory segment, but it could well be
other factors too.

> FWIW, I've been using the "shmcb" cache in all my servers (various IRIX and
> Linux machines) with no problems under various light to moderate (1 million
> hits/day) load.
> 
> As for tuning advice for the size of the shared memory cache, it seems that
> every ssl_session uses right around 140-150 bytes per session.  This means
> that with the default session cache size of 512000 bytes, you can support
> about 3500 concurrent users before the cache fills up and the server starts
> expiring sessions early.

This is one of the things people should know about shmcb vs shmht. One of the
major problems with shmht's design is it's handling of expired sessions.

Basically, it's pretty broken from a design point of view - due in part to the
way the session cache is organised and indexed. Session expiry is supposed to be
a security measure, not an optimisation measure. However, shmht will never
forcibly expire sessions from the cache to make room for new sessions, it will
only expire a session after the expiry time. The point of this is that if the
cache fills up (which would probably suggest that your server is *busy*) then
all future session negotiations will not be cached, which will mean less and
less session resumes succeed which means more and more renogiations take place -
all of which will likely make the server busier than it already was. So there's
a good chance your server performance will nose-dive when the cache fills up,
effectively forcing you to choose expiry times and cache sizes based on expected
loads, and not at all based on what session timeouts are supposed to mean.

shmcb handles this differently - if the cache (more precisely, if a sub-cache)
is full and an attempt to store a new session would otherwise fail, it will
start scrolling out the oldest sessions until there is enough space for the new
session. The upshot of this is that if your server is busy and your cache is
filling up so that, for example, the oldest sessions are about 3 minutes old
(and expiry time is set to five minutes), then at that load, sessions will
actually be getting expired from the cache when they are about 3 minutes old.
In other words, the cache is lowering the effective expiry time of sessions on
the fly if the load is too high for the cache. Of course, if the load drops down
again, then eventually the cache/load balance will return to a state when the
cache can store all required sessions so that they're only being expired when
the expiry time is up. The point of all this is that you can set the expiry time
as high as you want - at high-loads it will be overruled by the server so that
new sessions can always be stored. Ie. the expiry setting becomes a *maximum*
rather than an absolute, which is more in keeping with the cryptographic
purpose, which is to say, the expiry time is the admin's way of saying "If the
client tries to keep using a session longer than 2 hours, I want to *force* them
to renegotiate a new session anyway". Saying "this session *must* remain in the
cache for the next 2 hours" is counter-intuitive and risks forcing the server
into a situation where new sessions don't get cached *at all*.

Err, what else. Performance ... you *will* notice a significant performance
difference if you run benchmarks that keep the session cache reasonably full.
The reason for this is that the existing shmht cache uses poor indexing and
iteration operations are *very* slow. Eg. when an expiry is performed in shmht,
it will run across every single SSL_SESSION in the cache one-by-one, ASN1
decoding the serialised forms into SSL_SESSION structures (this is an expensive
operation), and checking the expiry times to decide whether to delete or not.
This is not only expensive, but also (again) performs at its worst in the
condition when you need it at its fastest - namely when the cache is quite full.
It is also the reason for another problem with shmht - namely, that it only
checks expiry times every "SSLCacheTimeout" seconds.

Eg. if the cache has session timeouts set to 300 seconds, then the first time a
real test is performed on the cache to expire old sessions, a time_t flag is set
so that no future tests will actually do any real work until after another 300
seconds have passed. If another test is called for by the code after 302
seconds, then a real test will be performed and the time_t value reset again.
The problem with this is that if you are getting sessions reasonably uniformly
(which is the normal scenario), and you are in the situation where you are
utilising your cache at nearly full capacity, then the cache actually loses 33%
of its storage. Consider a real expiry run performed on the cache at 12:00:00pm
followed by a new session added to the cache at 12:00:05pm. If the next *real*
expiry run happens at 12:05:01pm (here we are using 5 minute expiry times), then
that session will be *nearly* ready to expire but not quite, so it will stay in
the cache for another 5 minutes (at least). But 4 seconds later, that session is
unusable and is simply occupying space that could block new sessions from being
added to the cache.

What else ... concurrency - shmcb chooses a bit-mask value at startup based on
the cache size to divide the shared memory segment into a number of smaller
caches - "subcaches". The motivation for this was as follows; one of the
problems with shmht is that the first step in any add/get/delete/expire
operation is to grab the global mutex lock (this prevents other processes from
operating on the cache at the same time). It then performs its operations, and
as I've explained they can be time consuming given the structure of shmht's data
management, and finally releases the lock so other processes can get to the
cache too. If you're benchmarking using only 1 request at a time, then you may
notice this sort of problem a lot less than if you're benchmarking with a lot of
parallelism. However, shmcb doesn't actually do as much to improve this as it
could - the global mutex lock is still there despite the fact that every single
operation works on only one sub-cache, so in theory we could use a lock for each
sub-cache (or at least a lock for each group of "n" sub-caches perhaps) so that
on average lots of cache operations can be happening at the same time. However,
at least by using smaller caches each operation is going to be a lot faster than
if it had to work on a single much larger cache. Also, shmcb uses both expiry
times and session hints in the indexing structure so that it will never have to
do ASN operations during expiries, and will virtually never do an ASN decode
during a "get" to find it has decoded the wrong session. The other major win is
that expiries in shmcb run at constant-time irrespective of cache utilisation.
Expiries in shmht require an iteration across every session in the cache
(including ASN decodes), which as mentioned is why shmht will not do real
expiry-iterations very often (resulting in wasted space). shmcb expires old
sessions in a (sub-)cache by simply incrementing the cyclib-buffer's "start"
index to the first non-expired index entry, requiring no memcpy() or memmove()
operations, no ASN decodes, and no iterations. For this reason, expiries happen
implicitly during every single (sub-)cache operation.

BTW: shmht also uses a generic hash library for storing sessions in the
shared-memory segment (the hash library uses callbacks into "mm" functions to
actually malloc() and free() storage for individual items in the shared memory
segment on the fly).  The result of this is that the shared memory segment can
become fragmented over time, particularly if the cache is not often very empty.
This results in progressively slower malloc() callbacks, and also wasted space
in the cache. shmcb on the other hand keeps all session data packed - each
subcache tags new session data onto the tail of the cyclib buffer and removes
old session data off the front of the cyclic buffer before moving the "front"
index forward. So fragmentation doesn't happen, and indexing tricks become a
whole tonne-load easier. The other loss of using a generic hash library is that
it is quite obviously unecessary in the scenario of SSL/TLS session caching -
the session ID is chosen by the server during an SSL/TLS handshake using
OpenSSL's PRNG - ie. it is random. So you can use however many of the
session-ID's bytes as you like as a hash-value, and it's about as good a hash
value as you're ever going to get. :-) shmcb uses the first byte of the session
id to determine which of the sub-caches the operation applies to (whether it's
being stored into the cache, queried out of the cache by session ID, or deleted
from the cache by session ID). It also stores the second byte of the session id
in the index so that "get" operations can run a quick check for a match on the
second byte without yet doing an ASN decode of the actual session - so you will
only ASN decode the session to find the wrong session (and thus go back to
looking at indexes some more) if you are, as we say, "unlucky".

> There is no limit on the number of sessions cached when using the dbm cache.
> 
> I usually double the size of the session cache (1024000) and also double the
> length that a session can be cached for.  You'll want to avoid letting the
> cache get too full (over 75-80%) since the performance of the cache will
> likely start to drop at that point.

See my above points :-)

FWIW: I've also heard of some outstanding SIGBUS problems on a couple of "odd"
systems (where odd equals any commercial OS that requires sums of money and/or
hardware I don't want or can't afford or both ;-) - the reason for these SIGBUS
errors are why the "safe" functions are in shmcb ...

The data stored in the shared-memory segment in shmcb are kept packed and all
pointer arithmetic inside the cache uses "sizeof()" operators. However, some
systems require pointers to be aligned on 2- or 4-byte boundaries when writing
or reading some primitive types (other than 1-byte types of course). The safe
functions were, if memory serves, written in the psuedo-form;

 some_type shmcb_get_safe_some_type(some_type *ptr) {
     some_type toreturn;
     unsigned char *dummy = (unsigned char *)ptr;
     memcpy(&toreturn, dummy, sizeof(some_type));
     return roreturn;
 }

The point being that this would read and write "primitive" types with alignment
restrictions as memcpy() operations, ie. byte-by-byte. The problem that came up
is that gcc, in all its infinite wisdom, was in a particular case realising what
was going on, and in fact kindly optimising this function to the "equivalent"
form I was trying to avoid in the first place, ie.

 some_type shmcb_get_safe_some_type(some_type *ptr) {
     return *ptr;
 }

which of course lands the code right back in SIGBUS territory. I mention this so
that if anyone else is out there playing with this sort of code and spots a
SIGBUS - please recompile with debugging (eg. "-g -ggdb3" in gcc) and check the
back-trace of any core files - if it is a problem such as this we should fix it.
Of course, it's possible debugging may disable the "optimisation" but at least
the case I saw way back still occurred in the debugging situation.

That's all that springs to mind right now - and much of it is operating from
distant memory (it's been I a while since I looked at this stuff). Please send
me any comments or corrections.

Cheers,
Geoff



______________________________________________________________________
Apache Interface to OpenSSL (mod_ssl)                   www.modssl.org
User Support Mailing List                      [EMAIL PROTECTED]
Automated List Manager                            [EMAIL PROTECTED]

Reply via email to