socache/shmap & ZooKeeper provider

Chris Darroch Thu, 01 May 2008 14:30:52 -0700

Hi --

  I wanted to get a little experience with the "socache" (small object
cache) providers which Joe Orton recently refactored out of mod_ssl
and so I've written a pair of modules, mod_shmap and mod_socache_zookeeper,
which are available here:


http://people.apache.org/~chrisd/projects/shared_map/

  The mod_shmap module allows HTTP clients to access any configured
socache provider's storage using GET, PUT, and DELETE requests; the
URI path is used as the ID key.  This might be a useful way for
clients without a native API to access these providers.  My immediate
purpose, though, was to write a module that served as a test harness
for the various providers and an example of how to use them without
making people slog through mod_ssl.

  The mod_socache_zookeeper module is an experimental socache provider
that uses ZooKeeper as its data store.[1][2]  ZooKeeper is distributed,
highly reliable coordination service similar to Google's Chubby lock
service.[3]  Like Chubby, it appears to implement the Paxos consensus
algorithm.[4][5][6]  (ZooKeeper's documentation and code comments are
a little sketchy in this regard, however.)

  A key caveat about mod_socache_zookeeper is that it simply ignores
expiry times at the moment.  An enhancement would be to have a
background thread that periodically culled expired nodes.

  Here's an example configuration that maps all HTTP requests to
ZooKeeper, except for those under /shm, which go to a shared-memory
cyclic buffer cache:

SharedMapProvider zookeeper zk1.example.com:7000

<Location /shm>
   SharedMapProvider shmcb /tmp/shm
</Location>

SetHandler shmap

  Based on the experience of writing these modules, I have a few
thoughts and notes for discussion, in no particular order.


  I confess I still find "socache" sits oddly with me as a name,
both because of its similarity to mod_so and .so files, and because
I continue to doubt everyone will treat these providers as always
implementing caches only.  It's true that some providers will
always impose data size limits, but that could be something the
caller can interrogate and reject or require as necessary.
It would also be valuable, I think, to disambiguate these providers
from the different functionality of mod_cache and its related modules.
So I'd again suggest modules/shmap (for "shared map") as a possible
location and naming scheme.

  Another very minor naming issue is that AP_SOCACHE_FLAG_NOTMPSAFE
reads as "no temp safe" on first glance; perhaps NOT_MP_SAFE
or NOT_ATOMIC would be more readable?

  I ran into three particular inconsistencies when coding which
I think could be addressed quickly:

a) The store() call should take an apr_pool_t argument like
  retrieve() and delete() for temporary allocations.

b) The delete() call should return an apr_status_t like the other
  two, since complex providers may fail here.

c) All providers should always return APR_NOTFOUND from retrieve() and
  delete() when data is not found.  Currently at least the shmcb
  provider returns APR_EGENERAL in this case which makes it
  impossible to distinguish the "not found" case from serious errors.

  Another minor problem is that many of the error messages in
the providers betray their mod_ssl origins in their error messages,
such as "SSLSessionCache: Failed to Create Server" and so forth.

   Following my instinct that some users may not care about the
caching/expiry side of things, I think allowing expiry = 0 to mean
"retain as long as possible" would be useful.

  The namespace and hints arguments to the init() call are
somewhat underused and also rather specific to the existing providers.
I briefly thought I might be able to pass reslist min/max values
in the hints but the ap_socache_hints structure isn't the right
place; instead they'd need to be packed into the single string
argument passed to create().  At the moment the memcached provider
just hard-codes these values; so does my ZookKeeper provider.
I wonder if there's a way to open this up a little and make
per-provider-instance configuration easier, but I don't have
a specific idea here.

  In a related vein, I think a naive user is going to invoking
create() and init() at the right time a little tricky.  The
create() calls usually create an instance structure and parse
arguments, but don't otherwise initialize.  In order for their
messages to make it to the console at startup time, one needs
to invoke create() in the check_config phase, not post_config,
since by then stderr is redirected to the logs.

  In mod_ssl, create() is called with s->process->pool, which
means that if a provider is later unloaded, the structures it
allocates in create() remain around forever.  Global mutexes
are also created out of s->process->pool, and similarly remain
around indefinitely.  In mod_shmap I use pconf exclusively to
try to iron out these issues.

  Meanwhile, init() should ideally be called only during the
second and subsequent configuration pass, so you need to
do some magic with userdata in s->process->pool to avoid the
initial configuration pass.  Here pconf is used by mod_ssl and
mod_shmap, and that's good, except that it does introduce some
potential (if unlikely) interactions between graceful restarts
and shared-memory segments on some platforms.  Specifically,
if APR is using a named segment from shmget(), and ftok() returns
a different key after a graceful restart, then new processes
are attached to a new segment, while lingering processes from the
previous generation write to the old segment.  This is really
a complexity with shared memory, I think, and should be addressed
(if at all) within the provider; pconf is still the right pool
to pass to init() generally, I believe.

  Finally, just as a note, I used some tricks from mod_dbd in
mod_socache_zookeeper.  In particular, rather than opening connections
in init(), init() just creates a singly-linked list of instances,
and a child_init hook then creates a reslist of connections to
ZooKeeper in each child process.  This makes destroy() a no-op,
among other things.  However, it does require working around the problem
of avoiding both leaking resource structures and double-free segfaults
on shutdown that mod_dbd used to have.  This problem stems from the
fact that resources' sub-pools are destroyed prior to the reslist's
cleanup function being invoked, at which point the cleanup then invokes
each resource's destructor.[7][8]


  OK, well, I hope someone gets some utility out of these things,
and please email me with any bugs or suggestions.

Chris.

[1] http://zookeeper.sourceforge.net/
[2] http://zookeeper.wiki.sourceforge.net/
[3] http://labs.google.com/papers/chubby.html
[4] http://labs.google.com/papers/paxos_made_live.html
[5] http://research.microsoft.com/users/lamport/pubs/pubs.html#paxos-simple
[6] http://en.wikipedia.org/wiki/Paxos_algorithm
[7] http://mail-archives.apache.org/mod_mbox/apr-dev/200612.mbox/[EMAIL 
PROTECTED]
[8] http://mail-archives.apache.org/mod_mbox/apr-dev/200609.mbox/[EMAIL 
PROTECTED]

--
GPG Key ID: 366A375B
GPG Key Fingerprint: 485E 5041 17E1 E2BB C263  E4DE C8E3 FA36 366A 375B

socache/shmap & ZooKeeper provider

Reply via email to