On Tue, Feb 23, 2010 at 7:46 PM, Henrik Schröder <[email protected]> wrote:

> Yes, if you use automatic recovery from failover your cache can get
> unsynchronized as different parts of your application discover that a
> previous failing server is now back up at different points in time.
>
> If synchronization is very important to your application, make sure you
> don't use automatic recovery from failover, and you won't get this problem.
> The flipside to that is that when you want to put back servers in the
> cluster, you need to restart your application so that all parts of it get
> the updated server list at the same time.
>
> Another way of solving the problem is to not use failover at all. If your
> application is fine with more cache misses as long as one of your cache
> servers is down, then that solution is the best. You will never have
> synchronization problems, and you don't have to restart your application to
> bring back servers into the cluster.
>

Another option is to make your configuration dynamically reloadable.  As
long as your client code doesn't hold any instance for longer than a
request, it should be fairly easy to change at runtime.

We built a configuration mechanism in our system such that any app we bring
up listens on an "admin configuration" topic in our message queue, so all we
need to do is push a new config to all machines and issue a ReloadMcConfig
command on the queue. All clients will be dynamically updated with the new
config within a few seconds and synchronization problems are basically
nonexistent.

In reality though, I have never seen a memcached server crash in production.
> It is very, very stable. Normally you don't have to worry about what happens
> if one server goes down, because they never do.
>

Yes, it's very, very rare that memcached fails-- I don't know that memcached
itself has ever actually crashed on us in production, though we have had
hardware failures or the like.  More often than not, we only use this
mechanism to add new machines to the queue or to alter pool setups.

As an aside, is there an FAQ entry anywhere about this synchronization
scenario?  It seems like almost everybody who's first introduced to
memcached jumps through these same mental hoops and thinks they've found a
fatal flaw in the design.  I feel like it'd be advantageous if there was a
help item somewhere that explained how memcached works as well as it does
precisely because machines are completely unaware of each other; simplicity
and consistency are the keys (pardon the pun) to memcached.

-- 
awl

Reply via email to