I would hope that your datacenters are connected by some form of private
link...
Anyway, I *do* in fact like keeping the mysql updates paired with the
memcached updates. This only applies if your remote datacenters are
allowed to update their cache using local read slaves (which they should,
I'd guess). Odds are whatever other system you implement will at least
sometimes replay a remote mysql delete before the mysql update hits the
remote datacenter. This lag is also why facebook's article talks about
having that cookie which'll pin you to your active datacenter for a few
seconds after an update...
There's some text on this... was hoping to put up a nice FAQ entry or post
on it, but.. time... urgh.. anyway.
Instead of modifying the MySQL grammar to do it, you can use the
libmemcached MySQL UDF's by brian and patrick and embed DELETE or SET
commands in with your INSERT's. So long as all of your clients are
libmemcached based, your server lists should hash correctly. It's a little
complicated but doable.
-Dormando
On Thu, 11 Sep 2008, Chris wrote:
Another "nice to have" - security. Each datacenter has its own
private network, and memcached only listens on the private networks.
It would be nice to configure any services listening on public
interfaces to only accept connections from a specific IP address
range, or only from a list of users who authenticate themselves with a
signed certificate.
ActiveMQ may accomplish these goals - I'll take a look.
On Sep 11, 3:44 pm, "Gavin M. Roy" <[EMAIL PROTECTED]> wrote:
You could use something like Apache ActiveMQ and consumer scripts to
accomplish this. You could support the whole memcache grammar and have a
consumer that just repeats commands into distributed memcached clusters.
Regards,
Gavin
On Thu, Sep 11, 2008 at 3:27 PM, Chris <[EMAIL PROTECTED]> wrote:
I was wondering if anyone had any better solutions for cache
consistency with geographically distributed memcached clusters.
The problem: Having just one big memcached cluster is great if you
only have one datacenter, but if you have datacenters in a couple
different locations around the world, latency becomes a big problem.
Making a couple memcached queries from US -> Europe for a single
client request can make page loads unacceptably slow.
Our current solution to this problem is to have multiple memcached
clusters, one for each geographic region (Europe/US/Asia).
Unfortunately, keeping them in sync with the underlying data (mysql,
using replication) in an unpleasant problem.
Facebook had a solution to this that they wrote about on their
engineering blog (http://www.facebook.com/note.php?note_id=23844338919
). They modified the mysql query grammar to support a list of keys to
invalidate.
Does anyone have any other interesting solutions to this problem?
(Keeping in mind that "only using one memcached cluster" likely won't
work because there is too much latency)