>  have two servers in AWS.  One is a live production server (a multi site 
> WordPress installation with hundreds of sites and about 5,000 users) and the 
> other is a clone of prod that is being configured for a test server.  The 
> live one has four array servers, an Elastic
> Load Balancer and is connected to a large RDS in AWS.  And until yesterday, I 
> naively thought our caching was being handled via APC and a WordPress plugin 
> here and there.  But no.  Turns out someone here had added AWS's ElastiCache 
> to our live server.  Essentially,
> ElastiCache is memcache for those not in the cloud.
>
> Anyway, we tried to enable caching on our test server two days ago and it 
> introduced a really strange bug (a redirect mysteriously appeared on our live 
> site's main admin dashboard that then went to our test server).  So once we 
> realized the bug was most likely
> related to a caching system we didn't know we had, we disabled caching.  As 
> it turned out, when we enabled caching on our test server, it used the same 
> Elasticache server our live server was using (because test was a clone of 
> live).  So we disabled it when we
> removed/renamed the object-cache.php file.
>
> Disabling it solved our redirect issue, but suddenly, many (not all) of our 
> 5,000 users could no longer log into their individual sites.  For some 
> reason, the values that were in our database were not working for a good 
> percentage of users, forcing them to have to
> reset their passwords instead.  Obviously, this is huge with 5,000 users in 
> the mix.  So we reenabled caching on our live instance and decided to fix our 
> cached redirect with WP configuration changes instead (we 
> added define('RELOCATE',true); into the config to force
> the redirection to our test server to be overridden).  
>
> One of the things we noticed with memcache was that it kept updating our 
> wp_options table with the domain for the test server in place of our live 
> one.  In fact, it's still doing it whenever I run a query to find the string 
> for the test domain and update it to the
> live domain. Every few minutes, the caching changes it back. Scary. But it 
> looks like our configuration change for now forces an override.  The really 
> concerning thing about all this was the fact that it seems memcache is 
> drawing from its own key:value pairs for the
> user passwords instead of directly from the database.  I mean with caching 
> enabled, the users can get in.  Without it, many users are forced to reset 
> their passwords.
>
> Does anyone have any ideas for me as to how to effectively understand what's 
> going on with memcache in this case and how to fix it so the database gets 
> written to appropriately and so password info isn't just being held in the 
> cache?  To my thinking it's a ticking
> time bomb.  All it would take is one flush_all command to make life very, 
> very painful for most of my users.
>
> We are on Nginx with MySQL on the RDS.

This sounds like an issue with wordpress' usage of memcached, not
something about using memcached itself. You'll probably get a lot further
asking WordPress people who are familiar with how it uses memcached?

Memcached itself doesn't talk to any databases nor does it have any logic,
it's just a binary key/value blob store with an API.

Reply via email to