Well I see that you are not getting lots of replies so I will step in. What I do is run a small CRON job that does a few Memcache actions like:
Set key Get key Get status.... And watch the output for issues on each server. I use plain commands like: echo "stats" | nc localhost 11211 This way I see all of the interaction with no client software in the way. When an error shows up in these simple commands I send an Alert to the RightScale Monitor System I use. You may have a Munin or other monitor system. Now I see the exact moment when the Memcached goes sour. This provides a big hint as to why it is happening. Issues I see are: Out Of Memory errors. Time out on connect. No Connect. After a few week of looking at the issues, I can get all of these issues fixed and the code works great. It does take a few hits at first to find issues like DNS falures, Set up values set too low. Please ask for help, Memcache is a great Cache.... Edward M. Goldberg http://myCloudWatcher.com/
