We're testing verison 1.5.10 as a possible upgrade candidate for out older 
memcached servers, using a pool 9 servers.  They are running in parallel 
with the production pool, also 9 servers.  For the test all read requests 
are going to the production pool, and all updates (set, delete, etc...) are 
sent to one server in production pool and one server in the 1.5.10 pool via 
the key hashing algorithm.

That setup had been running without incident for about 12 days then 
yesterday two of the servers experienced the mass of CLOSE_WAIT connections 
similar to what's been described here.  We were able to collect some data, 
but not enough to figure out what's happening.  So I'm hoping to kickstart 
a discussion here about how to diagnose what's going on.  Until we can find 
way to explain (and prevent) another problem like this, we're unable to 
upgrade.

I can provide more information about our configuration.  I'm just not sure 
what bits are useful/interesting.  I will note that we're using "extstore" 
functionality on the new servers.

-jj

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to