The machine isn't swapping, actually. I'll try to "catch" it happening next time and see if I can get more information about the connections used . . . and also look into upgrading to 1.4.1, hopefully that helps.
On Sep 15, 6:19 pm, Vladimir <[email protected]> wrote: > I do question whether those would actually cause load to spike up. > Perhaps connection refused but I suspect those two ie. load spike and > connection refused are linked. Please correct if I am wrong. I just > checked my tcp_time_wait metrics and they peak around 600 even during > these load spikes. > > Eric Day wrote: > > If you discover this is a TIME_WAIT issue (too many TCP sockets > > waiting around in kernel), you can tweak this in the kernel: > > > # cat /proc/sys/net/ipv4/tcp_fin_timeout > > 60 > > > # cat /proc/sys/net/ipv4/ip_local_port_range > > 32768 61000 > > > 61000-32768= 28232 > > > (these are the defaults on Debian Linux). > > > So you only have a pool of 28232 sockets to work with, and each will > > linger around for 60 seconds in a TIME_WAIT state even after being > > close()d on both ends. You can increase your port range and lower > > your TIME_WAIT value to buy you a larger window. Something to keep > > in mind though for any clients/servers that have a high connect rate. > > > -Eric > > > On Tue, Sep 15, 2009 at 08:48:39PM -0400, Vladimir wrote: > > >> Too many connections in CLOSE_WAIT state ? > > >> Anyways I would highly recommend installing something like Ganglia to > >> get > >> some types of metrics. > > >> Also at 35-50 machine is not doing much other than swapping. > > >> Stephen Johnston wrote: > > >> This is a total long shot, but we spent alot of time figuring out a > >> similar issue that ended up being ephemeral port exhaustion. > > >> Stephen Johnston > > >> On Tue, Sep 15, 2009 at 8:27 PM, Vladimir <[email protected]> wrote: > > >> nsheth wrote: > > >> About once a day, usually during peak traffic times, I hit some > >> major > >> load issues. I'm running memached on the same boxes as my > >> webservers. Load usually spikes to 35-50, and I see the apache > >> error > >> log flooded with messages like the following: > > >> [Sun Sep 13 14:54:34 2009] [error] [client 10.0.0.2] PHP Warning: > >> memcache_pconnect() [<a > >> href='function.memcache-pconnect'>function. > >> memcache-pconnect</a>]: Can't connect to 10.0.0.5:11211, Unknown > >> error > >> (0) in /var/www/html/memcache.php on line 174, referer: xxxx > > >> Any thoughts? Restart apache, and everything clears up. > > >> It's PHP. I have seen something but in last couple weeks it has > >> "cleared" itself. It could be coincidental with using memcached > >> 1.4.1, > >> code changes etc. I actually have some Ganglia snapshots of the > >> behavior you are describing here > > >> http://2tu.us/pgr > > >> Reason why load goes to 35-50 is that Apache starts consuming > >> greater > >> and greater amounts of memory indicating a PHP memory leak. Granted > >> it > >> could also have something to do with session garbage collection. > > >> I'm running memcached 1.2.5 currently (which looks to be a bit out > >> of > >> date at this point, so perhaps an update is in order). > > >> I think that would be a wise choice. > >> Vladimir
