How did you notice the swapping? I struggled with a similar issue, we finally disabled one of our monitoring tools (hyperic hq) which solved the problem. But we didn't find out why/how the monitoring tool caused the issue. Anything you find out would be helpful.
Peter On Mon, Apr 12, 2010 at 1:52 AM, Ryan Tomayko <[email protected]> wrote: > On Wed, Apr 7, 2010 at 3:02 AM, Ryan Tomayko <[email protected]> wrote: > > On Wed, Apr 7, 2010 at 2:14 AM, dormando <[email protected]> wrote: > >> Just about all respones should happen sub-ms (excepting for network > >> jitter). > > > > Thanks for the quick response. > > > >> Some stuff you can check for offhand: > >> > >> - List versions of all related software you're running; memcached > proper, > >> libmemcached, ruby client) > > > > memcached 1.4.0 amd64 built from custom debian package > > memcached ruby lib is 0.17.3 (bundles libmemcached 0.32) > > > > I used the memslap from libmemcached 0.38 in my benchmarks > > > >> - Your full startup arguments to memcached > > > > /usr/bin/memcached -d -m 12288 -c 200 -l 172.17.0.139 -p 11211 -U > > 11211 -P /var/run/memcached.pid -u nobody > > > >> - Narrow down if these timeouts happen if it's initiating a new > connection > >> to memcached, or when reusing a persistent connection, or both (may not > be > >> easy). > > > > Hmm. I'll need to give this some thought. Let me tackle this tomorrow. > > > >> - If your memcached is (hopefully) new enough, is 'listen_disabled_num' > >> under the `stats` command nonzero? If so, you're hitting maxconns and > >> memcached is blocking new connections until old ones disconnect. Seems > >> unlikely for your case. > > > > Yep. I checked listen_disabled_num during my tests and frequently in > > the past because -c 200 always seemed low to me. I've never seen it at > > anything but 0 and increasing the connections doesn't seem to effect > > the tests. > > > >> Check dmesg and syslogs on the hosts to ensure iptables isn't > complaining > >> and TIME_WAIT buckets aren't overflowing anywhere, clients or servers. > > > > Nothing in dmesg or syslogs on either side. > > > >> If all software is new and blah blah blah, would you mind running a test > >> using a pure client (ruby or whatever, just no libmemcached) over > >> localhost to see if you can reproduce the issue there. > > > > Sure thing. I'll do that and get an answer to the new vs persistent > > connection question first thing tomorrow. > > Quick update to close this thread down: we think we've tracked this to > the memcached processes becoming just barely swapped out in some weird > circumstances. Sorry. I'm not sure how I missed it originally. > > Thanks for the feedback and help troubleshooting. > > Ryan > > > -- > To unsubscribe, reply using "remove me" as the subject. >
