packet mangling via iptables, or something else? On Thu, 1 Mar 2012, Joseph Brower wrote:
> It looked like we had some packet mangling going on. Talk about a crazy bug > to track down. I appreciate everyone's help! It's all resolved now. > > Thanks, > > Joseph Brower > > On 03/01/2012 12:34 PM, Joseph Brower wrote: > > To rule out the extension causing issues, I actually used a class i found > > that is slow, but doesn't rely on the pecl extensions at all (or the > > memcached extension). The issue still persisted. I'll see if I can change > > the version. I'll also see if there is any packet mangling that might be > > occuring. > > > > Thanks, > > > > Joseph Brower > > > > On 03/01/2012 12:26 PM, dormando wrote: > > > Can you upgrade to .13 and try again? > > > > > > You pasted some protocol errors... what version of pecl/memcache is that? > > > 3.x might have trouble with the binary protocol as it was alpha > > > abandonware. > > > > > > If they still happen with .13, it might be worth getting a log from > > > memcached. run it in screen with -vv and redirect the output to a logfile > > > or pipe through logger to syslog. it could be a lot of lines if things are > > > busy. > > > > > > that should tell you if the server sees anything at all. > > > > > > On Thu, 1 Mar 2012, Joseph Brower wrote: > > > > > > > Yup. Thats how I've got the production environment set up. We have two > > > > memcache server, each with a decent amount of RAM. The same thing > > > > happens there (though, > > > > not always to both memcache servers. Sometimes it happens to one or the > > > > other.) Also, I've written a small memcache test script that just tries > > > > to set and get a > > > > very small value. That works sometimes, and it fails other times. > > > > That's what I find so odd. > > > > > > > > Thanks, > > > > > > > > Joseph Brower > > > > > > > > On 03/01/2012 01:34 AM, Yiftach Shoolman wrote: > > > > Hi Joseph, > > > > I guess you know that your Memcached size if only 10MB (STAT > > > > limit_maxbytes 10485760). Magento zend cache (your objects) tests this > > > > size prior to setting > > > > object, and if the limit is reached (STAT bytes) you cannot set any new > > > > object in the cache, but you can set new sessions - so that might be > > > > your problem , > > > > though not according to the stats u sent. > > > > > > > > One more thing, it is better to deploy Magento with 2 Memcached servers, > > > > one for the cache and one for the session, so whenever you upgrade your > > > > site and > > > > flush the objects, you don't need to either flush your sessions - see > > > > typical configuration of one of our customers below. > > > > > > > > Best, > > > > > > > > Yiftach > > > > > > > > config> > > > > <global> > > > > <session_cache_limiter></session_cache_limiter> > > > > <session_save><![CDATA[memcache]]></session_save> > > > > <session_save_path><![CDATA[DNSADDRESS]></session_save_path> > > > > <cache> > > > > <backend>memcached</backend><!-- apc / memcached / xcache / empty=file > > > > --> > > > > <slow_backend>database</slow_backend> <!-- database / file (default) - > > > > used for 2 levels cache setup, necessary for all shared memory storages > > > > --> > > > > <slow_backend_store_data>0</slow_backend_store_data> > > > > <memcached><!-- memcached cache backend related config --> > > > > <servers><!-- any number of server nodes can be included --> > > > > <server> > > > > <host><![CDATA[NSADDRESS]]></host> > > > > <port><![CDATA[10245]]></port> > > > > <persistent><![CDATA[1]]></persistent> > > > > <weight><![CDATA[1]]></weight> > > > > <timeout><![CDATA[10]]></timeout> > > > > <retry_interval><![CDATA[10]]></retry_interval> > > > > <status><![CDATA[1]]></status> > > > > </server> > > > > </servers> > > > > <compression><![CDATA[0]]></compression> > > > > <cache_dir><![CDATA[]]></cache_dir> > > > > <hashed_directory_level><![CDATA[]]></hashed_directory_level> > > > > <hashed_directory_umask><![CDATA[]]></hashed_directory_umask> > > > > <file_name_prefix><![CDATA[]]></file_name_prefix> > > > > </memcached> > > > > </cache> > > > > > > > > > > > > > > > > > > > > On Thu, Mar 1, 2012 at 9:59 AM, Joseph Brower<[email protected]> > > > > wrote: > > > > Thanks for the response. > > > > > > > > I've been testing as best I can and I've found that setting and > > > > getting fail. I get either no output, or a > > > > > > > > Notice: Memcache::set(): Server my.memcachehost.com (tcp 11211) > > > > failed with: Received malformed response (0) in /var/www/memcache.php on > > > > line 5 > > > > > > > > I'm able to continue setting and getting via telnet without any > > > > issues. Also, if I redeploy my webserver (onto somewhere else in our > > > > cluster) > > > > things sometimes are happy, sometimes they continue to fail. > > > > When I look at netstat, I don't see the connections in memcache. When > > > > looking at > > > > the output from memcached, it doesn't show any additional output > > > > (as if the connection never reaches it.) I'm confident it's not my > > > > firewall > > > > rules, as I've got everything automated so that my configuration > > > > is consistent between versions. I've also ruled out the extension being > > > > used. It happens using the memcached, memcache, and an > > > > extensionless method that I found. > > > > > > > > I'm running on Ubuntu 10.04. All of the other services on this > > > > cluster don't have any connection issues (mysql, http, load balancer, > > > > ssl > > > > terminator) and they all use my same script for configuring the > > > > firewall rules appropriately. > > > > > > > > All of the stats look ok. I'm not maxing out the connection > > > > limit, and I am nowhere near memory limits. This happens when using > > > > memcache for > > > > sessions and for page cache. > > > > STAT pid 126 > > > > STAT uptime 2017 > > > > STAT time 1330588738 > > > > STAT version 1.4.10 > > > > STAT libevent 1.4.13-stable > > > > STAT pointer_size 64 > > > > STAT rusage_user 0.040000 > > > > STAT rusage_system 0.160000 > > > > STAT curr_connections 10 > > > > STAT total_connections 37 > > > > STAT connection_structures 11 > > > > STAT reserved_fds 20 > > > > STAT cmd_get 45 > > > > STAT cmd_set 35 > > > > STAT cmd_flush 0 > > > > STAT cmd_touch 0 > > > > STAT get_hits 40 > > > > STAT get_misses 5 > > > > STAT delete_misses 0 > > > > STAT delete_hits 0 > > > > STAT incr_misses 0 > > > > STAT incr_hits 10 > > > > STAT decr_misses 0 > > > > STAT decr_hits 0 > > > > STAT cas_misses 0 > > > > STAT cas_hits 0 > > > > STAT cas_badval 0 > > > > STAT touch_hits 0 > > > > STAT touch_misses 0 > > > > STAT auth_cmds 0 > > > > STAT auth_errors 0 > > > > STAT bytes_read 1942 > > > > STAT bytes_written 1672 > > > > STAT limit_maxbytes 10485760 > > > > STAT accepting_conns 1 > > > > STAT listen_disabled_num 0 > > > > STAT threads 4 > > > > STAT conn_yields 0 > > > > STAT hash_power_level 16 > > > > STAT hash_bytes 524288 > > > > STAT hash_is_expanding 0 > > > > STAT expired_unfetched 0 > > > > STAT evicted_unfetched 0 > > > > STAT bytes 303 > > > > STAT curr_items 4 > > > > STAT total_items 27 > > > > STAT evictions 0 > > > > STAT reclaimed 0 > > > > > > > > That's how some of my stats are. I've tried various sizes, this > > > > is an exceptionally small one that I was using only for testing. > > > > > > > > Thanks, > > > > > > > > Joseph Brower > > > > > > > > > > > > On 02/29/2012 11:16 PM, Yiftach Shoolman wrote: > > > > Hi Joseph, > > > > Can you elaborate a bit more on your problem, what do you mean by > > > > unavailable, can you set/get keys ? are your app-->mmemcached tcp > > > > connections > > > > disconnected ? have you reached to your memcached memory limit (please > > > > send memcach stats) ? something else ? > > > > Also, specific question about Magento, does it happen on the session > > > > caching (I guess so) or the object caching the part that is based on > > > > zend > > > > caching ? > > > > > > > > Yiftach > > > > > > > > On Thu, Mar 1, 2012 at 3:41 AM, Joseph Brower<[email protected]> > > > > wrote: > > > > When I'm using Memcache (the PECL extension) with Magento, > > > > everything > > > > works well for an indeterminate amount of time. After some time > > > > passes, Memcached becomes unavailable. This is the odd part > > > > though, I > > > > can still telnet into MemcacheD and issue commands. I have 4 > > > > webservers all connecting to one memcache instance. Does anyone > > > > have > > > > any ideas what might be going on? > > > > > > > > Thanks, > > > > > > > > Joseph Brower > > > > > > > > > > > > > > > > > > > > -- > > > > Yiftach Shoolman > > > > +972-54-7634621 > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Yiftach Shoolman > > > > +972-54-7634621 > > > > > > > > > > > > > > > > > > > >
