+1

On Tue, Jul 6, 2010 at 2:47 PM, dormando <[email protected]> wrote:

> Or you could disable the "failover" feature...
>
> On Tue, 6 Jul 2010, Darryl Kuhn wrote:
>
> > FYI - we made the change on one server and it does appear to have
> resolved premature key expiration.
> >
> > Effectively what appears to have been happening was that every so often a
> client was unable to connect to one or more of the memcached servers. When
> this happened it changed the key distribution. Because
> > the connection was persistent it meant that subsequent requests would use
> the same connection handle with the reduced server pool. Turning off
> persistent connections ensures that a if we are unable to
> > connect to a server in one instance the failure does not persist for
> subsequent connections.
> >
> > We'll be rolling this change out to the entire server pool and I'll give
> the list another update with our findings.
> >
> > Thanks,
> > Darryl
> >
> > On Fri, Jul 2, 2010 at 8:34 AM, Darryl Kuhn <[email protected]>
> wrote:
> >       Found the reset call - that was me being an idiot (I actually
> introduced it when I added logging to debug this issue)... That's been
> removed however there was no flush command. Somebody else
> >       suggested it may have to do with the fact that we're running
> persistent connections; and that if a failure occurred that failure would
> persist and alter hashing rules for subsequent requests on
> >       that connection. I do see a limited number of connection failures
> (~5-15) throughout the day. I'm going to alter the config to make
> connections non-persistent and see if it makes a difference
> >       (however I'm doubtful this is the issue as we've run with memcache
> server pools with a single instance - which would make it impossible to
> alter the hashing distribution).
> >
> >       I'll report back what I find - thanks for your continued input!
> >
> >       -Darryl
> >
> >
> > On Thu, Jul 1, 2010 at 12:28 PM, dormando <[email protected]> wrote:
> >       > Dormando... Thanks for the response. I've moved one of our
> servers to use an upgraded version running 1.4.5. Couple of things:
> >       >  *  I turned on logging last night
> >       >  *  I'm only running -vv at the moment; -vvv generated way more
> logging than we could handle. As it stands we've generated ~6GB of logs
> since last night (using -vv). I'm looking at ways
> >       of reducing log
> >       >     volume by logging only specific data or perhaps standing up
> 10 or 20 instances on one machine (using multiple ports) and turning on -vvv
> on only one instance. Any suggestions there?
> >
> > Oh. I thought given your stats output that you had reproduced it on a
> > server that was on a dev instance or local machine... but I guess that's
> > related to below. Running logs on a production instance with a lot of
> > traffic isn't that great of an idea, sorry about that :/
> >
> > > Looking at the logs two things jump out at me.
> > >  *  While I had -vvv turned on I saw "stats reset" command being issued
> constantly (at least once a second). Nothing in the code that we have does
> this - do you know if the PHP client does
> > this perhaps? Is
> > >     this something you've seen in the past?
> >
> > No, you probably have some code that's doing something intensely wrong.
> > Now we should probably add a counter for the number of times a "stats
> > reset" has been called...
> >
> > >  *  Second with -vv on I get something like this:
> > >      +  <71 get resourceCategoryPath21:984097:
> > >         >71 sending key resourceCategoryPath21:984097:
> > >         >71 END
> > >         <71 set 
> > > popularProducts:2010-06-28:skinit.com:styleskins:en::2000:image_wall:0__type
> 0 86400 5
> > >         >71 STORED
> > >         <71 set 
> > > popularProducts:2010-06-28:skinit.com:styleskins:en::2000:image_wall:0
> 1 86400 130230
> > >         <59 get domain_host:www.bestbuyskins.com
> > >         >59 sending key domain_host:www.bestbuyskins.com
> > >         >59 END
> > >  *  Two questions on the output - what's the "71" and "59"? Second - I
> would have thought I'd see an "END" after each "get" and "set" however you
> can see that's not the case.
> > >
> > > Last question... other than trolling through code is there a good place
> to go to understand how to parse out these log files (I'd prefer to
> self-help rather than bugging you)?
> >
> > Looks ike you figured that out. The numbers are the file descriptors
> > (connections). END/STORED/etc are the responses.
> >
> > Honestly I'm going to take a wild guess that something on your end is
> > constantly trying to reset the memcached instance.. it's probably doing a
> > "flush_all" then a "stats reset" which would hide the flush counter. Do
> > you see "flush_all" being called in the logs anywhere?
> >
> > Go find where you're calling stats reset and make it stop... that'll
> > probably help bubble up what the real problem is.
> >
> >
> >
> >
> >
>



-- 
awl

Reply via email to