Thanks, I will take a look at it.

Making it a new behavior makes it much more likely that I can accept it.

Cheers,
        -Brian

On Dec 12, 2011, at 1:36 PM, Trevor North wrote:

> I have a branch which more or less achieves what is described above by
> way of a new dead server retry timeout behaviour.
> 
> As per the current behaviour with consistent distribution and auto
> ejection, keys on the dead server are moved after we hit the initial
> failure limit by taking that host out of the continuum.  We then reset
> the continuum to force a retry every time we hit the dead server retry
> timeout in the same manner as is done for standard connection retries.
> Each dead server retry will result in a miss if the host is not actually
> available which isn't ideal but I wanted to achieve this whilst
> maintaining compatibility with current behaviour so have kept the
> changes to a bare minimum.
> 
> It's worth noting here that there are a couple of instances where an IO
> failure would incorrectly reset a server state to new even if it was
> already in timeout.  I've corrected this when setting the state although
> I suspect the IO in question probably shouldn't be being attempted in
> the first place in some cases.
> 
> I've made no attempt to leave keys on their newly allocated servers once
> the dead server is brought back to life and I don't believe it would be
> sensible to do so.  With multiple clients running, network flapping
> would result in effectively random distribution if we attempted to did
> this negating the point of the use of consistent distribution.
> 
> Bar the correction to the server state reset on IO failure when in
> timeout the changes introduced do not alter the behaviour currently seen
> if the new dead retry timeout is not used so they should be completely
> backwards compatible.
> 
> The branch is available at
> https://code.launchpad.net/~trevor/libmemcached/dead-retry and I've
> attached a patch which will apply to 1.0.2.
> 
> Feedback would be welcome as ideally this isn't something I want to have
> to maintain separately.
> 
> ** Attachment added: "Add dead server retry"
>   
> https://bugs.launchpad.net/libmemcached/+bug/881983/+attachment/2629812/+files/backoff-dead-reconnect
> 
> -- 
> You received this bug notification because you are subscribed to
> libmemcached.
> https://bugs.launchpad.net/bugs/881983
> 
> Title:
>  libmemcached resets continuum with dead server
> 
> Status in libmemcached - A C and C++ client library for memcached:
>  New
> Status in “libmemcached” package in Ubuntu:
>  New
> 
> Bug description:
>  Testing with pylibmc
> 
>  import pylibmc
>  hosts = ["10.234.34.32","10.224.65.34","10.224.71.109"]
>  Using libmemcached 0.53
> 
>  import pylibmc
>  hosts = ["IP_of_host_1_here","IP_of_host_2_here","IP_of_host_3_here"]
>  mc = pylibmc.Client(hosts,binary=False)
>  mc.behaviors['remove_failed']=3
>  mc.behaviors['hash']='md5'
>  mc.behaviors['distribution']='consistent ketama'
> 
>  last_exception = None
>  while True:
>     try:
>       mc.set("key","value")
>       print mc.get("key")
>     except Exception as e:
>       print e
> 
>  3 servers running, works fine
>  takedown the server handing the load
>  libmemcached returns 
>  error 47 from memcached_set: SERVER HAS FAILED AND IS DISABLED UNTIL TIMED 
> RETRY
>  until the number of retries has been reached, at which point the server is 
> removed from the pool and the continuum is recalculated.  
> 
>  A different server starts handling the key, until... the retry timeout
>  expires again, at which point the continuum is recalculated with the
>  dead server back in, and now all calls fail with
> 
>  error 35 from memcached_set: SERVER IS MARKED DEAD
> 
>  What should happen is that after the timeouts there should be a single
>  return of
> 
>  error 35 from memcached_set: SERVER IS MARKED DEAD
> 
>  After which the continuum is recalculated and values go to the new
>  server.
> 
>  Fix is to mark the server as dead, and exclude dead servers whenever
>  recalculating the continuum (only works for consistent distributions -
>  but that's what I'm using)
> 
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/libmemcached/+bug/881983/+subscriptions

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/881983

Title:
  libmemcached resets continuum with dead server

To manage notifications about this bug go to:
https://bugs.launchpad.net/libmemcached/+bug/881983/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to