Re: About client side hash function - non consistent - strategies

Henrik Schröder Tue, 23 Jun 2009 04:24:15 -0700

Again, if you want clients with good server selection algorithms, you should
take a look at the ones that implement the libketama method:
http://www.last.fm/user/RJ/journal/2007/04/10/rz_libketama_-_a_consistent_hashing_algo_for_memcache_clients


You may be right in that the below failover code in the PHP client is bad,
but so what? It's still built on the naive server selection algorithm, which
is way worse, so why spend time fixing this part of it? Failover is an
extremely rare edge-case, unless you have a huge memcached farm, and in that
case you probably aren't using these clients anyway, so they're good enough
for the task at hand which is handling small clusters.

If you still want to fix it, well, do it, and submit the patches to the
maintainers of those clients.


/Henrik

On Tue, Jun 23, 2009 at 12:19, Pau Freixes <[email protected]> wrote:

> Hi Henrik,
>
> I think the same about probably "inconsistency" cache data in failover
> environment, but I wanted know more opinions about the current
> implementations of hash function and server selection in client side. May be
> search to next server can be dangerous regarded the consistence of cache
> data in memcache cluster, but this is a other implicit problem in current
> memcache architecture.
>
> The current implementation at C client library [1] use a increment
> iteration to search the next aviable server starting with offset 0 with
> original value of hash, however php/python client are using this "amazing"
> method ;)
>
> Thinking in failover, may be, with not any more components like proxy cache
> to follow probably inconsistency cache problems the best solution is avoid
> failover
>
> Bye
>
> [1] 
> http://people.freebsd.org/~seanc/libmemcache/<http://people.freebsd.org/%7Eseanc/libmemcache/>
>
>
> On Tue, Jun 23, 2009 at 11:33 AM, Henrik Schröder <[email protected]>wrote:
>
>> If you want good failover, you should use the ketama method for server
>> selection instead, the below naive server selection algorithm is bad in many
>> ways.
>>
>> However, if you think a bit more on failover, you'll also soon realize
>> that it in itself will lead to unexpected behaviour. If a memcached server
>> goes down and you have automatic failover, then your entire application can
>> discover this at the same time and fail over at the same time, which is
>> fine. But when that server comes back up again, it will automatically be
>> restored into the cluster at different times by different parts of your
>> application, which leads to your cache data being unsynchronized, which may
>> or may not be fatal for your application.
>>
>> So if you want to make any changes to these clients, the best thing you
>> could do is add an option to disable failover. :-)
>>
>>
>> /Henrik
>>
>>
>> On Mon, Jun 22, 2009 at 19:31, Pau <[email protected]> wrote:
>>
>>>
>>> Hi to all, I'm a new in a list.
>>>
>>> This last days i spend some time thinking about how memcache - client
>>> side - do some failover strategie in one pool of memcached servers.
>>> Yes I have read the FAQ of memcached and this comment a lot of thinks
>>> about memcached and that tool is not a distributed system, hence it's
>>> responsability of client side build a "consistent" cache architecture,
>>> but php and python - pure lib clients without linking to libmemcache -
>>> have been written for memcache team, can you help me with next
>>> question ?
>>>
>>> At first time I thought client api used some easy approximation to
>>> build one "failover" hash strategy to avoid some connect errors, for
>>> example :
>>>
>>> server = hash(key) % n_servers
>>> while( connect(server, ...) AND i < max_retries )
>>> { i++; server++; }
>>>
>>> But I was wrong, php and python client using a different concept like
>>> to this :
>>>
>>> server = hash(key);
>>> while( connect(server % n_servers, ...) AND i < max_retries )
>>> {
>>> i++;
>>> server += hash(key + str(i)) ;
>>> }
>>>
>>> This kind of approximation can has a dont expected behaviour, all
>>> server values in all iteration can be projected in same integer range
>>> and only try to connect a subgroup of server pool more lesser than all
>>> group.
>>>
>>> Python and php hash strategies are different, one plus string value at
>>> end of hash value and the other plus string at init of hash with some
>>> rotate bit tricks, but in definitely all of them have the same
>>> behavior.
>>>
>>> What do you think about is ?
>>>
>>>
>>>
>>
>
>
> --
> --pau
>

Re: About client side hash function - non consistent - strategies

Reply via email to