Thx for your replies so far. Failover is deactivated in our configuration. This can not be the reason. I think I have to write a little bit more about the circumstances:
our 50+ consistent hashing cluster is very reliable on normal operations, incr/decr, get, set, multiget, etc. is not a problem. If we have a problem with keys on wrong servers in the continuum, we should have more problems, which we currently have not. The cluster is always under relatively high load (the number of connections for example is very high due to 160+ webservers in the front). We are now expecting in a very few cases, that this locking mechanism does not work. Two different clients try to lock the with the same object (if you want to prevent multiple inserts in a database on the same primary key you have to explicitly set one key valid for all clients and not a key with unique hashes in it), it works millions of times as expected (we are generating a large number of user triggered database inserts (~60/sec.) with this construct). But a handful of locks does not work and shows the behaviour described. So now my question is again: is it thinkable (even if it is very implausible), that a multithreaded memd does not provide 100% sure atomic add()? Kind regards, Jerome
