I'm not familiar with the source code, but I would suspect that the global lock 
is to maintain the linked list in LRU order.  As every read mutates the list, a 
reader/writer lock isn't appropriate.  If that's the case, you may want to 
consider trading the slightly better hit rate for more concurrency (such as by 
using the Second Chance FIFO algorithm) or adopting a CAS-based LRU algorithm 
to reduce the critical section on the tail lock.




________________________________
From: Toru Maesaka <[email protected]>
To: [email protected]
Sent: Thursday, March 5, 2009 7:11:09 PM
Subject: Ideas on Concurrency Control Strategy


Hi!

I was going to reply to Dustin's thread but noticed we haven't talked
about development related goodness in a while so I'm starting a new
thread.

So, to begin with I'm going to describe what I'm going to experiment
to get memcached scaling even further in modern hardware. As most of
you know, at the moment we globally lock the cache with
pthread_mutex_lock() to control concurrency. This is working fine at
the moment but taking into account that in most use cases, the
majority of the requests are GET related, I have a feeling that using
pthread_rwlock_wrlock() and pthread_rwlock_rdlock() might give us
substantial performance benefits without changing the codebase too
much.

For those that aren't familiar with pthread's read-write locking, it's
a really simple mechanism that allows multiple threads to fetch a lock
for reading IF a write lock isn't obtained by another thread. This
means our worker threads can perform GET related commands without
being blocked (again, if a write lock isn't taken by someone else). So
in brief, fetching data from memcached should scale quite well in
modern SMP architecture with this scheme.

The downside of pthread's read-write locking is that it consumes more
CPU than simple locking, and perhaps people are already happy with the
current performance of memcached. For the rest of us that wants more
performance out of memcached, we could design the codebase to only use
read-write locking when the server is started with more than certain
number of threads (e.g. > 8 threads).

As for the global stats_lock, I think its safe to assume that most of
the operations are write related so read-write locking is definitely
not the answer here. I think Facebook was dead-on with their solution
on scaling this area up (use scoreboarding per thread basis and
aggregate the results when needed).

There are other locks that needs to be looked at but I haven't got
that far yet so I'll stop now :)

To begin with I'm going to only experiment with using rw-locking for
cache_lock. I'll test the performance with libmemcached's memslap and
let you all know how it turned out... hopefully in not so distant
future.

Cheers,
Toru



      

Reply via email to