On Wednesday, March 14, 2012 1:35:38 PM UTC-7, Wendy Cheng wrote:

ok, found the place where it deadlocked. Our completion thread tried
> to complete the io (by notify_io_complete()) for the memcached worker
> thread while holding the cache_lock. The worker thread locked itself
> (thread->mutex) before entering "process_command()"; then tried to
> obtain the cache_lock. This implies I can't call notify_io_complete()
> while holding cache_lock. Guess I need to spend times to get myself
> familiar with network protocol logic, instead of isolating within
> storage engine itself. .
>
  The idea for notify_io_complete makes it a bit hard to deadlock. 
 Typically, a request comes in and you service it immediately.  if you 
can't immediately service the request, you can ask another thread to 
perform the work (e.g. an existing service pool or a temporary thread of 
it's an infrequent thing) and then you return EWOULDBLOCK.  This causes 
memcached to remove the connection from its libevent set completely (that 
is, you are now completely responsible for it).  After this, that thread 
may call notify_io_complete() to have the connection added back into 
libevent have memcached reissue the request against your engine.
 

> It would be nice to know what is a tap thread but I'm ok for now.
>
  The tap thread owns all the tap connections.  It's different from the 
normal protocol worker threads.  There's a writeup on tap here: 
 http://code.google.com/p/memcached/wiki/Tap 

Reply via email to