Seems like you're right.. I'd re-arranged where the LRU lock (cache_lock)
is called then forgot to update that one bit. Most of the do_item_unlink
code is safe there, until it gets into the LRU bits. It's unlikely anyone
actually saw a crash from this as it's a narrow race though.
That's easy to fix. It's still necessary to delete it, since threads can
stack around a handful of objects and cause rebalance to hang.
Thanks!
On Thu, 3 Jul 2014, Zhiwei Chan wrote:
> the item lock can only protect the hash list, but what about the LRU list?
> As far as i know, if trying to delete a node from a doubly-linked-list,
> it is necessary to lock at least 3 node: node, node->pre, node->next. I will
> try to check if it may crash the LRU list in gdb next week .
> And I think in do_item_get it is not necessary to delete the item that is
> re-balanced, just leave it there and return NULL seems better.
>
> 在 2014年7月3日星期四UTC+8下午1时30分29秒,Dormando写道:
> the item lock is already held for that key when do_item_get is called,
> which is why the nolock code is called there.
>
> slab rebalance has that second short-circuiting of fetches to ensure
> very
> hot items don't permanently jam a page move.
>
> On Wed, 2 Jul 2014, Zhiwei Chan wrote:
>
> > Hi all, I have thought carefully about the the thread-safe
> memcached recently, and found that if the re-balance is running, it may
> not
> > thread-safety. The code "do_item_get->do_item_unlink_nolock" may
> corrupt the hash table. Whenever it trying to modify the hash table,
> it should get
> > cache_lock, but the function do_item_get have not got the cache_lock.
> > Please tell me if anything i neglected.
> >
> > /** wrapper around assoc_find which does the lazy expiration logic */
> > item *do_item_get(const char *key, const size_t nkey, const uint32_t
> hv) {
> > //mutex_lock(&cache_lock);
> > item *it = assoc_find(key, nkey, hv);
> > if (it != NULL) {
> > refcount_incr(&it->refcount);
> > /* Optimization for slab reassignment. prevents popular items
> from
> > * jamming in busy wait. Can only do this here to satisfy
> lock order
> > * of item_lock, cache_lock, slabs_lock. */
> > if (slab_rebalance_signal &&
> > ((void *)it >= slab_rebal.slab_start && (void *)it <
> slab_rebal.slab_end)) {
> > do_item_unlink_nolock(it, hv);
> -------------------------------------------------------------------> no lock
> before
> unlink.
> > do_item_remove(it);
> > it = NULL;
> > }
> > }
> >
> > --
> >
> > ---
> > You received this message because you are subscribed to the Google
> Groups "memcached" group.
> > To unsubscribe from this group and stop receiving emails from it,
> send an email to [email protected].
> > For more options, visit https://groups.google.com/d/optout.
> >
> >
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>
>
--
---
You received this message because you are subscribed to the Google Groups
"memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.