Re: Issue 370 in memcached: memcached(v1.4.15) dead loop bug.

memcached Mon, 30 Jun 2014 11:57:41 -0700

Comment #7 on issue 370 by [email protected]: memcached(v1.4.15) deadloop bug.

http://code.google.com/p/memcached/issues/detail?id=370

Thanks for dormando to pull me back to the do_item_alloc function, and ialso read again carefully about how to reproduce and fix the issue #260. Itis awful to find the extremely path to reproduce the bug. But i m afraidthere MAY have other problems, even at vervion .20, more higherprobability.　　

reproduce step by step in v 1.4.20:

1) hash expanding, switch the item_lock to global_lock;

2) In thread A, do_item_alloc try to get a new item, and try to get a ITEMfrom the LRU tail. but it does not lock the ITEM, because it try tohold the item_locks[];3) after do_item_alloc check the refcount(after line items.c:139), otherthread B try to hold the ITEM(use global_lock), and increase the refcount,and believe itself hold a refcount;4) In thread A, do_item_alloc evicted the ITEM, and initialize theITEM(reset the refcount to 1), as a new one returning to the caller;5) In thread B, it call item_remove to de-reference the refcount, and makethe refcount to 0, so the ITEM is free! (the thread A is holding the ITEM,and the ITME is in the hash table yet);

6) So, any terrible thing may happen, including crash and dead loop.

In version lower than 1.4.19, it easier to occurr even if there is no hashexpanding. Because the code in do_item_alloc as following:


1) if the cur_hv == cur_hv, (it is possible at replace operation);
or

2) if the first loop hv!=cur_hv, and hold a lock; then the second loophv==cur_hv(without a lock, but the lock's pointer is not reset in the firstloop!), then release the lock wrong.


    for (; tries > 0 && search != NULL; tries--, search=search->prev) {
        uint32_t hv = hash(ITEM_key(search), search->nkey, 0);
        /* Attempt to hash item lock the "search" item. If locked, no
         * other callers can incr the refcount
         */
        /* FIXME: I think we need to mask the hv here for comparison? */

if (hv != cur_hv && (hold_lock = item_trylock(hv)) == NULL)------>

            continue;
        /* Now see if the item is refcount locked */
        if (refcount_incr(&search->refcount) != 2) {
            refcount_decr(&search->refcount);
            /* Old rare bug could cause a refcount leak. We haven't seen
             * it in years, but we leave this code in to prevent failures
             * just in case */
            if (search->time + TAIL_REPAIR_TIME < current_time) {
                itemstats[id].tailrepairs++;
                search->refcount = 1;
                do_item_unlink_nolock(search, hv);
            }
            if (hold_lock)
                item_trylock_unlock(hold_lock);
            continue;
        }

I think it's clear enough, but i also try to show more detail as following,and try to reproduce the phenomenon in gdb:

I don't know why the function "assoc_maintenance_thread" write asfollowing, but other threads will get the global_lockbefore "assoc_maintenance_thread" get it again.


static void *assoc_maintenance_thread(void *arg) {

    while (do_run_maintenance_thread) {
        int ii = 0;

        /* Lock the cache, and bulk move multiple buckets to the new
         * hash table. */

item_lock_global(); ----------------> step 5) tryto get the global_lock, it have to wait for other threads to release thelock.

        mutex_lock(&cache_lock);

        for (ii = 0; ii < hash_bulk_move && expanding; ++ii) {
           .....
        }

        mutex_unlock(&cache_lock);
        item_unlock_global();

        if (!expanding) {

/* finished expanding. tell all threads to use fine-grainedlocks */

            switch_item_lock_type(ITEM_LOCK_GRANULAR);
            slabs_rebalancer_resume();
            /* We are done expanding.. just wait for next invocation */
            mutex_lock(&cache_lock);
            started_expanding = false;

pthread_cond_wait(&maintenance_cond, &cache_lock);------------>step 1) wait here for expanding notify.

            /* Before doing anything, tell threads to use a global lock */
            mutex_unlock(&cache_lock);
            slabs_rebalancer_pause();

switch_item_lock_type(ITEM_LOCK_GLOBAL); ----------->step 2)switch to the global_lock, without holding the global_lock.---- other threadwill get the global_lock first. it is not thread-safe.

            mutex_lock(&cache_lock);

assoc_expand(); ---->step 3) expandthe hash size, but the items are not moved to the new buckets.mutex_unlock(&cache_lock); --->step 4) releasethe lock, MAY be not thread-safe.

        }
    }
    return NULL;
}
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

and in the function "do_item_alloc", using the function "item_trylock"to get the item's lock. please look carefully at "item_trylock", itdirectly access "item_locks", hoping a "no-op" as it said in the comment.the item(search) is not lock at all.

So, after the checker "if (refcount_incr(&search->refcount) != 2) ",other threads may hold the item and increase the refcounte, that will make


item *do_item_alloc(char *key, const size_t nkey, const int flags,
                    const rel_time_t exptime, const int nbytes,
                    const uint32_t cur_hv) {
    uint8_t nsuffix;
    item *it = NULL;
    char suffix[40];

size_t ntotal = item_make_header(nkey + 1, flags, nbytes, suffix,&nsuffix);

    if (settings.use_cas) {
        ntotal += sizeof(uint64_t);
    }

    unsigned int id = slabs_clsid(ntotal);
    if (id == 0)
        return 0;

    mutex_lock(&cache_lock);
    /* do a quick check if we have any expired items in the tail.. */
    int tries = 5;
    int tried_alloc = 0;
    item *search;
    void *hold_lock = NULL;
    rel_time_t oldest_live = settings.oldest_live;

    search = tails[id];
    /* We walk up *only* for locked items. Never searching for expired.
     * Waste of CPU for almost all deployments */
    for (; tries > 0 && search != NULL; tries--, search=search->prev) {

if (search->nbytes == 0 && search->nkey == 0 && search->it_flags ==1) {

            /* We are a crawler, ignore it. */
            tries++;
            continue;
        }
        uint32_t hv = hash(ITEM_key(search), search->nkey);
        /* Attempt to hash item lock the "search" item. If locked, no
         * other callers can incr the refcount
         */

/* Don't accidentally grab ourselves, or bail if we can't quicklock*/if (hv == cur_hv || (hold_lock = item_trylock(hv)) == NULL)---------------> 1) item_trylock always get lock from item_locks[], notglobal_lock

            continue;
        /* Now see if the item is refcount locked */

if (refcount_incr(&search->refcount) != 2) {---------------> 2) now the serch->refcount==2, means only the lru-linkreference the item.

            refcount_decr(&search->refcount);
            /* Old rare bug could cause a refcount leak. We haven't seen
             * it in years, but we leave this code in to prevent failures
             * just in case */
            if (settings.tail_repair_time &&

search->time + settings.tail_repair_time <current_time) {

                itemstats[id].tailrepairs++;
                search->refcount = 1;
                do_item_unlink_nolock(search, hv);
            }
            if (hold_lock)
                item_trylock_unlock(hold_lock);
            continue;
        }

        /* Expired or flushed */

if ((search->exptime != 0 && search->exptime < current_time)------------------> 3) after this line, other thread may got hold the itemand increase the refcount.

----------- ------- so if the item is alloc as a new item (refcount resetto 1), then the other thread(-------------------hold the item) call do_item_remove, it will freethe "new" item(because therefcount-------------------is 0). this is very highly possible at the item evictedcase.

|| (search->time <= oldest_live && oldest_live <=current_time)) {

            itemstats[id].reclaimed++;
            if ((search->it_flags & ITEM_FETCHED) == 0) {
                itemstats[id].expired_unfetched++;
            }
            it = search;

slabs_adjust_mem_requested(it->slabs_clsid, ITEM_ntotal(it),ntotal);

            do_item_unlink_nolock(it, hv);
            /* Initialize the item block: */
            it->slabs_clsid = 0;
        } else if ((it = slabs_alloc(ntotal, id)) == NULL) {
            tried_alloc = 1;
            if (settings.evict_to_free == 0) {
                itemstats[id].outofmemory++;
            } else {
                itemstats[id].evicted++;
                itemstats[id].evicted_time = current_time - search->time;
                if (search->exptime != 0)
                    itemstats[id].evicted_nonzero++;
                if ((search->it_flags & ITEM_FETCHED) == 0) {
                    itemstats[id].evicted_unfetched++;
                }

/* Special case. When ITEM_LOCK_GLOBAL mode is enabled, this should become a
 * no-op, as it's only called from within the item lock if necessary.

* However, we can't mix a no-op and threads which are still synchronizingto

 * GLOBAL. So instead we just always try to lock. When in GLOBAL mode this

* turns into an effective no-op. Threads re-synchronize after the powerlevel

 * switch so it should stay safe.
 */
void *item_trylock(uint32_t hv) {
    pthread_mutex_t *lock = &item_locks[hv & hashmask(item_lock_hashpower)];
    if (pthread_mutex_trylock(lock) == 0) {
        return lock;
    }
    return NULL;
}
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

 gdb reproduce(partial):

b assoc.c:211
b item_get
b items.c:158
(gdb) info thread
  Id   Target Id         Frame
  6    Thread 0x7ffff4d71700 (LWP 12647) "memcached" (running)
  5    Thread 0x7ffff5572700 (LWP 12646) "memcached" (running)
  4    Thread 0x7ffff5d73700 (LWP 12645) "memcached" (running)
* 3    Thread 0x7ffff6574700 (LWP 12644) "memcached" (running)
  2    Thread 0x7ffff6d75700 (LWP 12643) "memcached" (running)

1 Thread 0x7ffff7fe0740 (LWP 12642) "memcached" 0x00007ffff76ad9a3 inepoll_wait () at ../sysdeps/unix/syscall-template.S:81

////step 1////add a new item, modify the value of hash_items to 100000, tomaking a hash expanding.


(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff6574700 (LWP 12644))]

#0 do_item_alloc (key=0x7ffff0026414 "e", nkey=1, flags=0, exptime=0,nbytes=3, cur_hv=0) at items.c:159

159                tried_alloc = 1;
(gdb) c
Continuing.

Breakpoint 4, assoc_insert (it=0x7ffff7f35f70, hv=1187334374) at assoc.c:170
170        if (! expanding && hash_items > (hashsize(hashpower) * 3) / 2) {
(gdb) 508        it = do_item_get(key, nkey, hv);
p hash_items
$4 = 1000000
Breakpoint 1, assoc_maintenance_thread (arg=0x0) at assoc.c:211
211            item_lock_global();
(gdb) info thread
  Id   Target Id         Frame

6 Thread 0x7ffff4d71700 (LWP 12647) "memcached"assoc_maintenance_thread (arg=0x0) at assoc.c:211

  5    Thread 0x7ffff5572700 (LWP 12646) "memcached" (running)
  4    Thread 0x7ffff5d73700 (LWP 12645) "memcached" (running)
* 3    Thread 0x7ffff6574700 (LWP 12644) "memcached" (running)
  2    Thread 0x7ffff6d75700 (LWP 12643) "memcached" (running)
  1    Thread 0x7ffff7fe0740 (LWP 12642) "memcached" (running)

/////delete all key in the cache.

/////step 2/// add a item, key="71912_yhd.serial.product.get_1.0_0", to theslot 2, as the first item now.

////step 3///get the key="71912_yhd.serial.product.get_1.0_0",

///step 4 //and try to add a new key="Y_ORDER_225426358_02" to the slot 2.

Breakpoint 5, do_item_alloc(key=0x7ffff0026414 "71912_yhd.serial.product.get_1.0_0", nkey=34, flags=0,exptime=0, nbytes=3, cur_hv=0) at items.c:92

92                        const uint32_t cur_hv) {
(gdb) n

....

Breakpoint 3, item_get(key=0x7fffe0026414 "71912_yhd.serial.product.get_1.0_0", nkey=34) atthread.c:506

506        hv = hash(key, nkey);
(gdb) info thread
  Id   Target Id         Frame

6 Thread 0x7ffff4d71700 (LWP 12647) "memcached"assoc_maintenance_thread (arg=0x0) at assoc.c:211

  5    Thread 0x7ffff5572700 (LWP 12646) "memcached" (running)
  4    Thread 0x7ffff5d73700 (LWP 12645) "memcached" (running)

3 Thread 0x7ffff6574700 (LWP 12644) "memcached" do_item_alloc(key=0x7ffff0026414 "Y_ORDER_225426358_02", nkey=20, flags=0, exptime=0,nbytes=22, cur_hv=0) at items.c:147* 2 Thread 0x7ffff6d75700 (LWP 12643) "memcached" item_get(key=0x7fffe0026414 "71912_yhd.serial.product.get_1.0_0", nkey=34) atthread.c:506

  1    Thread 0x7ffff7fe0740 (LWP 12642) "memcached" (running)
(gdb) n
507        item_lock(hv);
(gdb)
508        it = do_item_get(key, nkey, hv);
(gdb)
509        item_unlock(hv);



(gdb) thread 3
[Switching to thread 3 (Thread 0x7ffff6574700 (LWP 12644))]

#0 do_item_alloc (key=0x7ffff0026414 "Y_ORDER_225426358_02", nkey=20,flags=0, exptime=0, nbytes=22, cur_hv=0) at items.c:147

147            if ((search->exptime != 0 && search->exptime < current_time)
(gdb) p search->refcount

$21 = 3 -------> now the refcount iswrong. (may be in the evicted process is better).

(gdb) bt

#0 do_item_alloc (key=0x7ffff0026414 "Y_ORDER_225426358_02", nkey=20,flags=0, exptime=0, nbytes=22, cur_hv=0) at items.c:147#1 0x0000000000417c26 in item_alloc(key=0x7ffff0026414 "Y_ORDER_225426358_02", nkey=20, flags=0, exptime=0,nbytes=22) at thread.c:495#2 0x000000000040ace9 in process_update_command (c=0x7ffff0026200,tokens=0x7ffff6573be0, ntokens=6, comm=2, handle_cas=false) atmemcached.c:3084#3 0x000000000040bde2 in process_command (c=0x7ffff0026200,command=0x7ffff0026410 "set") at memcached.c:3437#4 0x000000000040cc47 in try_read_command (c=0x7ffff0026200) atmemcached.c:3763#5 0x000000000040d958 in drive_machine (c=0x7ffff0026200) atmemcached.c:4108#6 0x000000000040e4f7 in event_handler (fd=37, which=2,arg=0x7ffff0026200) at memcached.c:4353#7 0x00007ffff7ba3f24 in event_base_loop () from/usr/lib/x86_64-linux-gnu/libevent-2.0.so.5

#8  0x0000000000417926 in worker_libevent (arg=0x635ea0) at thread.c:386

#9 0x00007ffff7980182 in start_thread (arg=0x7ffff6574700) atpthread_create.c:312#10 0x00007ffff76ad30d in clone ()at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

(gdb)

<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Connected to 127.0.0.1.
Escape character is '^]'.
set a 0 0 1
1
STORED
^[[A^[[B
ERROR
set b  0 0 1
2
STORED

get 71912_yhd.serial.product.get_1.0_0





jason@gy:~$ telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
set c 0 0 1
3
STORED

set 71912_yhd.serial.product.get_1.0_0 0 0 1
3
STORED
delete 71912_yhd.serial.product.get_1.0_0
DELETED
set 71912_yhd.serial.product.get_1.0_0 0 0 1
3
STORED
delete Y_ORDER_225426358_02
delete a
delete b
delete c
NOT_FOUND
DELETED
NOT_FOUND
NOT_FOUND
set Y_ORDER_225426358_02 0 0 20
aaaaaaaaaaaaaaaaaaaa

//////////////////////////////////////////

I will try to write a test script to reproduce it, and try to commit apatch to fix it.

--

You received this message because this project is configured to send allissue notifications to this address.

You may adjust your notification preferences at:
https://code.google.com/hosting/settings

--

---You received this message because you are subscribed to the Google Groups "memcached" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: Issue 370 in memcached: memcached(v1.4.15) dead loop bug.

Reply via email to