Troy and I stumbled across this bug, that at least for our
configurations, causes a double-free on the server when cleaning up
'stale' connections.

It seems that some logic is duplicated here when cleaning up
connections that disappear to the server:

#if !MEMCACHE_BOUNCEBUF
            if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)  /*
--------------- here-------------- */
                memcache_deregister(ib_device->memcache, &rq->buflist);
#  if MEMCACHE_EARLY_REG
            /* pin on post, dereg all these */
            if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
                rq->state.recv == RQ_RTS_WAITING_RTS_DONE)    /*
----------------- and here ------------ */
                memcache_deregister(ib_device->memcache, &rq->buflist);
            if (rq->state.recv == RQ_WAITING_INCOMING
              && rq->buflist.tot_len > ib_device->eager_buf_payload)
                memcache_deregister(ib_device->memcache, &rq->buflist);
#  endif

The patch below worked cleans it up, and compiles and runs *here*, but
we have some weird hanging issues after closing connections, basically
no new unexpected requests are processed until after the BMItimeout
and/or FlowTimeout (pvfs2.conf)  have expired.  We're not sure if that
is a seperate issue, it hangs and dies w/o this patch, now we just
hang.  This fixes the segfault which is a more major issue.
Patch is against CVS head from 20 minutes ago.


Index: src/io/bmi/bmi_ib/ib.c
===================================================================
RCS file: /anoncvs/pvfs2/src/io/bmi/bmi_ib/ib.c,v
retrieving revision 1.68
diff -r1.68 ib.c
1493,1494d1488
<           if (sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
<               memcache_deregister(ib_device->memcache, &sq->buflist);
1502c1496,1499
< #  endif
---
> #  else /* MEMCACHE_EARLY_REG */
>           if (sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
>               memcache_deregister(ib_device->memcache, &sq->buflist);
> #  endif /* !MEMCACHE_EARLY_REG */
1511,1512d1507
<           if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
<               memcache_deregister(ib_device->memcache, &rq->buflist);
1521c1516,1519
< #  endif
---
> #  else  /* MEMCACHE_EARLY_REG */
>           if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
>               memcache_deregister(ib_device->memcache, &rq->buflist);
> #  endif /* !MEMCACHE_EARLY_REG */



Thanks,
    +=Kyle

-- 
Kyle Schochenmaier
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to