Troy and I stumbled across this bug, that at least for our
configurations, causes a double-free on the server when cleaning up
'stale' connections.
It seems that some logic is duplicated here when cleaning up
connections that disappear to the server:
#if !MEMCACHE_BOUNCEBUF
if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE) /*
--------------- here-------------- */
memcache_deregister(ib_device->memcache, &rq->buflist);
# if MEMCACHE_EARLY_REG
/* pin on post, dereg all these */
if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
rq->state.recv == RQ_RTS_WAITING_RTS_DONE) /*
----------------- and here ------------ */
memcache_deregister(ib_device->memcache, &rq->buflist);
if (rq->state.recv == RQ_WAITING_INCOMING
&& rq->buflist.tot_len > ib_device->eager_buf_payload)
memcache_deregister(ib_device->memcache, &rq->buflist);
# endif
The patch below worked cleans it up, and compiles and runs *here*, but
we have some weird hanging issues after closing connections, basically
no new unexpected requests are processed until after the BMItimeout
and/or FlowTimeout (pvfs2.conf) have expired. We're not sure if that
is a seperate issue, it hangs and dies w/o this patch, now we just
hang. This fixes the segfault which is a more major issue.
Patch is against CVS head from 20 minutes ago.
Index: src/io/bmi/bmi_ib/ib.c
===================================================================
RCS file: /anoncvs/pvfs2/src/io/bmi/bmi_ib/ib.c,v
retrieving revision 1.68
diff -r1.68 ib.c
1493,1494d1488
< if (sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
< memcache_deregister(ib_device->memcache, &sq->buflist);
1502c1496,1499
< # endif
---
> # else /* MEMCACHE_EARLY_REG */
> if (sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
> memcache_deregister(ib_device->memcache, &sq->buflist);
> # endif /* !MEMCACHE_EARLY_REG */
1511,1512d1507
< if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
< memcache_deregister(ib_device->memcache, &rq->buflist);
1521c1516,1519
< # endif
---
> # else /* MEMCACHE_EARLY_REG */
> if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
> memcache_deregister(ib_device->memcache, &rq->buflist);
> # endif /* !MEMCACHE_EARLY_REG */
Thanks,
+=Kyle
--
Kyle Schochenmaier
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers