I think this is a fairly critical fix and will affect the majority of the ib users I know of, who are all running cvs head for various reasons. The patch only moves two #defines around so I'm not sure why we should hold off on it, maybe I'm missing something?
Kyle On Tue, Apr 8, 2008 at 1:47 PM, Sam Lang <[EMAIL PROTECTED]> wrote: > > Does this need to go in the next release? > -sam > > > > On Apr 8, 2008, at 1:43 PM, Pete Wyckoff wrote: > > > > > > > > > [EMAIL PROTECTED] wrote on Tue, 08 Apr 2008 13:17 -0500: > > > > > Troy and I stumbled across this bug, that at least for our > > > configurations, causes a double-free on the server when cleaning up > > > 'stale' connections. > > > > > > It seems that some logic is duplicated here when cleaning up > > > connections that disappear to the server: > > > > > > #if !MEMCACHE_BOUNCEBUF > > > if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE) /* > > > --------------- here-------------- */ > > > memcache_deregister(ib_device->memcache, &rq->buflist); > > > # if MEMCACHE_EARLY_REG > > > /* pin on post, dereg all these */ > > > if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION || > > > rq->state.recv == RQ_RTS_WAITING_RTS_DONE) /* > > > ----------------- and here ------------ */ > > > memcache_deregister(ib_device->memcache, &rq->buflist); > > > if (rq->state.recv == RQ_WAITING_INCOMING > > > && rq->buflist.tot_len > ib_device->eager_buf_payload) > > > memcache_deregister(ib_device->memcache, &rq->buflist); > > > # endif > > > > > > The patch below worked cleans it up, and compiles and runs *here*, but > > > we have some weird hanging issues after closing connections, basically > > > no new unexpected requests are processed until after the BMItimeout > > > and/or FlowTimeout (pvfs2.conf) have expired. We're not sure if that > > > is a seperate issue, it hangs and dies w/o this patch, now we just > > > hang. This fixes the segfault which is a more major issue. > > > Patch is against CVS head from 20 minutes ago. > > > > > > > Yep, good stuff. I was too eager in adding extra cancel states > > back when Troy pointed out there were problems. This fixes the > > bugs that went in with that patch on 15 feb 08. I think it should > > do the saame thing as what you were aiming at. Let me know if > > you like it. > > > > -- Pete > > > > P.S. Put a line "diff -uNp" in your ~/.cvsrc. Unified diffs are > > so much easier to read. > > > > commit 0872d4cace8e9cc98ac1a0831d56fb784f2fc0ff > > Author: Pete Wyckoff <[EMAIL PROTECTED]> > > Date: Tue Apr 8 14:40:37 2008 -0400 > > > > bmi ib: fix double free in cancel > > > > Earlier checkin to clean up more thoroughly during a cancel > > errantly introduced a double free in certain states. Remove > > those. > > > > diff --git a/src/io/bmi/bmi_ib/ib.c b/src/io/bmi/bmi_ib/ib.c > > index f8d6b41..76deb7d 100644 > > --- a/src/io/bmi/bmi_ib/ib.c > > +++ b/src/io/bmi/bmi_ib/ib.c > > @@ -1496,8 +1496,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id > context_id __unused) > > /* pin when sending rts, so also must dereg in this state */ > > if (sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION || > > sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION_GOT_CTS || > > - sq->state.send == SQ_WAITING_CTS || > > - sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION) > > + sq->state.send == SQ_WAITING_CTS) > > memcache_deregister(ib_device->memcache, &sq->buflist); > > # endif > > #endif > > @@ -1512,8 +1511,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id > context_id __unused) > > memcache_deregister(ib_device->memcache, &rq->buflist); > > # if MEMCACHE_EARLY_REG > > /* pin on post, dereg all these */ > > - if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION || > > - rq->state.recv == RQ_RTS_WAITING_RTS_DONE) > > + if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION) > > memcache_deregister(ib_device->memcache, &rq->buflist); > > if (rq->state.recv == RQ_WAITING_INCOMING > > && rq->buflist.tot_len > ib_device->eager_buf_payload) > > _______________________________________________ > > Pvfs2-developers mailing list > > [email protected] > > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers > > > > > > -- Kyle Schochenmaier _______________________________________________ Pvfs2-developers mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
