I think this is a fairly critical fix and will affect the majority of
the ib users I know of, who are all running cvs head for various
reasons.
The patch only moves two #defines around so I'm not sure why we should
hold off on it, maybe I'm missing something?

Kyle

On Tue, Apr 8, 2008 at 1:47 PM, Sam Lang <[EMAIL PROTECTED]> wrote:
>
>  Does this need to go in the next release?
>  -sam
>
>
>
>  On Apr 8, 2008, at 1:43 PM, Pete Wyckoff wrote:
>
> >
> >
> >
> > [EMAIL PROTECTED] wrote on Tue, 08 Apr 2008 13:17 -0500:
> >
> > > Troy and I stumbled across this bug, that at least for our
> > > configurations, causes a double-free on the server when cleaning up
> > > 'stale' connections.
> > >
> > > It seems that some logic is duplicated here when cleaning up
> > > connections that disappear to the server:
> > >
> > > #if !MEMCACHE_BOUNCEBUF
> > >           if (rq->state.recv == RQ_RTS_WAITING_RTS_DONE)  /*
> > > --------------- here-------------- */
> > >               memcache_deregister(ib_device->memcache, &rq->buflist);
> > > #  if MEMCACHE_EARLY_REG
> > >           /* pin on post, dereg all these */
> > >           if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
> > >               rq->state.recv == RQ_RTS_WAITING_RTS_DONE)    /*
> > > ----------------- and here ------------ */
> > >               memcache_deregister(ib_device->memcache, &rq->buflist);
> > >           if (rq->state.recv == RQ_WAITING_INCOMING
> > >             && rq->buflist.tot_len > ib_device->eager_buf_payload)
> > >               memcache_deregister(ib_device->memcache, &rq->buflist);
> > > #  endif
> > >
> > > The patch below worked cleans it up, and compiles and runs *here*, but
> > > we have some weird hanging issues after closing connections, basically
> > > no new unexpected requests are processed until after the BMItimeout
> > > and/or FlowTimeout (pvfs2.conf)  have expired.  We're not sure if that
> > > is a seperate issue, it hangs and dies w/o this patch, now we just
> > > hang.  This fixes the segfault which is a more major issue.
> > > Patch is against CVS head from 20 minutes ago.
> > >
> >
> > Yep, good stuff.  I was too eager in adding extra cancel states
> > back when Troy pointed out there were problems.  This fixes the
> > bugs that went in with that patch on 15 feb 08.  I think it should
> > do the saame thing as what you were aiming at.  Let me know if
> > you like it.
> >
> >                -- Pete
> >
> > P.S.  Put a line "diff -uNp" in your ~/.cvsrc.  Unified diffs are
> > so much easier to read.
> >
> > commit 0872d4cace8e9cc98ac1a0831d56fb784f2fc0ff
> > Author: Pete Wyckoff <[EMAIL PROTECTED]>
> > Date:   Tue Apr 8 14:40:37 2008 -0400
> >
> >   bmi ib: fix double free in cancel
> >
> >   Earlier checkin to clean up more thoroughly during a cancel
> >   errantly introduced a double free in certain states.  Remove
> >   those.
> >
> > diff --git a/src/io/bmi/bmi_ib/ib.c b/src/io/bmi/bmi_ib/ib.c
> > index f8d6b41..76deb7d 100644
> > --- a/src/io/bmi/bmi_ib/ib.c
> > +++ b/src/io/bmi/bmi_ib/ib.c
> > @@ -1496,8 +1496,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id
> context_id __unused)
> >            /* pin when sending rts, so also must dereg in this state */
> >            if (sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION ||
> >                sq->state.send == SQ_WAITING_RTS_SEND_COMPLETION_GOT_CTS ||
> > -               sq->state.send == SQ_WAITING_CTS ||
> > -               sq->state.send == SQ_WAITING_DATA_SEND_COMPLETION)
> > +               sq->state.send == SQ_WAITING_CTS)
> >                memcache_deregister(ib_device->memcache, &sq->buflist);
> > #  endif
> > #endif
> > @@ -1512,8 +1511,7 @@ BMI_ib_cancel(bmi_op_id_t id, bmi_context_id
> context_id __unused)
> >                memcache_deregister(ib_device->memcache, &rq->buflist);
> > #  if MEMCACHE_EARLY_REG
> >            /* pin on post, dereg all these */
> > -           if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION ||
> > -               rq->state.recv == RQ_RTS_WAITING_RTS_DONE)
> > +           if (rq->state.recv == RQ_RTS_WAITING_CTS_SEND_COMPLETION)
> >                memcache_deregister(ib_device->memcache, &rq->buflist);
> >            if (rq->state.recv == RQ_WAITING_INCOMING
> >              && rq->buflist.tot_len > ib_device->eager_buf_payload)
> > _______________________________________________
> > Pvfs2-developers mailing list
> > [email protected]
> > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
> >
> >
>
>



-- 
Kyle Schochenmaier
_______________________________________________
Pvfs2-developers mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Reply via email to