Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Thu, 28 Oct 2004 10:42:23 -0400 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > On Thu, 2004-10-28 at 02:01, Sean Hefty wrote: > > I've run into a few other issues trying to use separate send queues. > > > > One of note is that receives are posted to the QP outside of the > > lock that inserts them onto the recv_posted_mad_list. > > I couldn't find where you were referring to. Can you point me at it ? I think it was in ib_mad_post_receive_mad. > Also, if this is locked, should we go to finer grained locks ? > Currently there is a lock for the receive list, but might a lock per > receive list per QP be better ? I've changed the code to use a lock per QP. So, each QP now has their own send list, receive list, send lock, and receive lock. - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Thu, 2004-10-28 at 02:01, Sean Hefty wrote: > I've run into a few other issues trying to use separate send queues. > One of note is that receives are posted to the QP outside of the lock > that inserts them onto the recv_posted_mad_list. I couldn't find where you were referring to. Can you point me at it ? > I don't think that this causes a problem at the moment, since receives are > always re-posted from the completion handler, which is single threaded. > > Question then, should I go ahead and fix this so that it would work in a > multi-threaded case, or assume that completion handling will be single > threaded and optimize for this by removing unnecessary locking? Also, if this is locked, should we go to finer grained locks ? Currently there is a lock for the receive list, but might a lock per receive list per QP be better ? > (Currently, my patch fixes the locking, but it should be noted that the > code won't actually test that the locking is correct as it's written.) I guess we'll just do it by code inspection or someone should develop test(s) or real case(s) for this. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Wed, 27 Oct 2004 09:53:28 -0700 Sean Hefty <[EMAIL PROTECTED]> wrote: > I'll create a patch that uses separate send_posted_mad_list's for > QP0/1, but try to keep the changes fairly minimal. I'll do this after > changing the completion handling to use the current workqueue, rather > than allocating a separate thread. (I've canned my user-mode work, > since Roland is further along.) I've run into a few other issues trying to use separate send queues. One of note is that receives are posted to the QP outside of the lock that inserts them onto the recv_posted_mad_list. I don't think that this causes a problem at the moment, since receives are always re-posted from the completion handler, which is single threaded. Question then, should I go ahead and fix this so that it would work in a multi-threaded case, or assume that completion handling will be single threaded and optimize for this by removing unnecessary locking? (Currently, my patch fixes the locking, but it should be noted that the code won't actually test that the locking is correct as it's written.) - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 26 Oct 2004 13:14:00 -0400 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > On Tue, 2004-10-26 at 13:10, Roland Dreier wrote: > > Sean> As a suggestion, we can allocate 2 CQs per QP, one for > > Sean> receives, and one for sends. This would let us separate > > Sean> send from receive completions based on the callback. > > > > That's one solution, and another way to handle it is to have a way > > of distinguishing sends from receives based on wr_id (that's what > > the Topspin stack does). > > That's where I was heading with this. It implies a "stolen" bit in the > WRID. > > > Not sure which is better really. > > Me neither but Sean seems to feel strongly about the CQ separation. Just to make sure that we don't have duplicate efforts, I've been working on the patch to fix handling of send completions. My plan is to use one send_mad_posted_list per QP, to make it faster/easier to find the correct send completion, plus allow for easier error handling when one of the special QPs goes into the error state. The code currently maintains a single CQ per port. - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Wed, 27 Oct 2004 10:20:05 -0400 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > On Tue, 2004-10-26 at 12:50, Sean Hefty wrote: > > I think we have other issues with the completion handling as well. > > Since we use a single CQ for both QPs, I think that we need to search > > the send_posted_mad_list to find the corresponding completion. > > We cannot assume that the completion matches with the request at the > > head of the list. > > > > This appears to be broken in the non-error case as well. > > > > I will happily create a patch to fix these issues. > > Is it worth fixing this for the current approach or should I just wait > for this patch ? I'll create a patch that uses separate send_posted_mad_list's for QP0/1, but try to keep the changes fairly minimal. I'll do this after changing the completion handling to use the current workqueue, rather than allocating a separate thread. (I've canned my user-mode work, since Roland is further along.) - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 2004-10-26 at 12:50, Sean Hefty wrote: > I think we have other issues with the completion handling as well. > Since we use a single CQ for both QPs, I think that we need to search > the send_posted_mad_list to find the corresponding completion. > We cannot assume that the completion matches with the request at the > head of the list. > > This appears to be broken in the non-error case as well. > > I will happily create a patch to fix these issues. Is it worth fixing this for the current approach or should I just wait for this patch ? Thanks. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 26 Oct 2004 13:03:58 -0400 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > On Tue, 2004-10-26 at 12:50, Sean Hefty wrote: > > > Another alternative is to assume it is a receive if it is not a send is > > > not matched. > > > > I think we have other issues with the completion handling as well. > > Since we use a single CQ for both QPs, I think that we need to search > > the send_posted_mad_list to find the corresponding completion. > > We cannot assume that the completion matches with the request at the > > head of the list. > > > > This appears to be broken in the non-error case as well. > > Right. > > > I will happily create a patch to fix these issues. > > Just wondering... will the patch change to a CQ/QP or leave it as 1 > CQ/port ? (BTW, there was a patch a long time ago on this which was lost > in the shuffle. Sorry). I was just looking at the other error handling cases to see what would make the most sense. At a minimum, I think that we want two send_posted_mad_list's, one per QP, in order to recover from errors on one of the QPs. Having a single list makes it more complicated to restart a QP. >From a software viewpoint, I think that 2 CQs per QP, for a total of 4 per port, >would make the code the simplest, and probably allow for the most optimization wrt >completion processing and QP size. (My assumption is that the memory cost for 4 >smaller CQs would basically be the same as 1 or 2 larger CQs.) Of course, we can always use a single CQ and just set the wr_id to something that can differentiate between which send/receive queue we're trying to process. - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
Hello! Quoting r. Roland Dreier ([EMAIL PROTECTED]) "Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call?send done handler": > Sean> As a suggestion, we can allocate 2 CQs per QP, one for > Sean> receives, and one for sends. This would let us separate > Sean> send from receive completions based on the callback. > > That's one solution, and another way to handle it is to have a way of > distinguishing sends from receives based on wr_id (that's what the > Topspin stack does). > > Not sure which is better really. > > - Roland If you have 2 CQs you could have separate threads handing sends and receives, waking up only the relevant one. MST ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 2004-10-26 at 13:10, Roland Dreier wrote: > Sean> As a suggestion, we can allocate 2 CQs per QP, one for > Sean> receives, and one for sends. This would let us separate > Sean> send from receive completions based on the callback. > > That's one solution, and another way to handle it is to have a way of > distinguishing sends from receives based on wr_id (that's what the > Topspin stack does). That's where I was heading with this. It implies a "stolen" bit in the WRID. > Not sure which is better really. Me neither but Sean seems to feel strongly about the CQ separation. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
Sean> As a suggestion, we can allocate 2 CQs per QP, one for Sean> receives, and one for sends. This would let us separate Sean> send from receive completions based on the callback. That's one solution, and another way to handle it is to have a way of distinguishing sends from receives based on wr_id (that's what the Topspin stack does). Not sure which is better really. - Roland ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 2004-10-26 at 12:50, Sean Hefty wrote: > > Another alternative is to assume it is a receive if it is not a send is > > not matched. > > I think we have other issues with the completion handling as well. > Since we use a single CQ for both QPs, I think that we need to search > the send_posted_mad_list to find the corresponding completion. > We cannot assume that the completion matches with the request at the > head of the list. > > This appears to be broken in the non-error case as well. Right. > I will happily create a patch to fix these issues. Just wondering... will the patch change to a CQ/QP or leave it as 1 CQ/port ? (BTW, there was a patch a long time ago on this which was lost in the shuffle. Sorry). -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 26 Oct 2004 12:44:56 -0400 Hal Rosenstock <[EMAIL PROTECTED]> wrote: > On Tue, 2004-10-26 at 12:30, Sean Hefty wrote: > > > I think this is still not quite right: what if a receive fails? > > > > As a suggestion, we can allocate 2 CQs per QP, one for receives, > > and one for sends. This would let us separate send from receive > > completions based on the callback. > > Another alternative is to assume it is a receive if it is not a send is > not matched. I think we have other issues with the completion handling as well. Since we use a single CQ for both QPs, I think that we need to search the send_posted_mad_list to find the corresponding completion. We cannot assume that the completion matches with the request at the head of the list. This appears to be broken in the non-error case as well. I will happily create a patch to fix these issues. - Sean ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 2004-10-26 at 12:30, Sean Hefty wrote: > > I think this is still not quite right: what if a receive fails? > > As a suggestion, we can allocate 2 CQs per QP, one for receives, > and one for sends. This would let us separate send from receive > completions based on the callback. Another alternative is to assume it is a receive if it is not a send is not matched. -- Hal ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
On Tue, 26 Oct 2004 09:14:03 -0700 Roland Dreier <[EMAIL PROTECTED]> wrote: > if (wc.status != IB_WC_SUCCESS) { > printk(KERN_ERR PFX "Completion error %d WRID 0x%Lx\n", > wc.status, (unsigned long long) > wc.wr_id); > + ib_mad_send_done_handler(port_priv, &wc); > } else { > > I think this is still not quite right: what if a receive fails? As a suggestion, we can allocate 2 CQs per QP, one for receives, and one for sends. This would let us separate send from receive completions based on the callback. ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
if (wc.status != IB_WC_SUCCESS) { printk(KERN_ERR PFX "Completion error %d WRID 0x%Lx\n", wc.status, (unsigned long long) wc.wr_id); + ib_mad_send_done_handler(port_priv, &wc); } else { I think this is still not quite right: what if a receive fails? - R. ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[openib-general] [PATCH] ib_mad: In completion handler, when status != success call send done handler
ib_mad: In completion handler, when status != success call send done handler so that if a request fails we'll complete it. eg if a consumer uses a bad L_Key for a send request, the send will complete (and if the consumer ever does ib_unregister_mad_agent it will not hang waiting for the send to finish). Index: ib_mad.c === --- ib_mad.c(revision 1070) +++ ib_mad.c(working copy) @@ -1168,6 +1168,7 @@ if (wc.status != IB_WC_SUCCESS) { printk(KERN_ERR PFX "Completion error %d WRID 0x%Lx\n", wc.status, (unsigned long long) wc.wr_id); + ib_mad_send_done_handler(port_priv, &wc); } else { printk(KERN_DEBUG PFX "Completion opcode 0x%x WRID 0x%Lx\n", wc.opcode, (unsigned long long) wc.wr_id); ___ openib-general mailing list [EMAIL PROTECTED] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general