> > - struct ib_recv_wr *bad_wr;
> > + struct ib_recv_wr *bad_wr, rx_wr;
> > + struct ib_sge rx_sge[IPOIB_CM_RX_SG];
>
> I worry about this putting an extra 300 bytes or so on the stack... I
> think it would be nicer if there were a way to just make sure receive
> posting was single-threaded, but I still don't know which contexts are
> racing against each other...
The race basically occurs between ipoib_poll() and
ipoib_cm_nonsrq_init_rx()
as Shirley mentioned. Below is sort of seq diagram for a possible race:
ipoib_poll()
ipoib_cm_req_handler()
ipoib_cm_handle_rx_wc() ipoib_cm_nonsrq_init_rx()
ipoib_cm_post_receive_nonsrq()
ipoib_cm_post_receive_nonsrq()
priv->cm.rx_wr.wr_id = id <<<<
>>>> priv->cm.rx_wr.wr_id = id
ib_post_recv()
ib_post_recv()
Makes sense?
Yes, we should avoid allocating those vars on the stack.
Either we use same pattern as done for SRQ, ie rx_wr_arr[conn]
or something like this:
- enhance ipoib_cm_post_receive_nonsrq() with wr and sge passed to by
the caller, ie ipoib_poll() -> ipoib_cm_handle_rx_wc() resp.
ipoib_cm_nonsrq_init_rx()
- ipoib_cm_handle_rx_wc() passes rx_wr and rx_sge as already defined
in struct ipoib_cm_dev_priv to ipoib_cm_post_receive_nonsrq()
since we use one cq, there should not be a race in this path
- ipoib_cm_nonsrq_init_rx() allocates wr and sge on the heap or slabcache,
passes them to ipoib_cm_post_receive_nonsrq() and frees them thereafter
If you want me to make a patch, glad to do.
Nam
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general