Or Gerlitz wrote:
Steve Wise wrote:
Support for the IB BMME and iWARP equivalent memory extensions to non
shared memory regions. Usage Model:
- MR allocated with ib_alloc_mr()
- Page lists allocated via ib_alloc_fast_reg_page_list().
- MR made VALID and bound to a specific page list via
ib_post_send(IB_WR_FAST_REG_MR)
- MR made INVALID via ib_post_send(IB_WR_INVALIDATE_MR)
Steve,
I am trying to further understand what would be a real life ULP design
here, and I think there are some more issues to clarify/define for the
case of ULP which has to create a mapping for a list of pages and send
this mapping (eg IB/rkey iWARP/stag) to a remote party that uses it
for RDMA.
AFAIK, the idea was to let the ulp post --two-- work requests, where
the first creates the mapping and the second sends this mapping to the
remote side, such that the second does not start before the first
completes (i.e a fence).
Now, the above scheme means that the ulp knows the value of the
rkey/stag at the time of posting these two work requests (since it has
to encode it in the second one), so something has to be clarified re
the rkey/stag here, do they change each time this MR is used? how many
bits can be changed, etc.
The ULP knows the rkey/stag because its returned up front in the
ib_alloc_fast_reg_mr(). And it doesn't change (ignoring the key issue
which we haven't exposed yet to the ULP). The same rkey/stag can be
used for multiple mappings. It can be made invalid at any point in time
via the IB_WR_INVALIDATE_MR so the fact that you're leaving the same
rkey/stag advertised is not a risk.
So you allocate the rkey/stag up front, allocate page_lists up front,
then as needed you populate your page list and bind it to the rkey/stag
via IB_WR_FAST_REG_MR, and invalidate that mapping via
IB_WR_INVALIDATE_MR. You can do this any number of times, and with
proper fencing, you can pipeline these mappings. Eventually when
you're done doing IO (like for NFSRDMA when the mount is unmounted) you
free up the page list(s) and mr/rkey/stag.
So NFSRDMA will keep these fast_reg_mrs and page_list structs
pre-allocated and hung off some context so that per RPC, they can be
bound/registered, the IO executed, and then the MR invalidated as part
of processing the RPC.
I guess my questions are to some extent RTFM ones, but, first, with
some quick looking in the IB spec I did not manage to get enough
answers (pointers appreciated...) and second, you are proposing an
implementation here, so I think it makes sense to review the actual
usage model to see all aspects needed for ULPs are covered...
Talking on usage, do you plan to patch the mainline nfs-rdma code to
use these verbs?
Yes. Tom Tucker will be doing this. Jon Mason is implementing RDS
changes to utilize this too. The hope is all this makes 2.6.27/ofed-1.4.
I can also post test code (krping module) if anyone is interested. I'm
developing that now.
Steve.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general