Or Gerlitz wrote:
Steve Wise wrote:
Support for the IB BMME and iWARP equivalent memory extensions to non shared memory regions. Usage Model:
- MR allocated with ib_alloc_mr()
- Page lists allocated via ib_alloc_fast_reg_page_list().
- MR made VALID and bound to a specific page list via ib_post_send(IB_WR_FAST_REG_MR)
- MR made INVALID via ib_post_send(IB_WR_INVALIDATE_MR)
Steve,

I am trying to further understand what would be a real life ULP design here, and I think there are some more issues to clarify/define for the case of ULP which has to create a mapping for a list of pages and send this mapping (eg IB/rkey iWARP/stag) to a remote party that uses it for RDMA.

AFAIK, the idea was to let the ulp post --two-- work requests, where the first creates the mapping and the second sends this mapping to the remote side, such that the second does not start before the first completes (i.e a fence).

Now, the above scheme means that the ulp knows the value of the rkey/stag at the time of posting these two work requests (since it has to encode it in the second one), so something has to be clarified re the rkey/stag here, do they change each time this MR is used? how many bits can be changed, etc.

The ULP knows the rkey/stag because its returned up front in the ib_alloc_fast_reg_mr(). And it doesn't change (ignoring the key issue which we haven't exposed yet to the ULP). The same rkey/stag can be used for multiple mappings. It can be made invalid at any point in time via the IB_WR_INVALIDATE_MR so the fact that you're leaving the same rkey/stag advertised is not a risk.

So you allocate the rkey/stag up front, allocate page_lists up front, then as needed you populate your page list and bind it to the rkey/stag via IB_WR_FAST_REG_MR, and invalidate that mapping via IB_WR_INVALIDATE_MR. You can do this any number of times, and with proper fencing, you can pipeline these mappings. Eventually when you're done doing IO (like for NFSRDMA when the mount is unmounted) you free up the page list(s) and mr/rkey/stag.

So NFSRDMA will keep these fast_reg_mrs and page_list structs pre-allocated and hung off some context so that per RPC, they can be bound/registered, the IO executed, and then the MR invalidated as part of processing the RPC.


I guess my questions are to some extent RTFM ones, but, first, with some quick looking in the IB spec I did not manage to get enough answers (pointers appreciated...) and second, you are proposing an implementation here, so I think it makes sense to review the actual usage model to see all aspects needed for ULPs are covered...

Talking on usage, do you plan to patch the mainline nfs-rdma code to use these verbs?

Yes. Tom Tucker will be doing this. Jon Mason is implementing RDS changes to utilize this too. The hope is all this makes 2.6.27/ofed-1.4.

I can also post test code (krping module) if anyone is interested. I'm developing that now.

Steve.


_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to