[ofa-general] Re: Incorrect max_sge reported in mthca device query
On Mon, 2007-04-02 at 09:08 +0300, Michael S. Tsirkin wrote: On Sun, 2007-04-01 at 09:43 +0300, Michael S. Tsirkin wrote: [...snip...] I think that if we extend the API, we need to design it carefully to cover as many use cases as possible. Tom, could you explain what are you trying to do? Why does your application need as many SGEs as possible? Mike: The application is NFS-RDMA. NFS keeps it's data as non-contiguous arrays of pages. So the motivation is that having a larger SGL allows you to support larger data transfers with a single operation. The challenge with the current query/request method is that as we've discussed the advertised max may not work. What makes the adjust/retry unworkable is that you don't know which of the advertised maxes caused the request to fail. So when you retry, which qp_attr do you adjust? The send sge? The recv sge? The qp depth? So what I'm proposing, and I think is similar if not identical to what other folks have talked about is having an interface that treats the qp_attr values as requested-sizes that can be adjusted by the provider. So for example, if I ask for a send_sge of 30, but you can only do 28, you give me 28 and adjust the qp_attr structure so that I know what I got. This would allow me to perform a predictable sequence of 1. query, 2. request, 3. adjust in my code. BTW, I think it needs to be new provider method to be done efficiently. Also, what's a good name, ib_request_qp? Thanks, Tom Also - what about out of resources cases described above? Would you expect the verbs API to retry the request for you? ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [ofa-general] Re: Incorrect max_sge reported in mthca device query
The challenge with the current query/request method is that as we've discussed the advertised max may not work. What makes the adjust/retry unworkable is that you don't know which of the advertised maxes caused the request to fail. So when you retry, which qp_attr do you adjust? The send sge? The recv sge? The qp depth? So what I'm proposing, and I think is similar if not identical to what other folks have talked about is having an interface that treats the qp_attr values as requested-sizes that can be adjusted by the provider. So for example, if I ask for a send_sge of 30, but you can only do 28, you give me 28 and adjust the qp_attr structure so that I know what I got. This would allow me to perform a predictable sequence of 1. query, 2. request, 3. adjust in my code. If the send sge/recv sge/qp depth/etc. aren't independent though, this pushes the problem and policy decision down to the provider. I can't think of an easy solution to this. - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
RE: [ofa-general] Re: Incorrect max_sge reported in mthca device query
On Thu, 2007-04-05 at 09:27 -0700, Sean Hefty wrote: The challenge with the current query/request method is that as we've discussed the advertised max may not work. What makes the adjust/retry unworkable is that you don't know which of the advertised maxes caused the request to fail. So when you retry, which qp_attr do you adjust? The send sge? The recv sge? The qp depth? So what I'm proposing, and I think is similar if not identical to what other folks have talked about is having an interface that treats the qp_attr values as requested-sizes that can be adjusted by the provider. So for example, if I ask for a send_sge of 30, but you can only do 28, you give me 28 and adjust the qp_attr structure so that I know what I got. This would allow me to perform a predictable sequence of 1. query, 2. request, 3. adjust in my code. If the send sge/recv sge/qp depth/etc. aren't independent though, this pushes the problem and policy decision down to the provider. I can't think of an easy solution to this. Agreed. But practically I think they are. I think the SGE max is driven off the max size of a WR and type of QP. This is true of the iWARP adapters as well. But taking the bait...even if you didn't push it down to the provider, how do you expose the inter-relationships to the consumer? An approach in this vein is a could_you_would_you/why_not interface that would return whether or not the specified qp_attr would work and if it didn't some indication of which resource(s) caused the problem. The problems there are a) the resource may be gone when you go back with what you just had approved, and b) you still have to fuss with multiple whacks at it if you couldn't get what you asked for. I think something simpler, although arguably not perfect is the way to go. Tom - Sean ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
Re: [ofa-general] Re: Incorrect max_sge reported in mthca device query
[EMAIL PROTECTED] wrote on Thu, 05 Apr 2007 09:45 -0500: The challenge with the current query/request method is that as we've discussed the advertised max may not work. What makes the adjust/retry unworkable is that you don't know which of the advertised maxes caused the request to fail. So when you retry, which qp_attr do you adjust? The send sge? The recv sge? The qp depth? As an aside, we discussed this topic in June 2006. See the thread http://lists.openfabrics.org/pipermail/general/2006-June/thread.html#23417 for some insightful comments from MST and Tom Talpey. No conclusion was reached regarding the ideal form of the API. -- Pete ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
[ofa-general] Re: Incorrect max_sge reported in mthca device query
Michael: Thanks for the detail reply. How about if we added an interface that would treat the SGE counts/WR counts as requests and then update the qp_init_attr struct with what was actually created? That would allow the app to request the max, but settle for what the device was capable of at the time. On Sun, 2007-04-01 at 09:43 +0300, Michael S. Tsirkin wrote: Quoting Tom Tucker [EMAIL PROTECTED]: Subject: Incorrect max_sge reported in mthca device query Roland: I think the max_sge reported by mthca_query_device is off by one. If you try to create a QP with the reported max, it fails with -EINVAL. I think the reason is that the mthca_alloc_wqe_buf function reserves a slot for a bind request and this pushes the WQE size over the 496B limit when the user requests the max (30) when allocating the QP. Please let me know if I'm confused about what max_sge really means. Thanks, Tom Tom, max_sge reported by mthca_query_device is the upper bound for all QP types. I have not tested this, but think you can create a UD type QP with this number of SGEs. I'd like to add that there can be no hard guarantee that creating a QP with a specific set of max_sge/max_wr always succeeds even if it is within the range of values reported by mthca_query_device: for example, for userspace QPs, the system administrator might have limited the amount of memory that can be locked up by these QPs, and QP allocation requests with large max_sge/max_wr values will always fail. There are other examples of this. Thus, an application that wants to use as large a number of SGEs/WRs as possible in a robust fashion currently has no other choice except a trial and error approach, handling failures gracefully. Finally, as a side note, it is *also* inefficient to request allocation of more sge entries than ULP will typically use - for reasons such as cache utilization, and many others. How does this overhead trade-off against the need to sometimes post multiple WRs by ULP will depend both on ULP and the hardware used. This need to tune the ULP to a specific HCA is annoying, and might be something that we want to try and solve at the API level. However, max_sge/max_wr values in query device are unlikely to be the appropriate API for this. One way out could be to extend the API for create_qp and friends, passing in both min and max values for some parameters, and allowing the verbs provider to choose the optimal combination of these. I think I floated a similiar proposal once already, but there didn't appear to be sufficient user support for such a large API extension. ___ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general