[ofa-general] Re: Incorrect max_sge reported in mthca device query

2007-04-05 Thread Tom Tucker
On Mon, 2007-04-02 at 09:08 +0300, Michael S. Tsirkin wrote:
  On Sun, 2007-04-01 at 09:43 +0300, Michael S. Tsirkin wrote:
[...snip...]
 I think that if we extend the API, we need to design it carefully
 to cover as many use cases as possible.
 Tom, could you explain what are you trying to do?
 Why does your application need as many SGEs as possible?
 
Mike:

The application is NFS-RDMA. NFS keeps it's data as non-contiguous
arrays of pages. So the motivation is that having a larger SGL allows
you to support larger data transfers with a single operation. 

The challenge with the current query/request method is that as we've
discussed the advertised max may not work. What makes the adjust/retry
unworkable is that you don't know which of the advertised maxes caused
the request to fail. So when you retry, which qp_attr do you adjust? The
send sge? The recv sge? The qp depth?

So what I'm proposing, and I think is similar if not identical to what
other folks have talked about is having an interface that treats the
qp_attr values as requested-sizes that can be adjusted by the provider.
So for example, if I ask for a send_sge of 30, but you can only do 28,
you give me 28 and adjust the qp_attr structure so that I know what I
got. This would allow me to perform a predictable sequence of 1. query,
2. request, 3. adjust in my code.

BTW, I think it needs to be new provider method to be done efficiently.
Also, what's a good name, ib_request_qp? 

Thanks,
Tom

 Also - what about out of resources cases described above?
 Would you expect the verbs API to retry the request for you?
 



___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [ofa-general] Re: Incorrect max_sge reported in mthca device query

2007-04-05 Thread Sean Hefty
The challenge with the current query/request method is that as we've
discussed the advertised max may not work. What makes the adjust/retry
unworkable is that you don't know which of the advertised maxes caused
the request to fail. So when you retry, which qp_attr do you adjust? The
send sge? The recv sge? The qp depth?

So what I'm proposing, and I think is similar if not identical to what
other folks have talked about is having an interface that treats the
qp_attr values as requested-sizes that can be adjusted by the provider.
So for example, if I ask for a send_sge of 30, but you can only do 28,
you give me 28 and adjust the qp_attr structure so that I know what I
got. This would allow me to perform a predictable sequence of 1. query,
2. request, 3. adjust in my code.

If the send sge/recv sge/qp depth/etc. aren't independent though, this pushes
the problem and policy decision down to the provider.  I can't think of an easy
solution to this.

- Sean
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


RE: [ofa-general] Re: Incorrect max_sge reported in mthca device query

2007-04-05 Thread Tom Tucker
On Thu, 2007-04-05 at 09:27 -0700, Sean Hefty wrote:
 The challenge with the current query/request method is that as we've
 discussed the advertised max may not work. What makes the adjust/retry
 unworkable is that you don't know which of the advertised maxes caused
 the request to fail. So when you retry, which qp_attr do you adjust? The
 send sge? The recv sge? The qp depth?
 
 So what I'm proposing, and I think is similar if not identical to what
 other folks have talked about is having an interface that treats the
 qp_attr values as requested-sizes that can be adjusted by the provider.
 So for example, if I ask for a send_sge of 30, but you can only do 28,
 you give me 28 and adjust the qp_attr structure so that I know what I
 got. This would allow me to perform a predictable sequence of 1. query,
 2. request, 3. adjust in my code.
 
 If the send sge/recv sge/qp depth/etc. aren't independent though, this pushes
 the problem and policy decision down to the provider.  I can't think of an 
 easy
 solution to this.

Agreed. But practically I think they are. I think the SGE max is driven
off the max size of a WR and type of QP. This is true of the iWARP
adapters as well.  

But taking the bait...even if you didn't push it down to the provider,
how do you expose the inter-relationships to the consumer? An approach
in this vein is a could_you_would_you/why_not interface that would
return whether or not the specified qp_attr would work and if it didn't
some indication of which resource(s) caused the problem. The problems
there are a) the resource may be gone when you go back with what you
just had approved, and b) you still have to fuss with multiple whacks
at it if you couldn't get what you asked for.

I think something simpler, although arguably not perfect is the way to
go.

Tom

 
 - Sean

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


Re: [ofa-general] Re: Incorrect max_sge reported in mthca device query

2007-04-05 Thread Pete Wyckoff
[EMAIL PROTECTED] wrote on Thu, 05 Apr 2007 09:45 -0500:
 The challenge with the current query/request method is that as we've
 discussed the advertised max may not work. What makes the adjust/retry
 unworkable is that you don't know which of the advertised maxes caused
 the request to fail. So when you retry, which qp_attr do you adjust? The
 send sge? The recv sge? The qp depth?

As an aside, we discussed this topic in June 2006.  See the thread

http://lists.openfabrics.org/pipermail/general/2006-June/thread.html#23417

for some insightful comments from MST and Tom Talpey.  No conclusion
was reached regarding the ideal form of the API.

-- Pete
___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[ofa-general] Re: Incorrect max_sge reported in mthca device query

2007-04-01 Thread Tom Tucker
Michael:

Thanks for the detail reply. 

How about if we added an interface that would treat the SGE counts/WR
counts as requests and then update the qp_init_attr struct with what
was actually created? That would allow the app to request the max, but
settle for what the device was capable of at the time. 

On Sun, 2007-04-01 at 09:43 +0300, Michael S. Tsirkin wrote:
  Quoting Tom Tucker [EMAIL PROTECTED]:
  Subject: Incorrect max_sge reported in mthca device query
  
  
  Roland:
  
  I think the max_sge reported by mthca_query_device is off by one. If you
  try to create a QP with the reported max, it fails with -EINVAL. I think
  the reason is that the mthca_alloc_wqe_buf function reserves a slot for
  a bind request and this pushes the WQE size over the 496B limit when
  the user requests the max (30) when allocating the QP.
  
  Please let me know if I'm confused about what max_sge really means.
  
  Thanks,
  Tom
 
 Tom,
   max_sge reported by mthca_query_device is the upper bound
   for all QP types. I have not tested this, but think you can
   create a UD type QP with this number of SGEs.
 
   I'd like to add that there can be no hard guarantee that
   creating a QP with a specific set of max_sge/max_wr always
   succeeds even if it is within the range of values reported
   by mthca_query_device: for example, for userspace QPs, the
   system administrator might have limited the amount of
   memory that can be locked up by these QPs, and
   QP allocation requests with large max_sge/max_wr
   values will always fail. There are other examples of this.
   Thus, an application that wants to use as large a number of SGEs/WRs as
   possible in a robust fashion currently has no other choice except
   a trial and error approach, handling failures gracefully.
 
   Finally, as a side note, it is *also* inefficient to request
   allocation of more sge entries than ULP will typically
   use - for reasons such as cache utilization, and many others.
   How does this overhead trade-off against the need to sometimes 
 post
   multiple WRs by ULP will depend both on ULP and the hardware
   used. This need to tune the ULP to a specific HCA is annoying,
   and might be something that we want to try and solve at
   the API level. However, max_sge/max_wr values in query device
   are unlikely to be the appropriate API for this.
 
   One way out could be to extend the API for create_qp and friends,
   passing in both min and max values for some parameters,
   and allowing the verbs provider to choose the optimal combination
   of these. I think I floated a similiar proposal once already, but there
   didn't appear to be sufficient user support for such a large API
   extension.
 

___
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general