Sean Hefty wrote:
FYI - It is my intention to implement the host side portion of QoS support. (It's one of my path forward objectives.) I plan on implementing the host side as outlined below. If anyone has any comments, I would like to get them as soon as possible.

Sean,

From what I understand while reading your proposal, is that it is quite different then what what suggested in the original RFC. I don't think it makes sense to implement the host side of this before there's agreement on the over-all solution namely how the host side design/code plugs to the management scheme at the SM side.

Basically, the SM people have not really reacted on your proposal, which is a problem...

One more thing that bothers me is backward compatibility with SM/SA, that does not support the not-published-yet IBTA QoS extensions. Where you thinking to first probe for the SA capabilities to see if it supports QoS path-queries or think its an over-doing?

Or.
Sean Hefty wrote:
2. Architecture ----------------

This is a higher level approach to the problem, but I came up with the
following QoS relationship hierarchy, where '->' means 'maps to'.

Application Service -> Service ID (or range)
Service ID -> desired QoS
QoS, SGID, DGID, PKey -> SGID, DGID, TClass, FlowLabel, PKey
SGID, DGID, TC, FL, PKey -> SLID, DLID, SL (set if crossing subnets)
SLID, DLID, SL -> MTU, Rate, VL, PacketLifeTime

I use these relationships below:

4. IPoIB ---------

IPoIB already query the SA for its broadcast group information. The additional functionality required is for IPoIB to provide the
broadcast group SL, MTU, and RATE in every following PathRecord query
performed when a new UDAV is needed by IPoIB. We could assign a
special Service-ID for IPoIB use but since all communication on the
same IPoIB interface shares the same QoS-Level without the ability to
 differentiate it by target service we can ignore it for simplicity.

Rather than IPoIB specifying SL, MTU, and rate with PR queries, it should specify TClass and FlowLabel. This is necessary for IPoIB to span IB subnets.

5. CMA features ----------------

The CMA interface supports Service-ID through the notion of port
space as a prefixes to the port_num which is part of the sockaddr
provided to rdma_resolve_add(). What is missing is the explicit
request for a QoS-Class that should allow the ULP (like SDP) to
propagate a specific request for a class of service. A mechanism for
providing the QoS-Class is available in the IPv6 address, so we could
use that address field. Another option is to implement a special connection options API for CMA.

Missing functionality by CMA is the usage of the provided QoS-Class
and Service-ID in the sent PR/MPR. When a response is obtained it is
an existing requirement for the CMA to use the PR/MPR from the
response in setting up the QP address vector.

I think the RDMA CM needs two solutions, depending on which address family is used. For IPv6, the existing interface is sufficient, and works for both IB and iWarp. The RDMA CM only needs to include the TC and FL as part of its PR query. For IPv4, to remain transport neutral, I think we should add an rdma_set_option() routine to specify the QoS field. The RDMA CM would include the QoS field for PR query under this condition.

For IB, this requires changes to the ib_sa to support the new PR extensions. I don't think we gain anything having the RDMA CM include service IDs as part of the query.

6. SDP -------

SDP uses CMA for building its connections. The Service-ID for SDP is
0x000000000001PPPP, where PPPP are 4 hex digits holding the remote
TCP/IP Port Number to connect to. SDP might be provided with
SO_PRIORITY socket option. In that case the value provided should be
sent to the CMA as the TClass option of that connection.

SDP would use specify the QoS through the IPv6 address or rdma_set_option() routine.

7. SRP -------

Current SRP implementation uses its own CM callbacks (not CMA). So
SRP should fill in the Service-ID in the PR/MPR by itself and use
that information in setting up the QP. The T10 SRP standard defines
the SRP Service-ID to be defined by the SRP target I/O Controller
(but they should also comply with IBTA Service- ID rules). Anyway,
the Service-ID is reported by the I/O Controller in the ServiceEntries DMA attribute and should be used in the PR/MPR if the
SA reports its ability to handle QoS PR/MPRs.

I agree.

8. iSER -------- iSER uses CMA and thus should be very close to SDP.
The Service-ID for iSER should be TBD.

See RDMA CM and SDP.

3.2. PR/MPR query handling: OpenSM should be able to enforce the
provided policy on client request. The overall flow for such requests
is: first the request is matched against the defined match rules such
that the target QoS-Level definition is found. Given the QoS-Level a
path(s) search is performed with the given restrictions imposed by
that level. The following two sections describe these steps.

If we use the QoS hierarchy outlined above, I think we can construct some fairly simple tables to guide our PR selection. The SA may need to construct the tables starting at the bottom and working up, but I *think* it could be done. And by distributing the tables, we can support a more distributed (a la local SA) operation.

From an administration point, I would be happier seeing something where the administrator defines a QoS level in terms of latency or bandwidth requirements and relative priority. Then, if desired, the administrator could provide more details, such as indicating which nodes would use which services, minimum required MTUs, etc. It would then be up to the SA to map these requirements to specific TC, FL, SL, VL values.

In general, though, I'm personally far less concerned with the QoS specification interface to the SA, versus the operation that takes place on the hosts.

Comments on using this approach on the host side?

_______________________________________________
general mailing list
general@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to