Hi Hal, This is the mail that I was talking about (QoS info for OpenSM man page). Sasha has reviewed it, and posted his answer to the mailing list.
-- Yevgeny -------- Original Message -------- Subject: [ofa-general] [PATCH] opensm/man: Adding QoS-related info to opensm man pages Date: Wed, 26 Mar 2008 02:47:08 +0200 From: Yevgeny Kliteynik <[EMAIL PROTECTED]> To: Sasha Khapyorsky <[EMAIL PROTECTED]> CC: OpenIB <[email protected]> Hi Sasha, I've added QoS related info to opensm man pages: enhanced existing part (that was talking about VL arbitration) and added description of QoS manager in accordance with QoS annex. Please apply to ofed_1_3 and master. Signed-off-by: Yevgeny Kliteynik <[EMAIL PROTECTED]> --- opensm/man/opensm.8.in | 501 +++++++++++++++++++++++++++++++++++++++++++----- 1 files changed, 457 insertions(+), 44 deletions(-) diff --git a/opensm/man/opensm.8.in b/opensm/man/opensm.8.in index 5322ab7..1d9c5b7 100644 --- a/opensm/man/opensm.8.in +++ b/opensm/man/opensm.8.in @@ -35,7 +35,8 @@ to initialize the InfiniBand hardware (at least one per each InfiniBand subnet). opensm also now contains an experimental version of a performance -manager as well. +manager and an experimental version QoS manager (in accordance with +IBA QoS Annex). opensm defaults were designed to meet the common case usage on clusters with up to a few hundred nodes. Thus, in this default mode, opensm will scan the IB fabric, initialize it, and sweep occasionally for changes. @@ -433,51 +434,463 @@ partition manager: Default=0x7fff,ipoib:ALL=full; -.SH QOS CONFIGURATION +.SH QUALITY OF SERVICE .PP -There are a set of QoS related low-level configuration parameters. -All these parameter names are prefixed by "qos_" string. Here is a full -list of these parameters: - - qos_max_vls - The maximum number of VLs that will be on the subnet - qos_high_limit - The limit of High Priority component of VL - Arbitration table (IBA 7.6.9) - qos_vlarb_low - Low priority VL Arbitration table (IBA 7.6.9) - template - qos_vlarb_high - High priority VL Arbitration table (IBA 7.6.9) - template - Both VL arbitration templates are pairs of - VL and weight - qos_sl2vl - SL2VL Mapping table (IBA 7.6.6) template. It is - a list of VLs corresponding to SLs 0-15 (Note - that VL15 used here means drop this SL) - -Typical default values (hard-coded in OpenSM initialization) are: - - qos_max_vls=15 - qos_high_limit=0 - qos_vlarb_low=0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4 - qos_vlarb_high=0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0 - qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 - -The syntax is compatible with rest of OpenSM configuration options and -values may be stored in OpenSM config file (cached options file). - -In addition to the above, we may define separate QoS configuration -parameters sets for various target types. As targets, we currently support -CAs, routers, switch external ports, and switch's enhanced port 0. The -names of such specialized parameters are prefixed by "qos_<type>_" -string. Here is a full list of the currently supported sets: - - qos_ca_ - QoS configuration parameters set for CAs. - qos_rtr_ - parameters set for routers. - qos_sw0_ - parameters set for switches' port 0. - qos_swe_ - parameters set for switches' external ports. +OpenSM QoS support comprises of two parts: -Examples: - qos_sw0_max_vls=2 - qos_ca_sl2vl=0,1,2,3,5,5,5,12,12,0, - qos_swe_high_limit=0 + 1. \fBQoS manager in accordance with IBA QoS Annex\fP (experimental) +.P + 2. \fBSL2VL and VL Arbitration tables configuration\fP +.P +.SS QoS Manager (experimental) +.PP +When Quality of Service in OpenSM is enabled (-Q or --qos), OpenSM looks +for QoS Policy file. The default name of this file is [EMAIL PROTECTED]@/@[EMAIL PROTECTED] The default may be changed by using +-Y or --qos_policy_file option with OpenSM. + +During fabric initialization and at every heavy sweep OpenSM parses the +QoS policy file, applies its settings to the discovered fabric elements, +and enforces the provided policy on client requests. The overall flow for +such requests is as follows: + - The request is matched against the defined matching rules such that + the QoS Level definition is found. + - Given the QoS Level, path(s) search is performed with the given + restrictions imposed by that level. + +There are two ways to define QoS policy: + - \fBFull\fP: the full policy file syntax provides the administrator various + ways to match a PathRecord/MultiPathRecord (PR/MPR) request, and to + enforce various QoS constraints on the requested PR/MPR. + - \fBSimplified\fP: the simplified policy file syntax enables the administrator + match PR/MPR requests by various ULPs and applications running on top of + these ULPs. + +While the full policy syntax is very flexible, in many cases the simplified +policy definition would be sufficient. +.PP +.B Full QoS Policy File +.PP +QoS policy file has the following sections: + +.B I) +Port Groups (denoted by port-groups). +This section defines zero or more port groups that can be referred later by +matching rules (see below). Port group lists ports by: + - Port GUID + - Port name, which is a combination of NodeDescription and IB port number + - PKey, which means that all the ports in the subnet that belong to + partition with a given PKey belong to this port group + - Partition name, which means that all the ports in the subnet that belong + to partition with a given name belong to this port group + - Node type, where possible node types are: CA, SWITCH, ROUTER, ALL, and + SELF (SM's port). + +.B II) +QoS Setup (denoted by qos-setup). +This section describes how to set up SL2VL and VL Arbitration tables on +various nodes in the fabric. +However, this is not supported in OFED 1.3. +SL2VL and VLArb tables should be configured in the OpenSM options file. + +.B III) +QoS Levels (denoted by qos-levels). +Each QoS Level defines Service Level (SL) and a few optional fields: + - MTU limit + - Rate limit + - PKey + - Packet lifetime + +When path(s) search is performed, it is done with regards to restriction that +these QoS Level parameters impose. +One QoS level that is mandatory to define is a DEFAULT QoS level. It is +applied to a PR/MPR query that does not match any existing match rule. +Similar to any other QoS Level, it can also be explicitly referred by any +match rule. + +.B IV) +QoS Matching Rules (denoted by qos-match-rules). +Each PathRecord/MultiPathRecord query that OpenSM receives is matched against +the set of matching rules. Rules are scanned in order of appearance in the QoS +policy file such as the first match takes precedence. +Each rule has a name of QoS level that will be applied to the matching query. +A default QoS level is applied to a query that did not match any rule. +Queries can be matched by: + - Source port group (whether a source port is a member of a specified group) + - Destination port group (same as above, only for destination port) + - PKey + - QoS class + - Service ID + +To match a certain matching rule, PR/MPR query has to match ALL the rule's +criteria. However, not all the fields of the PR/MPR query have to appear in +the matching rule. +For instance, if the rule has a single criterion - Service ID, it will match +any query that has this Service ID, disregarding rest of the query fields. +However, if a certain query has only Service ID (which means that this is the +only bit in the PR/MPR component mask that is on), it will not match any rule +that has other matching criteria besides Service ID. +.PP +.B Simplified QoS Policy Definition +.PP +Simplified QoS policy definition comprises of a single section denoted by +qos-ulps. Similar to the full QoS policy, it has a list of match rules and +their QoS Level, but in this case a match rule has only one criterion - its +goal is to match a certain ULP (or a certain application on top of this ULP) +PR/MPR request, and QoS Level has only one constraint - Service Level (SL). +The simplified policy section may appear in the policy file in combine with +the full policy, or as a stand-alone policy definition. +See more details and list of match rule criteria below. +.PP +.B Policy File Syntax Guidelines +.PP +Empty lines are ignored. +Leading and trailing blanks, as well as empty lines, are ignored, so the +indentation in the example is just for better readability. +Comments are started with the pound sign (#) and terminated by EOL. +Any keyword should be the first non-blank in the line, unless it's a comment. +Keywords that denote section/subsection start have matching closing keywords. +Having a QoS Level named "DEFAULT" is a must - it is applied to PR/MPR +requests that didn't match any of the matching rules. +Any section/subsection of the policy file is optional. + +.PP +.B Examples of Full Policy File +.PP +As mentioned earlier, any section of the policy file is optional, and +the only mandatory part of the policy file is a default QoS Level. +Here's an example of the shortest policy file: + + qos-levels + qos-level + name: DEFAULT + sl: 0 + end-qos-level + end-qos-levels + +Port groups section is missing because there are no match rules, which means +that port groups are not referred anywhere, and there is no need defining +them. And since this policy file doesn't have any matching rules, PR/MPR query +won't match any rule, and OpenSM will enforce default QoS level. +Essentially, the above example is equivalent to not having QoS policy file +at all. + +The following example shows all the possible options and keywords in the +policy file and their syntax: + + # + # See the comments in the following example. + # They explain different keywords and their meaning. + # + port-groups + port-group + name: Storage + # "use" is just a description that is used for logging + # Other than that, it is just a comment + use: SRP Targets + port-guid: 0x10000000000001, 0x10000000000005-0x1000000000FFFA + port-guid: 0x1000000000FFFF + end-port-group + + port-group + name: Virtual Servers + # The syntax of the port name is as follows: + # "node_description/Pnum". + # node_description is compared to the NodeDescription of the node, + # and "Pnum" is a port number on that node. + port-name: vs1 HCA-1/P1, vs2 HCA-1/P1 + end-port-group + + # using partitions defined in the partition policy + port-group + name: Partitions + partition: Part1 + pkey: 0x1234 + end-port-group + + # using node types: CA, ROUTER, SWITCH, SELF (for node that runs SM) + # or ALL (for all the nodes in the subnet) + port-group + name: CAs and SM + node-type: CA, SELF + end-port-group + + end-port-groups + + qos-setup + # This section of the policy file describes how to set up SL2VL and VL + # Arbitration tables on various nodes in the fabric. + # However, this is not supported in OFED 1.3 - the section is parsed + # and ignored. SL2VL and VLArb tables should be configured in the + # OpenSM options file (by default - /var/cache/opensm/opensm.opts). + end-qos-setup + + qos-levels + + # Having a QoS Level named "DEFAULT" is a must - it is applied to + # PR/MPR requests that didn't match any of the matching rules. + qos-level + name: DEFAULT + use: default QoS Level + sl: 0 + end-qos-level + + # the whole set: SL, MTU-Limit, Rate-Limit, PKey, Packet Lifetime + qos-level + name: WholeSet + sl: 1 + mtu-limit: 4 + rate-limit: 5 + pkey: 0x1234 + packet-life: 8 + end-qos-level + + end-qos-levels + + # Match rules are scanned in order of their apperance in the policy file. + # First matched rule takes precedence. + qos-match-rules + + # matching by single criteria: QoS class + qos-match-rule + use: by QoS class + qos-class: 7-9,11 + # Name of qos-level to apply to the matching PR/MPR + qos-level-name: WholeSet + end-qos-match-rule + + # show matching by destination group and service id + qos-match-rule + use: Storage targets + destination: Storage + service-id: 0x10000000000001, 0x10000000000008-0x10000000000FFF + qos-level-name: WholeSet + end-qos-match-rule + + qos-match-rule + source: Storage + use: match by source group only + qos-level-name: DEFAULT + end-qos-match-rule + + qos-match-rule + use: match by all parameters + qos-class: 7-9,11 + source: Virtual Servers + destination: Storage + service-id: 0x0000000000010000-0x000000000001FFFF + pkey: 0x0F00-0x0FFF + qos-level-name: WholeSet + end-qos-match-rule + + end-qos-match-rules + +.PP +.B Simplified QoS Policy - Details and Examples +.PP +Simplified QoS policy match rules are tailored for matching ULPs (or +some application on top of a ULP) PR/MPR requests. It has a list of +per-ULP (or per-application) match rules and the SL that should be +enforced on the matched PR/MPR query. + +Match rules include: + - Default match rule that is applied to PR/MPR query that didn't + match any of the other match rules + - SDP + - SDP application with a specific target TCP/IP port range + - SRP with a specific target IB port GUID + - RDS + - iSER + - iSER application with a specific target TCP/IP port range + - IPoIB with a default PKey + - IPoIB with a specific PKey + - any ULP/application with a specific Service ID in the PR/MPR query + - any ULP/application with a specific PKey in the PR/MPR query + - any ULP/application with a specific target IB port GUID in the PR/MPR query + +Since any section of the policy file is optional, as long as basic rules +of the file are kept (such as no referring to nonexisting port group, +having default QoS Level, etc), the simplified policy section (qos-ulps) +can serve as a complete QoS policy file. +The shortest policy file in this case would be as follows: + + qos-ulps + default : 0 #default SL + end-qos-ulps + +It is equivalent to not having policy file at all. + +Below is an example of simplified QoS policy with all the possible keywords: + + qos-ulps + default : 0 # default SL + sdp, port-num 30000 : 0 # SL for application running on top + # of SDP when a destination + # TCP/IPport is 30000 + sdp, port-num 10000-20000 : 0 + sdp : 1 # default SL for any other + # application running on top of SDP + rds : 2 # SL for RDS traffic + iser, port-num 900 : 0 # SL for iSER with a specific target + # port + iser : 3 # default SL for iSER + ipoib, pkey 0x0001 : 0 # SL for IPoIB on partition with + # pkey 0x0001 + ipoib : 4 # default IPoIB partition, + # pkey=0x7FFF + any, service-id 0x6234 : 6 # match any PR/MPR query with a + # specific Service ID + any, pkey 0x0ABC : 6 # match any PR/MPR query with a + # specific PKey + srp, target-port-guid 0x1234 : 5 # SRP when SRP Target is located on + # a specified IB port GUID + any, target-port-guid 0x0ABC-0xFFFFF : 6 # match any PR/MPR query with + # a specific target port GUID + end-qos-ulps + + +Similar to the full policy definition, matching of PR/MPR queries is done in +order of appearance in the QoS policy file such as the first match takes +precedence, except for the "default" rule, which is applied only if the query +didn't match any other rule. + +All other sections of the QoS policy file take precedence over the qos-ulps +section. That is, if a policy file has both qos-match-rules and qos-ulps +sections, then any query is matched first against the rules in the +qos-match-rules section, and only if there was no match, the query is matched +against the rules in qos-ulps section. + +Note that some of these match rules may overlap, so in order to use the +simplified QoS definition effectively, it is important to understand how each +of the ULPs is matched: + +.B IPoIB: +PR query is matched by PKey. Default PKey for IPoIB partition is 0x7fff, so +the following three match rules are equivalent: + + ipoib : <SL> + ipoib, pkey 0x7fff : <SL> + any, pkey 0x7fff : <SL> + +.I Note +: For OFED 1.3, IPoIB partition SL configuration should be done through +partition configuration file only. + +\fBSDP\fP: PR query is matched by Service ID. The Service-ID for SDP is +0x000000000001PPPP, where PPPP are 4 hex digits holding the remote TCP/IP +Port Number to connect to. The following two match rules are equivalent: + + sdp : <SL> + any, service-id 0x0000000000010000-0x000000000001ffff : <SL> + +\fBRDS\fP: Similar to SDP, RDS PR query is matched by Service ID. The +Service ID for RDS is 0x000000000106PPPP, where PPPP are 4 hex digits +holding the remote TCP/IP Port Number to connect to. Default port number +for RDS is 0x48CA, which makes a default Service-ID 0x00000000010648CA. +The following two match rules are equivalent: + + rds : <SL> + any, service-id 0x00000000010648CA : <SL> + +\fBiSER\fP: Similar to RDS, iSER query is matched by Service ID, where the +Service ID is also 0x000000000106PPPP. Default port number for iSER is 0x035C, +which makes a default Service-ID 0x000000000106035C. +The following two match rules are equivalent: + + iser : <SL> + any, service-id 0x000000000106035C : <SL> + +\fBSRP\fP: Service ID for SRP varies from storage vendor to vendor, thus SRP query is +matched by the target IB port GUID. The following two match rules are +equivalent: + + srp, target-port-guid 0x1234 : <SL> + any, target-port-guid 0x1234 : <SL> + +Note that any of the above ULPs might contain target port GUID in the PR +query, so in order for these queries not to be recognized by the QoS manager +as SRP, the SRP match rule (or any match rule that refers to the target port +guid only) should be placed at the end of the qos-ulps match rules. + +\fBMPI\fP: SL for MPI is manually configured by MPI admin. OpenSM is not +forcing any SL on the MPI traffic, and that's why it is the only ULP that +did not appear in the qos-ulps section. + + +.SS SL2VL Mapping and VL Arbitration +.PP + +OpenSM cached options file has a set of QoS related configuration +parameters, that are used to configure SL2VL mapping and VL arbitration +on IB ports. These parameters are: + - Max VLs: the maximum number of VLs that will be on the subnet. + - High limit: the limit of High Priority component of VL Arbitration + table (IBA 7.6.9). + - VLArb low table: Low priority VL Arbitration table (IBA 7.6.9) template. + - VLArb high table: High priority VL Arbitration table (IBA 7.6.9) template. + - SL2VL: SL2VL Mapping table (IBA 7.6.6) template. It is a list of VLs + corresponding to SLs 0-15 (Note that VL15 used here means drop this SL). + +There are separate QoS configuration parameters sets for various target +types: CAs, routers, switch external ports, and switch's enhanced port 0. +The names of such parameters are prefixed by "qos_<type>_" string. +Here is a full list of the currently supported sets: + + qos_ca_ - QoS configuration parameters set for CAs. + qos_rtr_ - parameters set for routers. + qos_sw0_ - parameters set for switches' port 0. + qos_swe_ - parameters set for switches' external ports. + +Here's the example of typical default values for all the ports in the +subnet (hard-coded in OpenSM initialization): + + qos_max_vls=15 + qos_high_limit=0 + qos_vlarb_high=0:4,1:0,2:0,3:0,4:0,5:0,6:0,7:0,8:0,9:0,10:0,11:0,12:0,13:0,14:0 + qos_vlarb_low=0:0,1:4,2:4,3:4,4:4,5:4,6:4,7:4,8:4,9:4,10:4,11:4,12:4,13:4,14:4 + qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 + + +VL arbitration tables (both high and low) are lists of VL/Weight pairs. +Each list entry contains a VL number (values from 0-14), and a weighting value +(values 0-255), indicating the number of 64 byte units (credits) which may be +transmitted from that VL when its turn in the arbitration occurs. A weight +of 0 indicates that this entry should be skipped. If a list entry is +programmed for VL15 or for a VL that is not supported or is not currently +configured by the port, the port may either skip that entry or send from any +supported VL for that entry. + +Note, that the same VLs may be listed multiple times in the High or Low +priority arbitration tables, and, further, it can be listed in both tables. + +The limit of high-priority VLArb table (qos_<type>_high_limit) indicates the +number of high-priority packets that can be transmitted without an opportunity +to send a low-priority packet. Specifically, the number of bytes that can be +sent is high_limit times 4K bytes. + +A high_limit value of 255 indicates that the byte limit is unbounded. +Note: if the 255 value is used, the low priority VLs may be starved. +A value of 0 indicates that only a single packet from the high-priority table +may be sent before an opportunity is given to the low-priority table. + +Keep in mind that ports usually transmit packets of size equal to MTU. +For instance, for 4KB MTU a single packet will require 64 credits, so in order +to achieve effective VL arbitration for packets of 4KB MTU, the weighting +values for each VL should be multiples of 64. + +Below is an example of SL2VL and VL Arbitration configuration on subnet: + + qos_max_vls=15 + qos_high_limit=6 + qos_vlarb_high=0:4 + qos_vlarb_low=0:0,1:64,2:128,3:192,4:0,5:64,6:64,7:64 + qos_sl2vl=0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,7 + +In this example, there are 8 VLs configured on subnet: VL0 to VL7. VL0 is +defined as a high priority VL, and it is limited to 6 x 4KB = 24KB in a single +transmission burst. Such configuration would suilt VL that needs low latency +and uses small MTU when transmitting packets. Rest of VLs are defined as low +priority VLs with different weights, while VL4 is effectively turned off. .SH PREFIX ROUTES .PP -- 1.5.1.4 _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
