Hi Eitan, On 8/28/07, Eitan Zahavi <[EMAIL PROTECTED]> wrote: > Hi Hal, > > As you pointed out, coming up with a VLArb and SL2VL manager that is > supporting both routing and QoS is not trivial. > But even with a uniform and statically assigned tables (supported by the > current options config file) different QoS classes can be provided to > different clients by the new QoS policy code. > > I agree our OFED 1.3 is somewhat limited as it does not solve the VLArb > and SL2VL setting issue. > But with the time constrains of the development, review and QA this is > the most we could do for 1.3
How much of the QoS policy syntax is left "dormant" by this ? How much is actually used ? Can this be detailed somewhere ? Also, there was a previous question on the list on IPoIB CM relative to QoS which I either missed the answer to or was left unanswered: How is IPoIB CM dealt with in terms of QoS ? Is it like IPoIB UD ? What documentation will be provided (e.g. man page and perhaps other doc file for this as it is quite complex) ? It would be nice to see the complete picture but it seems like this will be a set of incremental changes for which some looking back over the entirety will be needed when things settle down. -- Hal > EZ > > Eitan Zahavi > Senior Engineering Director, Software Architect > Mellanox Technologies LTD > Tel:+972-4-9097208 > Fax:+972-4-9593245 > P.O. Box 586 Yokneam 20692 ISRAEL > > > > > -----Original Message----- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of > > Hal Rosenstock > > Sent: Monday, August 27, 2007 7:33 PM > > To: Yevgeny Kliteynik > > Cc: OpenIB > > Subject: Re: [ofa-general] Re: [PATCH 3/7 V2] osm: QoS policy > > C & H files > > > > On 8/23/07, Yevgeny Kliteynik <[EMAIL PROTECTED]> wrote: > > > Hi Sasha, > > > > > > Sasha Khapyorsky wrote: > > > > On 11:11 Thu 23 Aug , Yevgeny Kliteynik wrote: > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +typedef struct osm_qos_port_group_t_ { > > > >>>> + char *name; /* single string (this port > > group name) */ > > > >>>> + char *use; /* single string (description) */ > > > >>>> + cl_list_t port_name_list; /* list of port > > names (.../.../...) */ > > > >>>> + uint64_t **guid_range_arr; /* array of guid > > ranges (pair of 64-bit > > > >>>> guids) */ > > > >>>> + unsigned guid_range_len; /* num of guid > > ranges in the array */ > > > >>>> + cl_list_t partition_list; /* list of > > partition names */ > > > >>>> + boolean_t node_type_ca; > > > >>>> + boolean_t node_type_switch; > > > >>>> + boolean_t node_type_router; > > > >>>> + boolean_t node_type_self; > > > >>>> +} osm_qos_port_group_t; > > > >>> I see you are using this in "run-time", not just during > > the parsing. > > > >>> Instead of having all this config features you can just resolve > > > >>> port guids in parse time and keep it here in cl_map() > > for fast searches. > > > >> By saying "config features", do you mean the four boolean flags? > > > > > > > > No, I mean everything except name and use. > > > > > > > >> It looks to me that checking the type of node is as fast as it > > > >> gets, and it won't hurt to leave these booleans instead of > > > >> resolving all the guids. > > > > > > > > It could be optimization when types are specified, which is not > > > > always the case and then you are going to do linear > > searches over all lists. > > > > > > Right, it would be linear scanning of list of partitions. > > > > > > > And how something like "for each guid in this group" (which is > > > > needed for QoS port parameters setup) should be resolved? By > > > > matching each guid in the subnet against those lists? > > > > > > Good point. > > > > > > >> Moreover, the guids here are stored in range array, > > which is IMO > > > >> better suited for the policy file syntax, because if a user > > > >> specifies something like this "0x0-0x0FFF" in guids, it will be > > > >> only one element of the array, which is efficient both > > in memory and in serch time. > > > > > > > > And what should be there if user specifies ports in the group as: > > > > guid1, guid2, guid3, etc. ? > > > > > > The range array is sorted and "shrinked". > > > That is, if a user specifies guids as "30,1,2,3-20,15", > > eventually you > > > would get two elements in the array: 1-20 and 30-30. > > > cl_map implemented as a binary tree, right? > > > And doing binary search in this kind of array is faster > > than searching > > > a guid in cl_map of all the guids. > > > Worst case - you will get the same performance as in case > > of cl_map if > > > *all* the guids are "discreet" and can't be groupped in ranges. > > > > > > Nevertheless, I agree that there's a problem if there are many > > > partitions in the list (although I'm not sure it's a practical case) > > > > > > I'll work on this. > > > > > > >> (I probably should mention here that the efficient > > search in the range > > > >> array is not implemented yet, but it would be a simple > > binary search - > > > >> there's a "todo" comment in the search function right now) > > > >> > > > >>>> + > > > >>>> +osm_qos_port_group_t *osm_qos_policy_port_group_create(); > > > >>>> +void osm_qos_policy_port_group_destroy(); > > > >>> Would be nice to have function prototypes in one place. > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +typedef struct osm_qos_vlarb_scope_t_ { > > > >>>> + cl_list_t group_list; /* list of group names (strings) */ > > > >>>> + cl_list_t across_list; /* list of 'across' group > > names (strings) */ > > > >>>> + cl_list_t vlarb_high_list; /* list of num > > pairs (n:m,...), 32-bit values > > > >>>> */ > > > >>>> + cl_list_t vlarb_low_list; /* list of num > > pairs (n:m,...), 32-bit values > > > >>>> */ > > > >>> Why cl_list for VLArb? it should be short fixed length arrays? > > > >> Right. > > > >> Since the actual VLArb setup is not implemented yet, I > > didn't see > > > >> this obvious thing. > > > > > > > > But it should be implemented. Right? > > > > > > Sure, but not for OFED 1.3 - we have a feature freeze in 11 days. > > > > Without VLArb configuration, where does this come from ? What > > is the QoS support then: just SL ? > > > > -- Hal > > > > > > > > >>>> + uint32_t vl_high_limit; /* single integer */ > > > >>>> + boolean_t vl_high_limit_set; > > > >>>> +} osm_qos_vlarb_scope_t; > > > >>>> + > > > >>>> +osm_qos_vlarb_scope_t *osm_qos_policy_vlarb_scope_create(); > > > >>>> +void osm_qos_policy_vlarb_scope_destroy(); > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +typedef struct osm_qos_sl2vl_scope_t_ { > > > >>>> + cl_list_t group_list; /* list of strings (port > > group names) */ > > > >>>> + boolean_t from[OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH]; > > > >>>> + boolean_t to[OSM_QOS_POLICY_MAX_PORTS_ON_SWITCH]; > > > >>>> + cl_list_t across_from_list; /* list of strings > > (port group names) */ > > > >>>> + cl_list_t across_to_list; /* list of strings > > (port group names) */ > > > >>>> + uint8_t sl2vl_table[16]; /* array of sl2vl values */ > > > >>>> + boolean_t sl2vl_table_set; > > > >>>> +} osm_qos_sl2vl_scope_t; > > > >>> This will be used for sl2vl setup? Same as above - why not to > > > >>> generate final port guid list just in parse time? > > > >>>> + > > > >>>> +osm_qos_sl2vl_scope_t *osm_qos_policy_sl2vl_scope_create(); > > > >>>> +void osm_qos_policy_sl2vl_scope_destroy(); > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +typedef struct osm_qos_level_t_ { > > > >>>> + char *use; > > > >>>> + char *name; > > > >>>> + uint8_t sl; > > > >>>> + boolean_t sl_set; > > > >>>> + uint8_t mtu_limit; > > > >>>> + boolean_t mtu_limit_set; > > > >>>> + uint8_t rate_limit; > > > >>>> + boolean_t rate_limit_set; > > > >>>> + uint8_t pkt_life; > > > >>>> + boolean_t pkt_life_set; > > > >>>> + uint64_t **path_bits_range_arr; /* array of bit > > ranges (real > > > >>>> +values are > > > >>>> 32bits) */ > > > >>>> + unsigned path_bits_range_len; /* num of bit > > ranges in the array */ > > > >>>> + uint64_t **pkey_range_arr; /* array of PKey > > ranges (real values are > > > >>>> 16bits) */ > > > >>>> + unsigned pkey_range_len; > > > >>>> +} osm_qos_level_t; > > > >>>> + > > > >>>> +osm_qos_level_t *osm_qos_policy_qos_level_create(); > > > >>>> +void osm_qos_policy_qos_level_destroy(); > > > >>>> + > > > >>>> +boolean_t osm_qos_level_has_pkey(IN const > > osm_qos_level_t * p_qos_level, > > > >>>> + IN ib_net16_t pkey); > > > >>>> + > > > >>>> +ib_net16_t osm_qos_level_get_shared_pkey(IN const > > > >>>> +osm_qos_level_t * > > > >>>> p_qos_level, > > > >>>> + IN const > > osm_physp_t * p_src_physp, > > > >>>> + IN const osm_physp_t * > > > >>>> + p_dest_physp); > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +typedef struct osm_qos_match_rule_t_ { > > > >>>> + char *use; > > > >>>> + cl_list_t source_list; /* list of strings */ > > > >>>> + cl_list_t source_group_list; /* list of pointers > > to relevant port-group > > > >>>> */ > > > >>>> + cl_list_t destination_list; /* list of strings */ > > > >>>> + cl_list_t destination_group_list; /* list of > > pointers to relevant > > > >>>> port-group */ > > > >>> I think you should only keep port guids there (mapped > > for fast searches). > > > >> Same as above. > > > >> I think that checking node type and then guid range array is > > > >> essentially faster than checking guid map. > > > > > > > > _Only_ for case when the group is specified by type, > > which is likely > > > > will not be typical. > > > > > > Again, in *worst* case you will get the same performance in > > searching > > > range array as in searching cl_map (which is a binary tree). > > > > > > >> You might say, of course, that there can be many port > > groups in > > > >> the same match rule, but I don't see this as a practical > > example. > > > > > > > > Of course it could (or you must disable multi groups > > here, which is > > > > not good idea I think and not what was presented in the RFC). > > > > > > Can you elaborate on this? > > > Do you think that having multiple groups is not useful at all? > > > > > > > Also each group still have this "linear searchable" lists: > > > > > > > > cl_list_t port_name_list; /* list of port > > names (.../.../...) */ > > > > > > Port names are not implemented yet :) > > > There's a "todo" comment in the code. > > > This is the only keyword in the parser that is not implemented - I > > > mean it's parsed, but other than creating this list the > > parser doesn't do anything with it. > > > But It definitely won't stay as a pure list. > > > > > > > uint64_t **guid_range_arr; /* array of guid > > ranges (pair of 64-bit > > > > > > This is not just a list, as I've explained before. > > > > > > > cl_list_t partition_list; /* list of > > partition names */ > > > > > > I can hardly believe than there will be more than one or two > > > partitions in this list. Do you think otherwise? > > > > > > Anyway, I can extract all the guids and prepare a map. This is > > > certainly the easiest implementation. And I still think that on > > > average, guid ranges and partition lists are better, but > > who knows - > > > perhaps someone would want to define a bunch of partitions > > in a single > > > port group and ruin my average... :) > > > > > > >>>> + char *qos_level_name; > > > >>>> + osm_qos_level_t *p_qos_level; > > > >>> Why do you need qos_level_name if you keep the pointer to this > > > >>> qos_level struct? > > > >> In policy file the match rule might appear before the > > QoS levels, > > > >> so matching qos level names to the actual qos levels is > > done when > > > >> the parsing is done. > > > >> > > > >>>> + uint64_t **service_id_range_arr; /* array of > > SID ranges (64-bit values) > > > >>>> */ > > > >>>> + unsigned service_id_range_len; uint64_t > > > >>>> + **qos_class_range_arr; /* array of QoS Class ranges (real > > > >>>> values are 16bits) */ > > > >>>> + unsigned qos_class_range_len; > > > >>>> + uint64_t **pkey_range_arr; /* array of PKey > > ranges (real values are > > > >>>> 16bits) */ > > > >>>> + unsigned pkey_range_len; > > > >>>> +} osm_qos_match_rule_t; > > > >>>> + > > > >>>> +osm_qos_match_rule_t *osm_qos_policy_match_rule_create(); > > > >>>> +void osm_qos_policy_match_rule_destroy(); > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +typedef struct osm_qos_policy_t_ { > > > >>>> + cl_list_t port_groups; /* list of osm_qos_port_group_t */ > > > >>>> + cl_list_t sl2vl_tables; /* list of osm_qos_sl2vl_scope_t */ > > > >>>> + cl_list_t vlarb_tables; /* list of osm_qos_vlarb_scope_t */ > > > >>>> + cl_list_t qos_levels; /* list of osm_qos_level_t */ > > > >>>> + cl_list_t qos_match_rules; /* list of > > osm_qos_match_rule_t */ > > > >>>> + osm_qos_level_t *p_default_qos_level; /* default > > QoS level */ > > > >>>> +} osm_qos_policy_t; > > > >>>> + > > > >>>> +void osm_qos_policy_create(); > > > >>>> +void osm_qos_policy_destroy(); > > > >>>> +int osm_qos_policy_validate(); > > > >>>> + > > > >>>> +void osm_qos_policy_get_qos_level_by_pr(IN const > > osm_pr_rcv_t * p_rcv, > > > >>>> + IN const > > ib_path_rec_t * p_pr, > > > >>>> + IN const > > osm_physp_t * p_src_physp, > > > >>>> + IN const > > osm_physp_t * p_dest_physp, > > > >>>> + IN ib_net64_t comp_mask, > > > >>>> + OUT osm_qos_level_t ** > > > >>>> +pp_qos_level); > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +int osm_qos_parse_policy_file(IN osm_log_t * p_log, IN const > > > >>>> +char > > > >>>> *policy_file); > > > >>>> + > > > >>>> +/***************************************************/ > > > >>>> + > > > >>>> +#endif /* ifndef > > OSM_QOS_POLICY_H */ > > > >>>> diff --git a/opensm/opensm/osm_qos_policy.c > > > >>>> b/opensm/opensm/osm_qos_policy.c new file mode 100644 index > > > >>>> 0000000..bc2aa68 > > > >>>> --- /dev/null > > > >>>> +++ b/opensm/opensm/osm_qos_policy.c > > > >>>> @@ -0,0 +1,901 @@ > > > >>>> +/* > > > >>>> + * Copyright (c) 2004-2006 Voltaire, Inc. All rights reserved. > > > >>>> + * Copyright (c) 2002-2005 Mellanox Technologies LTD. > > All rights > > > >>>> reserved. > > > >>>> + * Copyright (c) 1996-2003 Intel Corporation. All > > rights reserved. > > > >>>> + * > > > >>>> + * This software is available to you under a choice of one of > > > >>>> + two > > > >>>> + * licenses. You may choose to be licensed under the > > terms of > > > >>>> + the GNU > > > >>>> + * General Public License (GPL) Version 2, available from the > > > >>>> + file > > > >>>> + * COPYING in the main directory of this source tree, or the > > > >>>> + * OpenIB.org BSD license below: > > > >>>> + * > > > >>>> + * Redistribution and use in source and binary > > forms, with or > > > >>>> + * without modification, are permitted provided > > that the following > > > >>>> + * conditions are met: > > > >>>> + * > > > >>>> + * - Redistributions of source code must retain the above > > > >>>> + * copyright notice, this list of conditions > > and the following > > > >>>> + * disclaimer. > > > >>>> + * > > > >>>> + * - Redistributions in binary form must > > reproduce the above > > > >>>> + * copyright notice, this list of conditions > > and the following > > > >>>> + * disclaimer in the documentation and/or > > other materials > > > >>>> + * provided with the distribution. > > > >>>> + * > > > >>>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY > > > >>>> + KIND, > > > >>>> + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE > > > >>>> + WARRANTIES OF > > > >>>> + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > > > >>>> + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR > > COPYRIGHT > > > >>>> + HOLDERS > > > >>>> + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > > LIABILITY, WHETHER > > > >>>> + IN AN > > > >>>> + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > > FROM, OUT OF > > > >>>> + OR IN > > > >>>> + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > DEALINGS IN > > > >>>> + THE > > > >>>> + * SOFTWARE. > > > >>>> + * > > > >>>> + */ > > > >>>> + > > > >>>> +/* > > > >>>> + * Abstract: > > > >>>> + * OSM QoS Policy functions. > > > >>>> + * > > > >>>> + * Environment: > > > >>>> + * Linux User Mode > > > >>>> + * > > > >>>> + * Author: > > > >>>> + * Yevgeny Kliteynik, Mellanox > > > >>>> + */ > > > >>>> + > > > >>>> +#include <opensm/osm_qos_policy.h> #include > > > >>>> +<opensm/osm_qos_parser_y.h> #include <opensm/osm_partition.h> > > > >>>> + > > > >>>> +extern void yyerror(char *s); > > > >>>> +osm_log_t *p_qos_parser_osm_log = NULL; osm_qos_policy_t > > > >>>> +*p_qos_policy = NULL; > > > >>> Please try to avoid globals - keep it as part of > > osm_opensm_t or > > > >>> osm_subn_t structures. > > > >> I thought about it, but didn't want to "condaminate" the > > > >> osm_opensm_t or osm_subn_t structures untill the QoS > > functionality is ready. > > > > > > > > How globals are better in this sense? > > > > > > They're not :) > > > The difference is that less files are modified. > > > But I agree that the QoS policy should be part of > > osm_opensm_t or osm_subn_t. > > > > > > -- Yevgeny > > > > > > > Sasha > > > > > > > > > > _______________________________________________ > > > general mailing list > > > [email protected] > > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > > > To unsubscribe, please visit > > > http://openib.org/mailman/listinfo/openib-general > > > > > _______________________________________________ > > general mailing list > > [email protected] > > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > > > To unsubscribe, please visit > > http://openib.org/mailman/listinfo/openib-general > > > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
