Acked-by: Jarno Rajahalme <[email protected]> > On Jul 2, 2015, at 8:41 PM, Ben Pfaff <[email protected]> wrote: > > OpenFlow 1.4 introduces the ability to turn on flow table eviction with an > OFPT_TABLE_MOD message specifying OFPTC_EVICTION. It also adds related > machinery to other messages that mention OFPTC_* fields. This commit adds > support for the new feature, implementing it as a second, parallel way to > enable flow table eviction. It takes more work than it seems like it > should because there is so much weirdness with the treatment of OFPTC_* > flags over the evolution of OpenFlow; please refer to the explanation in > DESIGN.md for more information. > > This commit also adds related support to ovs-ofctl, plus tests. > > Signed-off-by: Ben Pfaff <[email protected]> > Co-authored-by: Saloni Jain <[email protected]> > Signed-off-by: Saloni Jain <[email protected]> > --- > v1->v2: Miscellaneous style fixes. Comment improvements. > Introduce OFPROTO_EVICTION_FLAGS. Fix documentation. > > DESIGN.md | 85 +++++++++++++- > NEWS | 1 + > include/openflow/openflow-1.3.h | 9 +- > lib/ofp-parse.c | 52 ++++++--- > lib/ofp-parse.h | 2 +- > lib/ofp-print.c | 90 ++++++++++---- > lib/ofp-util.c | 220 ++++++++++++++++++++++++++--------- > lib/ofp-util.h | 52 ++++++++- > ofproto/ofproto-provider.h | 9 +- > ofproto/ofproto.c | 173 +++++++++++++++++---------- > ofproto/ofproto.h | 18 +-- > tests/ofp-print.at | 2 +- > tests/ofproto.at | 23 +--- > utilities/ovs-ofctl.8.in | 36 +++--- > utilities/ovs-ofctl.c | 34 +++--- > vswitchd/bridge.c | 7 +- > vswitchd/vswitch.xml | 252 ++++++++++++++++++++++------------------ > 17 files changed, 721 insertions(+), 344 deletions(-) > > diff --git a/DESIGN.md b/DESIGN.md > index e533b7c..38413d7 100644 > --- a/DESIGN.md > +++ b/DESIGN.md > @@ -277,13 +277,19 @@ The table for 1.3 is the same as the one shown above > for 1.2. > > > OpenFlow 1.4 > ------------- > +----------- > + > +OpenFlow 1.4 makes these changes: > + > + - Adds the "importance" field to flow_mods, but it does not > + explicitly specify which kinds of flow_mods set the importance. > + For consistency, Open vSwitch uses the same rule for importance > + as for idle_timeout and hard_timeout, that is, only an "ADD" > + flow_mod sets the importance. (This issue has been filed with > + the ONF as EXT-496.) > > -OpenFlow 1.4 adds the "importance" field to flow_mods, but it does not > -explicitly specify which kinds of flow_mods set the importance. For > -consistency, Open vSwitch uses the same rule for importance as for > -idle_timeout and hard_timeout, that is, only an "ADD" flow_mod sets > -the importance. (This issue has been filed with the ONF as EXT-496.) > + - Eviction Mechanism to automatically delete entries of lower > + importance to make space for newer entries. > > > OpenFlow 1.4 Bundles > @@ -606,6 +612,73 @@ Tables 128 and above are reserved for use by the switch > itself. > Controllers should use only tables 0 through 127. > > > +OFPTC_* Table Configuration > +=========================== > + > +This section covers the history of the OFPTC_* table configuration > +bits across OpenFlow versions. > + > +OpenFlow 1.0 flow tables had fixed configurations. > + > +OpenFlow 1.1 enabled controllers to configure behavior upon flow table > +miss and added the OFPTC_MISS_* constants for that purpose. OFPTC_* > +did not control anything else but it was nevertheless conceptualized > +as a set of bit-fields instead of an enum. OF1.1 added the > +OFPT_TABLE_MOD message to set OFPTC_MISS_* for a flow table and added > +the 'config' field to the OFPST_TABLE reply to report the current > +setting. > + > +OpenFlow 1.2 did not change anything in this regard. > + > +OpenFlow 1.3 switched to another means to changing flow table miss > +behavior and deprecated OFPTC_MISS_* without adding any more OFPTC_* > +constants. This meant that OFPT_TABLE_MOD now had no purpose at all, > +but OF1.3 kept it around "for backward compatibility with older and > +newer versions of the specification." At the same time, OF1.3 > +introduced a new message OFPMP_TABLE_FEATURES that included a field > +'config' documented as reporting the OFPTC_* values set with > +OFPT_TABLE_MOD; of course this served no real purpose because no > +OFPTC_* values are defined. OF1.3 did remove the OFPTC_* field from > +OFPMP_TABLE (previously named OFPST_TABLE). > + > +OpenFlow 1.4 defined two new OFPTC_* constants, OFPTC_EVICTION and > +OFPTC_VACANCY_EVENTS, using bits that did not overlap with > +OFPTC_MISS_* even though those bits had not been defined since OF1.2. > +OFPT_TABLE_MOD still controlled these settings. The field for OFPTC_* > +values in OFPMP_TABLE_FEATURES was renamed from 'config' to > +'capabilities' and documented as reporting the flags that are > +supported in a OFPT_TABLE_MOD message. The OFPMP_TABLE_DESC message > +newly added in OF1.4 reported the OFPTC_* setting. > + > +OpenFlow 1.5 did not change anything in this regard. > + > +The following table summarizes. The columns say: > + > + - OpenFlow version(s). > + > + - The OFPTC_* flags defined in those versions. > + > + - Whether OFPT_TABLE_MOD can modify OFPTC_* flags. > + > + - Whether OFPST_TABLE/OFPMP_TABLE reports the OFPTC_* flags. > + > + - What OFPMP_TABLE_FEATURES reports (if it exists): either the > + current configuration or the switch's capabilities. > + > + - Whether OFPMP_TABLE_DESC reports the current configuration. > + > +OpenFlow OFPTC_* flags TABLE_MOD stats? TABLE_FEATURES > TABLE_DESC > +--------- ----------------------- --------- ------ -------------- > ---------- > +OF1.0 none no[*][+] no[*] nothing[*][+] no[*][+] > +OF1.1/1.2 MISS_* yes yes nothing[+] no[+] > +OF1.3 none yes[*] no[*] config[*] no[*][+] > +OF1.4/1.5 EVICTION/VACANCY_EVENTS yes no capabilities yes > + > + [*] Nothing to report/change anyway. > + > + [+] No such message. > + > + > IPv6 > ==== > > diff --git a/NEWS b/NEWS > index 3da7cfd..57e4f89 100644 > --- a/NEWS > +++ b/NEWS > @@ -7,6 +7,7 @@ Post-v2.4.0 > * Group chaining (where one OpenFlow group triggers another) is > now supported. > * OpenFlow 1.4+ "importance" is now considered for flow eviction. > + * OpenFlow 1.4+ OFPTC_EVICTION is now implemented. > - Support for matching and generating options with Geneve tunnels. > - Support Multicast Listener Discovery (MLDv1 and MLDv2). > > diff --git a/include/openflow/openflow-1.3.h b/include/openflow/openflow-1.3.h > index 142d32c..cf93429 100644 > --- a/include/openflow/openflow-1.3.h > +++ b/include/openflow/openflow-1.3.h > @@ -232,7 +232,14 @@ struct ofp13_table_features { > char name[OFP_MAX_TABLE_NAME_LEN]; > ovs_be64 metadata_match; /* Bits of metadata table can match. */ > ovs_be64 metadata_write; /* Bits of metadata table can write. */ > - ovs_be32 config; /* Bitmap of OFPTC_* values */ > + > + /* In OF1.3 this field was named 'config' and it was useless because > OF1.3 > + * did not define any OFPTC_* bits. > + * > + * OF1.4 renamed this field to 'capabilities' and added OFPTC14_EVICTION > + * and OFPTC14_VACANCY_EVENTS. */ > + ovs_be32 capabilities; /* Bitmap of OFPTC_* values */ > + > ovs_be32 max_entries; /* Max number of entries supported. */ > > /* Table Feature Property list */ > diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c > index df32819..9e88d6d 100644 > --- a/lib/ofp-parse.c > +++ b/lib/ofp-parse.c > @@ -869,20 +869,20 @@ parse_ofp_flow_mod_str(struct ofputil_flow_mod *fm, > const char *string, > return error; > } > > -/* Convert 'table_id' and 'flow_miss_handling' (as described for the > - * "mod-table" command in the ovs-ofctl man page) into 'tm' for sending the > - * specified table_mod 'command' to a switch. > +/* Convert 'table_id' and 'setting' (as described for the "mod-table" command > + * in the ovs-ofctl man page) into 'tm' for sending a table_mod command to a > + * switch. > + * > + * Stores a bitmap of the OpenFlow versions that are usable for 'tm' into > + * '*usable_versions'. > * > * Returns NULL if successful, otherwise a malloc()'d string describing the > * error. The caller is responsible for freeing the returned string. */ > char * OVS_WARN_UNUSED_RESULT > parse_ofp_table_mod(struct ofputil_table_mod *tm, const char *table_id, > - const char *flow_miss_handling, > - enum ofputil_protocol *usable_protocols) > + const char *setting, uint32_t *usable_versions) > { > - /* Table mod requires at least OF 1.1. */ > - *usable_protocols = OFPUTIL_P_OF11_UP; > - > + *usable_versions = 0; > if (!strcasecmp(table_id, "all")) { > tm->table_id = OFPTT_ALL; > } else { > @@ -892,18 +892,38 @@ parse_ofp_table_mod(struct ofputil_table_mod *tm, const > char *table_id, > } > } > > - if (strcmp(flow_miss_handling, "controller") == 0) { > - tm->miss_config = OFPUTIL_TABLE_MISS_CONTROLLER; > - } else if (strcmp(flow_miss_handling, "continue") == 0) { > - tm->miss_config = OFPUTIL_TABLE_MISS_CONTINUE; > - } else if (strcmp(flow_miss_handling, "drop") == 0) { > - tm->miss_config = OFPUTIL_TABLE_MISS_DROP; > + tm->miss = OFPUTIL_TABLE_MISS_DEFAULT; > + tm->eviction = OFPUTIL_TABLE_EVICTION_DEFAULT; > + tm->eviction_flags = UINT32_MAX; > + > + /* Only OpenFlow 1.1 and 1.2 can configure table-miss via table_mod. > + * Only OpenFlow 1.4+ can configure eviction via table_mod. > + * > + * (OpenFlow 1.4+ can also configure vacancy events via table_mod, but > OVS > + * doesn't support those yet and they're also logically a per-OpenFlow > + * session setting so it wouldn't make sense to support them here > anyway.) > + */ > + if (!strcmp(setting, "controller")) { > + tm->miss = OFPUTIL_TABLE_MISS_CONTROLLER; > + *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION); > + } else if (!strcmp(setting, "continue")) { > + tm->miss = OFPUTIL_TABLE_MISS_CONTINUE; > + *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION); > + } else if (!strcmp(setting, "drop")) { > + tm->miss = OFPUTIL_TABLE_MISS_DROP; > + *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION); > + } else if (!strcmp(setting, "evict")) { > + tm->eviction = OFPUTIL_TABLE_EVICTION_ON; > + *usable_versions = (1 << OFP14_VERSION) | (1u << OFP15_VERSION); > + } else if (!strcmp(setting, "noevict")) { > + tm->eviction = OFPUTIL_TABLE_EVICTION_OFF; > + *usable_versions = (1 << OFP14_VERSION) | (1u << OFP15_VERSION); > } else { > - return xasprintf("invalid flow_miss_handling %s", > flow_miss_handling); > + return xasprintf("invalid table_mod setting %s", setting); > } > > if (tm->table_id == 0xfe > - && tm->miss_config == OFPUTIL_TABLE_MISS_CONTINUE) { > + && tm->miss == OFPUTIL_TABLE_MISS_CONTINUE) { > return xstrdup("last table's flow miss handling can not be continue"); > } > > diff --git a/lib/ofp-parse.h b/lib/ofp-parse.h > index 0749d5b..930e3d9 100644 > --- a/lib/ofp-parse.h > +++ b/lib/ofp-parse.h > @@ -49,7 +49,7 @@ char *parse_ofp_flow_mod_str(struct ofputil_flow_mod *, > const char *string, > > char *parse_ofp_table_mod(struct ofputil_table_mod *, > const char *table_id, const char > *flow_miss_handling, > - enum ofputil_protocol *usable_protocols) > + uint32_t *usable_versions) > OVS_WARN_UNUSED_RESULT; > > char *parse_ofp_flow_mod_file(const char *file_name, int command, > diff --git a/lib/ofp-print.c b/lib/ofp-print.c > index 1b25180..d76134f 100644 > --- a/lib/ofp-print.c > +++ b/lib/ofp-print.c > @@ -942,23 +942,54 @@ ofp_print_port_mod(struct ds *string, const struct > ofp_header *oh) > } > } > > -static void > -ofp_print_table_miss_config(struct ds *string, enum ofputil_table_miss miss) > +static const char * > +ofputil_table_miss_to_string(enum ofputil_table_miss miss) > { > switch (miss) { > - case OFPUTIL_TABLE_MISS_CONTROLLER: > - ds_put_cstr(string, "controller\n"); > - break; > - case OFPUTIL_TABLE_MISS_CONTINUE: > - ds_put_cstr(string, "continue\n"); > - break; > - case OFPUTIL_TABLE_MISS_DROP: > - ds_put_cstr(string, "drop\n"); > - break; > - case OFPUTIL_TABLE_MISS_DEFAULT: > - default: > - ds_put_format(string, "Unknown (%d)\n", miss); > - break; > + case OFPUTIL_TABLE_MISS_DEFAULT: return "default"; > + case OFPUTIL_TABLE_MISS_CONTROLLER: return "controller"; > + case OFPUTIL_TABLE_MISS_CONTINUE: return "continue"; > + case OFPUTIL_TABLE_MISS_DROP: return "drop"; > + default: return "***error***"; > + } > +} > + > +static const char * > +ofputil_table_eviction_to_string(enum ofputil_table_eviction eviction) > +{ > + switch (eviction) { > + case OFPUTIL_TABLE_EVICTION_DEFAULT: return "default"; > + case OFPUTIL_TABLE_EVICTION_ON: return "on"; > + case OFPUTIL_TABLE_EVICTION_OFF: return "off"; > + default: return "***error***"; > + } > + > +} > + > +static const char * > +ofputil_eviction_flag_to_string(uint32_t bit) > +{ > + enum ofp14_table_mod_prop_eviction_flag eviction_flag = bit; > + > + switch (eviction_flag) { > + case OFPTMPEF14_OTHER: return "OTHER"; > + case OFPTMPEF14_IMPORTANCE: return "IMPORTANCE"; > + case OFPTMPEF14_LIFETIME: return "LIFETIME"; > + } > + > + return NULL; > +} > + > +/* Appends to 'string' a description of the bitmap of OFPTMPEF14_* values in > + * 'eviction_flags'. */ > +static void > +ofputil_put_eviction_flags(struct ds *string, uint32_t eviction_flags) > +{ > + if (eviction_flags != UINT32_MAX) { > + ofp_print_bit_names(string, eviction_flags, > + ofputil_eviction_flag_to_string, '|'); > + } else { > + ds_put_cstr(string, "(default)"); > } > } > > @@ -980,9 +1011,17 @@ ofp_print_table_mod(struct ds *string, const struct > ofp_header *oh) > ds_put_format(string, " table_id=%"PRIu8, pm.table_id); > } > > - if (pm.miss_config != OFPUTIL_TABLE_MISS_DEFAULT) { > - ds_put_cstr(string, ", flow_miss_config="); > - ofp_print_table_miss_config(string, pm.miss_config); > + if (pm.miss != OFPUTIL_TABLE_MISS_DEFAULT) { > + ds_put_format(string, ", flow_miss_config=%s", > + ofputil_table_miss_to_string(pm.miss)); > + } > + if (pm.eviction != OFPUTIL_TABLE_EVICTION_DEFAULT) { > + ds_put_format(string, ", eviction=%s", > + ofputil_table_eviction_to_string(pm.eviction)); > + } > + if (pm.eviction_flags != UINT32_MAX) { > + ds_put_cstr(string, "eviction_flags="); > + ofputil_put_eviction_flags(string, pm.eviction_flags); > } > } > > @@ -2500,8 +2539,19 @@ ofp_print_table_features(struct ds *s, > } > > if (features->miss_config != OFPUTIL_TABLE_MISS_DEFAULT) { > - ds_put_cstr(s, " config="); > - ofp_print_table_miss_config(s, features->miss_config); > + ds_put_format(s, " config=%s\n", > + ofputil_table_miss_to_string(features->miss_config)); > + } > + > + if (features->supports_eviction >= 0) { > + ds_put_format(s, " eviction: %ssupported\n", > + features->supports_eviction ? "" : "not "); > + > + } > + if (features->supports_vacancy_events >= 0) { > + ds_put_format(s, " vacancy events: %ssupported\n", > + features->supports_vacancy_events ? "" : "not "); > + > } > > if (features->max_entries) { > diff --git a/lib/ofp-util.c b/lib/ofp-util.c > index 38dcb89..2e9ae47 100644 > --- a/lib/ofp-util.c > +++ b/lib/ofp-util.c > @@ -53,8 +53,11 @@ VLOG_DEFINE_THIS_MODULE(ofp_util); > * in the peer and so there's not much point in showing a lot of them. */ > static struct vlog_rate_limit bad_ofmsg_rl = VLOG_RATE_LIMIT_INIT(1, 5); > > -static enum ofputil_table_miss ofputil_table_miss_from_config( > - ovs_be32 config_, enum ofp_version); > +static enum ofputil_table_eviction ofputil_decode_table_eviction( > + ovs_be32 config, enum ofp_version); > +static ovs_be32 ofputil_encode_table_config(enum ofputil_table_miss, > + enum ofputil_table_eviction, > + enum ofp_version); > > struct ofp_prop_header { > ovs_be16 type; > @@ -4644,7 +4647,15 @@ ofputil_decode_table_features(struct ofpbuf *msg, > ovs_strlcpy(tf->name, otf->name, OFP_MAX_TABLE_NAME_LEN); > tf->metadata_match = otf->metadata_match; > tf->metadata_write = otf->metadata_write; > - tf->miss_config = ofputil_table_miss_from_config(otf->config, > oh->version); > + tf->miss_config = OFPUTIL_TABLE_MISS_DEFAULT; > + if (oh->version >= OFP14_VERSION) { > + uint32_t caps = ntohl(otf->capabilities); > + tf->supports_eviction = (caps & OFPTC14_EVICTION) != 0; > + tf->supports_vacancy_events = (caps & OFPTC14_VACANCY_EVENTS) != 0; > + } else { > + tf->supports_eviction = -1; > + tf->supports_vacancy_events = -1; > + } > tf->max_entries = ntohl(otf->max_entries); > > while (properties.size > 0) { > @@ -4852,7 +4863,14 @@ ofputil_append_table_features_reply(const struct > ofputil_table_features *tf, > ovs_strlcpy(otf->name, tf->name, sizeof otf->name); > otf->metadata_match = tf->metadata_match; > otf->metadata_write = tf->metadata_write; > - otf->config = ofputil_table_miss_to_config(tf->miss_config, version); > + if (version >= OFP14_VERSION) { > + if (tf->supports_eviction) { > + otf->capabilities |= htonl(OFPTC14_EVICTION); > + } > + if (tf->supports_vacancy_events) { > + otf->capabilities |= htonl(OFPTC14_VACANCY_EVENTS); > + } > + } > otf->max_entries = htonl(tf->max_entries); > > put_table_instruction_features(reply, &tf->nonmiss, 0, version); > @@ -4868,17 +4886,97 @@ ofputil_append_table_features_reply(const struct > ofputil_table_features *tf, > ofpmp_postappend(replies, start_ofs); > } > > -/* ofputil_table_mod */ > +static enum ofperr > +parse_table_mod_eviction_property(struct ofpbuf *property, > + struct ofputil_table_mod *tm) > +{ > + struct ofp14_table_mod_prop_eviction *ote = property->data; > + > + if (property->size != sizeof *ote) { > + return OFPERR_OFPBPC_BAD_LEN; > + } > + > + tm->eviction_flags = ntohl(ote->flags); > + return 0; > +} > + > +/* Given 'config', taken from an OpenFlow 'version' message that specifies > + * table configuration (a table mod, table stats, or table features message), > + * returns the table eviction configuration that it specifies. > + * > + * Only OpenFlow 1.4 and later specify table eviction configuration this way, > + * so for other 'version' values this function always returns > + * OFPUTIL_TABLE_EVICTION_DEFAULT. */ > +static enum ofputil_table_eviction > +ofputil_decode_table_eviction(ovs_be32 config, enum ofp_version version) > +{ > + return (version < OFP14_VERSION ? OFPUTIL_TABLE_EVICTION_DEFAULT > + : config & htonl(OFPTC14_EVICTION) ? OFPUTIL_TABLE_EVICTION_ON > + : OFPUTIL_TABLE_EVICTION_OFF); > +} > + > +/* Returns a bitmap of OFPTC* values suitable for 'config' fields in various > + * OpenFlow messages of the given 'version', based on the provided 'miss' and > + * 'eviction' values. */ > +static ovs_be32 > +ofputil_encode_table_config(enum ofputil_table_miss miss, > + enum ofputil_table_eviction eviction, > + enum ofp_version version) > +{ > + /* See the section "OFPTC_* Table Configuration" in DESIGN.md for more > + * information on the crazy evolution of this field. */ > + switch (version) { > + case OFP10_VERSION: > + /* OpenFlow 1.0 didn't have such a field, any value ought to do. */ > + return htonl(0); > + > + case OFP11_VERSION: > + case OFP12_VERSION: > + /* OpenFlow 1.1 and 1.2 define only OFPTC11_TABLE_MISS_*. */ > + switch (miss) { > + case OFPUTIL_TABLE_MISS_DEFAULT: > + /* Really this shouldn't be used for encoding (the caller should > + * provide a specific value) but I can't imagine that defaulting > to > + * the fall-through case here will hurt. */ > + case OFPUTIL_TABLE_MISS_CONTROLLER: > + default: > + return htonl(OFPTC11_TABLE_MISS_CONTROLLER); > + case OFPUTIL_TABLE_MISS_CONTINUE: > + return htonl(OFPTC11_TABLE_MISS_CONTINUE); > + case OFPUTIL_TABLE_MISS_DROP: > + return htonl(OFPTC11_TABLE_MISS_DROP); > + } > + OVS_NOT_REACHED(); > + > + case OFP13_VERSION: > + /* OpenFlow 1.3 removed OFPTC11_TABLE_MISS_* and didn't define any > new > + * flags, so this is correct. */ > + return htonl(0); > + > + case OFP14_VERSION: > + case OFP15_VERSION: > + /* OpenFlow 1.4 introduced OFPTC14_EVICTION and > OFPTC14_VACANCY_EVENTS > + * and we don't support the latter yet. */ > + return htonl(eviction == OFPUTIL_TABLE_EVICTION_ON > + ? OFPTC14_EVICTION : 0); > + } > + > + OVS_NOT_REACHED(); > +} > > /* Given 'config', taken from an OpenFlow 'version' message that specifies > * table configuration (a table mod, table stats, or table features message), > - * returns the table miss configuration that it specifies. */ > + * returns the table miss configuration that it specifies. > + * > + * Only OpenFlow 1.1 and 1.2 specify table miss configurations this way, so > for > + * other 'version' values this function always returns > + * OFPUTIL_TABLE_MISS_DEFAULT. */ > static enum ofputil_table_miss > -ofputil_table_miss_from_config(ovs_be32 config_, enum ofp_version version) > +ofputil_decode_table_miss(ovs_be32 config_, enum ofp_version version) > { > uint32_t config = ntohl(config_); > > - if (version < OFP13_VERSION) { > + if (version == OFP11_VERSION || version == OFP12_VERSION) { > switch (config & OFPTC11_TABLE_MISS_MASK) { > case OFPTC11_TABLE_MISS_CONTROLLER: > return OFPUTIL_TABLE_MISS_CONTROLLER; > @@ -4898,32 +4996,6 @@ ofputil_table_miss_from_config(ovs_be32 config_, enum > ofp_version version) > } > } > > -/* Given a table miss configuration, returns the corresponding OpenFlow table > - * configuration for use in an OpenFlow message of the given 'version'. */ > -ovs_be32 > -ofputil_table_miss_to_config(enum ofputil_table_miss miss, > - enum ofp_version version) > -{ > - if (version < OFP13_VERSION) { > - switch (miss) { > - case OFPUTIL_TABLE_MISS_CONTROLLER: > - case OFPUTIL_TABLE_MISS_DEFAULT: > - return htonl(OFPTC11_TABLE_MISS_CONTROLLER); > - > - case OFPUTIL_TABLE_MISS_CONTINUE: > - return htonl(OFPTC11_TABLE_MISS_CONTINUE); > - > - case OFPUTIL_TABLE_MISS_DROP: > - return htonl(OFPTC11_TABLE_MISS_DROP); > - > - default: > - OVS_NOT_REACHED(); > - } > - } else { > - return htonl(0); > - } > -} > - > /* Decodes the OpenFlow "table mod" message in '*oh' into an abstract form in > * '*pm'. Returns 0 if successful, otherwise an OFPERR_* value. */ > enum ofperr > @@ -4933,6 +5005,10 @@ ofputil_decode_table_mod(const struct ofp_header *oh, > enum ofpraw raw; > struct ofpbuf b; > > + memset(pm, 0, sizeof *pm); > + pm->miss = OFPUTIL_TABLE_MISS_DEFAULT; > + pm->eviction = OFPUTIL_TABLE_EVICTION_DEFAULT; > + pm->eviction_flags = UINT32_MAX; > ofpbuf_use_const(&b, oh, ntohs(oh->length)); > raw = ofpraw_pull_assert(&b); > > @@ -4940,16 +5016,37 @@ ofputil_decode_table_mod(const struct ofp_header *oh, > const struct ofp11_table_mod *otm = b.data; > > pm->table_id = otm->table_id; > - pm->miss_config = ofputil_table_miss_from_config(otm->config, > - oh->version); > + pm->miss = ofputil_decode_table_miss(otm->config, oh->version); > } else if (raw == OFPRAW_OFPT14_TABLE_MOD) { > const struct ofp14_table_mod *otm = ofpbuf_pull(&b, sizeof *otm); > > pm->table_id = otm->table_id; > - pm->miss_config = ofputil_table_miss_from_config(otm->config, > - oh->version); > - /* We do not understand any properties yet, so we do not bother > - * parsing them. */ > + pm->miss = ofputil_decode_table_miss(otm->config, oh->version); > + pm->eviction = ofputil_decode_table_eviction(otm->config, > oh->version); > + while (b.size > 0) { > + struct ofpbuf property; > + enum ofperr error; > + uint16_t type; > + > + error = ofputil_pull_property(&b, &property, &type); > + if (error) { > + return error; > + } > + > + switch (type) { > + case OFPTMPT14_EVICTION: > + error = parse_table_mod_eviction_property(&property, pm); > + break; > + > + default: > + error = OFPERR_OFPBRC_BAD_TYPE; > + break; > + } > + > + if (error) { > + return error; > + } > + } > } else { > return OFPERR_OFPBRC_BAD_TYPE; > } > @@ -4957,11 +5054,11 @@ ofputil_decode_table_mod(const struct ofp_header *oh, > return 0; > } > > -/* Converts the abstract form of a "table mod" message in '*pm' into an > OpenFlow > - * message suitable for 'protocol', and returns that encoded form in a buffer > - * owned by the caller. */ > +/* Converts the abstract form of a "table mod" message in '*tm' into an > + * OpenFlow message suitable for 'protocol', and returns that encoded form > in a > + * buffer owned by the caller. */ > struct ofpbuf * > -ofputil_encode_table_mod(const struct ofputil_table_mod *pm, > +ofputil_encode_table_mod(const struct ofputil_table_mod *tm, > enum ofputil_protocol protocol) > { > enum ofp_version ofp_version = ofputil_protocol_to_ofp_version(protocol); > @@ -4980,20 +5077,28 @@ ofputil_encode_table_mod(const struct > ofputil_table_mod *pm, > > b = ofpraw_alloc(OFPRAW_OFPT11_TABLE_MOD, ofp_version, 0); > otm = ofpbuf_put_zeros(b, sizeof *otm); > - otm->table_id = pm->table_id; > - otm->config = ofputil_table_miss_to_config(pm->miss_config, > - ofp_version); > + otm->table_id = tm->table_id; > + otm->config = ofputil_encode_table_config(tm->miss, tm->eviction, > + ofp_version); > break; > } > case OFP14_VERSION: > case OFP15_VERSION: { > struct ofp14_table_mod *otm; > + struct ofp14_table_mod_prop_eviction *ote; > > b = ofpraw_alloc(OFPRAW_OFPT14_TABLE_MOD, ofp_version, 0); > otm = ofpbuf_put_zeros(b, sizeof *otm); > - otm->table_id = pm->table_id; > - otm->config = ofputil_table_miss_to_config(pm->miss_config, > - ofp_version); > + otm->table_id = tm->table_id; > + otm->config = ofputil_encode_table_config(tm->miss, tm->eviction, > + ofp_version); > + > + if (tm->eviction_flags != UINT32_MAX) { > + ote = ofpbuf_put_zeros(b, sizeof *ote); > + ote->type = htons(OFPTMPT14_EVICTION); > + ote->length = htons(sizeof *ote); > + ote->flags = htonl(tm->eviction_flags); > + } > break; > } > default: > @@ -5339,8 +5444,9 @@ ofputil_put_ofp12_table_stats(const struct > ofputil_table_stats *stats, > out->metadata_write = features->metadata_write; > out->instructions = ovsinst_bitmap_to_openflow( > features->nonmiss.instructions, OFP12_VERSION); > - out->config = ofputil_table_miss_to_config(features->miss_config, > - OFP12_VERSION); > + out->config = ofputil_encode_table_config(features->miss_config, > + OFPUTIL_TABLE_EVICTION_DEFAULT, > + OFP12_VERSION); > out->max_entries = htonl(features->max_entries); > out->active_count = htonl(stats->active_count); > out->lookup_count = htonll(stats->lookup_count); > @@ -5446,8 +5552,8 @@ ofputil_decode_ofp11_table_stats(struct ofpbuf *msg, > features->nonmiss.apply.ofpacts = ofpact_bitmap_from_openflow( > ots->write_actions, OFP11_VERSION); > features->miss = features->nonmiss; > - features->miss_config = ofputil_table_miss_from_config(ots->config, > - OFP11_VERSION); > + features->miss_config = ofputil_decode_table_miss(ots->config, > + OFP11_VERSION); > features->match = mf_bitmap_from_of11(ots->match); > features->wildcard = mf_bitmap_from_of11(ots->wildcards); > bitmap_or(features->match.bm, features->wildcard.bm, MFF_N_IDS); > @@ -5476,8 +5582,8 @@ ofputil_decode_ofp12_table_stats(struct ofpbuf *msg, > ovs_strlcpy(features->name, ots->name, sizeof features->name); > features->metadata_match = ots->metadata_match; > features->metadata_write = ots->metadata_write; > - features->miss_config = ofputil_table_miss_from_config(ots->config, > - OFP12_VERSION); > + features->miss_config = ofputil_decode_table_miss(ots->config, > + OFP12_VERSION); > features->max_entries = ntohl(ots->max_entries); > > features->nonmiss.instructions = ovsinst_bitmap_from_openflow( > @@ -5545,6 +5651,8 @@ ofputil_decode_table_stats_reply(struct ofpbuf *msg, > > memset(stats, 0, sizeof *stats); > memset(features, 0, sizeof *features); > + features->supports_eviction = -1; > + features->supports_vacancy_events = -1; > > switch ((enum ofp_version) oh->version) { > case OFP10_VERSION: > diff --git a/lib/ofp-util.h b/lib/ofp-util.h > index f90ac0c..fbc8abf 100644 > --- a/lib/ofp-util.h > +++ b/lib/ofp-util.h > @@ -609,13 +609,33 @@ enum ofputil_table_miss { > OFPUTIL_TABLE_MISS_DROP, /* Drop the packet. */ > }; > > -ovs_be32 ofputil_table_miss_to_config(enum ofputil_table_miss, > - enum ofp_version); > +/* Abstract version of OFPTC14_EVICTION. > + * > + * OpenFlow 1.0 through 1.3 don't know anything about eviction, so decoding a > + * message for one of these protocols always yields > + * OFPUTIL_TABLE_EVICTION_DEFAULT. */ > +enum ofputil_table_eviction { > + OFPUTIL_TABLE_EVICTION_DEFAULT, /* No value. */ > + OFPUTIL_TABLE_EVICTION_ON, /* Enable eviction. */ > + OFPUTIL_TABLE_EVICTION_OFF /* Disable eviction. */ > +}; > > /* Abstract ofp_table_mod. */ > struct ofputil_table_mod { > uint8_t table_id; /* ID of the table, 0xff indicates all tables. > */ > - enum ofputil_table_miss miss_config; > + > + /* OpenFlow 1.1 and 1.2 only. For other versions, ignored on encoding, > + * decoded to OFPUTIL_TABLE_MISS_DEFAULT. */ > + enum ofputil_table_miss miss; > + > + /* OpenFlow 1.4+ only. For other versions, ignored on encoding, decoded > to > + * OFPUTIL_TABLE_EVICTION_DEFAULT. */ > + enum ofputil_table_eviction eviction; > + > + /* OpenFlow 1.4+ only and optional even there; UINT32_MAX indicates > + * absence. For other versions, ignored on encoding, decoded to > + * UINT32_MAX.*/ > + uint32_t eviction_flags; /* OFPTMPEF14_*. */ > }; > > enum ofperr ofputil_decode_table_mod(const struct ofp_header *, > @@ -623,16 +643,38 @@ enum ofperr ofputil_decode_table_mod(const struct > ofp_header *, > struct ofpbuf *ofputil_encode_table_mod(const struct ofputil_table_mod *, > enum ofputil_protocol); > > -/* Abstract ofp_table_features. */ > +/* Abstract ofp_table_features. > + * > + * This is used for all versions of OpenFlow, even though ofp_table_features > + * was only introduced in OpenFlow 1.3, because earlier versions of OpenFlow > + * include support for a subset of ofp_table_features through OFPST_TABLE > (aka > + * OFPMP_TABLE). */ > struct ofputil_table_features { > uint8_t table_id; /* Identifier of table. Lower numbered tables > are consulted first. */ > char name[OFP_MAX_TABLE_NAME_LEN]; > ovs_be64 metadata_match; /* Bits of metadata table can match. */ > ovs_be64 metadata_write; /* Bits of metadata table can write. */ > - enum ofputil_table_miss miss_config; > uint32_t max_entries; /* Max number of entries supported. */ > > + /* Flags. > + * > + * 'miss_config' is relevant for OpenFlow 1.1 and 1.2 only, because those > + * versions include OFPTC_MISS_* flags in OFPST_TABLE. For other > versions, > + * it is decoded to OFPUTIL_TABLE_MISS_DEFAULT and ignored for encoding. > + * > + * 'supports_eviction' and 'supports_vacancy_events' are relevant only > for > + * OpenFlow 1.4 and later only. For OF1.4, they are boolean: 1 if > + * supported, otherwise 0. For other versions, they are decoded as -1 > and > + * ignored for encoding. > + * > + * See the section "OFPTC_* Table Configuration" in DESIGN.md for more > + * details of how OpenFlow has changed in this area. > + */ > + enum ofputil_table_miss miss_config; /* OF1.1 and 1.2 only. */ > + int supports_eviction; /* OF1.4+ only. */ > + int supports_vacancy_events; /* OF1.4+ only. */ > + > /* Table features related to instructions. There are two instances: > * > * - 'miss' reports features available in the table miss flow. > diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h > index f7e8ead..95c1c64 100644 > --- a/ofproto/ofproto-provider.h > +++ b/ofproto/ofproto-provider.h > @@ -246,9 +246,16 @@ struct oftable { > struct hmap eviction_groups_by_id; > struct heap eviction_groups_by_size; > > - /* Table configuration. */ > + /* Flow table miss handling configuration. */ > ATOMIC(enum ofputil_table_miss) miss_config; > > + /* Eviction is enabled if either the client (vswitchd) enables it or an > + * OpenFlow controller enables it; thus, a nonzero value indicates that > + * eviction is enabled. */ > +#define EVICTION_CLIENT (1 << 0) /* Set to 1 if client enables eviction. */ > +#define EVICTION_OPENFLOW (1 << 1) /* Set to 1 if OpenFlow enables eviction. > */ > + unsigned int eviction; > + > atomic_ulong n_matched; > atomic_ulong n_missed; > }; > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c > index 4aa844e..970aca3 100644 > --- a/ofproto/ofproto.c > +++ b/ofproto/ofproto.c > @@ -84,13 +84,18 @@ static void oftable_set_name(struct oftable *, const char > *name); > > static enum ofperr evict_rules_from_table(struct oftable *) > OVS_REQUIRES(ofproto_mutex); > -static void oftable_disable_eviction(struct oftable *) > - OVS_REQUIRES(ofproto_mutex); > -static void oftable_enable_eviction(struct oftable *, > - const struct mf_subfield *fields, > - size_t n_fields) > +static void oftable_configure_eviction(struct oftable *, > + unsigned int eviction, > + const struct mf_subfield *fields, > + size_t n_fields) > OVS_REQUIRES(ofproto_mutex); > > +/* This is the only combination of OpenFlow eviction flags that OVS > supports: a > + * combination of OF1.4+ importance, the remaining lifetime of the flow, and > + * fairness based on user-specified fields. */ > +#define OFPROTO_EVICTION_FLAGS \ > + (OFPTMPEF14_OTHER | OFPTMPEF14_IMPORTANCE | OFPTMPEF14_LIFETIME) > + > /* A set of rules within a single OpenFlow table (oftable) that have the same > * values for the oftable's eviction_fields. A rule to be evicted, when one > is > * needed, is taken from the eviction group that contains the greatest number > @@ -1429,11 +1434,10 @@ ofproto_configure_table(struct ofproto *ofproto, int > table_id, > } > > ovs_mutex_lock(&ofproto_mutex); > - if (s->groups) { > - oftable_enable_eviction(table, s->groups, s->n_groups); > - } else { > - oftable_disable_eviction(table); > - } > + unsigned int new_eviction = (s->enable_eviction > + ? table->eviction | EVICTION_CLIENT > + : table->eviction & ~EVICTION_CLIENT); > + oftable_configure_eviction(table, new_eviction, s->groups, s->n_groups); > table->max_flows = s->max_flows; > evict_rules_from_table(table); > ovs_mutex_unlock(&ofproto_mutex); > @@ -1695,7 +1699,7 @@ ofproto_run(struct ofproto *p) > struct eviction_group *evg; > struct rule *rule; > > - if (!table->eviction_fields) { > + if (!table->eviction) { > continue; > } > > @@ -6560,10 +6564,38 @@ handle_group_mod(struct ofconn *ofconn, const struct > ofp_header *oh) > enum ofputil_table_miss > ofproto_table_get_miss_config(const struct ofproto *ofproto, uint8_t table_id) > { > - enum ofputil_table_miss value; > + enum ofputil_table_miss miss; > + > + atomic_read_relaxed(&ofproto->tables[table_id].miss_config, &miss); > + return miss; > +} > + > +static void > +table_mod__(struct oftable *oftable, > + enum ofputil_table_miss miss, enum ofputil_table_eviction > eviction) > +{ > + if (miss == OFPUTIL_TABLE_MISS_DEFAULT) { > + /* This is how an OFPT_TABLE_MOD decodes if it doesn't specify any > + * table-miss configuration (because the protocol used doesn't have > + * such a concept), so there's nothing to do. */ > + } else { > + atomic_store_relaxed(&oftable->miss_config, miss); > + } > + > + unsigned int new_eviction = oftable->eviction; > + if (eviction == OFPUTIL_TABLE_EVICTION_ON) { > + new_eviction |= EVICTION_OPENFLOW; > + } else if (eviction == OFPUTIL_TABLE_EVICTION_OFF) { > + new_eviction &= ~EVICTION_OPENFLOW; > + } > > - atomic_read_relaxed(&ofproto->tables[table_id].miss_config, &value); > - return value; > + if (new_eviction != oftable->eviction) { > + ovs_mutex_lock(&ofproto_mutex); > + oftable_configure_eviction(oftable, new_eviction, > + oftable->eviction_fields, > + oftable->n_eviction_fields); > + ovs_mutex_unlock(&ofproto_mutex); > + } > } > > static enum ofperr > @@ -6571,18 +6603,33 @@ table_mod(struct ofproto *ofproto, const struct > ofputil_table_mod *tm) > { > if (!check_table_id(ofproto, tm->table_id)) { > return OFPERR_OFPTMFC_BAD_TABLE; > - } else if (tm->miss_config != OFPUTIL_TABLE_MISS_DEFAULT) { > - if (tm->table_id == OFPTT_ALL) { > - int i; > - for (i = 0; i < ofproto->n_tables; i++) { > - atomic_store_relaxed(&ofproto->tables[i].miss_config, > - tm->miss_config); > + } > + > + /* Don't allow the eviction flags to be changed (except to the only fixed > + * value that OVS supports). OF1.4 says this is normal: "The > + * OFPTMPT_EVICTION property usually cannot be modified using a > + * OFP_TABLE_MOD request, because the eviction mechanism is switch > + * defined". */ > + if (tm->eviction_flags != UINT32_MAX > + && tm->eviction_flags != OFPROTO_EVICTION_FLAGS) { > + return OFPERR_OFPTMFC_BAD_CONFIG; > + } > + > + if (tm->table_id == OFPTT_ALL) { > + struct oftable *oftable; > + OFPROTO_FOR_EACH_TABLE (oftable, ofproto) { > + if (!(oftable->flags & (OFTABLE_HIDDEN | OFTABLE_READONLY))) { > + table_mod__(oftable, tm->miss, tm->eviction); > } > - } else { > - atomic_store_relaxed(&ofproto->tables[tm->table_id].miss_config, > - tm->miss_config); > } > + } else { > + struct oftable *oftable = &ofproto->tables[tm->table_id]; > + if (oftable->flags & OFTABLE_READONLY) { > + return OFPERR_OFPTMFC_EPERM; > + } > + table_mod__(oftable, tm->miss, tm->eviction); > } > + > return 0; > } > > @@ -7231,7 +7278,7 @@ choose_rule_to_evict(struct oftable *table, struct rule > **rulep) > struct eviction_group *evg; > > *rulep = NULL; > - if (!table->eviction_fields) { > + if (!table->eviction) { > return false; > } > > @@ -7452,7 +7499,7 @@ eviction_group_add_rule(struct rule *rule) > * so no additional protection is needed. */ > has_timeout = rule->hard_timeout || rule->idle_timeout; > > - if (table->eviction_fields && has_timeout) { > + if (table->eviction && has_timeout) { > struct eviction_group *evg; > > evg = eviction_group_find(table, eviction_group_hash_rule(rule)); > @@ -7474,6 +7521,8 @@ oftable_init(struct oftable *table) > classifier_init(&table->cls, flow_segment_u64s); > table->max_flows = UINT_MAX; > table->n_flows = 0; > + hmap_init(&table->eviction_groups_by_id); > + heap_init(&table->eviction_groups_by_size); > atomic_init(&table->miss_config, OFPUTIL_TABLE_MISS_DEFAULT); > > classifier_set_prefix_fields(&table->cls, default_prefix_fields, > @@ -7490,9 +7539,13 @@ static void > oftable_destroy(struct oftable *table) > { > ovs_assert(classifier_is_empty(&table->cls)); > + > ovs_mutex_lock(&ofproto_mutex); > - oftable_disable_eviction(table); > + oftable_configure_eviction(table, 0, NULL, 0); > ovs_mutex_unlock(&ofproto_mutex); > + > + hmap_destroy(&table->eviction_groups_by_id); > + heap_destroy(&table->eviction_groups_by_size); > classifier_destroy(&table->cls); > free(table->name); > } > @@ -7520,60 +7573,56 @@ oftable_set_name(struct oftable *table, const char > *name) > /* oftables support a choice of two policies when adding a rule would cause > the > * number of flows in the table to exceed the configured maximum number: > either > * they can refuse to add the new flow or they can evict some existing flow. > - * This function configures the former policy on 'table'. */ > -static void > -oftable_disable_eviction(struct oftable *table) > - OVS_REQUIRES(ofproto_mutex) > -{ > - if (table->eviction_fields) { > - struct eviction_group *evg, *next; > - > - HMAP_FOR_EACH_SAFE (evg, next, id_node, > - &table->eviction_groups_by_id) { > - eviction_group_destroy(table, evg); > - } > - hmap_destroy(&table->eviction_groups_by_id); > - heap_destroy(&table->eviction_groups_by_size); > - > - free(table->eviction_fields); > - table->eviction_fields = NULL; > - table->n_eviction_fields = 0; > - } > -} > - > -/* oftables support a choice of two policies when adding a rule would cause > the > - * number of flows in the table to exceed the configured maximum number: > either > - * they can refuse to add the new flow or they can evict some existing flow. > * This function configures the latter policy on 'table', with fairness based > * on the values of the 'n_fields' fields specified in 'fields'. (Specifying > * 'n_fields' as 0 disables fairness.) */ > static void > -oftable_enable_eviction(struct oftable *table, > - const struct mf_subfield *fields, size_t n_fields) > +oftable_configure_eviction(struct oftable *table, unsigned int eviction, > + const struct mf_subfield *fields, size_t n_fields) > OVS_REQUIRES(ofproto_mutex) > { > struct rule *rule; > > - if (table->eviction_fields > + if ((table->eviction != 0) == (eviction != 0) > && n_fields == table->n_eviction_fields > && (!n_fields > || !memcmp(fields, table->eviction_fields, > n_fields * sizeof *fields))) { > - /* No change. */ > + /* The set of eviction fields did not change. If 'eviction' changed, > + * it remains nonzero, so that we can just update table->eviction > + * without fussing with the eviction groups. */ > + table->eviction = eviction; > return; > } > > - oftable_disable_eviction(table); > - > - table->n_eviction_fields = n_fields; > - table->eviction_fields = xmemdup(fields, n_fields * sizeof *fields); > - > - table->eviction_group_id_basis = random_uint32(); > + /* Destroy existing eviction groups, then destroy and recreate data > + * structures to recover memory. */ > + struct eviction_group *evg, *next; > + HMAP_FOR_EACH_SAFE (evg, next, id_node, &table->eviction_groups_by_id) { > + eviction_group_destroy(table, evg); > + } > + hmap_destroy(&table->eviction_groups_by_id); > hmap_init(&table->eviction_groups_by_id); > + heap_destroy(&table->eviction_groups_by_size); > heap_init(&table->eviction_groups_by_size); > > - CLS_FOR_EACH (rule, cr, &table->cls) { > - eviction_group_add_rule(rule); > + /* Replace eviction groups by the new ones, if there is a change. Free > the > + * old fields only after allocating the new ones, because 'fields == > + * table->eviction_fields' is possible. */ > + struct mf_subfield *old_fields = table->eviction_fields; > + table->n_eviction_fields = n_fields; > + table->eviction_fields = (fields > + ? xmemdup(fields, n_fields * sizeof *fields) > + : NULL); > + free(old_fields); > + > + /* Add the new eviction groups, if enabled. */ > + table->eviction = eviction; > + if (table->eviction) { > + table->eviction_group_id_basis = random_uint32(); > + CLS_FOR_EACH (rule, cr, &table->cls) { > + eviction_group_add_rule(rule); > + } > } > } > > diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h > index 7dc1874..7504027 100644 > --- a/ofproto/ofproto.h > +++ b/ofproto/ofproto.h > @@ -1,5 +1,5 @@ > /* > - * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc. > + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc. > * > * Licensed under the Apache License, Version 2.0 (the "License"); > * you may not use this file except in compliance with the License. > @@ -459,14 +459,18 @@ struct ofproto_table_settings { > char *name; /* Name exported via OpenFlow or NULL. */ > unsigned int max_flows; /* Maximum number of flows or UINT_MAX. */ > > - /* These members determine the handling of an attempt to add a flow that > - * would cause the table to have more than 'max_flows' flows. > + /* These members, together with OpenFlow OFPT_TABLE_MOD, determine the > + * handling of an attempt to add a flow that would cause the table to > have > + * more than 'max_flows' flows: > * > - * If 'groups' is NULL, overflows will be rejected with an error. > + * - If 'enable_eviction' is false and OFPT_TABLE_MOD does not enable > + * eviction, overflows will be rejected with an error. > * > - * If 'groups' is nonnull, an overflow will cause a flow to be removed. > - * The flow to be removed is chosen to give fairness among groups > - * distinguished by different values for the subfields within 'groups'. > */ > + * - If 'enable_eviction' is true or OFPT_TABLE_MOD enables eviction, > an > + * overflow will cause a flow to be removed. The flow to be removed > + * is chosen to give fairness among groups distinguished by > different > + * values for the 'n_groups' subfields within 'groups'. */ > + bool enable_eviction; > struct mf_subfield *groups; > size_t n_groups; > > diff --git a/tests/ofp-print.at b/tests/ofp-print.at > index 39a5bbb..e08a201 100644 > --- a/tests/ofp-print.at > +++ b/tests/ofp-print.at > @@ -1132,7 +1132,7 @@ AT_KEYWORDS([ofp-print]) > AT_CHECK([ovs-ofctl ofp-print "\ > 05 11 00 10 00 00 00 02 02 00 00 00 00 00 00 00 \ > " 3], [0], [dnl > -OFPT_TABLE_MOD (OF1.4) (xid=0x2): table_id=2 > +OFPT_TABLE_MOD (OF1.4) (xid=0x2): table_id=2, eviction=off > ]) > AT_CLEANUP > > diff --git a/tests/ofproto.at b/tests/ofproto.at > index 3bd625d..54c077d 100644 > --- a/tests/ofproto.at > +++ b/tests/ofproto.at > @@ -1869,11 +1869,13 @@ OVS_VSWITCHD_START > # Configure a maximum of 4 flows. > AT_CHECK( > [ovs-vsctl \ > - -- --id=@t0 create Flow_Table name=evict flow-limit=4 > overflow-policy=evict \ > + -- --id=@t0 create Flow_Table name=evict flow-limit=4 \ > -- set bridge br0 flow_tables:0=@t0 \ > | ${PERL} $srcdir/uuidfilt.pl], > [0], [<0> > ]) > +# Use mod-table to turn on eviction just to demonstrate that it works. > +AT_CHECK([ovs-ofctl -O OpenFlow14 mod-table br0 0 evict]) > # Add 4 flows. > for in_port in 4 3 2 1; do > ovs-ofctl -O Openflow14 add-flow br0 importance=$((in_port + > 30)),priority=$((in_port + 5)),hard_timeout=$((in_port + 500)),actions=drop > @@ -1895,7 +1897,7 @@ AT_CHECK([ovs-ofctl -O Openflow14 dump-flows br0 | > ofctl_strip | sort], [0], [dn > OFPST_FLOW reply (OF1.4): > ]) > # Disable the Eviction configuration. > -AT_CHECK([ovs-vsctl set Flow_Table evict overflow-policy=refuse]) > +AT_CHECK([ovs-ofctl -O OpenFlow14 mod-table br0 0 noevict]) > # Adding another flow will cause the system to give error for FULL TABLE. > AT_CHECK([ovs-ofctl -O Openflow14 add-flow br0 > hard_timeout=506,importance=36,priority=11,actions=drop],[1], [], [stderr]) > AT_CHECK([head -n 1 stderr | ofctl_strip], [0], > @@ -1923,23 +1925,6 @@ AT_CHECK([ovs-ofctl mod-flows br0 > in_port=5,actions=drop], [1], [], [stderr]) > AT_CHECK([head -n 1 stderr | ofctl_strip], [0], > [OFPT_ERROR: OFPFMFC_TABLE_FULL > ]) > -# Now set the eviction on timeout basis. > -AT_CHECK( > - [ovs-vsctl \ > - -- --id=@t0 create Flow_Table flow-limit=4 overflow-policy=evict \ > - -- set bridge br0 flow_tables:0=@t0 \ > - | ${PERL} $srcdir/uuidfilt.pl], > - [0], [<0> > -]) > -#Now add a new flow > -AT_CHECK([ovs-ofctl -O Openflow14 add-flow br0 > importance=37,hard_timeout=507,priority=11,in_port=6,actions=drop]) > -AT_CHECK([ovs-ofctl -O Openflow14 dump-flows br0 | ofctl_strip | sort], [0], > [dnl > - hard_timeout=503, importance=33, priority=8 actions=drop > - hard_timeout=504, importance=34, priority=9 actions=drop > - hard_timeout=505, importance=35, priority=10,in_port=2 actions=NORMAL > - hard_timeout=507, importance=37, priority=11,in_port=6 actions=drop > -OFPST_FLOW reply (OF1.4): > -]) > OVS_VSWITCHD_STOP > AT_CLEANUP > > diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in > index 395d851..058d205 100644 > --- a/utilities/ovs-ofctl.8.in > +++ b/utilities/ovs-ofctl.8.in > @@ -62,20 +62,13 @@ Prints to the console statistics for each of the flow > tables used by > \fBdump\-table\-features \fIswitch\fR > Prints to the console features for each of the flow tables used by > \fIswitch\fR. > -. > -.IP "\fBmod\-table \fIswitch\fR \fItable_id\fR \fIflow_miss_handling\fR" > -An OpenFlow 1.0 switch looks up each packet that arrives at the switch > -in table 0, then in table 1 if there is no match in table 0, then in > -table 2, and so on until the packet finds a match in some table. > -Finally, if no match was found, the switch sends the packet to the > -controller > -.IP > -OpenFlow 1.1 and later offer more flexibility. This command > -configures the flow table miss handling configuration for table > -\fItable_id\fR in \fIswitch\fR. \fItable_id\fR may be an OpenFlow > -table number between 0 and 254, inclusive, or the keyword \fBALL\fR to > -modify all tables. \fIflow_miss_handling\fR may be any one of the > -following: > +.IP "\fBmod\-table \fIswitch\fR \fItable_id\fR \fIsetting\fR" > +This command configures flow table settings for OpenFlow table > +\fItable_id\fR within \fIswitch\fR. The available settings depend on > +the OpenFlow version in use. In OpenFlow 1.1 and 1.2 (which must be > +enabled with the \fB\-O\fR option) only, \fBmod\-table\fR configures > +behavior when no flow is found when a packet is looked up in a flow > +table. The following \fIsetting\fR values are available: > .RS > .IP \fBdrop\fR > Drop the packet. > @@ -87,6 +80,21 @@ tables other than the last one.) > Send to controller. (This is how an OpenFlow 1.0 switch always > handles packets that do not match any flow in the last table.) > .RE > +.IP > +In OpenFlow 1.4 and later (which must be enabled with the \fB\-O\fR > +option) only, \fBmod\-table\fR configures the behavior when a > +controller attempts to add a flow to a flow table that is full. The > +following \fIsetting\fR values are available: > +.RS > +.IP \fBevict\fR > +Delete some existing flow from the flow table, according to the > +algorithm described for the \fBFlow_Table\fR table in > +\fBovs-vswitchd.conf.db\fR(5). > +.IP \fBnoevict\fR > +Refuse to add the new flow. (Eviction might still be enabled through > +the \fBoverflow_policy\fR oclumn in the \fBFlow_Table\fR table > +documented in \fBovs-vswitchd.conf.db\fR(5).) > +.RE > . > .TP > \fBdump\-ports \fIswitch\fR [\fInetdev\fR] > diff --git a/utilities/ovs-ofctl.c b/utilities/ovs-ofctl.c > index 6ef7070..f16cc25 100644 > --- a/utilities/ovs-ofctl.c > +++ b/utilities/ovs-ofctl.c > @@ -340,8 +340,11 @@ usage(void) > " dump-table-features SWITCH print table features\n" > " mod-port SWITCH IFACE ACT modify port behavior\n" > " mod-table SWITCH MOD modify flow table behavior\n" > + " OF1.1/1.2 MOD: controller, continue, drop\n" > + " OF1.4+ MOD: evict, noevict\n" > " get-frags SWITCH print fragment handling behavior\n" > " set-frags SWITCH FRAG_MODE set fragment handling behavior\n" > + " FRAG_MODE: normal, drop, reassemble, nx-match\n" > " dump-ports SWITCH [PORT] print port statistics\n" > " dump-ports-desc SWITCH [PORT] print port descriptions\n" > " dump-flows SWITCH print all flow entries\n" > @@ -1848,35 +1851,28 @@ found: > static void > ofctl_mod_table(struct ovs_cmdl_context *ctx) > { > - enum ofputil_protocol protocol, usable_protocols; > + uint32_t usable_versions; > struct ofputil_table_mod tm; > struct vconn *vconn; > char *error; > - int i; > > - error = parse_ofp_table_mod(&tm, ctx->argv[2], ctx->argv[3], > &usable_protocols); > + error = parse_ofp_table_mod(&tm, ctx->argv[2], ctx->argv[3], > + &usable_versions); > if (error) { > ovs_fatal(0, "%s", error); > } > > - protocol = open_vconn(ctx->argv[1], &vconn); > - if (!(protocol & usable_protocols)) { > - for (i = 0; i < sizeof(enum ofputil_protocol) * CHAR_BIT; i++) { > - enum ofputil_protocol f = 1 << i; > - if (f != protocol > - && f & usable_protocols > - && try_set_protocol(vconn, f, &protocol)) { > - protocol = f; > - break; > - } > - } > - } > - > - if (!(protocol & usable_protocols)) { > - char *usable_s = ofputil_protocols_to_string(usable_protocols); > - ovs_fatal(0, "Switch does not support table mod message(%s)", > usable_s); > + uint32_t allowed_versions = get_allowed_ofp_versions(); > + if (!(allowed_versions & usable_versions)) { > + struct ds versions = DS_EMPTY_INITIALIZER; > + ofputil_format_version_bitmap_names(&versions, allowed_versions); > + ovs_fatal(0, "table_mod '%s' requires one of the OpenFlow " > + "versions %s but none is enabled (use -O)", > + ctx->argv[3], ds_cstr(&versions)); > } > + mask_allowed_ofp_versions(usable_versions); > > + enum ofputil_protocol protocol = open_vconn(ctx->argv[1], &vconn); > transact_noreply(vconn, ofputil_encode_table_mod(&tm, protocol)); > vconn_close(vconn); > } > diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c > index d48cf7f..dfb581e 100644 > --- a/vswitchd/bridge.c > +++ b/vswitchd/bridge.c > @@ -3603,6 +3603,7 @@ bridge_configure_tables(struct bridge *br) > s.name = NULL; > s.max_flows = UINT_MAX; > s.groups = NULL; > + s.enable_eviction = false; > s.n_groups = 0; > s.n_prefix_fields = 0; > memset(s.prefix_fields, ~0, sizeof(s.prefix_fields)); > @@ -3614,9 +3615,10 @@ bridge_configure_tables(struct bridge *br) > if (cfg->n_flow_limit && *cfg->flow_limit < UINT_MAX) { > s.max_flows = *cfg->flow_limit; > } > - if (cfg->overflow_policy > - && !strcmp(cfg->overflow_policy, "evict")) { > > + s.enable_eviction = (cfg->overflow_policy > + && !strcmp(cfg->overflow_policy, "evict")); > + if (cfg->n_groups) { > s.groups = xmalloc(cfg->n_groups * sizeof *s.groups); > for (k = 0; k < cfg->n_groups; k++) { > const char *string = cfg->groups[k]; > @@ -3636,6 +3638,7 @@ bridge_configure_tables(struct bridge *br) > } > } > } > + > /* Prefix lookup fields. */ > s.n_prefix_fields = 0; > for (k = 0; k < cfg->n_prefixes; k++) { > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index 9f108f0..483a9de 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3039,46 +3039,27 @@ > dump-tables</code>. The name does not affect switch behavior. > </column> > > - <column name="flow_limit"> > - If set, limits the number of flows that may be added to the table. > Open > - vSwitch may limit the number of flows in a table for other reasons, > - e.g. due to hardware limitations or for resource availability or > - performance reasons. > - </column> > - > - <column name="overflow_policy"> > + <group title="Eviction Policy"> > <p> > - Controls the switch's behavior when an OpenFlow flow table > modification > - request would add flows in excess of <ref column="flow_limit"/>. The > - supported values are: > + Open vSwitch supports limiting the number of flows that may be > + installed in a flow table, via the <ref column="flow_limit"/> column. > + When adding a flow would exceed this limit, by default Open vSwitch > + reports an error, but there are two ways to configure Open vSwitch to > + instead delete (``evict'') a flow to make room for the new one: > </p> > > - <dl> > - <dt><code>refuse</code></dt> > - <dd> > - Refuse to add the flow or flows. This is also the default policy > - when <ref column="overflow_policy"/> is unset. > - </dd> > - > - <dt><code>evict</code></dt> > - <dd> > - Delete the flow that will expire soonest. See <ref > column="groups"/> > - for details. > - </dd> > - </dl> > - </column> > + <ul> > + <li> > + Set the <ref column="overflow_policy"/> column to > <code>evict</code>. > + </li> > > - <column name="groups"> > - <p> > - When <ref column="overflow_policy"/> is <code>evict</code>, this > - controls how flows are chosen for eviction when the flow table would > - otherwise exceed <ref column="flow_limit"/> flows. Its value is a > set > - of NXM fields or sub-fields, each of which takes one of the forms > - <code><var>field</var>[]</code> or > - <code><var>field</var>[<var>start</var>..<var>end</var>]</code>, > - e.g. <code>NXM_OF_IN_PORT[]</code>. Please see > - <code>nicira-ext.h</code> for a complete list of NXM field names. > - </p> > + <li> > + Send an OpenFlow 1.4+ ``table mod request'' to enable eviction for > + the flow table (e.g. <code>ovs-ofctl -O OpenFlow14 mod-table br0 0 > + evict</code> to enable eviction on flow table 0 of bridge > + <code>br0</code>). > + </li> > + </ul> > > <p> > When a flow must be evicted due to overflow, the flow to evict is > @@ -3118,95 +3099,138 @@ > </ol> > > <p> > - The eviction process only considers flows that have an idle timeout > or > - a hard timeout. That is, eviction never deletes permanent flows. > + The eviction process only considers flows that have an idle timeout > + or a hard timeout. That is, eviction never deletes permanent flows. > (Permanent flows do count against <ref column="flow_limit"/>.) > </p> > > - <p> > - Open vSwitch ignores any invalid or unknown field specifications. > - </p> > + <column name="flow_limit"> > + If set, limits the number of flows that may be added to the table. > + Open vSwitch may limit the number of flows in a table for other > + reasons, e.g. due to hardware limitations or for resource > availability > + or performance reasons. > + </column> > > - <p> > - When <ref column="overflow_policy"/> is not <code>evict</code>, this > - column has no effect. > - </p> > - </column> > + <column name="overflow_policy"> > + <p> > + Controls the switch's behavior when an OpenFlow flow table > + modification request would add flows in excess of <ref > + column="flow_limit"/>. The supported values are: > + </p> > > - <column name="prefixes"> > - <p> > - This string set specifies which fields should be used for > - address prefix tracking. Prefix tracking allows the > - classifier to skip rules with longer than necessary prefixes, > - resulting in better wildcarding for datapath flows. > - </p> > - <p> > - Prefix tracking may be beneficial when a flow table contains > - matches on IP address fields with different prefix lengths. > - For example, when a flow table contains IP address matches on > - both full addresses and proper prefixes, the full address > - matches will typically cause the datapath flow to un-wildcard > - the whole address field (depending on flow entry priorities). > - In this case each packet with a different address gets handed > - to the userspace for flow processing and generates its own > - datapath flow. With prefix tracking enabled for the address > - field in question packets with addresses matching shorter > - prefixes would generate datapath flows where the irrelevant > - address bits are wildcarded, allowing the same datapath flow > - to handle all the packets within the prefix in question. In > - this case many userspace upcalls can be avoided and the > - overall performance can be better. > - </p> > - <p> > - This is a performance optimization only, so packets will > - receive the same treatment with or without prefix tracking. > - </p> > - <p> > - The supported fields are: <code>tun_id</code>, > - <code>tun_src</code>, <code>tun_dst</code>, > - <code>nw_src</code>, <code>nw_dst</code> (or aliases > - <code>ip_src</code> and <code>ip_dst</code>), > - <code>ipv6_src</code>, and <code>ipv6_dst</code>. (Using this > - feature for <code>tun_id</code> would only make sense if the > - tunnel IDs have prefix structure similar to IP addresses.) > - </p> > + <dl> > + <dt><code>refuse</code></dt> > + <dd> > + Refuse to add the flow or flows. This is also the default policy > + when <ref column="overflow_policy"/> is unset. > + </dd> > > - <p> > - By default, the <code>prefixes=ip_dst,ip_src</code> are used > - on each flow table. This instructs the flow classifier to > - track the IP destination and source addresses used by the > - rules in this specific flow table. > - </p> > + <dt><code>evict</code></dt> > + <dd> > + Delete a flow chosen according to the algorithm described above. > + </dd> > + </dl> > + </column> > > - <p> > - The keyword <code>none</code> is recognized as an explicit > - override of the default values, causing no prefix fields to be > - tracked. > - </p> > + <column name="groups"> > + <p> > + When <ref column="overflow_policy"/> is <code>evict</code>, this > + controls how flows are chosen for eviction when the flow table > would > + otherwise exceed <ref column="flow_limit"/> flows. Its value is a > + set of NXM fields or sub-fields, each of which takes one of the > forms > + <code><var>field</var>[]</code> or > + <code><var>field</var>[<var>start</var>..<var>end</var>]</code>, > + e.g. <code>NXM_OF_IN_PORT[]</code>. Please see > + <code>nicira-ext.h</code> for a complete list of NXM field names. > + </p> > > - <p> > - To set the prefix fields, the flow table record needs to > - exist: > - </p> > + <p> > + Open vSwitch ignores any invalid or unknown field specifications. > + </p> > > - <dl> > - <dt><code>ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- --id=@N1 > create Flow_Table name=table0</code></dt> > - <dd> > - Creates a flow table record for the OpenFlow table number 0. > - </dd> > + <p> > + When eviction is not enabled, via <ref column="overflow_policy"/> > or > + an OpenFlow 1.4+ ``table mod,'' this column has no effect. > + </p> > + </column> > + </group> > > - <dt><code>ovs-vsctl set Flow_Table table0 > prefixes=ip_dst,ip_src</code></dt> > - <dd> > - Enables prefix tracking for IP source and destination > - address fields. > - </dd> > - </dl> > + <group title="Classifier Optimization"> > + <column name="prefixes"> > + <p> > + This string set specifies which fields should be used for > + address prefix tracking. Prefix tracking allows the > + classifier to skip rules with longer than necessary prefixes, > + resulting in better wildcarding for datapath flows. > + </p> > + <p> > + Prefix tracking may be beneficial when a flow table contains > + matches on IP address fields with different prefix lengths. > + For example, when a flow table contains IP address matches on > + both full addresses and proper prefixes, the full address > + matches will typically cause the datapath flow to un-wildcard > + the whole address field (depending on flow entry priorities). > + In this case each packet with a different address gets handed > + to the userspace for flow processing and generates its own > + datapath flow. With prefix tracking enabled for the address > + field in question packets with addresses matching shorter > + prefixes would generate datapath flows where the irrelevant > + address bits are wildcarded, allowing the same datapath flow > + to handle all the packets within the prefix in question. In > + this case many userspace upcalls can be avoided and the > + overall performance can be better. > + </p> > + <p> > + This is a performance optimization only, so packets will > + receive the same treatment with or without prefix tracking. > + </p> > + <p> > + The supported fields are: <code>tun_id</code>, > + <code>tun_src</code>, <code>tun_dst</code>, > + <code>nw_src</code>, <code>nw_dst</code> (or aliases > + <code>ip_src</code> and <code>ip_dst</code>), > + <code>ipv6_src</code>, and <code>ipv6_dst</code>. (Using this > + feature for <code>tun_id</code> would only make sense if the > + tunnel IDs have prefix structure similar to IP addresses.) > + </p> > > - <p> > - There is a maximum number of fields that can be enabled for any > - one flow table. Currently this limit is 3. > - </p> > - </column> > + <p> > + By default, the <code>prefixes=ip_dst,ip_src</code> are used > + on each flow table. This instructs the flow classifier to > + track the IP destination and source addresses used by the > + rules in this specific flow table. > + </p> > + > + <p> > + The keyword <code>none</code> is recognized as an explicit > + override of the default values, causing no prefix fields to be > + tracked. > + </p> > + > + <p> > + To set the prefix fields, the flow table record needs to > + exist: > + </p> > + > + <dl> > + <dt><code>ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- --id=@N1 > create Flow_Table name=table0</code></dt> > + <dd> > + Creates a flow table record for the OpenFlow table number 0. > + </dd> > + > + <dt><code>ovs-vsctl set Flow_Table table0 > prefixes=ip_dst,ip_src</code></dt> > + <dd> > + Enables prefix tracking for IP source and destination > + address fields. > + </dd> > + </dl> > + > + <p> > + There is a maximum number of fields that can be enabled for any > + one flow table. Currently this limit is 3. > + </p> > + </column> > + </group> > > <group title="Common Columns"> > The overall purpose of these columns is described under <code>Common > -- > 2.1.3 > > _______________________________________________ > dev mailing list > [email protected] > http://openvswitch.org/mailman/listinfo/dev
_______________________________________________ dev mailing list [email protected] http://openvswitch.org/mailman/listinfo/dev
