Acked-by: Jarno Rajahalme <[email protected]>

> On Jul 2, 2015, at 8:41 PM, Ben Pfaff <[email protected]> wrote:
> 
> OpenFlow 1.4 introduces the ability to turn on flow table eviction with an
> OFPT_TABLE_MOD message specifying OFPTC_EVICTION.  It also adds related
> machinery to other messages that mention OFPTC_* fields.  This commit adds
> support for the new feature, implementing it as a second, parallel way to
> enable flow table eviction.  It takes more work than it seems like it
> should because there is so much weirdness with the treatment of OFPTC_*
> flags over the evolution of OpenFlow; please refer to the explanation in
> DESIGN.md for more information.
> 
> This commit also adds related support to ovs-ofctl, plus tests.
> 
> Signed-off-by: Ben Pfaff <[email protected]>
> Co-authored-by: Saloni Jain <[email protected]>
> Signed-off-by: Saloni Jain <[email protected]>
> ---
> v1->v2: Miscellaneous style fixes.  Comment improvements.
>  Introduce OFPROTO_EVICTION_FLAGS.  Fix documentation.
> 
> DESIGN.md                       |  85 +++++++++++++-
> NEWS                            |   1 +
> include/openflow/openflow-1.3.h |   9 +-
> lib/ofp-parse.c                 |  52 ++++++---
> lib/ofp-parse.h                 |   2 +-
> lib/ofp-print.c                 |  90 ++++++++++----
> lib/ofp-util.c                  | 220 ++++++++++++++++++++++++++---------
> lib/ofp-util.h                  |  52 ++++++++-
> ofproto/ofproto-provider.h      |   9 +-
> ofproto/ofproto.c               | 173 +++++++++++++++++----------
> ofproto/ofproto.h               |  18 +--
> tests/ofp-print.at              |   2 +-
> tests/ofproto.at                |  23 +---
> utilities/ovs-ofctl.8.in        |  36 +++---
> utilities/ovs-ofctl.c           |  34 +++---
> vswitchd/bridge.c               |   7 +-
> vswitchd/vswitch.xml            | 252 ++++++++++++++++++++++------------------
> 17 files changed, 721 insertions(+), 344 deletions(-)
> 
> diff --git a/DESIGN.md b/DESIGN.md
> index e533b7c..38413d7 100644
> --- a/DESIGN.md
> +++ b/DESIGN.md
> @@ -277,13 +277,19 @@ The table for 1.3 is the same as the one shown above 
> for 1.2.
> 
> 
> OpenFlow 1.4
> -------------
> +-----------
> +
> +OpenFlow 1.4 makes these changes:
> +
> +  - Adds the "importance" field to flow_mods, but it does not
> +    explicitly specify which kinds of flow_mods set the importance.
> +    For consistency, Open vSwitch uses the same rule for importance
> +    as for idle_timeout and hard_timeout, that is, only an "ADD"
> +    flow_mod sets the importance.  (This issue has been filed with
> +    the ONF as EXT-496.)
> 
> -OpenFlow 1.4 adds the "importance" field to flow_mods, but it does not
> -explicitly specify which kinds of flow_mods set the importance.  For
> -consistency, Open vSwitch uses the same rule for importance as for
> -idle_timeout and hard_timeout, that is, only an "ADD" flow_mod sets
> -the importance.  (This issue has been filed with the ONF as EXT-496.)
> +  - Eviction Mechanism to automatically delete entries of lower
> +    importance to make space for newer entries.
> 
> 
> OpenFlow 1.4 Bundles
> @@ -606,6 +612,73 @@ Tables 128 and above are reserved for use by the switch 
> itself.
> Controllers should use only tables 0 through 127.
> 
> 
> +OFPTC_* Table Configuration
> +===========================
> +
> +This section covers the history of the OFPTC_* table configuration
> +bits across OpenFlow versions.
> +
> +OpenFlow 1.0 flow tables had fixed configurations.
> +
> +OpenFlow 1.1 enabled controllers to configure behavior upon flow table
> +miss and added the OFPTC_MISS_* constants for that purpose.  OFPTC_*
> +did not control anything else but it was nevertheless conceptualized
> +as a set of bit-fields instead of an enum.  OF1.1 added the
> +OFPT_TABLE_MOD message to set OFPTC_MISS_* for a flow table and added
> +the 'config' field to the OFPST_TABLE reply to report the current
> +setting.
> +
> +OpenFlow 1.2 did not change anything in this regard.
> +
> +OpenFlow 1.3 switched to another means to changing flow table miss
> +behavior and deprecated OFPTC_MISS_* without adding any more OFPTC_*
> +constants.  This meant that OFPT_TABLE_MOD now had no purpose at all,
> +but OF1.3 kept it around "for backward compatibility with older and
> +newer versions of the specification."  At the same time, OF1.3
> +introduced a new message OFPMP_TABLE_FEATURES that included a field
> +'config' documented as reporting the OFPTC_* values set with
> +OFPT_TABLE_MOD; of course this served no real purpose because no
> +OFPTC_* values are defined.  OF1.3 did remove the OFPTC_* field from
> +OFPMP_TABLE (previously named OFPST_TABLE).
> +
> +OpenFlow 1.4 defined two new OFPTC_* constants, OFPTC_EVICTION and
> +OFPTC_VACANCY_EVENTS, using bits that did not overlap with
> +OFPTC_MISS_* even though those bits had not been defined since OF1.2.
> +OFPT_TABLE_MOD still controlled these settings.  The field for OFPTC_*
> +values in OFPMP_TABLE_FEATURES was renamed from 'config' to
> +'capabilities' and documented as reporting the flags that are
> +supported in a OFPT_TABLE_MOD message.  The OFPMP_TABLE_DESC message
> +newly added in OF1.4 reported the OFPTC_* setting.
> +
> +OpenFlow 1.5 did not change anything in this regard.
> +
> +The following table summarizes.  The columns say:
> +
> +  - OpenFlow version(s).
> +
> +  - The OFPTC_* flags defined in those versions.
> +
> +  - Whether OFPT_TABLE_MOD can modify OFPTC_* flags.
> +
> +  - Whether OFPST_TABLE/OFPMP_TABLE reports the OFPTC_* flags.
> +
> +  - What OFPMP_TABLE_FEATURES reports (if it exists): either the
> +    current configuration or the switch's capabilities.
> +
> +  - Whether OFPMP_TABLE_DESC reports the current configuration.
> +
> +OpenFlow   OFPTC_* flags            TABLE_MOD stats? TABLE_FEATURES 
> TABLE_DESC
> +---------  -----------------------  --------- ------ -------------- 
> ----------
> +OF1.0      none                     no[*][+]  no[*]  nothing[*][+]  no[*][+]
> +OF1.1/1.2  MISS_*                   yes       yes    nothing[+]     no[+]
> +OF1.3      none                     yes[*]    no[*]  config[*]      no[*][+]
> +OF1.4/1.5  EVICTION/VACANCY_EVENTS  yes       no     capabilities   yes
> +
> +   [*] Nothing to report/change anyway.
> +
> +   [+] No such message.
> +
> +
> IPv6
> ====
> 
> diff --git a/NEWS b/NEWS
> index 3da7cfd..57e4f89 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -7,6 +7,7 @@ Post-v2.4.0
>      * Group chaining (where one OpenFlow group triggers another) is
>        now supported.
>      * OpenFlow 1.4+ "importance" is now considered for flow eviction.
> +     * OpenFlow 1.4+ OFPTC_EVICTION is now implemented.
>    - Support for matching and generating options with Geneve tunnels.
>    - Support Multicast Listener Discovery (MLDv1 and MLDv2).
> 
> diff --git a/include/openflow/openflow-1.3.h b/include/openflow/openflow-1.3.h
> index 142d32c..cf93429 100644
> --- a/include/openflow/openflow-1.3.h
> +++ b/include/openflow/openflow-1.3.h
> @@ -232,7 +232,14 @@ struct ofp13_table_features {
>     char name[OFP_MAX_TABLE_NAME_LEN];
>     ovs_be64 metadata_match;  /* Bits of metadata table can match. */
>     ovs_be64 metadata_write;  /* Bits of metadata table can write. */
> -    ovs_be32 config;          /* Bitmap of OFPTC_* values */
> +
> +    /* In OF1.3 this field was named 'config' and it was useless because 
> OF1.3
> +     * did not define any OFPTC_* bits.
> +     *
> +     * OF1.4 renamed this field to 'capabilities' and added OFPTC14_EVICTION
> +     * and OFPTC14_VACANCY_EVENTS. */
> +    ovs_be32 capabilities;    /* Bitmap of OFPTC_* values */
> +
>     ovs_be32 max_entries;     /* Max number of entries supported. */
> 
>     /* Table Feature Property list */
> diff --git a/lib/ofp-parse.c b/lib/ofp-parse.c
> index df32819..9e88d6d 100644
> --- a/lib/ofp-parse.c
> +++ b/lib/ofp-parse.c
> @@ -869,20 +869,20 @@ parse_ofp_flow_mod_str(struct ofputil_flow_mod *fm, 
> const char *string,
>     return error;
> }
> 
> -/* Convert 'table_id' and 'flow_miss_handling' (as described for the
> - * "mod-table" command in the ovs-ofctl man page) into 'tm' for sending the
> - * specified table_mod 'command' to a switch.
> +/* Convert 'table_id' and 'setting' (as described for the "mod-table" command
> + * in the ovs-ofctl man page) into 'tm' for sending a table_mod command to a
> + * switch.
> + *
> + * Stores a bitmap of the OpenFlow versions that are usable for 'tm' into
> + * '*usable_versions'.
>  *
>  * Returns NULL if successful, otherwise a malloc()'d string describing the
>  * error.  The caller is responsible for freeing the returned string. */
> char * OVS_WARN_UNUSED_RESULT
> parse_ofp_table_mod(struct ofputil_table_mod *tm, const char *table_id,
> -                    const char *flow_miss_handling,
> -                    enum ofputil_protocol *usable_protocols)
> +                    const char *setting, uint32_t *usable_versions)
> {
> -    /* Table mod requires at least OF 1.1. */
> -    *usable_protocols = OFPUTIL_P_OF11_UP;
> -
> +    *usable_versions = 0;
>     if (!strcasecmp(table_id, "all")) {
>         tm->table_id = OFPTT_ALL;
>     } else {
> @@ -892,18 +892,38 @@ parse_ofp_table_mod(struct ofputil_table_mod *tm, const 
> char *table_id,
>         }
>     }
> 
> -    if (strcmp(flow_miss_handling, "controller") == 0) {
> -        tm->miss_config = OFPUTIL_TABLE_MISS_CONTROLLER;
> -    } else if (strcmp(flow_miss_handling, "continue") == 0) {
> -        tm->miss_config = OFPUTIL_TABLE_MISS_CONTINUE;
> -    } else if (strcmp(flow_miss_handling, "drop") == 0) {
> -        tm->miss_config = OFPUTIL_TABLE_MISS_DROP;
> +    tm->miss = OFPUTIL_TABLE_MISS_DEFAULT;
> +    tm->eviction = OFPUTIL_TABLE_EVICTION_DEFAULT;
> +    tm->eviction_flags = UINT32_MAX;
> +
> +    /* Only OpenFlow 1.1 and 1.2 can configure table-miss via table_mod.
> +     * Only OpenFlow 1.4+ can configure eviction via table_mod.
> +     *
> +     * (OpenFlow 1.4+ can also configure vacancy events via table_mod, but 
> OVS
> +     * doesn't support those yet and they're also logically a per-OpenFlow
> +     * session setting so it wouldn't make sense to support them here 
> anyway.)
> +     */
> +    if (!strcmp(setting, "controller")) {
> +        tm->miss = OFPUTIL_TABLE_MISS_CONTROLLER;
> +        *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION);
> +    } else if (!strcmp(setting, "continue")) {
> +        tm->miss = OFPUTIL_TABLE_MISS_CONTINUE;
> +        *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION);
> +    } else if (!strcmp(setting, "drop")) {
> +        tm->miss = OFPUTIL_TABLE_MISS_DROP;
> +        *usable_versions = (1u << OFP11_VERSION) | (1u << OFP12_VERSION);
> +    } else if (!strcmp(setting, "evict")) {
> +        tm->eviction = OFPUTIL_TABLE_EVICTION_ON;
> +        *usable_versions = (1 << OFP14_VERSION) | (1u << OFP15_VERSION);
> +    } else if (!strcmp(setting, "noevict")) {
> +        tm->eviction = OFPUTIL_TABLE_EVICTION_OFF;
> +        *usable_versions = (1 << OFP14_VERSION) | (1u << OFP15_VERSION);
>     } else {
> -        return xasprintf("invalid flow_miss_handling %s", 
> flow_miss_handling);
> +        return xasprintf("invalid table_mod setting %s", setting);
>     }
> 
>     if (tm->table_id == 0xfe
> -        && tm->miss_config == OFPUTIL_TABLE_MISS_CONTINUE) {
> +        && tm->miss == OFPUTIL_TABLE_MISS_CONTINUE) {
>         return xstrdup("last table's flow miss handling can not be continue");
>     }
> 
> diff --git a/lib/ofp-parse.h b/lib/ofp-parse.h
> index 0749d5b..930e3d9 100644
> --- a/lib/ofp-parse.h
> +++ b/lib/ofp-parse.h
> @@ -49,7 +49,7 @@ char *parse_ofp_flow_mod_str(struct ofputil_flow_mod *, 
> const char *string,
> 
> char *parse_ofp_table_mod(struct ofputil_table_mod *,
>                           const char *table_id, const char 
> *flow_miss_handling,
> -                          enum ofputil_protocol *usable_protocols)
> +                          uint32_t *usable_versions)
>     OVS_WARN_UNUSED_RESULT;
> 
> char *parse_ofp_flow_mod_file(const char *file_name, int command,
> diff --git a/lib/ofp-print.c b/lib/ofp-print.c
> index 1b25180..d76134f 100644
> --- a/lib/ofp-print.c
> +++ b/lib/ofp-print.c
> @@ -942,23 +942,54 @@ ofp_print_port_mod(struct ds *string, const struct 
> ofp_header *oh)
>     }
> }
> 
> -static void
> -ofp_print_table_miss_config(struct ds *string, enum ofputil_table_miss miss)
> +static const char *
> +ofputil_table_miss_to_string(enum ofputil_table_miss miss)
> {
>     switch (miss) {
> -    case OFPUTIL_TABLE_MISS_CONTROLLER:
> -        ds_put_cstr(string, "controller\n");
> -        break;
> -    case OFPUTIL_TABLE_MISS_CONTINUE:
> -        ds_put_cstr(string, "continue\n");
> -        break;
> -    case OFPUTIL_TABLE_MISS_DROP:
> -        ds_put_cstr(string, "drop\n");
> -        break;
> -    case OFPUTIL_TABLE_MISS_DEFAULT:
> -    default:
> -        ds_put_format(string, "Unknown (%d)\n", miss);
> -        break;
> +    case OFPUTIL_TABLE_MISS_DEFAULT: return "default";
> +    case OFPUTIL_TABLE_MISS_CONTROLLER: return "controller";
> +    case OFPUTIL_TABLE_MISS_CONTINUE: return "continue";
> +    case OFPUTIL_TABLE_MISS_DROP: return "drop";
> +    default: return "***error***";
> +    }
> +}
> +
> +static const char *
> +ofputil_table_eviction_to_string(enum ofputil_table_eviction eviction)
> +{
> +    switch (eviction) {
> +    case OFPUTIL_TABLE_EVICTION_DEFAULT: return "default";
> +    case OFPUTIL_TABLE_EVICTION_ON: return "on";
> +    case OFPUTIL_TABLE_EVICTION_OFF: return "off";
> +    default: return "***error***";
> +    }
> +
> +}
> +
> +static const char *
> +ofputil_eviction_flag_to_string(uint32_t bit)
> +{
> +    enum ofp14_table_mod_prop_eviction_flag eviction_flag = bit;
> +
> +    switch (eviction_flag) {
> +    case OFPTMPEF14_OTHER:      return "OTHER";
> +    case OFPTMPEF14_IMPORTANCE: return "IMPORTANCE";
> +    case OFPTMPEF14_LIFETIME:   return "LIFETIME";
> +    }
> +
> +    return NULL;
> +}
> +
> +/* Appends to 'string' a description of the bitmap of OFPTMPEF14_* values in
> + * 'eviction_flags'. */
> +static void
> +ofputil_put_eviction_flags(struct ds *string, uint32_t eviction_flags)
> +{
> +    if (eviction_flags != UINT32_MAX) {
> +        ofp_print_bit_names(string, eviction_flags,
> +                            ofputil_eviction_flag_to_string, '|');
> +    } else {
> +        ds_put_cstr(string, "(default)");
>     }
> }
> 
> @@ -980,9 +1011,17 @@ ofp_print_table_mod(struct ds *string, const struct 
> ofp_header *oh)
>         ds_put_format(string, " table_id=%"PRIu8, pm.table_id);
>     }
> 
> -    if (pm.miss_config != OFPUTIL_TABLE_MISS_DEFAULT) {
> -        ds_put_cstr(string, ", flow_miss_config=");
> -        ofp_print_table_miss_config(string, pm.miss_config);
> +    if (pm.miss != OFPUTIL_TABLE_MISS_DEFAULT) {
> +        ds_put_format(string, ", flow_miss_config=%s",
> +                      ofputil_table_miss_to_string(pm.miss));
> +    }
> +    if (pm.eviction != OFPUTIL_TABLE_EVICTION_DEFAULT) {
> +        ds_put_format(string, ", eviction=%s",
> +                      ofputil_table_eviction_to_string(pm.eviction));
> +    }
> +    if (pm.eviction_flags != UINT32_MAX) {
> +        ds_put_cstr(string, "eviction_flags=");
> +        ofputil_put_eviction_flags(string, pm.eviction_flags);
>     }
> }
> 
> @@ -2500,8 +2539,19 @@ ofp_print_table_features(struct ds *s,
>     }
> 
>     if (features->miss_config != OFPUTIL_TABLE_MISS_DEFAULT) {
> -        ds_put_cstr(s, "    config=");
> -        ofp_print_table_miss_config(s, features->miss_config);
> +        ds_put_format(s, "    config=%s\n",
> +                      ofputil_table_miss_to_string(features->miss_config));
> +    }
> +
> +    if (features->supports_eviction >= 0) {
> +        ds_put_format(s, "    eviction: %ssupported\n",
> +                      features->supports_eviction ? "" : "not ");
> +
> +    }
> +    if (features->supports_vacancy_events >= 0) {
> +        ds_put_format(s, "    vacancy events: %ssupported\n",
> +                      features->supports_vacancy_events ? "" : "not ");
> +
>     }
> 
>     if (features->max_entries) {
> diff --git a/lib/ofp-util.c b/lib/ofp-util.c
> index 38dcb89..2e9ae47 100644
> --- a/lib/ofp-util.c
> +++ b/lib/ofp-util.c
> @@ -53,8 +53,11 @@ VLOG_DEFINE_THIS_MODULE(ofp_util);
>  * in the peer and so there's not much point in showing a lot of them. */
> static struct vlog_rate_limit bad_ofmsg_rl = VLOG_RATE_LIMIT_INIT(1, 5);
> 
> -static enum ofputil_table_miss ofputil_table_miss_from_config(
> -    ovs_be32 config_, enum ofp_version);
> +static enum ofputil_table_eviction ofputil_decode_table_eviction(
> +    ovs_be32 config, enum ofp_version);
> +static ovs_be32 ofputil_encode_table_config(enum ofputil_table_miss,
> +                                            enum ofputil_table_eviction,
> +                                            enum ofp_version);
> 
> struct ofp_prop_header {
>     ovs_be16 type;
> @@ -4644,7 +4647,15 @@ ofputil_decode_table_features(struct ofpbuf *msg,
>     ovs_strlcpy(tf->name, otf->name, OFP_MAX_TABLE_NAME_LEN);
>     tf->metadata_match = otf->metadata_match;
>     tf->metadata_write = otf->metadata_write;
> -    tf->miss_config = ofputil_table_miss_from_config(otf->config, 
> oh->version);
> +    tf->miss_config = OFPUTIL_TABLE_MISS_DEFAULT;
> +    if (oh->version >= OFP14_VERSION) {
> +        uint32_t caps = ntohl(otf->capabilities);
> +        tf->supports_eviction = (caps & OFPTC14_EVICTION) != 0;
> +        tf->supports_vacancy_events = (caps & OFPTC14_VACANCY_EVENTS) != 0;
> +    } else {
> +        tf->supports_eviction = -1;
> +        tf->supports_vacancy_events = -1;
> +    }
>     tf->max_entries = ntohl(otf->max_entries);
> 
>     while (properties.size > 0) {
> @@ -4852,7 +4863,14 @@ ofputil_append_table_features_reply(const struct 
> ofputil_table_features *tf,
>     ovs_strlcpy(otf->name, tf->name, sizeof otf->name);
>     otf->metadata_match = tf->metadata_match;
>     otf->metadata_write = tf->metadata_write;
> -    otf->config = ofputil_table_miss_to_config(tf->miss_config, version);
> +    if (version >= OFP14_VERSION) {
> +        if (tf->supports_eviction) {
> +            otf->capabilities |= htonl(OFPTC14_EVICTION);
> +        }
> +        if (tf->supports_vacancy_events) {
> +            otf->capabilities |= htonl(OFPTC14_VACANCY_EVENTS);
> +        }
> +    }
>     otf->max_entries = htonl(tf->max_entries);
> 
>     put_table_instruction_features(reply, &tf->nonmiss, 0, version);
> @@ -4868,17 +4886,97 @@ ofputil_append_table_features_reply(const struct 
> ofputil_table_features *tf,
>     ofpmp_postappend(replies, start_ofs);
> }
> 
> -/* ofputil_table_mod */
> +static enum ofperr
> +parse_table_mod_eviction_property(struct ofpbuf *property,
> +                                  struct ofputil_table_mod *tm)
> +{
> +    struct ofp14_table_mod_prop_eviction *ote = property->data;
> +
> +    if (property->size != sizeof *ote) {
> +        return OFPERR_OFPBPC_BAD_LEN;
> +    }
> +
> +    tm->eviction_flags = ntohl(ote->flags);
> +    return 0;
> +}
> +
> +/* Given 'config', taken from an OpenFlow 'version' message that specifies
> + * table configuration (a table mod, table stats, or table features message),
> + * returns the table eviction configuration that it specifies.
> + *
> + * Only OpenFlow 1.4 and later specify table eviction configuration this way,
> + * so for other 'version' values this function always returns
> + * OFPUTIL_TABLE_EVICTION_DEFAULT. */
> +static enum ofputil_table_eviction
> +ofputil_decode_table_eviction(ovs_be32 config, enum ofp_version version)
> +{
> +    return (version < OFP14_VERSION ? OFPUTIL_TABLE_EVICTION_DEFAULT
> +            : config & htonl(OFPTC14_EVICTION) ? OFPUTIL_TABLE_EVICTION_ON
> +            : OFPUTIL_TABLE_EVICTION_OFF);
> +}
> +
> +/* Returns a bitmap of OFPTC* values suitable for 'config' fields in various
> + * OpenFlow messages of the given 'version', based on the provided 'miss' and
> + * 'eviction' values. */
> +static ovs_be32
> +ofputil_encode_table_config(enum ofputil_table_miss miss,
> +                            enum ofputil_table_eviction eviction,
> +                            enum ofp_version version)
> +{
> +    /* See the section "OFPTC_* Table Configuration" in DESIGN.md for more
> +     * information on the crazy evolution of this field. */
> +    switch (version) {
> +    case OFP10_VERSION:
> +        /* OpenFlow 1.0 didn't have such a field, any value ought to do. */
> +        return htonl(0);
> +
> +    case OFP11_VERSION:
> +    case OFP12_VERSION:
> +        /* OpenFlow 1.1 and 1.2 define only OFPTC11_TABLE_MISS_*. */
> +        switch (miss) {
> +        case OFPUTIL_TABLE_MISS_DEFAULT:
> +            /* Really this shouldn't be used for encoding (the caller should
> +             * provide a specific value) but I can't imagine that defaulting 
> to
> +             * the fall-through case here will hurt. */
> +        case OFPUTIL_TABLE_MISS_CONTROLLER:
> +        default:
> +            return htonl(OFPTC11_TABLE_MISS_CONTROLLER);
> +        case OFPUTIL_TABLE_MISS_CONTINUE:
> +            return htonl(OFPTC11_TABLE_MISS_CONTINUE);
> +        case OFPUTIL_TABLE_MISS_DROP:
> +            return htonl(OFPTC11_TABLE_MISS_DROP);
> +        }
> +        OVS_NOT_REACHED();
> +
> +    case OFP13_VERSION:
> +        /* OpenFlow 1.3 removed OFPTC11_TABLE_MISS_* and didn't define any 
> new
> +         * flags, so this is correct. */
> +        return htonl(0);
> +
> +    case OFP14_VERSION:
> +    case OFP15_VERSION:
> +        /* OpenFlow 1.4 introduced OFPTC14_EVICTION and 
> OFPTC14_VACANCY_EVENTS
> +         * and we don't support the latter yet. */
> +        return htonl(eviction == OFPUTIL_TABLE_EVICTION_ON
> +                     ? OFPTC14_EVICTION : 0);
> +    }
> +
> +    OVS_NOT_REACHED();
> +}
> 
> /* Given 'config', taken from an OpenFlow 'version' message that specifies
>  * table configuration (a table mod, table stats, or table features message),
> - * returns the table miss configuration that it specifies.  */
> + * returns the table miss configuration that it specifies.
> + *
> + * Only OpenFlow 1.1 and 1.2 specify table miss configurations this way, so 
> for
> + * other 'version' values this function always returns
> + * OFPUTIL_TABLE_MISS_DEFAULT. */
> static enum ofputil_table_miss
> -ofputil_table_miss_from_config(ovs_be32 config_, enum ofp_version version)
> +ofputil_decode_table_miss(ovs_be32 config_, enum ofp_version version)
> {
>     uint32_t config = ntohl(config_);
> 
> -    if (version < OFP13_VERSION) {
> +    if (version == OFP11_VERSION || version == OFP12_VERSION) {
>         switch (config & OFPTC11_TABLE_MISS_MASK) {
>         case OFPTC11_TABLE_MISS_CONTROLLER:
>             return OFPUTIL_TABLE_MISS_CONTROLLER;
> @@ -4898,32 +4996,6 @@ ofputil_table_miss_from_config(ovs_be32 config_, enum 
> ofp_version version)
>     }
> }
> 
> -/* Given a table miss configuration, returns the corresponding OpenFlow table
> - * configuration for use in an OpenFlow message of the given 'version'. */
> -ovs_be32
> -ofputil_table_miss_to_config(enum ofputil_table_miss miss,
> -                             enum ofp_version version)
> -{
> -    if (version < OFP13_VERSION) {
> -        switch (miss) {
> -        case OFPUTIL_TABLE_MISS_CONTROLLER:
> -        case OFPUTIL_TABLE_MISS_DEFAULT:
> -            return htonl(OFPTC11_TABLE_MISS_CONTROLLER);
> -
> -        case OFPUTIL_TABLE_MISS_CONTINUE:
> -            return htonl(OFPTC11_TABLE_MISS_CONTINUE);
> -
> -        case OFPUTIL_TABLE_MISS_DROP:
> -            return htonl(OFPTC11_TABLE_MISS_DROP);
> -
> -        default:
> -            OVS_NOT_REACHED();
> -        }
> -    } else {
> -        return htonl(0);
> -    }
> -}
> -
> /* Decodes the OpenFlow "table mod" message in '*oh' into an abstract form in
>  * '*pm'.  Returns 0 if successful, otherwise an OFPERR_* value. */
> enum ofperr
> @@ -4933,6 +5005,10 @@ ofputil_decode_table_mod(const struct ofp_header *oh,
>     enum ofpraw raw;
>     struct ofpbuf b;
> 
> +    memset(pm, 0, sizeof *pm);
> +    pm->miss = OFPUTIL_TABLE_MISS_DEFAULT;
> +    pm->eviction = OFPUTIL_TABLE_EVICTION_DEFAULT;
> +    pm->eviction_flags = UINT32_MAX;
>     ofpbuf_use_const(&b, oh, ntohs(oh->length));
>     raw = ofpraw_pull_assert(&b);
> 
> @@ -4940,16 +5016,37 @@ ofputil_decode_table_mod(const struct ofp_header *oh,
>         const struct ofp11_table_mod *otm = b.data;
> 
>         pm->table_id = otm->table_id;
> -        pm->miss_config = ofputil_table_miss_from_config(otm->config,
> -                                                         oh->version);
> +        pm->miss = ofputil_decode_table_miss(otm->config, oh->version);
>     } else if (raw == OFPRAW_OFPT14_TABLE_MOD) {
>         const struct ofp14_table_mod *otm = ofpbuf_pull(&b, sizeof *otm);
> 
>         pm->table_id = otm->table_id;
> -        pm->miss_config = ofputil_table_miss_from_config(otm->config,
> -                                                         oh->version);
> -        /* We do not understand any properties yet, so we do not bother
> -         * parsing them. */
> +        pm->miss = ofputil_decode_table_miss(otm->config, oh->version);
> +        pm->eviction = ofputil_decode_table_eviction(otm->config, 
> oh->version);
> +        while (b.size > 0) {
> +            struct ofpbuf property;
> +            enum ofperr error;
> +            uint16_t type;
> +
> +            error = ofputil_pull_property(&b, &property, &type);
> +            if (error) {
> +                return error;
> +            }
> +
> +            switch (type) {
> +            case OFPTMPT14_EVICTION:
> +                error = parse_table_mod_eviction_property(&property, pm);
> +                break;
> +
> +            default:
> +                error = OFPERR_OFPBRC_BAD_TYPE;
> +                break;
> +            }
> +
> +            if (error) {
> +                return error;
> +            }
> +        }
>     } else {
>         return OFPERR_OFPBRC_BAD_TYPE;
>     }
> @@ -4957,11 +5054,11 @@ ofputil_decode_table_mod(const struct ofp_header *oh,
>     return 0;
> }
> 
> -/* Converts the abstract form of a "table mod" message in '*pm' into an 
> OpenFlow
> - * message suitable for 'protocol', and returns that encoded form in a buffer
> - * owned by the caller. */
> +/* Converts the abstract form of a "table mod" message in '*tm' into an
> + * OpenFlow message suitable for 'protocol', and returns that encoded form 
> in a
> + * buffer owned by the caller. */
> struct ofpbuf *
> -ofputil_encode_table_mod(const struct ofputil_table_mod *pm,
> +ofputil_encode_table_mod(const struct ofputil_table_mod *tm,
>                         enum ofputil_protocol protocol)
> {
>     enum ofp_version ofp_version = ofputil_protocol_to_ofp_version(protocol);
> @@ -4980,20 +5077,28 @@ ofputil_encode_table_mod(const struct 
> ofputil_table_mod *pm,
> 
>         b = ofpraw_alloc(OFPRAW_OFPT11_TABLE_MOD, ofp_version, 0);
>         otm = ofpbuf_put_zeros(b, sizeof *otm);
> -        otm->table_id = pm->table_id;
> -        otm->config = ofputil_table_miss_to_config(pm->miss_config,
> -                                                   ofp_version);
> +        otm->table_id = tm->table_id;
> +        otm->config = ofputil_encode_table_config(tm->miss, tm->eviction,
> +                                                  ofp_version);
>         break;
>     }
>     case OFP14_VERSION:
>     case OFP15_VERSION: {
>         struct ofp14_table_mod *otm;
> +        struct ofp14_table_mod_prop_eviction *ote;
> 
>         b = ofpraw_alloc(OFPRAW_OFPT14_TABLE_MOD, ofp_version, 0);
>         otm = ofpbuf_put_zeros(b, sizeof *otm);
> -        otm->table_id = pm->table_id;
> -        otm->config = ofputil_table_miss_to_config(pm->miss_config,
> -                                                   ofp_version);
> +        otm->table_id = tm->table_id;
> +        otm->config = ofputil_encode_table_config(tm->miss, tm->eviction,
> +                                                  ofp_version);
> +
> +        if (tm->eviction_flags != UINT32_MAX) {
> +            ote = ofpbuf_put_zeros(b, sizeof *ote);
> +            ote->type = htons(OFPTMPT14_EVICTION);
> +            ote->length = htons(sizeof *ote);
> +            ote->flags = htonl(tm->eviction_flags);
> +        }
>         break;
>     }
>     default:
> @@ -5339,8 +5444,9 @@ ofputil_put_ofp12_table_stats(const struct 
> ofputil_table_stats *stats,
>     out->metadata_write = features->metadata_write;
>     out->instructions = ovsinst_bitmap_to_openflow(
>         features->nonmiss.instructions, OFP12_VERSION);
> -    out->config = ofputil_table_miss_to_config(features->miss_config,
> -                                               OFP12_VERSION);
> +    out->config = ofputil_encode_table_config(features->miss_config,
> +                                              OFPUTIL_TABLE_EVICTION_DEFAULT,
> +                                              OFP12_VERSION);
>     out->max_entries = htonl(features->max_entries);
>     out->active_count = htonl(stats->active_count);
>     out->lookup_count = htonll(stats->lookup_count);
> @@ -5446,8 +5552,8 @@ ofputil_decode_ofp11_table_stats(struct ofpbuf *msg,
>     features->nonmiss.apply.ofpacts = ofpact_bitmap_from_openflow(
>         ots->write_actions, OFP11_VERSION);
>     features->miss = features->nonmiss;
> -    features->miss_config = ofputil_table_miss_from_config(ots->config,
> -                                                           OFP11_VERSION);
> +    features->miss_config = ofputil_decode_table_miss(ots->config,
> +                                                      OFP11_VERSION);
>     features->match = mf_bitmap_from_of11(ots->match);
>     features->wildcard = mf_bitmap_from_of11(ots->wildcards);
>     bitmap_or(features->match.bm, features->wildcard.bm, MFF_N_IDS);
> @@ -5476,8 +5582,8 @@ ofputil_decode_ofp12_table_stats(struct ofpbuf *msg,
>     ovs_strlcpy(features->name, ots->name, sizeof features->name);
>     features->metadata_match = ots->metadata_match;
>     features->metadata_write = ots->metadata_write;
> -    features->miss_config = ofputil_table_miss_from_config(ots->config,
> -                                                           OFP12_VERSION);
> +    features->miss_config = ofputil_decode_table_miss(ots->config,
> +                                                      OFP12_VERSION);
>     features->max_entries = ntohl(ots->max_entries);
> 
>     features->nonmiss.instructions = ovsinst_bitmap_from_openflow(
> @@ -5545,6 +5651,8 @@ ofputil_decode_table_stats_reply(struct ofpbuf *msg,
> 
>     memset(stats, 0, sizeof *stats);
>     memset(features, 0, sizeof *features);
> +    features->supports_eviction = -1;
> +    features->supports_vacancy_events = -1;
> 
>     switch ((enum ofp_version) oh->version) {
>     case OFP10_VERSION:
> diff --git a/lib/ofp-util.h b/lib/ofp-util.h
> index f90ac0c..fbc8abf 100644
> --- a/lib/ofp-util.h
> +++ b/lib/ofp-util.h
> @@ -609,13 +609,33 @@ enum ofputil_table_miss {
>     OFPUTIL_TABLE_MISS_DROP,       /* Drop the packet. */
> };
> 
> -ovs_be32 ofputil_table_miss_to_config(enum ofputil_table_miss,
> -                                      enum ofp_version);
> +/* Abstract version of OFPTC14_EVICTION.
> + *
> + * OpenFlow 1.0 through 1.3 don't know anything about eviction, so decoding a
> + * message for one of these protocols always yields
> + * OFPUTIL_TABLE_EVICTION_DEFAULT. */
> +enum ofputil_table_eviction {
> +    OFPUTIL_TABLE_EVICTION_DEFAULT, /* No value. */
> +    OFPUTIL_TABLE_EVICTION_ON,      /* Enable eviction. */
> +    OFPUTIL_TABLE_EVICTION_OFF      /* Disable eviction. */
> +};
> 
> /* Abstract ofp_table_mod. */
> struct ofputil_table_mod {
>     uint8_t table_id;         /* ID of the table, 0xff indicates all tables. 
> */
> -    enum ofputil_table_miss miss_config;
> +
> +    /* OpenFlow 1.1 and 1.2 only.  For other versions, ignored on encoding,
> +     * decoded to OFPUTIL_TABLE_MISS_DEFAULT. */
> +    enum ofputil_table_miss miss;
> +
> +    /* OpenFlow 1.4+ only.  For other versions, ignored on encoding, decoded 
> to
> +     * OFPUTIL_TABLE_EVICTION_DEFAULT. */
> +    enum ofputil_table_eviction eviction;
> +
> +    /* OpenFlow 1.4+ only and optional even there; UINT32_MAX indicates
> +     * absence.  For other versions, ignored on encoding, decoded to
> +     * UINT32_MAX.*/
> +    uint32_t eviction_flags;    /* OFPTMPEF14_*. */
> };
> 
> enum ofperr ofputil_decode_table_mod(const struct ofp_header *,
> @@ -623,16 +643,38 @@ enum ofperr ofputil_decode_table_mod(const struct 
> ofp_header *,
> struct ofpbuf *ofputil_encode_table_mod(const struct ofputil_table_mod *,
>                                        enum ofputil_protocol);
> 
> -/* Abstract ofp_table_features. */
> +/* Abstract ofp_table_features.
> + *
> + * This is used for all versions of OpenFlow, even though ofp_table_features
> + * was only introduced in OpenFlow 1.3, because earlier versions of OpenFlow
> + * include support for a subset of ofp_table_features through OFPST_TABLE 
> (aka
> + * OFPMP_TABLE). */
> struct ofputil_table_features {
>     uint8_t table_id;         /* Identifier of table. Lower numbered tables
>                                  are consulted first. */
>     char name[OFP_MAX_TABLE_NAME_LEN];
>     ovs_be64 metadata_match;  /* Bits of metadata table can match. */
>     ovs_be64 metadata_write;  /* Bits of metadata table can write. */
> -    enum ofputil_table_miss miss_config;
>     uint32_t max_entries;     /* Max number of entries supported. */
> 
> +    /* Flags.
> +     *
> +     * 'miss_config' is relevant for OpenFlow 1.1 and 1.2 only, because those
> +     * versions include OFPTC_MISS_* flags in OFPST_TABLE.  For other 
> versions,
> +     * it is decoded to OFPUTIL_TABLE_MISS_DEFAULT and ignored for encoding.
> +     *
> +     * 'supports_eviction' and 'supports_vacancy_events' are relevant only 
> for
> +     * OpenFlow 1.4 and later only.  For OF1.4, they are boolean: 1 if
> +     * supported, otherwise 0.  For other versions, they are decoded as -1 
> and
> +     * ignored for encoding.
> +     *
> +     * See the section "OFPTC_* Table Configuration" in DESIGN.md for more
> +     * details of how OpenFlow has changed in this area.
> +     */
> +    enum ofputil_table_miss miss_config; /* OF1.1 and 1.2 only. */
> +    int supports_eviction;               /* OF1.4+ only. */
> +    int supports_vacancy_events;         /* OF1.4+ only. */
> +
>     /* Table features related to instructions.  There are two instances:
>      *
>      *   - 'miss' reports features available in the table miss flow.
> diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
> index f7e8ead..95c1c64 100644
> --- a/ofproto/ofproto-provider.h
> +++ b/ofproto/ofproto-provider.h
> @@ -246,9 +246,16 @@ struct oftable {
>     struct hmap eviction_groups_by_id;
>     struct heap eviction_groups_by_size;
> 
> -    /* Table configuration. */
> +    /* Flow table miss handling configuration. */
>     ATOMIC(enum ofputil_table_miss) miss_config;
> 
> +    /* Eviction is enabled if either the client (vswitchd) enables it or an
> +     * OpenFlow controller enables it; thus, a nonzero value indicates that
> +     * eviction is enabled.  */
> +#define EVICTION_CLIENT  (1 << 0)  /* Set to 1 if client enables eviction. */
> +#define EVICTION_OPENFLOW (1 << 1) /* Set to 1 if OpenFlow enables eviction. 
> */
> +    unsigned int eviction;
> +
>     atomic_ulong n_matched;
>     atomic_ulong n_missed;
> };
> diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> index 4aa844e..970aca3 100644
> --- a/ofproto/ofproto.c
> +++ b/ofproto/ofproto.c
> @@ -84,13 +84,18 @@ static void oftable_set_name(struct oftable *, const char 
> *name);
> 
> static enum ofperr evict_rules_from_table(struct oftable *)
>     OVS_REQUIRES(ofproto_mutex);
> -static void oftable_disable_eviction(struct oftable *)
> -    OVS_REQUIRES(ofproto_mutex);
> -static void oftable_enable_eviction(struct oftable *,
> -                                    const struct mf_subfield *fields,
> -                                    size_t n_fields)
> +static void oftable_configure_eviction(struct oftable *,
> +                                       unsigned int eviction,
> +                                       const struct mf_subfield *fields,
> +                                       size_t n_fields)
>     OVS_REQUIRES(ofproto_mutex);
> 
> +/* This is the only combination of OpenFlow eviction flags that OVS 
> supports: a
> + * combination of OF1.4+ importance, the remaining lifetime of the flow, and
> + * fairness based on user-specified fields. */
> +#define OFPROTO_EVICTION_FLAGS \
> +    (OFPTMPEF14_OTHER | OFPTMPEF14_IMPORTANCE | OFPTMPEF14_LIFETIME)
> +
> /* A set of rules within a single OpenFlow table (oftable) that have the same
>  * values for the oftable's eviction_fields.  A rule to be evicted, when one 
> is
>  * needed, is taken from the eviction group that contains the greatest number
> @@ -1429,11 +1434,10 @@ ofproto_configure_table(struct ofproto *ofproto, int 
> table_id,
>     }
> 
>     ovs_mutex_lock(&ofproto_mutex);
> -    if (s->groups) {
> -        oftable_enable_eviction(table, s->groups, s->n_groups);
> -    } else {
> -        oftable_disable_eviction(table);
> -    }
> +    unsigned int new_eviction = (s->enable_eviction
> +                                 ? table->eviction | EVICTION_CLIENT
> +                                 : table->eviction & ~EVICTION_CLIENT);
> +    oftable_configure_eviction(table, new_eviction, s->groups, s->n_groups);
>     table->max_flows = s->max_flows;
>     evict_rules_from_table(table);
>     ovs_mutex_unlock(&ofproto_mutex);
> @@ -1695,7 +1699,7 @@ ofproto_run(struct ofproto *p)
>             struct eviction_group *evg;
>             struct rule *rule;
> 
> -            if (!table->eviction_fields) {
> +            if (!table->eviction) {
>                 continue;
>             }
> 
> @@ -6560,10 +6564,38 @@ handle_group_mod(struct ofconn *ofconn, const struct 
> ofp_header *oh)
> enum ofputil_table_miss
> ofproto_table_get_miss_config(const struct ofproto *ofproto, uint8_t table_id)
> {
> -    enum ofputil_table_miss value;
> +    enum ofputil_table_miss miss;
> +
> +    atomic_read_relaxed(&ofproto->tables[table_id].miss_config, &miss);
> +    return miss;
> +}
> +
> +static void
> +table_mod__(struct oftable *oftable,
> +            enum ofputil_table_miss miss, enum ofputil_table_eviction 
> eviction)
> +{
> +    if (miss == OFPUTIL_TABLE_MISS_DEFAULT) {
> +        /* This is how an OFPT_TABLE_MOD decodes if it doesn't specify any
> +         * table-miss configuration (because the protocol used doesn't have
> +         * such a concept), so there's nothing to do. */
> +    } else {
> +        atomic_store_relaxed(&oftable->miss_config, miss);
> +    }
> +
> +    unsigned int new_eviction = oftable->eviction;
> +    if (eviction == OFPUTIL_TABLE_EVICTION_ON) {
> +        new_eviction |= EVICTION_OPENFLOW;
> +    } else if (eviction == OFPUTIL_TABLE_EVICTION_OFF) {
> +        new_eviction &= ~EVICTION_OPENFLOW;
> +    }
> 
> -    atomic_read_relaxed(&ofproto->tables[table_id].miss_config, &value);
> -    return value;
> +    if (new_eviction != oftable->eviction) {
> +        ovs_mutex_lock(&ofproto_mutex);
> +        oftable_configure_eviction(oftable, new_eviction,
> +                                   oftable->eviction_fields,
> +                                   oftable->n_eviction_fields);
> +        ovs_mutex_unlock(&ofproto_mutex);
> +    }
> }
> 
> static enum ofperr
> @@ -6571,18 +6603,33 @@ table_mod(struct ofproto *ofproto, const struct 
> ofputil_table_mod *tm)
> {
>     if (!check_table_id(ofproto, tm->table_id)) {
>         return OFPERR_OFPTMFC_BAD_TABLE;
> -    } else if (tm->miss_config != OFPUTIL_TABLE_MISS_DEFAULT) {
> -        if (tm->table_id == OFPTT_ALL) {
> -            int i;
> -            for (i = 0; i < ofproto->n_tables; i++) {
> -                atomic_store_relaxed(&ofproto->tables[i].miss_config,
> -                                     tm->miss_config);
> +    }
> +
> +    /* Don't allow the eviction flags to be changed (except to the only fixed
> +     * value that OVS supports).  OF1.4 says this is normal: "The
> +     * OFPTMPT_EVICTION property usually cannot be modified using a
> +     * OFP_TABLE_MOD request, because the eviction mechanism is switch
> +     * defined". */
> +    if (tm->eviction_flags != UINT32_MAX
> +        && tm->eviction_flags != OFPROTO_EVICTION_FLAGS) {
> +        return OFPERR_OFPTMFC_BAD_CONFIG;
> +    }
> +
> +    if (tm->table_id == OFPTT_ALL) {
> +        struct oftable *oftable;
> +        OFPROTO_FOR_EACH_TABLE (oftable, ofproto) {
> +            if (!(oftable->flags & (OFTABLE_HIDDEN | OFTABLE_READONLY))) {
> +                table_mod__(oftable, tm->miss, tm->eviction);
>             }
> -        } else {
> -            atomic_store_relaxed(&ofproto->tables[tm->table_id].miss_config,
> -                                 tm->miss_config);
>         }
> +    } else {
> +        struct oftable *oftable = &ofproto->tables[tm->table_id];
> +        if (oftable->flags & OFTABLE_READONLY) {
> +            return OFPERR_OFPTMFC_EPERM;
> +        }
> +        table_mod__(oftable, tm->miss, tm->eviction);
>     }
> +
>     return 0;
> }
> 
> @@ -7231,7 +7278,7 @@ choose_rule_to_evict(struct oftable *table, struct rule 
> **rulep)
>     struct eviction_group *evg;
> 
>     *rulep = NULL;
> -    if (!table->eviction_fields) {
> +    if (!table->eviction) {
>         return false;
>     }
> 
> @@ -7452,7 +7499,7 @@ eviction_group_add_rule(struct rule *rule)
>      * so no additional protection is needed. */
>     has_timeout = rule->hard_timeout || rule->idle_timeout;
> 
> -    if (table->eviction_fields && has_timeout) {
> +    if (table->eviction && has_timeout) {
>         struct eviction_group *evg;
> 
>         evg = eviction_group_find(table, eviction_group_hash_rule(rule));
> @@ -7474,6 +7521,8 @@ oftable_init(struct oftable *table)
>     classifier_init(&table->cls, flow_segment_u64s);
>     table->max_flows = UINT_MAX;
>     table->n_flows = 0;
> +    hmap_init(&table->eviction_groups_by_id);
> +    heap_init(&table->eviction_groups_by_size);
>     atomic_init(&table->miss_config, OFPUTIL_TABLE_MISS_DEFAULT);
> 
>     classifier_set_prefix_fields(&table->cls, default_prefix_fields,
> @@ -7490,9 +7539,13 @@ static void
> oftable_destroy(struct oftable *table)
> {
>     ovs_assert(classifier_is_empty(&table->cls));
> +
>     ovs_mutex_lock(&ofproto_mutex);
> -    oftable_disable_eviction(table);
> +    oftable_configure_eviction(table, 0, NULL, 0);
>     ovs_mutex_unlock(&ofproto_mutex);
> +
> +    hmap_destroy(&table->eviction_groups_by_id);
> +    heap_destroy(&table->eviction_groups_by_size);
>     classifier_destroy(&table->cls);
>     free(table->name);
> }
> @@ -7520,60 +7573,56 @@ oftable_set_name(struct oftable *table, const char 
> *name)
> /* oftables support a choice of two policies when adding a rule would cause 
> the
>  * number of flows in the table to exceed the configured maximum number: 
> either
>  * they can refuse to add the new flow or they can evict some existing flow.
> - * This function configures the former policy on 'table'. */
> -static void
> -oftable_disable_eviction(struct oftable *table)
> -    OVS_REQUIRES(ofproto_mutex)
> -{
> -    if (table->eviction_fields) {
> -        struct eviction_group *evg, *next;
> -
> -        HMAP_FOR_EACH_SAFE (evg, next, id_node,
> -                            &table->eviction_groups_by_id) {
> -            eviction_group_destroy(table, evg);
> -        }
> -        hmap_destroy(&table->eviction_groups_by_id);
> -        heap_destroy(&table->eviction_groups_by_size);
> -
> -        free(table->eviction_fields);
> -        table->eviction_fields = NULL;
> -        table->n_eviction_fields = 0;
> -    }
> -}
> -
> -/* oftables support a choice of two policies when adding a rule would cause 
> the
> - * number of flows in the table to exceed the configured maximum number: 
> either
> - * they can refuse to add the new flow or they can evict some existing flow.
>  * This function configures the latter policy on 'table', with fairness based
>  * on the values of the 'n_fields' fields specified in 'fields'.  (Specifying
>  * 'n_fields' as 0 disables fairness.) */
> static void
> -oftable_enable_eviction(struct oftable *table,
> -                        const struct mf_subfield *fields, size_t n_fields)
> +oftable_configure_eviction(struct oftable *table, unsigned int eviction,
> +                           const struct mf_subfield *fields, size_t n_fields)
>     OVS_REQUIRES(ofproto_mutex)
> {
>     struct rule *rule;
> 
> -    if (table->eviction_fields
> +    if ((table->eviction != 0) == (eviction != 0)
>         && n_fields == table->n_eviction_fields
>         && (!n_fields
>             || !memcmp(fields, table->eviction_fields,
>                        n_fields * sizeof *fields))) {
> -        /* No change. */
> +        /* The set of eviction fields did not change.  If 'eviction' changed,
> +         * it remains nonzero, so that we can just update table->eviction
> +         * without fussing with the eviction groups. */
> +        table->eviction = eviction;
>         return;
>     }
> 
> -    oftable_disable_eviction(table);
> -
> -    table->n_eviction_fields = n_fields;
> -    table->eviction_fields = xmemdup(fields, n_fields * sizeof *fields);
> -
> -    table->eviction_group_id_basis = random_uint32();
> +    /* Destroy existing eviction groups, then destroy and recreate data
> +     * structures to recover memory. */
> +    struct eviction_group *evg, *next;
> +    HMAP_FOR_EACH_SAFE (evg, next, id_node, &table->eviction_groups_by_id) {
> +        eviction_group_destroy(table, evg);
> +    }
> +    hmap_destroy(&table->eviction_groups_by_id);
>     hmap_init(&table->eviction_groups_by_id);
> +    heap_destroy(&table->eviction_groups_by_size);
>     heap_init(&table->eviction_groups_by_size);
> 
> -    CLS_FOR_EACH (rule, cr, &table->cls) {
> -        eviction_group_add_rule(rule);
> +    /* Replace eviction groups by the new ones, if there is a change.  Free 
> the
> +     * old fields only after allocating the new ones, because 'fields ==
> +     * table->eviction_fields' is possible. */
> +    struct mf_subfield *old_fields = table->eviction_fields;
> +    table->n_eviction_fields = n_fields;
> +    table->eviction_fields = (fields
> +                              ? xmemdup(fields, n_fields * sizeof *fields)
> +                              : NULL);
> +    free(old_fields);
> +
> +    /* Add the new eviction groups, if enabled. */
> +    table->eviction = eviction;
> +    if (table->eviction) {
> +        table->eviction_group_id_basis = random_uint32();
> +        CLS_FOR_EACH (rule, cr, &table->cls) {
> +            eviction_group_add_rule(rule);
> +        }
>     }
> }
> 
> diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h
> index 7dc1874..7504027 100644
> --- a/ofproto/ofproto.h
> +++ b/ofproto/ofproto.h
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014 Nicira, Inc.
> + * Copyright (c) 2009, 2010, 2011, 2012, 2013, 2014, 2015 Nicira, Inc.
>  *
>  * Licensed under the Apache License, Version 2.0 (the "License");
>  * you may not use this file except in compliance with the License.
> @@ -459,14 +459,18 @@ struct ofproto_table_settings {
>     char *name;                 /* Name exported via OpenFlow or NULL. */
>     unsigned int max_flows;     /* Maximum number of flows or UINT_MAX. */
> 
> -    /* These members determine the handling of an attempt to add a flow that
> -     * would cause the table to have more than 'max_flows' flows.
> +    /* These members, together with OpenFlow OFPT_TABLE_MOD, determine the
> +     * handling of an attempt to add a flow that would cause the table to 
> have
> +     * more than 'max_flows' flows:
>      *
> -     * If 'groups' is NULL, overflows will be rejected with an error.
> +     *    - If 'enable_eviction' is false and OFPT_TABLE_MOD does not enable
> +     *      eviction, overflows will be rejected with an error.
>      *
> -     * If 'groups' is nonnull, an overflow will cause a flow to be removed.
> -     * The flow to be removed is chosen to give fairness among groups
> -     * distinguished by different values for the subfields within 'groups'. 
> */
> +     *    - If 'enable_eviction' is true or OFPT_TABLE_MOD enables eviction, 
> an
> +     *      overflow will cause a flow to be removed.  The flow to be removed
> +     *      is chosen to give fairness among groups distinguished by 
> different
> +     *      values for the 'n_groups' subfields within 'groups'. */
> +    bool enable_eviction;
>     struct mf_subfield *groups;
>     size_t n_groups;
> 
> diff --git a/tests/ofp-print.at b/tests/ofp-print.at
> index 39a5bbb..e08a201 100644
> --- a/tests/ofp-print.at
> +++ b/tests/ofp-print.at
> @@ -1132,7 +1132,7 @@ AT_KEYWORDS([ofp-print])
> AT_CHECK([ovs-ofctl ofp-print "\
> 05 11 00 10 00 00 00 02 02 00 00 00 00 00 00 00 \
> " 3], [0], [dnl
> -OFPT_TABLE_MOD (OF1.4) (xid=0x2): table_id=2
> +OFPT_TABLE_MOD (OF1.4) (xid=0x2): table_id=2, eviction=off
> ])
> AT_CLEANUP
> 
> diff --git a/tests/ofproto.at b/tests/ofproto.at
> index 3bd625d..54c077d 100644
> --- a/tests/ofproto.at
> +++ b/tests/ofproto.at
> @@ -1869,11 +1869,13 @@ OVS_VSWITCHD_START
> # Configure a maximum of 4 flows.
> AT_CHECK(
>   [ovs-vsctl \
> -     -- --id=@t0 create Flow_Table name=evict flow-limit=4 
> overflow-policy=evict \
> +     -- --id=@t0 create Flow_Table name=evict flow-limit=4 \
>      -- set bridge br0 flow_tables:0=@t0 \
>    | ${PERL} $srcdir/uuidfilt.pl],
>   [0], [<0>
> ])
> +# Use mod-table to turn on eviction just to demonstrate that it works.
> +AT_CHECK([ovs-ofctl -O OpenFlow14 mod-table br0 0 evict])
> # Add 4 flows.
> for in_port in 4 3 2 1; do
>     ovs-ofctl -O Openflow14 add-flow br0 importance=$((in_port + 
> 30)),priority=$((in_port + 5)),hard_timeout=$((in_port + 500)),actions=drop
> @@ -1895,7 +1897,7 @@ AT_CHECK([ovs-ofctl -O Openflow14 dump-flows br0 | 
> ofctl_strip | sort], [0], [dn
> OFPST_FLOW reply (OF1.4):
> ])
> # Disable the Eviction configuration.
> -AT_CHECK([ovs-vsctl set Flow_Table evict overflow-policy=refuse])
> +AT_CHECK([ovs-ofctl -O OpenFlow14 mod-table br0 0 noevict])
> # Adding another flow will cause the system to give error for FULL TABLE.
> AT_CHECK([ovs-ofctl -O Openflow14 add-flow br0 
> hard_timeout=506,importance=36,priority=11,actions=drop],[1], [], [stderr])
> AT_CHECK([head -n 1 stderr | ofctl_strip], [0],
> @@ -1923,23 +1925,6 @@ AT_CHECK([ovs-ofctl mod-flows br0 
> in_port=5,actions=drop], [1], [], [stderr])
> AT_CHECK([head -n 1 stderr | ofctl_strip], [0],
>   [OFPT_ERROR: OFPFMFC_TABLE_FULL
> ])
> -# Now set the eviction on timeout basis.
> -AT_CHECK(
> -  [ovs-vsctl \
> -     -- --id=@t0 create Flow_Table flow-limit=4 overflow-policy=evict \
> -     -- set bridge br0 flow_tables:0=@t0 \
> -   | ${PERL} $srcdir/uuidfilt.pl],
> -  [0], [<0>
> -])
> -#Now add a new flow
> -AT_CHECK([ovs-ofctl -O Openflow14 add-flow br0 
> importance=37,hard_timeout=507,priority=11,in_port=6,actions=drop])
> -AT_CHECK([ovs-ofctl -O Openflow14 dump-flows br0 | ofctl_strip | sort], [0], 
> [dnl
> - hard_timeout=503, importance=33, priority=8 actions=drop
> - hard_timeout=504, importance=34, priority=9 actions=drop
> - hard_timeout=505, importance=35, priority=10,in_port=2 actions=NORMAL
> - hard_timeout=507, importance=37, priority=11,in_port=6 actions=drop
> -OFPST_FLOW reply (OF1.4):
> -])
> OVS_VSWITCHD_STOP
> AT_CLEANUP
> 
> diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
> index 395d851..058d205 100644
> --- a/utilities/ovs-ofctl.8.in
> +++ b/utilities/ovs-ofctl.8.in
> @@ -62,20 +62,13 @@ Prints to the console statistics for each of the flow 
> tables used by
> \fBdump\-table\-features \fIswitch\fR
> Prints to the console features for each of the flow tables used by
> \fIswitch\fR.
> -.
> -.IP "\fBmod\-table \fIswitch\fR \fItable_id\fR  \fIflow_miss_handling\fR"
> -An OpenFlow 1.0 switch looks up each packet that arrives at the switch
> -in table 0, then in table 1 if there is no match in table 0, then in
> -table 2, and so on until the packet finds a match in some table.
> -Finally, if no match was found, the switch sends the packet to the
> -controller
> -.IP
> -OpenFlow 1.1 and later offer more flexibility.  This command
> -configures the flow table miss handling configuration for table
> -\fItable_id\fR in \fIswitch\fR.  \fItable_id\fR may be an OpenFlow
> -table number between 0 and 254, inclusive, or the keyword \fBALL\fR to
> -modify all tables.  \fIflow_miss_handling\fR may be any one of the
> -following:
> +.IP "\fBmod\-table \fIswitch\fR \fItable_id\fR \fIsetting\fR"
> +This command configures flow table settings for OpenFlow table
> +\fItable_id\fR within \fIswitch\fR.  The available settings depend on
> +the OpenFlow version in use.  In OpenFlow 1.1 and 1.2 (which must be
> +enabled with the \fB\-O\fR option) only, \fBmod\-table\fR configures
> +behavior when no flow is found when a packet is looked up in a flow
> +table.  The following \fIsetting\fR values are available:
> .RS
> .IP \fBdrop\fR
> Drop the packet.
> @@ -87,6 +80,21 @@ tables other than the last one.)
> Send to controller.  (This is how an OpenFlow 1.0 switch always
> handles packets that do not match any flow in the last table.)
> .RE
> +.IP
> +In OpenFlow 1.4 and later (which must be enabled with the \fB\-O\fR
> +option) only, \fBmod\-table\fR configures the behavior when a
> +controller attempts to add a flow to a flow table that is full.  The
> +following \fIsetting\fR values are available:
> +.RS
> +.IP \fBevict\fR
> +Delete some existing flow from the flow table, according to the
> +algorithm described for the \fBFlow_Table\fR table in
> +\fBovs-vswitchd.conf.db\fR(5).
> +.IP \fBnoevict\fR
> +Refuse to add the new flow.  (Eviction might still be enabled through
> +the \fBoverflow_policy\fR oclumn in the \fBFlow_Table\fR table
> +documented in \fBovs-vswitchd.conf.db\fR(5).)
> +.RE
> .
> .TP
> \fBdump\-ports \fIswitch\fR [\fInetdev\fR]
> diff --git a/utilities/ovs-ofctl.c b/utilities/ovs-ofctl.c
> index 6ef7070..f16cc25 100644
> --- a/utilities/ovs-ofctl.c
> +++ b/utilities/ovs-ofctl.c
> @@ -340,8 +340,11 @@ usage(void)
>            "  dump-table-features SWITCH  print table features\n"
>            "  mod-port SWITCH IFACE ACT   modify port behavior\n"
>            "  mod-table SWITCH MOD        modify flow table behavior\n"
> +           "      OF1.1/1.2 MOD: controller, continue, drop\n"
> +           "      OF1.4+ MOD: evict, noevict\n"
>            "  get-frags SWITCH            print fragment handling behavior\n"
>            "  set-frags SWITCH FRAG_MODE  set fragment handling behavior\n"
> +           "      FRAG_MODE: normal, drop, reassemble, nx-match\n"
>            "  dump-ports SWITCH [PORT]    print port statistics\n"
>            "  dump-ports-desc SWITCH [PORT]  print port descriptions\n"
>            "  dump-flows SWITCH           print all flow entries\n"
> @@ -1848,35 +1851,28 @@ found:
> static void
> ofctl_mod_table(struct ovs_cmdl_context *ctx)
> {
> -    enum ofputil_protocol protocol, usable_protocols;
> +    uint32_t usable_versions;
>     struct ofputil_table_mod tm;
>     struct vconn *vconn;
>     char *error;
> -    int i;
> 
> -    error = parse_ofp_table_mod(&tm, ctx->argv[2], ctx->argv[3], 
> &usable_protocols);
> +    error = parse_ofp_table_mod(&tm, ctx->argv[2], ctx->argv[3],
> +                                &usable_versions);
>     if (error) {
>         ovs_fatal(0, "%s", error);
>     }
> 
> -    protocol = open_vconn(ctx->argv[1], &vconn);
> -    if (!(protocol & usable_protocols)) {
> -        for (i = 0; i < sizeof(enum ofputil_protocol) * CHAR_BIT; i++) {
> -            enum ofputil_protocol f = 1 << i;
> -            if (f != protocol
> -                && f & usable_protocols
> -                && try_set_protocol(vconn, f, &protocol)) {
> -                protocol = f;
> -                break;
> -            }
> -        }
> -    }
> -
> -    if (!(protocol & usable_protocols)) {
> -        char *usable_s = ofputil_protocols_to_string(usable_protocols);
> -        ovs_fatal(0, "Switch does not support table mod message(%s)", 
> usable_s);
> +    uint32_t allowed_versions = get_allowed_ofp_versions();
> +    if (!(allowed_versions & usable_versions)) {
> +        struct ds versions = DS_EMPTY_INITIALIZER;
> +        ofputil_format_version_bitmap_names(&versions, allowed_versions);
> +        ovs_fatal(0, "table_mod '%s' requires one of the OpenFlow "
> +                  "versions %s but none is enabled (use -O)",
> +                  ctx->argv[3], ds_cstr(&versions));
>     }
> +    mask_allowed_ofp_versions(usable_versions);
> 
> +    enum ofputil_protocol protocol = open_vconn(ctx->argv[1], &vconn);
>     transact_noreply(vconn, ofputil_encode_table_mod(&tm, protocol));
>     vconn_close(vconn);
> }
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index d48cf7f..dfb581e 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -3603,6 +3603,7 @@ bridge_configure_tables(struct bridge *br)
>         s.name = NULL;
>         s.max_flows = UINT_MAX;
>         s.groups = NULL;
> +        s.enable_eviction = false;
>         s.n_groups = 0;
>         s.n_prefix_fields = 0;
>         memset(s.prefix_fields, ~0, sizeof(s.prefix_fields));
> @@ -3614,9 +3615,10 @@ bridge_configure_tables(struct bridge *br)
>             if (cfg->n_flow_limit && *cfg->flow_limit < UINT_MAX) {
>                 s.max_flows = *cfg->flow_limit;
>             }
> -            if (cfg->overflow_policy
> -                && !strcmp(cfg->overflow_policy, "evict")) {
> 
> +            s.enable_eviction = (cfg->overflow_policy
> +                                 && !strcmp(cfg->overflow_policy, "evict"));
> +            if (cfg->n_groups) {
>                 s.groups = xmalloc(cfg->n_groups * sizeof *s.groups);
>                 for (k = 0; k < cfg->n_groups; k++) {
>                     const char *string = cfg->groups[k];
> @@ -3636,6 +3638,7 @@ bridge_configure_tables(struct bridge *br)
>                     }
>                 }
>             }
> +
>             /* Prefix lookup fields. */
>             s.n_prefix_fields = 0;
>             for (k = 0; k < cfg->n_prefixes; k++) {
> diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> index 9f108f0..483a9de 100644
> --- a/vswitchd/vswitch.xml
> +++ b/vswitchd/vswitch.xml
> @@ -3039,46 +3039,27 @@
>       dump-tables</code>.  The name does not affect switch behavior.
>     </column>
> 
> -    <column name="flow_limit">
> -      If set, limits the number of flows that may be added to the table.  
> Open
> -      vSwitch may limit the number of flows in a table for other reasons,
> -      e.g. due to hardware limitations or for resource availability or
> -      performance reasons.
> -    </column>
> -
> -    <column name="overflow_policy">
> +    <group title="Eviction Policy">
>       <p>
> -        Controls the switch's behavior when an OpenFlow flow table 
> modification
> -        request would add flows in excess of <ref column="flow_limit"/>.  The
> -        supported values are:
> +        Open vSwitch supports limiting the number of flows that may be
> +        installed in a flow table, via the <ref column="flow_limit"/> column.
> +        When adding a flow would exceed this limit, by default Open vSwitch
> +        reports an error, but there are two ways to configure Open vSwitch to
> +        instead delete (``evict'') a flow to make room for the new one:
>       </p>
> 
> -      <dl>
> -        <dt><code>refuse</code></dt>
> -        <dd>
> -          Refuse to add the flow or flows.  This is also the default policy
> -          when <ref column="overflow_policy"/> is unset.
> -        </dd>
> -
> -        <dt><code>evict</code></dt>
> -        <dd>
> -          Delete the flow that will expire soonest.  See <ref 
> column="groups"/>
> -          for details.
> -        </dd>
> -      </dl>
> -    </column>
> +      <ul>
> +        <li>
> +          Set the <ref column="overflow_policy"/> column to 
> <code>evict</code>.
> +        </li>
> 
> -    <column name="groups">
> -      <p>
> -        When <ref column="overflow_policy"/> is <code>evict</code>, this
> -        controls how flows are chosen for eviction when the flow table would
> -        otherwise exceed <ref column="flow_limit"/> flows.  Its value is a 
> set
> -        of NXM fields or sub-fields, each of which takes one of the forms
> -        <code><var>field</var>[]</code> or
> -        <code><var>field</var>[<var>start</var>..<var>end</var>]</code>,
> -        e.g. <code>NXM_OF_IN_PORT[]</code>.  Please see
> -        <code>nicira-ext.h</code> for a complete list of NXM field names.
> -      </p>
> +        <li>
> +          Send an OpenFlow 1.4+ ``table mod request'' to enable eviction for
> +          the flow table (e.g. <code>ovs-ofctl -O OpenFlow14 mod-table br0 0
> +          evict</code> to enable eviction on flow table 0 of bridge
> +          <code>br0</code>).
> +        </li>
> +      </ul>
> 
>       <p>
>         When a flow must be evicted due to overflow, the flow to evict is
> @@ -3118,95 +3099,138 @@
>       </ol>
> 
>       <p>
> -        The eviction process only considers flows that have an idle timeout 
> or
> -        a hard timeout.  That is, eviction never deletes permanent flows.
> +        The eviction process only considers flows that have an idle timeout
> +        or a hard timeout.  That is, eviction never deletes permanent flows.
>         (Permanent flows do count against <ref column="flow_limit"/>.)
>       </p>
> 
> -      <p>
> -        Open vSwitch ignores any invalid or unknown field specifications.
> -      </p>
> +      <column name="flow_limit">
> +        If set, limits the number of flows that may be added to the table.
> +        Open vSwitch may limit the number of flows in a table for other
> +        reasons, e.g. due to hardware limitations or for resource 
> availability
> +        or performance reasons.
> +      </column>
> 
> -      <p>
> -        When <ref column="overflow_policy"/> is not <code>evict</code>, this
> -        column has no effect.
> -      </p>
> -    </column>
> +      <column name="overflow_policy">
> +        <p>
> +          Controls the switch's behavior when an OpenFlow flow table
> +          modification request would add flows in excess of <ref
> +          column="flow_limit"/>.  The supported values are:
> +        </p>
> 
> -    <column name="prefixes">
> -      <p>
> -        This string set specifies which fields should be used for
> -        address prefix tracking.  Prefix tracking allows the
> -        classifier to skip rules with longer than necessary prefixes,
> -        resulting in better wildcarding for datapath flows.
> -      </p>
> -      <p>
> -        Prefix tracking may be beneficial when a flow table contains
> -        matches on IP address fields with different prefix lengths.
> -        For example, when a flow table contains IP address matches on
> -        both full addresses and proper prefixes, the full address
> -        matches will typically cause the datapath flow to un-wildcard
> -        the whole address field (depending on flow entry priorities).
> -        In this case each packet with a different address gets handed
> -        to the userspace for flow processing and generates its own
> -        datapath flow.  With prefix tracking enabled for the address
> -        field in question packets with addresses matching shorter
> -        prefixes would generate datapath flows where the irrelevant
> -        address bits are wildcarded, allowing the same datapath flow
> -        to handle all the packets within the prefix in question.  In
> -        this case many userspace upcalls can be avoided and the
> -        overall performance can be better.
> -      </p>
> -      <p>
> -        This is a performance optimization only, so packets will
> -        receive the same treatment with or without prefix tracking.
> -      </p>
> -      <p>
> -        The supported fields are: <code>tun_id</code>,
> -        <code>tun_src</code>, <code>tun_dst</code>,
> -        <code>nw_src</code>, <code>nw_dst</code> (or aliases
> -        <code>ip_src</code> and <code>ip_dst</code>),
> -        <code>ipv6_src</code>, and <code>ipv6_dst</code>.  (Using this
> -        feature for <code>tun_id</code> would only make sense if the
> -        tunnel IDs have prefix structure similar to IP addresses.)
> -      </p>
> +        <dl>
> +          <dt><code>refuse</code></dt>
> +          <dd>
> +            Refuse to add the flow or flows.  This is also the default policy
> +            when <ref column="overflow_policy"/> is unset.
> +          </dd>
> 
> -      <p>
> -        By default, the <code>prefixes=ip_dst,ip_src</code> are used
> -        on each flow table.  This instructs the flow classifier to
> -        track the IP destination and source addresses used by the
> -        rules in this specific flow table.
> -      </p>
> +          <dt><code>evict</code></dt>
> +          <dd>
> +            Delete a flow chosen according to the algorithm described above.
> +          </dd>
> +        </dl>
> +      </column>
> 
> -      <p>
> -        The keyword <code>none</code> is recognized as an explicit
> -        override of the default values, causing no prefix fields to be
> -        tracked.
> -      </p>
> +      <column name="groups">
> +        <p>
> +          When <ref column="overflow_policy"/> is <code>evict</code>, this
> +          controls how flows are chosen for eviction when the flow table 
> would
> +          otherwise exceed <ref column="flow_limit"/> flows.  Its value is a
> +          set of NXM fields or sub-fields, each of which takes one of the 
> forms
> +          <code><var>field</var>[]</code> or
> +          <code><var>field</var>[<var>start</var>..<var>end</var>]</code>,
> +          e.g. <code>NXM_OF_IN_PORT[]</code>.  Please see
> +          <code>nicira-ext.h</code> for a complete list of NXM field names.
> +        </p>
> 
> -      <p>
> -        To set the prefix fields, the flow table record needs to
> -        exist:
> -      </p>
> +        <p>
> +          Open vSwitch ignores any invalid or unknown field specifications.
> +        </p>
> 
> -      <dl>
> -        <dt><code>ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- --id=@N1 
> create Flow_Table name=table0</code></dt>
> -        <dd>
> -          Creates a flow table record for the OpenFlow table number 0.
> -        </dd>
> +        <p>
> +          When eviction is not enabled, via <ref column="overflow_policy"/> 
> or
> +          an OpenFlow 1.4+ ``table mod,'' this column has no effect.
> +        </p>
> +      </column>
> +    </group>
> 
> -        <dt><code>ovs-vsctl set Flow_Table table0 
> prefixes=ip_dst,ip_src</code></dt>
> -        <dd>
> -          Enables prefix tracking for IP source and destination
> -          address fields.
> -        </dd>
> -      </dl>
> +    <group title="Classifier Optimization">
> +      <column name="prefixes">
> +        <p>
> +          This string set specifies which fields should be used for
> +          address prefix tracking.  Prefix tracking allows the
> +          classifier to skip rules with longer than necessary prefixes,
> +          resulting in better wildcarding for datapath flows.
> +        </p>
> +        <p>
> +          Prefix tracking may be beneficial when a flow table contains
> +          matches on IP address fields with different prefix lengths.
> +          For example, when a flow table contains IP address matches on
> +          both full addresses and proper prefixes, the full address
> +          matches will typically cause the datapath flow to un-wildcard
> +          the whole address field (depending on flow entry priorities).
> +          In this case each packet with a different address gets handed
> +          to the userspace for flow processing and generates its own
> +          datapath flow.  With prefix tracking enabled for the address
> +          field in question packets with addresses matching shorter
> +          prefixes would generate datapath flows where the irrelevant
> +          address bits are wildcarded, allowing the same datapath flow
> +          to handle all the packets within the prefix in question.  In
> +          this case many userspace upcalls can be avoided and the
> +          overall performance can be better.
> +        </p>
> +        <p>
> +          This is a performance optimization only, so packets will
> +          receive the same treatment with or without prefix tracking.
> +        </p>
> +        <p>
> +          The supported fields are: <code>tun_id</code>,
> +          <code>tun_src</code>, <code>tun_dst</code>,
> +          <code>nw_src</code>, <code>nw_dst</code> (or aliases
> +          <code>ip_src</code> and <code>ip_dst</code>),
> +          <code>ipv6_src</code>, and <code>ipv6_dst</code>.  (Using this
> +          feature for <code>tun_id</code> would only make sense if the
> +          tunnel IDs have prefix structure similar to IP addresses.)
> +        </p>
> 
> -      <p>
> -        There is a maximum number of fields that can be enabled for any
> -        one flow table.  Currently this limit is 3.
> -      </p>
> -    </column>
> +        <p>
> +          By default, the <code>prefixes=ip_dst,ip_src</code> are used
> +          on each flow table.  This instructs the flow classifier to
> +          track the IP destination and source addresses used by the
> +          rules in this specific flow table.
> +        </p>
> +
> +        <p>
> +          The keyword <code>none</code> is recognized as an explicit
> +          override of the default values, causing no prefix fields to be
> +          tracked.
> +        </p>
> +
> +        <p>
> +          To set the prefix fields, the flow table record needs to
> +          exist:
> +        </p>
> +
> +        <dl>
> +          <dt><code>ovs-vsctl set Bridge br0 flow_tables:0=@N1 -- --id=@N1 
> create Flow_Table name=table0</code></dt>
> +          <dd>
> +            Creates a flow table record for the OpenFlow table number 0.
> +          </dd>
> +
> +          <dt><code>ovs-vsctl set Flow_Table table0 
> prefixes=ip_dst,ip_src</code></dt>
> +          <dd>
> +            Enables prefix tracking for IP source and destination
> +            address fields.
> +          </dd>
> +        </dl>
> +
> +        <p>
> +          There is a maximum number of fields that can be enabled for any
> +          one flow table.  Currently this limit is 3.
> +        </p>
> +      </column>
> +    </group>
> 
>     <group title="Common Columns">
>       The overall purpose of these columns is described under <code>Common
> -- 
> 2.1.3
> 
> _______________________________________________
> dev mailing list
> [email protected]
> http://openvswitch.org/mailman/listinfo/dev

_______________________________________________
dev mailing list
[email protected]
http://openvswitch.org/mailman/listinfo/dev

Reply via email to