>
> On 02/10/2017 10:57 AM, Ciara Loftus wrote:
> > Unconditional insertion of EMC entries results in EMC thrashing at high
> > numbers of parallel flows. When this occurs, the performance of the EMC
> > often falls below that of the dpcls classifier, rendering the EMC
> > practically useless.
> >
> > Instead of unconditionally inserting entries into the EMC when a miss
> > occurs, use a 1% probability of insertion. This ensures that the most
> > frequent flows have the highest chance of creating an entry in the EMC,
> > and the probability of thrashing the EMC is also greatly reduced.
> >
> > The probability of insertion is configurable, via the
> > other_config:emc-insert-prob option. For example the following command
> > increases the insertion probability to 10%.
> >
> > ovs-vsctl set Open_vSwitch . other_config:emc-insert-prob=10
> >
> > Signed-off-by: Ciara Loftus <[email protected]>
> > Signed-off-by: Georg Schmuecking <[email protected]>
> > Co-authored-by: Georg Schmuecking <[email protected]>
> > ---
> > v8:
> > - Floating point precision percentiles
> > - Moved NEWS entry to post-2.7 section and is no longer in the DPDK
> > specific section.
> > v7:
> > - Remove further code duplication
> > v6:
> > - Refactor the code to remove duplication around calls to
> > emc_probabilistic_insert()
> > v5:
> > - Use percentiles for emc-insert-prob (0-100%)
> > - Update docs to reflect the option not exclusive to the DPDK datapath.
> > v4:
> > - Added Georg to Authors file
> > - Set emc-insert-prob=1 for 'PMD - stats' unit test
> > - Use read_relaxed on atomic var
> > - Correctly allow for 0 and 100% probababilites
> > - Cache align new element in dp_netdev struct
> > - Revert to default probability if invalid emc-insert-prob set
> > - Allow configurability for non-DPDK case
> > v3:
> > - Use new dpif other_config infrastructure to tidy up how the
> > emc-insert-prob value is passed to dpif-netdev.
> > v2:
> > - Enable probability configurability via other_config:emc-insert-prob
> > option.
> >
> > AUTHORS.rst | 1 +
> > Documentation/howto/dpdk.rst | 19 +++++++++++++++++
> > NEWS | 2 ++
> > lib/dpif-netdev.c | 50
> ++++++++++++++++++++++++++++++++++++++++----
> > lib/smap.c | 10 +++++++++
> > lib/smap.h | 1 +
> > tests/pmd.at | 1 +
> > vswitchd/vswitch.xml | 13 ++++++++++++
> > 8 files changed, 93 insertions(+), 4 deletions(-)
> >
> > diff --git a/AUTHORS.rst b/AUTHORS.rst
> > index b567fcc..3382ad8 100644
> > --- a/AUTHORS.rst
> > +++ b/AUTHORS.rst
> > @@ -382,6 +382,7 @@ Eric Lopez [email protected]
> > Frido Roose [email protected]
> > Gaetano Catalli [email protected]
> > Gavin Remaley [email protected]
> > +Georg Schmuecking [email protected]
> > George Shuklin [email protected]
> > Gerald Rogers [email protected]
> > Ghanem Bahri [email protected]
> > diff --git a/Documentation/howto/dpdk.rst
> b/Documentation/howto/dpdk.rst
> > index d1e6e89..9e0dbeb 100644
> > --- a/Documentation/howto/dpdk.rst
> > +++ b/Documentation/howto/dpdk.rst
> > @@ -354,6 +354,25 @@ the `DPDK documentation
> >
> > Note: Not all DPDK virtual PMD drivers have been tested and verified to
> work.
> >
> > +EMC Insertion Probability
> > +-------------------------
> > +By default 1 in every 100 flows are inserted into the Exact Match Cache
> (EMC).
> > +It is possible to change this insertion probability by setting the
> > +``emc-insert-prob`` option::
> > +
> > + $ ovs-vsctl --no-wait set Open_vSwitch . other_config:emc-insert-
> prob=N
> > +
> > +where:
> > +
> > +``N``
> > + is a positive floating point number between 0 and 100 representing the
> > + percentage probability of EMC insertion.
> > +
> > +If ``N`` is set to 100, an insertion will be performed for every flow. If
> > set
> > +to 0, no insertions will be performed and the EMC will effectively be
> disabled.
> > +
> > +For more information on the EMC refer to :doc:`/intro/install/dpdk` .
> > +
> > .. _dpdk-ovs-in-guest:
> >
> > OVS with DPDK Inside VMs
> > diff --git a/NEWS b/NEWS
> > index aebd99c..5c9e628 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -3,6 +3,8 @@ Post-v2.7.0
> > - Tunnels:
> > * Added support to set packet mark for tunnel endpoint using
> > `egress_pkt_mark` OVSDB option.
> > + - EMC insertion probability is reduced to 1% and is configurable via
> > + the new 'other_config:emc-insert-prob' option.
> >
> > v2.7.0 - xx xxx xxxx
> > ---------------------
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index 0be5db5..c8cb9ee 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -144,6 +144,11 @@ struct netdev_flow_key {
> > #define EM_FLOW_HASH_MASK (EM_FLOW_HASH_ENTRIES - 1)
> > #define EM_FLOW_HASH_SEGS 2
> >
> > +/* Percentage probability of calling emc_insert */
> > +#define DEFAULT_EM_FLOW_INSERT_PROB 1
> > +#define DEFAULT_EM_FLOW_INSERT_MIN (UINT32_MAX / \
> > + (100 / DEFAULT_EM_FLOW_INSERT_PROB))
> > +
> > struct emc_entry {
> > struct dp_netdev_flow *flow;
> > struct netdev_flow_key key; /* key.hash used for emc hash value. */
> > @@ -254,6 +259,9 @@ struct dp_netdev {
> > uint64_t last_tnl_conf_seq;
> >
> > struct conntrack conntrack;
> > +
> > + /* Probability of EMC insertions is a factor of 'emc_insert_min'.*/
> > + OVS_ALIGNED_VAR(CACHE_LINE_SIZE)atomic_uint32_t
> emc_insert_min;
> > };
> >
> > static struct dp_netdev_port *dp_netdev_lookup_port(const struct
> dp_netdev *dp,
> > @@ -1066,6 +1074,8 @@ create_dp_netdev(const char *name, const
> struct dpif_class *class,
> >
> > conntrack_init(&dp->conntrack);
> >
> > + atomic_init(&dp->emc_insert_min,
> DEFAULT_EM_FLOW_INSERT_MIN);
> > +
> > cmap_init(&dp->poll_threads);
> > ovs_mutex_init_recursive(&dp->non_pmd_mutex);
> > ovsthread_key_create(&dp->per_pmd_key, NULL);
> > @@ -1943,6 +1953,27 @@ emc_insert(struct emc_cache *cache, const
> struct netdev_flow_key *key,
> > emc_change_entry(to_be_replaced, flow, key);
> > }
> >
> > +static inline void
> > +emc_probabilistic_insert(struct dp_netdev_pmd_thread *pmd,
> > + const struct netdev_flow_key *key,
> > + struct dp_netdev_flow *flow)
> > +{
> > + /* Insert an entry into the EMC based on probability value 'min'. By
> > + * default the value is UINT32_MAX / 100/1 which yields an insertion
> > + * probability of 1/100. */
> > +
> > + uint32_t min;
> > + atomic_read_relaxed(&pmd->dp->emc_insert_min, &min);
> > +
> > +#ifdef DPDK_NETDEV
> > + if (min && (key->hash ^ (uint32_t)pmd->last_cycles) <= min) {
> > +#else
> > + if (min && (key->hash ^ random_uint32()) <= min) {
> > +#endif
> > + emc_insert(&pmd->flow_cache, key, flow);
> > + }
> > +}
> > +
> > static inline struct dp_netdev_flow *
> > emc_lookup(struct emc_cache *cache, const struct netdev_flow_key
> *key)
> > {
> > @@ -2731,6 +2762,7 @@ dpif_netdev_set_config(struct dpif *dpif, const
> struct smap *other_config)
> > {
> > struct dp_netdev *dp = get_dp_netdev(dpif);
> > const char *cmask = smap_get(other_config, "pmd-cpu-mask");
> > + float insert_prob = smap_get_float(other_config, "emc-insert-prob", -
> 1);
> >
> > if (!nullable_string_is_equal(dp->pmd_cmask, cmask)) {
> > free(dp->pmd_cmask);
> > @@ -2738,6 +2770,18 @@ dpif_netdev_set_config(struct dpif *dpif, const
> struct smap *other_config)
> > dp_netdev_request_reconfigure(dp);
> > }
> >
> > + if (insert_prob >= 0 && insert_prob <= 100) {
> > + uint32_t insert_min = insert_prob == 0 ? 0 :
> > + UINT32_MAX / (uint32_t)(100 /
> > insert_prob);
>
> The calculation here is not correct - every emc-insert-prob from 51-100
> will give the same insert_min, 50-34 etc.
>
> Aside from this, I don't think it's user friendly to need floating point
> numbers. If someone wants to put an entry in the emc (on avg.) every
> 2000th packet, then let them set 2000 in the db, not 0.05. We'd just
> need to document it 1/N as I think it was in the original patches and
> maybe update some of the wording around 'probability' etc.
Cast was in the wrong place. Apologies.
I've respun this again with your suggestions & renamed the option to
'emc-insert-inv-prob' which hopefully makes things clearer.
Thanks,
Ciara
>
> > + uint32_t cur_min;
> > + atomic_read_relaxed(&dp->emc_insert_min, &cur_min);
> > + if (insert_min != cur_min) {
> > + atomic_store_relaxed(&dp->emc_insert_min, insert_min);
> > + }
> > + } else {
> > + atomic_store_relaxed(&dp->emc_insert_min,
> DEFAULT_EM_FLOW_INSERT_MIN);
> > + }
> > +
> > return 0;
> > }
> >
> > @@ -4193,8 +4237,7 @@ handle_packet_upcall(struct
> dp_netdev_pmd_thread *pmd, struct dp_packet *packet,
> > add_actions->size);
> > }
> > ovs_mutex_unlock(&pmd->flow_mutex);
> > -
> > - emc_insert(&pmd->flow_cache, key, netdev_flow);
> > + emc_probabilistic_insert(pmd, key, netdev_flow);
> > }
> > }
> >
> > @@ -4217,7 +4260,6 @@ fast_path_processing(struct
> dp_netdev_pmd_thread *pmd,
> > struct dpcls *cls;
> > struct dpcls_rule *rules[PKT_ARRAY_SIZE];
> > struct dp_netdev *dp = pmd->dp;
> > - struct emc_cache *flow_cache = &pmd->flow_cache;
> > int miss_cnt = 0, lost_cnt = 0;
> > int lookup_cnt = 0, add_lookup_cnt;
> > bool any_miss;
> > @@ -4288,7 +4330,7 @@ fast_path_processing(struct
> dp_netdev_pmd_thread *pmd,
> >
> > flow = dp_netdev_flow_cast(rules[i]);
> >
> > - emc_insert(flow_cache, &keys[i], flow);
> > + emc_probabilistic_insert(pmd, &keys[i], flow);
> > dp_netdev_queue_batches(packet, flow, &keys[i].mf, batches,
> n_batches);
> > }
> >
> > diff --git a/lib/smap.c b/lib/smap.c
> > index 0a5e54a..406a1eb 100644
> > --- a/lib/smap.c
> > +++ b/lib/smap.c
> > @@ -231,6 +231,16 @@ smap_get_int(const struct smap *smap, const char
> *key, int def)
> > return value ? atoi(value) : def;
> > }
> >
> > +/* Gets the value associated with 'key' in 'smap' and converts it to a
> > float
> > + * using atof(). If 'key' is not in 'smap', returns 'def'. */
> > +float
> > +smap_get_float(const struct smap *smap, const char *key, float def)
> > +{
> > + const char *value = smap_get(smap, key);
> > +
> > + return value ? atof(value) : def;
> > +}
> > +
> > /* Gets the value associated with 'key' in 'smap' and converts it to an int
> > * using strtoull(). If 'key' is not in 'smap', returns 'def'. */
> > unsigned long long int
> > diff --git a/lib/smap.h b/lib/smap.h
> > index edf591c..77afa7e 100644
> > --- a/lib/smap.h
> > +++ b/lib/smap.h
> > @@ -98,6 +98,7 @@ const char *smap_get_def(const struct smap *, const
> char *key,
> > struct smap_node *smap_get_node(const struct smap *, const char *);
> > bool smap_get_bool(const struct smap *smap, const char *key, bool def);
> > int smap_get_int(const struct smap *smap, const char *key, int def);
> > +float smap_get_float(const struct smap *smap, const char *key, float
> def);
> > unsigned long long int smap_get_ullong(const struct smap *, const char
> *key,
> > unsigned long long def);
> > bool smap_get_uuid(const struct smap *, const char *key, struct uuid *);
> > diff --git a/tests/pmd.at b/tests/pmd.at
> > index 35dbcfe..b3132ff 100644
> > --- a/tests/pmd.at
> > +++ b/tests/pmd.at
> > @@ -158,6 +158,7 @@ CHECK_PMD_THREADS_CREATED()
> >
> > AT_CHECK([ovs-appctl vlog/set dpif_netdev:dbg])
> > AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> > +AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:emc-insert-
> prob=100])
> >
> > sleep 1
> >
> > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
> > index 146a816..0865a6d 100644
> > --- a/vswitchd/vswitch.xml
> > +++ b/vswitchd/vswitch.xml
> > @@ -335,6 +335,19 @@
> > datapaths.
> > </p>
> > </column>
> > +
> > + <column name="other_config" key="emc-insert-prob"
> > + type='{"type": "real", "minReal": 0, "maxReal": 100}'>
> > + <p>
> > + Specifies the percentage probability (0-100%) of a flow being
> > + inserted into the Exact Match Cache (EMC). A value of 100 will
> > result
> > + in an insertion for every flow whereas a value of zero will
> > result in
> > + no insertions and essentially disable the EMC.
> > + </p>
> > + <p>
> > + Defaults to 1 ie. there is 1% chance of EMC insertion.
> > + </p>
> > + </column>
> > </group>
> >
> > <group title="Status">
> >
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev