On 01/16/2019 01:05 PM, Ilya Maximets wrote: > Conditional EMC insert helps a lot in scenarios with high numbers > of parallel flows, but in current implementation this option affects > all the threads and ports at once. There are scenarios where we have > different number of flows on different ports. For example, if one > of the VMs encapsulates traffic using additional headers, it will > receive large number of flows but only few flows will come out of > this VM. In this scenario it's much faster to use EMC instead of > classifier for traffic from the VM, but it's better to disable EMC > for the traffic which flows to VM. > > To handle above issue introduced 'emc-enable' configurable to > enable/disable EMC on a per-port basis. Ex.: > > ovs-vsctl set interface dpdk0 other_config:emc-enable=false > > EMC probability kept as is and it works for all the ports with > 'emc-enable=true'. >
Hi Ilya, just a couple of minor comments below, > Signed-off-by: Ilya Maximets <i.maxim...@samsung.com> > --- > > Version 3: > * Minor rebase on current master. > > Version 2: > * The main concern was about backward compatibility. Also, there > is no real profit having the per-port probability value. > So, per-port probability switched to per-port 'emc-enable' > configurable. > Global probability kept back and can be used without any changes. > > It's been a while since the first version. It's available here: > https://patchwork.ozlabs.org/patch/800277/ > > Documentation/topics/dpdk/bridge.rst | 4 ++ > NEWS | 5 +- > lib/dpif-netdev.c | 79 +++++++++++++++++++++++++--- > vswitchd/vswitch.xml | 19 +++++++ > 4 files changed, 98 insertions(+), 9 deletions(-) > > diff --git a/Documentation/topics/dpdk/bridge.rst > b/Documentation/topics/dpdk/bridge.rst > index 82bdad840..2fcd86607 100644 > --- a/Documentation/topics/dpdk/bridge.rst > +++ b/Documentation/topics/dpdk/bridge.rst > @@ -101,6 +101,10 @@ observed with pmd stats:: > For certain traffic profiles with many parallel flows, it's recommended to > set > ``N`` to '0' to achieve higher forwarding performance. > > +It is also possible to enable/disable EMC on per-port basis using:: > + > + $ ovs-vsctl set interface <iface> other_config:emc-enable={true,false} > + > For more information on the EMC refer to :doc:`/intro/install/dpdk` . > > > diff --git a/NEWS b/NEWS > index 4618cc0a0..7826578d3 100644 > --- a/NEWS > +++ b/NEWS > @@ -24,10 +24,13 @@ Post-v2.10.0 > allocated dynamically using the following syntax: > ovn-nbctl lsp-set-addresses <port> "dynamic <IP>" > - DPDK: > + * Add support for DPDK 18.11 > + - Userspace datapath: > * Add option for simple round-robin based Rxq to PMD assignment. > It can be set with pmd-rxq-assign. > - * Add support for DPDK 18.11 > * Add support for Auto load balancing of PMDs (experimental) > + * Added new per-port configurable option to manage EMC: > + 'other_config:emc-enable'. > - Add 'symmetric_l3' hash function. > - OVS now honors 'updelay' and 'downdelay' for bonds with LACP configured. > - ovs-vswitchd: > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c > index be529b6b0..6704be400 100644 > --- a/lib/dpif-netdev.c > +++ b/lib/dpif-netdev.c > @@ -474,6 +474,7 @@ struct dp_netdev_port { > unsigned n_rxq; /* Number of elements in 'rxqs' */ > unsigned *txq_used; /* Number of threads that use each tx queue. > */ > struct ovs_mutex txq_used_mutex; > + bool emc_enabled; /* If true EMC will be used. */ > char *type; /* Port type as requested by user. */ > char *rxq_affinity_list; /* Requested affinity of rx queues. */ > }; > @@ -588,6 +589,7 @@ static void dp_netdev_actions_free(struct > dp_netdev_actions *); > struct polled_queue { > struct dp_netdev_rxq *rxq; > odp_port_t port_no; > + bool emc_enabled; > }; > > /* Contained by struct dp_netdev_pmd_thread's 'poll_list' member. */ > @@ -617,6 +619,8 @@ struct dp_netdev_pmd_thread_ctx { > long long now; > /* RX queue from which last packet was received. */ > struct dp_netdev_rxq *last_rxq; > + /* EMC insertion probability context for the current processing cycle. */ > + uint32_t emc_insert_min; > }; > > /* PMD: Poll modes drivers. PMD accesses devices via polling to eliminate > @@ -1798,6 +1802,7 @@ port_create(const char *devname, const char *type, > port->netdev = netdev; > port->type = xstrdup(type); > port->sf = sf; > + port->emc_enabled = true; > port->need_reconfigure = true; > ovs_mutex_init(&port->txq_used_mutex); > > @@ -2830,9 +2835,7 @@ emc_probabilistic_insert(struct dp_netdev_pmd_thread > *pmd, > * default the value is UINT32_MAX / 100 which yields an insertion > * probability of 1/100 ie. 1% */ > > - uint32_t min; > - > - atomic_read_relaxed(&pmd->dp->emc_insert_min, &min); > + uint32_t min = pmd->ctx.emc_insert_min; > > if (min && random_uint32() <= min) { > emc_insert(&(pmd->flow_cache).emc_cache, key, flow); > @@ -3698,7 +3701,8 @@ dpif_netdev_execute(struct dpif *dpif, struct > dpif_execute *execute) > ovs_mutex_lock(&dp->non_pmd_mutex); > } > > - /* Update current time in PMD context. */ > + /* Update current time in PMD context. We don't care about EMC insertion > + * probability, because we are on a slow path. */ > pmd_thread_ctx_time_update(pmd); > > /* The action processing expects the RSS hash to be valid, because > @@ -3842,7 +3846,7 @@ dpif_netdev_set_config(struct dpif *dpif, const struct > smap *other_config) > if (insert_min != cur_min) { > atomic_store_relaxed(&dp->emc_insert_min, insert_min); > if (insert_min == 0) { > - VLOG_INFO("EMC has been disabled"); > + VLOG_INFO("EMC insertion probability changed to zero"); > } else { > VLOG_INFO("EMC insertion probability changed to 1/%llu > (~%.2f%%)", > insert_prob, (100 / (float)insert_prob)); > @@ -3965,6 +3969,27 @@ exit: > return error; > } > > +/* Returns 'true' if one of the 'port's RX queues exists in 'poll_list' > + * of given PMD thread. */ > +static bool > +dpif_netdev_pmd_polls_port(struct dp_netdev_pmd_thread *pmd, > + struct dp_netdev_port *port) > + OVS_EXCLUDED(pmd->port_mutex) > +{ > + struct rxq_poll *poll; > + bool found = false; > + > + ovs_mutex_lock(&pmd->port_mutex); > + HMAP_FOR_EACH (poll, node, &pmd->poll_list) { > + if (port == poll->rxq->port) { > + found = true; > + break; > + } > + } > + ovs_mutex_unlock(&pmd->port_mutex); > + return found; > +} > + > /* Changes the affinity of port's rx queues. The changes are actually > applied > * in dpif_netdev_run(). */ The comment is a bit stale now. > static int > @@ -3975,10 +4000,33 @@ dpif_netdev_port_set_config(struct dpif *dpif, > odp_port_t port_no, > struct dp_netdev_port *port; > int error = 0; > const char *affinity_list = smap_get(cfg, "pmd-rxq-affinity"); > + bool emc_enabled = smap_get_bool(cfg, "emc-enable", true); > > ovs_mutex_lock(&dp->port_mutex); > error = get_port_by_number(dp, port_no, &port); > - if (error || !netdev_is_pmd(port->netdev) > + if (error) { > + goto unlock; > + } > + > + if (emc_enabled != port->emc_enabled) { > + struct dp_netdev_pmd_thread *pmd; > + > + port->emc_enabled = emc_enabled; > + /* Mark for reload all the threads that polls this port and request > + * for reconfiguration for the actual reloading of threads. */ > + CMAP_FOR_EACH (pmd, node, &dp->poll_threads) { > + if (dpif_netdev_pmd_polls_port(pmd, port)) { > + pmd->need_reload = true; > + } > + } > + dp_netdev_request_reconfigure(dp); > + > + VLOG_INFO("%s: EMC has been %s", netdev_get_name(port->netdev), > + (emc_enabled) ? "enabled" : "disabled"); I think it would be good idea to add a similar log as dpif_netdev_set_config() for 'emc-insert-inv-prob' when emc_enabled gets enabled here as a reminder for the user. Otherwise, a user will see 'EMC has been enabled' which is not incorrect, but perhaps a long time ago 'emc-insert-inv-prob' was set to 0, and there is confusion about whether the EMC will actually be used now or not. > + } > + > + /* Checking for RXq affinity changes. */ > + if (!netdev_is_pmd(port->netdev) > || nullable_string_is_equal(affinity_list, port->rxq_affinity_list)) > { > goto unlock; > } > @@ -5123,6 +5171,13 @@ dpif_netdev_run(struct dpif *dpif) > if (!netdev_is_pmd(port->netdev)) { > int i; > > + if (port->emc_enabled) { > + atomic_read_relaxed(&dp->emc_insert_min, > + &non_pmd->ctx.emc_insert_min); > + } else { > + non_pmd->ctx.emc_insert_min = 0; > + } > + > for (i = 0; i < port->n_rxq; i++) { > if (dp_netdev_process_rxq_port(non_pmd, > &port->rxqs[i], > @@ -5296,6 +5351,7 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread > *pmd, > HMAP_FOR_EACH (poll, node, &pmd->poll_list) { > poll_list[i].rxq = poll->rxq; > poll_list[i].port_no = poll->rxq->port->port_no; > + poll_list[i].emc_enabled = poll->rxq->port->emc_enabled; > i++; > } > > @@ -5360,6 +5416,14 @@ reload: > pmd_perf_start_iteration(s); > > for (i = 0; i < poll_cnt; i++) { > + > + if (poll_list[i].emc_enabled) { > + atomic_read_relaxed(&pmd->dp->emc_insert_min, > + &pmd->ctx.emc_insert_min); > + } else { > + pmd->ctx.emc_insert_min = 0; > + } > + > process_packets = > dp_netdev_process_rxq_port(pmd, poll_list[i].rxq, > poll_list[i].port_no); > @@ -6301,7 +6365,7 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, > struct dfc_cache *cache = &pmd->flow_cache; > struct dp_packet *packet; > const size_t cnt = dp_packet_batch_size(packets_); > - uint32_t cur_min; > + uint32_t cur_min = pmd->ctx.emc_insert_min; > int i; > uint16_t tcp_flags; > bool smc_enable_db; > @@ -6309,7 +6373,6 @@ dfc_processing(struct dp_netdev_pmd_thread *pmd, > bool batch_enable = true; > > atomic_read_relaxed(&pmd->dp->smc_enable_db, &smc_enable_db); > - atomic_read_relaxed(&pmd->dp->emc_insert_min, &cur_min); > pmd_perf_update_counter(&pmd->perf_stats, > md_is_valid ? PMD_STAT_RECIRC : PMD_STAT_RECV, > cnt); > diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml > index d58f63228..88edb5d35 100644 > --- a/vswitchd/vswitch.xml > +++ b/vswitchd/vswitch.xml > @@ -3101,6 +3101,25 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 > type=patch options:peer=p1 \ > </column> > </group> > > + <group title="EMC (Exact Match Cache) Configuration"> > + <p> > + These settings controls behaviour of EMC lookups/insertions for > packets > + received from the interface. > + </p> > + > + <column name="other_config" key="emc-enable" type='{"type": > "boolean"}'> > + <p> > + Specifies if Exact Match Cache (EMC) should be used while > processing > + packets received from this interface. > + If true, <ref table="Open_vSwitch" column="other_config" > + key="emc-insert-inv-prob"/> will have effect on this interface. > + </p> > + <p> > + Defaults to true. > + </p> > + </column> > + </group> > + > <group title="MTU"> > <p> > The MTU (maximum transmission unit) is the largest amount of data > _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev