[ovs-dev] [PATCH] ofproto-dpif-xlate: Remove repeated function for judge garp.

2022-11-30 Thread Han Ding


Function is_gratuitous_arp() and function is_garp() are all used to judge
whether the flow is gratuitous arp. It is not necessary to use two functions
to do the same thing and just keep one.

Signed-off-by: Han Ding 
---
 ofproto/ofproto-dpif-xlate.c | 32 ++--
 1 file changed, 2 insertions(+), 30 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index a9cf3cbee..b3c13f6bf 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -2543,34 +2543,6 @@ output_normal(struct xlate_ctx *ctx, const struct 
xbundle *out_xbundle,
 memcpy(>xin->flow.vlans, _vlans, sizeof(old_vlans));
 }

-/* A VM broadcasts a gratuitous ARP to indicate that it has resumed after
- * migration.  Older Citrix-patched Linux DomU used gratuitous ARP replies to
- * indicate this; newer upstream kernels use gratuitous ARP requests. */
-static bool
-is_gratuitous_arp(const struct flow *flow, struct flow_wildcards *wc)
-{
-if (flow->dl_type != htons(ETH_TYPE_ARP)) {
-return false;
-}
-
-memset(>masks.dl_dst, 0xff, sizeof wc->masks.dl_dst);
-if (!eth_addr_is_broadcast(flow->dl_dst)) {
-return false;
-}
-
-memset(>masks.nw_proto, 0xff, sizeof wc->masks.nw_proto);
-if (flow->nw_proto == ARP_OP_REPLY) {
-return true;
-} else if (flow->nw_proto == ARP_OP_REQUEST) {
-memset(>masks.nw_src, 0xff, sizeof wc->masks.nw_src);
-memset(>masks.nw_dst, 0xff, sizeof wc->masks.nw_dst);
-
-return flow->nw_src == flow->nw_dst;
-} else {
-return false;
-}
-}
-
 /* Determines whether packets in 'flow' within 'xbridge' should be forwarded or
  * dropped.  Returns true if they may be forwarded, false if they should be
  * dropped.
@@ -2619,7 +2591,7 @@ is_admissible(struct xlate_ctx *ctx, struct xport 
*in_port,
 mac = mac_learning_lookup(xbridge->ml, flow->dl_src, vlan);
 if (mac
 && mac_entry_get_port(xbridge->ml, mac) != in_xbundle->ofbundle
-&& (!is_gratuitous_arp(flow, ctx->wc)
+&& (!is_garp(flow, ctx->wc)
 || mac_entry_is_grat_arp_locked(mac))) {
 ovs_rwlock_unlock(>ml->rwlock);
 xlate_report(ctx, OFT_DETAIL,
@@ -3062,7 +3034,7 @@ xlate_normal(struct xlate_ctx *ctx)
 }

 /* Learn source MAC. */
-bool is_grat_arp = is_gratuitous_arp(flow, wc);
+bool is_grat_arp = is_garp(flow, wc);
 if (ctx->xin->allow_side_effects
 && flow->packet_type == htonl(PT_ETH)
 && in_port && in_port->pt_mode != NETDEV_PT_LEGACY_L3
--
2.27.0




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] [PATCH v6 net-next] net: openvswitch: Add support to count upcall packets

2022-11-30 Thread Jakub Kicinski
On Wed, 30 Nov 2022 04:15:59 -0500 wangchuanlei wrote:
> +/**
> + *   ovs_vport_get_upcall_stats - retrieve upcall stats
> + *
> + * @vport: vport from which to retrieve the stats
> + * @ovs_vport_upcall_stats: location to store stats

s/ovs_vport_upcall_//

> + *
> + * Retrieves upcall stats for the given device.
> + *
> + * Must be called with ovs_mutex or rcu_read_lock.
> + */
> +void ovs_vport_get_upcall_stats(struct vport *vport, struct 
> ovs_vport_upcall_stats *stats)
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH/RFC net-next] tc: allow drivers to accept gact with PIPE when offloading

2022-11-30 Thread Tianyu Yuan


On Thus, Dec 1 , 2022, at 2:05 AM, Marcelo Leitner wrote:
> 
> On Wed, Nov 30, 2022 at 03:36:57AM +, Tianyu Yuan wrote:
> >
> > On Mon, Nov 29, 2022 at 8:35 PM , Eelco Chaudron wrote:
> > >
> > > On 28 Nov 2022, at 14:33, Marcelo Leitner wrote:
> > >
> > > > On Mon, Nov 28, 2022 at 02:17:40PM +0100, Eelco Chaudron wrote:
> > > >>
> > > >>
> > > >> On 28 Nov 2022, at 14:11, Marcelo Leitner wrote:
> > > >>
> > > >>> On Mon, Nov 28, 2022 at 07:11:05AM +, Tianyu Yuan wrote:
> > > > ...
> > > 
> > >  Furthermore, I think the current stats for each action
> > >  mentioned in
> > >  2) cannot represent the real hw stats and this is why [ RFC
> > >  net-next v2 0/2] (net: flow_offload: add support for per action
> > >  hw stats)
> > > will come up.
> > > >>>
> > > >>> Exactly. Then, when this patchset (or similar) come up, it won't
> > > >>> update all actions with the same stats anymore. It will require
> > > >>> a set of stats from hw for the gact with PIPE action here. But
> > > >>> if drivers are ignoring this action, they can't have specific
> > > >>> stats for it. Or am I missing something?
> > > >>>
> > > >>> So it is better for the drivers to reject the whole flow instead
> > > >>> of simply ignoring it, and let vswitchd probe if it should or
> > > >>> should not use this action.
> > > >>
> > > >> Please note that OVS does not probe features per interface, but
> > > >> does it
> > > per datapath. So if it’s supported in pipe in tc software, we will
> > > use it. If the driver rejects it, we will probably end up with the tc 
> > > software
> rule only.
> > > >
> > > > Ah right. I remember it will pick 1 interface for testing and use
> > > > those results everywhere, which then I don't know if it may or may
> > > > not be a representor port or not. Anyhow, then it should use
> > > > skip_sw, to try to probe for the offloading part. Otherwise I'm
> > > > afraid tc sw will always accept this flow and trick the probing, yes.
> > >
> > > Well, it depends on how you look at it. In theory, we should be
> > > hardware agnostic, meaning what if you have different hardware in
> > > your system? OVS only supports global offload enablement.
> > >
> > > Tianyu how are you planning to support this from the OVS side? How
> > > would you probe kernel and/or hardware support for this change?
> >
> > Currently in the test demo, I just extend gact with PIPE (previously
> > only SHOT as default and GOTO_CHAIN when chain exists), and then put
> > such a gact with PIPE at the first place of each filter which will be 
> > transacted
> with kernel tc.
> >
> > About the tc sw datapath mentioned, we don't have to make changes
> > because gact with PIPE has already been supported in current tc
> > implementation and it could act like a 'counter' And for the hardware
> > we just need to ignore this PIPE and the stats of this action will still be
> > updated in kernel side and sent to userspace.
> 
> I can't see how the action would have stats from hw if the driver is ignoring
> the action.

The stats for each actions in a filter is updated here in pkt_cls.h:
static inline void
tcf_exts_hw_stats_update(const struct tcf_exts *exts,
 u64 bytes, u64 packets, u64 drops, u64 lastuse,
 u8 used_hw_stats, bool used_hw_stats_valid)
{
#ifdef CONFIG_NET_CLS_ACT
int i;

for (i = 0; i < exts->nr_actions; i++) {
struct tc_action *a = exts->actions[i];

/* if stats from hw, just skip */
if (tcf_action_update_hw_stats(a)) {
preempt_disable();
tcf_action_stats_update(a, bytes, packets, drops,
lastuse, true);
preempt_enable();

a->used_hw_stats = used_hw_stats;
a->used_hw_stats_valid = used_hw_stats_valid;
}
}
#endif
}
In which bytes, packets, drops, lastuse are dumped from the driver, the stats 
of gact PIPE is updated here is kernel
TC, rather than in driver directly.

> 
> But maybe there was a misunderstanding here. I was reading more the
> cxgb4 driver here and AFAICT this patch will skip PIPE on the action 
> validation,
> but not actually skip the action entirely. Then it will hit
> cxgb4_process_flow_actions() and maybe the driver will the right thing with
> a dummy action out of the blue. Was this your expectation, to just ignore it 
> in
> the validation step, and let it fall through through the driver? If yes, the
> comments are misleading, as the NICs will have to process the packets.
> 

Actually we want to ignore it not only in validation step but also the process 
step.
We don't want to HW process this action and just want to let the driver treat 
this flow
with PIPE at the first place as a offloadable one.
> >
> > I agree with that the unsupported actions should be rejected by
> > drivers, so may another 

Re: [ovs-dev] [PATCH ovn v8 7/7] Document experimental support for co-hosted controllers

2022-11-30 Thread 0-day Robot
Bleep bloop.  Greetings Ihar Hrachyshka, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Document experimental support for co-hosted controllers
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v8 6/7] Add connectivity test for 2 controllers on the same host

2022-11-30 Thread Ihar Hrachyshka
Signed-off-by: Ihar Hrachyshka 
---
 tests/ovn-macros.at |  30 ++
 tests/ovn.at| 137 ++--
 2 files changed, 137 insertions(+), 30 deletions(-)

diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index ef7191c6a..1f57293d1 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -345,6 +345,36 @@ ovn_attach() {
 ovn_az_attach NONE $@
 }
 
+# This function is similar to ovn_attach but makes sure it doesn't
+# mess with another controller settings
+start_virtual_controller() {
+local net=$1 bridge=$2 int_bridge=$3 ip=$4 masklen=${5-24} 
encap=${6-geneve,vxlan} systemid=${7-$sandbox} cli_args=${@:8}
+net_attach $net $bridge || return 1
+
+mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
+arp_table="$arp_table $sandbox,$bridge,$ip,$mac"
+ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
+ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1
+
+local ovn_remote
+if test X$HAVE_OPENSSL = Xyes; then
+ovn_remote=$SSL_OVN_SB_DB
+else
+ovn_remote=unix:$ovs_base/ovn-sb/ovn-sb.sock
+fi
+ovs-vsctl \
+-- set Open_vSwitch . external-ids:ovn-remote-$systemid=$ovn_remote \
+-- set Open_vSwitch . external-ids:ovn-encap-type-$systemid=$encap \
+-- set Open_vSwitch . external-ids:ovn-encap-ip-$systemid=$ip \
+-- set Open_vSwitch . external-ids:ovn-bridge-$systemid=$int_bridge \
+-- --may-exist add-br $int_bridge \
+-- set bridge $int_bridge fail-mode=secure 
other-config:disable-in-band=true \
+|| return 1
+
+echo IHAR ${cli_args}
+ovn-controller --enable-dummy-vif-plug ${cli_args} -vconsole:off --detach 
--no-chdir
+}
+
 # ovn_setenv AZ
 ovn_setenv () {
 local d=$ovs_base/$1
diff --git a/tests/ovn.at b/tests/ovn.at
index 4adcb2f58..00cefd58f 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -33591,40 +33591,11 @@ ovn-nbctl --wait=hv sync
 echo hv2 > ${OVN_SYSCONFDIR}/system-id-override
 ovs-vsctl add-br br-phys2
 
-# This function is similar to ovn_attach but makes sure it doesn't
-# mess with another controller settings
-start_virtual_controller() {
-local net=$1 bridge=$2 ip=$3 masklen=${4-24} encap=${5-geneve,vxlan} 
systemid=${6-$sandbox} cli_args=${@:7}
-net_attach $net $bridge || return 1
-
-mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
-arp_table="$arp_table $sandbox,$bridge,$ip,$mac"
-ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
-ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1
-
-local ovn_remote
-if test X$HAVE_OPENSSL = Xyes; then
-ovn_remote=$SSL_OVN_SB_DB
-else
-ovn_remote=unix:$ovs_base/ovn-sb/ovn-sb.sock
-fi
-ovs-vsctl \
--- set Open_vSwitch . external-ids:ovn-remote-$systemid=$ovn_remote \
--- set Open_vSwitch . external-ids:ovn-encap-type-$systemid=$encap \
--- set Open_vSwitch . external-ids:ovn-encap-ip-$systemid=$ip \
--- set Open_vSwitch . external-ids:ovn-bridge-$systemid=br-int-2 \
--- --may-exist add-br br-int-2 \
--- set bridge br-int-2 fail-mode=secure 
other-config:disable-in-band=true \
-|| return 1
-
-ovn-controller --enable-dummy-vif-plug ${cli_args} -vconsole:off --detach 
--no-chdir
-}
-
 # for some reason SSL ovsdb configuration overrides CLI, so
 # delete ssl config from ovsdb to give CLI arguments priority
 ovs-vsctl del-ssl
 
-start_virtual_controller n1 br-phys2 192.168.0.2 24 geneve,vxlan hv2 \
+start_virtual_controller n1 br-phys2 br-int-2 192.168.0.2 24 geneve,vxlan hv2 \
 --pidfile=${OVS_RUNDIR}/ovn-controller-2.pid \
 --log-file=${OVS_RUNDIR}/ovn-controller-2.log \
 -p $PKIDIR/testpki-hv2-privkey.pem \
@@ -33645,3 +33616,109 @@ OVS_WAIT_UNTIL([ovs-vsctl --columns _uuid --bare find 
Port \
 OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])
+
+# NOTE: This test case runs two ovn-controllers inside the same sandbox (hv1).
+# Each controller uses a unique chassis name - hv1 and hv2 - and manage
+# different bridges with different ports. This is why all 'as' commands below
+# are executed from the same - hv1 - sandbox, regardless of whether they
+# logically belong to ports of chassis named hv1 or hv2.
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([multiple controllers on the same host can talk to each other])
+ovn_start
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys-1
+ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys:br-phys-1
+
+ovn_attach n1 br-phys-1 192.168.1.1 24
+
+# Disable local ARP responder to pass ARP requests through tunnels
+check ovn-nbctl \
+ls-add ls \
+-- add Logical_Switch ls other_config vlan-passthru=true
+ovn-nbctl lsp-add ls lp1
+ovn-nbctl lsp-set-addresses lp1 "00:00:00:00:00:01 10.0.0.1"
+
+ovn-nbctl lsp-add ls ln_port
+ovn-nbctl lsp-set-addresses ln_port unknown
+ovn-nbctl lsp-set-type ln_port localnet
+ovn-nbctl 

[ovs-dev] [PATCH ovn v8 7/7] Document experimental support for co-hosted controllers

2022-11-30 Thread Ihar Hrachyshka
Acked-by: Mark Michelson 
Signed-off-by: Ihar Hrachyshka 
---
 NEWS|  2 ++
 controller/ovn-controller.8.xml | 12 
 2 files changed, 14 insertions(+)

diff --git a/NEWS b/NEWS
index 672efb749..c6a5d6264 100644
--- a/NEWS
+++ b/NEWS
@@ -5,6 +5,8 @@ Post v22.09.0
 period of time.
   - ovn-northd: Add configuration knobs to enable drop sampling using OVS's
 per-flow IPFIX sampling.
+  - ovn-controller: Experimental support for co-hosting multiple controller
+instances on the same host.
 
 OVN v22.09.0 - 16 Sep 2022
 --
diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index 73a3bbf9b..0d2f72d64 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -375,6 +375,18 @@
 set in the database.
 
 
+
+Chassis-specific configuration options in the database plus the ability
+to configure the chassis name to use via the
+system-id-override file or command line allows to run
+multiple ovn-controller instances with unique chassis
+names on the same host using the same vswitchd instance.
+This may be useful when running a hybrid setup with more than one CMS
+managing ports on the host, or to use different datapath types on the
+same host. Note that this ability is highly experimental and has known
+limitations. Use at your own risk.
+
+
 
   ovn-controller reads the following values from the
   Open_vSwitch database of the local OVS instance:
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v8 2/7] Support ovn-...- specific global ovsdb options

2022-11-30 Thread Ihar Hrachyshka
Before the patch, all controller instances were reading configuration
from the same external-ids:ovn-* options. This patch adds support for
distinct config otions for different chassis names stored in the same
ovsdb global config object.

To configure an option for a distinct chassis name, an admin may add a
suffix with the desired chassis name to a config option. For example, if
the following is configured in ovsdb, only a controller with the
corresponding chassis name (either 'hv1' or 'hv2') would read just one
of the following options:

ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys:br-phys-1
ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv2=phys:br-phys-2

Chassis specific config options override any global settings, so for
example if the following configuration is used, then controller 'hv1'
will use the first setting but not the latter. Any other controllers
will use the global setting, which is the second setting..

ovs-vsctl set open . external-ids:ovn-bridge-mappings-hv1=phys:br-phys-1
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys-2

This is supported for other options too.

This is in preparation to support running multiple controller instances
using the same vswitchd instance.

Signed-off-by: Ihar Hrachyshka 
---
 controller/chassis.c| 136 +++-
 controller/chassis.h|   3 +-
 controller/encaps.c |  19 +++--
 controller/ovn-controller.8.xml |  16 
 controller/ovn-controller.c | 107 -
 controller/patch.c  |   6 +-
 controller/physical.c   |   2 +-
 lib/ovn-util.c  |  79 +++
 lib/ovn-util.h  |  26 ++
 ovn-nb.xml  |   8 ++
 tests/ovn-macros.at |   4 +-
 tests/ovn.at|  35 
 12 files changed, 339 insertions(+), 102 deletions(-)

diff --git a/controller/chassis.c b/controller/chassis.c
index 49c3af832..6c88c96d6 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -93,9 +93,10 @@ chassis_register_ovs_idl(struct ovsdb_idl *ovs_idl)
 }
 
 static const char *
-get_hostname(const struct smap *ext_ids)
+get_hostname(const struct smap *ext_ids, const char *chassis_id)
 {
-const char *hostname = smap_get_def(ext_ids, "hostname", "");
+const char *hostname = get_chassis_external_id_value(ext_ids, chassis_id,
+ "hostname", "");
 
 if (strlen(hostname) == 0) {
 static char hostname_[HOST_NAME_MAX + 1];
@@ -111,69 +112,81 @@ get_hostname(const struct smap *ext_ids)
 }
 
 static const char *
-get_bridge_mappings(const struct smap *ext_ids)
+get_bridge_mappings(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-bridge-mappings", "");
+return get_chassis_external_id_value(ext_ids, chassis_id,
+ "ovn-bridge-mappings", "");
 }
 
 const char *
-get_chassis_mac_mappings(const struct smap *ext_ids)
+get_chassis_mac_mappings(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-chassis-mac-mappings", "");
+return get_chassis_external_id_value(ext_ids, chassis_id,
+ "ovn-chassis-mac-mappings", "");
 }
 
 static const char *
-get_cms_options(const struct smap *ext_ids)
+get_cms_options(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-cms-options", "");
+return get_chassis_external_id_value(ext_ids, chassis_id,
+ "ovn-cms-options", "");
 }
 
 static const char *
-get_monitor_all(const struct smap *ext_ids)
+get_monitor_all(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-monitor-all", "false");
+return get_chassis_external_id_value(ext_ids, chassis_id,
+ "ovn-monitor-all", "false");
 }
 
 static const char *
-get_enable_lflow_cache(const struct smap *ext_ids)
+get_enable_lflow_cache(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-enable-lflow-cache", "true");
+return get_chassis_external_id_value(ext_ids, chassis_id,
+ "ovn-enable-lflow-cache", "true");
 }
 
 static const char *
-get_limit_lflow_cache(const struct smap *ext_ids)
+get_limit_lflow_cache(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-limit-lflow-cache", "");
+return get_chassis_external_id_value(ext_ids, chassis_id,
+ "ovn-limit-lflow-cache", "");
 }
 
 static const char *
-get_memlimit_lflow_cache(const struct smap *ext_ids)
+get_memlimit_lflow_cache(const struct smap *ext_ids, const char *chassis_id)
 {
-return smap_get_def(ext_ids, "ovn-memlimit-lflow-cache-kb", "");
+return 

[ovs-dev] [PATCH ovn v8 5/7] Don't touch tunnel ports from a different br-int

2022-11-30 Thread Ihar Hrachyshka
When multiple controllers are running using the same vswitchd,
controllers should delete only those tunnel ports that belong to the
integration bridge that is managed by the controller instance.

This makes sure multiple controllers don't step on each other when
running using the same vswitchd instance.

Signed-off-by: Ihar Hrachyshka 
Acked-by: Mark Michelson 
---
 controller/encaps.c | 42 +---
 controller/encaps.h |  1 -
 controller/ovn-controller.c |  3 +-
 tests/ovn.at| 79 +
 4 files changed, 99 insertions(+), 26 deletions(-)

diff --git a/controller/encaps.c b/controller/encaps.c
index a381a8d17..5d383401d 100644
--- a/controller/encaps.c
+++ b/controller/encaps.c
@@ -388,7 +388,6 @@ chassis_tzones_overlap(const struct sset *transport_zones,
 
 void
 encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
-   const struct ovsrec_bridge_table *bridge_table,
const struct ovsrec_bridge *br_int,
const struct sbrec_chassis_table *chassis_table,
const struct sbrec_chassis *this_chassis,
@@ -401,7 +400,6 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
 }
 
 const struct sbrec_chassis *chassis_rec;
-const struct ovsrec_bridge *br;
 
 struct tunnel_ctx tc = {
 .chassis = SHASH_INITIALIZER(),
@@ -419,27 +417,25 @@ encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
 /* Collect all port names into tc.port_names.
  *
  * Collect all the OVN-created tunnels into tc.tunnel_hmap. */
-OVSREC_BRIDGE_TABLE_FOR_EACH (br, bridge_table) {
-for (size_t i = 0; i < br->n_ports; i++) {
-const struct ovsrec_port *port = br->ports[i];
-sset_add(_names, port->name);
-
-/*
- * note that the id here is not just the chassis name, but the
- * combination of 
- */
-const char *id = smap_get(>external_ids, "ovn-chassis-id");
-if (id) {
-if (!shash_find(, id)) {
-struct chassis_node *chassis = xzalloc(sizeof *chassis);
-chassis->bridge = br;
-chassis->port = port;
-shash_add_assert(, id, chassis);
-} else {
-/* Duplicate port for ovn-chassis-id.  Arbitrarily choose
- * to delete this one. */
-ovsrec_bridge_update_ports_delvalue(br, port);
-}
+for (size_t i = 0; i < br_int->n_ports; i++) {
+const struct ovsrec_port *port = br_int->ports[i];
+sset_add(_names, port->name);
+
+/*
+ * note that the id here is not just the chassis name, but the
+ * combination of 
+ */
+const char *id = smap_get(>external_ids, "ovn-chassis-id");
+if (id) {
+if (!shash_find(, id)) {
+struct chassis_node *chassis = xzalloc(sizeof *chassis);
+chassis->bridge = br_int;
+chassis->port = port;
+shash_add_assert(, id, chassis);
+} else {
+/* Duplicate port for ovn-chassis-id.  Arbitrarily choose
+ * to delete this one. */
+ovsrec_bridge_update_ports_delvalue(br_int, port);
 }
 }
 }
diff --git a/controller/encaps.h b/controller/encaps.h
index 25d44b034..867c6f28c 100644
--- a/controller/encaps.h
+++ b/controller/encaps.h
@@ -30,7 +30,6 @@ struct sset;
 
 void encaps_register_ovs_idl(struct ovsdb_idl *);
 void encaps_run(struct ovsdb_idl_txn *ovs_idl_txn,
-const struct ovsrec_bridge_table *,
 const struct ovsrec_bridge *br_int,
 const struct sbrec_chassis_table *,
 const struct sbrec_chassis *,
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 6842bca8f..373a299ce 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -4244,8 +4244,7 @@ main(int argc, char *argv[])
 }
 
 if (chassis) {
-encaps_run(ovs_idl_txn,
-   bridge_table, br_int,
+encaps_run(ovs_idl_txn, br_int,
sbrec_chassis_table_get(ovnsb_idl_loop.idl),
chassis,
sbrec_sb_global_first(ovnsb_idl_loop.idl),
diff --git a/tests/ovn.at b/tests/ovn.at
index 6929e5d11..4adcb2f58 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -33566,3 +33566,82 @@ check_column "" Encap ip chassis_name=hv1 type=vxlan
 OVN_CLEANUP([hv1],[hv2])
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([controllers don't touch tunnels that are not on br-int])
+ovn_start
+net_add n1
+
+sim_add hv1
+as hv1
+ovs-vsctl add-br br-phys1
+ovn_attach n1 br-phys1 192.168.0.1
+
+# use a port as a canary in the mine to wait until the controller is up
+# (meaning, ssl configuration was read 

[ovs-dev] [PATCH ovn v8 3/7] Allow to override system-id via file

2022-11-30 Thread Ihar Hrachyshka
Before the patch, system-id could be configured via a global config
option in ovsdb. This patch adds another option - configure system-id
via a file. This is achieved by writing the desired system-id into the
following file location: ${OVN_SYSCONFDIR}/system-id-override.

The file is read on controller startup. The file setting overrides
configuration stored in ovsdb, if any.

This may be useful when running multiple containerized controller
instances using the same vswitchd.

Signed-off-by: Ihar Hrachyshka 
---
 controller/chassis.c|  6 +
 controller/chassis.h|  2 ++
 controller/ovn-controller.8.xml |  5 -
 controller/ovn-controller.c | 39 +
 tests/ovn.at| 36 ++
 tests/ovs-macros.at |  2 ++
 6 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/controller/chassis.c b/controller/chassis.c
index 6c88c96d6..8566f5e40 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -37,6 +37,8 @@ VLOG_DEFINE_THIS_MODULE(chassis);
 #define HOST_NAME_MAX 255
 #endif /* HOST_NAME_MAX */
 
+char *file_system_id = NULL;
+
 /*
  * Structure for storing the chassis config parsed from the ovs table.
  */
@@ -827,6 +829,10 @@ chassis_get_mac(const struct sbrec_chassis *chassis_rec,
 const char *
 get_ovs_chassis_id(const struct ovsrec_open_vswitch_table *ovs_table)
 {
+if (file_system_id) {
+return file_system_id;
+}
+
 const struct ovsrec_open_vswitch *cfg
 = ovsrec_open_vswitch_table_first(ovs_table);
 const char *chassis_id = cfg ? smap_get(>external_ids, "system-id")
diff --git a/controller/chassis.h b/controller/chassis.h
index 6bf5d7095..ea085ed47 100644
--- a/controller/chassis.h
+++ b/controller/chassis.h
@@ -33,6 +33,8 @@ struct sset;
 struct eth_addr;
 struct smap;
 
+extern char *file_system_id;
+
 void chassis_register_ovs_idl(struct ovsdb_idl *);
 const struct sbrec_chassis *chassis_run(
 struct ovsdb_idl_txn *ovnsb_idl_txn,
diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index 14fb15752..1783209c1 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -73,7 +73,10 @@
   not directly supported.  Users have two options: either first
   gracefully stop ovn-controller or manually delete the
   stale Chassis and Chassis_Private records
-  after changing the system-id.
+  after changing the system-id. Note that the chassis name can
+  also be provided via the system-id-override file in the
+  local OVN "etc" directory. The file configuration overrides the one from
+  the database, if both are present.
 
   external_ids:hostname
   The hostname to use in the Chassis table.
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 680ad1d9e..d648f221b 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -55,6 +56,7 @@
 #include "lib/ip-mcast-index.h"
 #include "lib/mac-binding-index.h"
 #include "lib/mcast-group-index.h"
+#include "lib/ovn-dirs.h"
 #include "lib/ovn-sb-idl.h"
 #include "lib/ovn-util.h"
 #include "patch.h"
@@ -152,6 +154,37 @@ struct pending_pkt {
 /* Registered ofctrl seqno type for nb_cfg propagation. */
 static size_t ofctrl_seq_type_nb_cfg;
 
+static void
+remove_newline(char *s)
+{
+char *last = [strlen(s) - 1];
+switch (*last) {
+case '\n':
+case '\r':
+*last = '\0';
+default:
+return;
+}
+}
+
+static char *get_file_system_id(void)
+{
+char *ret = NULL;
+char *filename = xasprintf("%s/system-id-override", ovn_sysconfdir());
+errno = 0;
+FILE *f = fopen(filename, "r");
+if (f) {
+char system_id[64];
+if (fgets(system_id, sizeof system_id, f)) {
+remove_newline(system_id);
+ret = xstrdup(system_id);
+}
+fclose(f);
+}
+free(filename);
+return ret;
+}
+
 static unsigned int
 update_sb_monitors(struct ovsdb_idl *ovnsb_idl,
const struct sbrec_chassis *chassis,
@@ -3611,6 +3644,9 @@ main(int argc, char *argv[])
 struct ovn_controller_exit_args exit_args = {, };
 int retval;
 
+/* Read from system-id-override file once on startup. */
+file_system_id = get_file_system_id();
+
 ovs_cmdl_proctitle_init(argc, argv);
 ovn_set_program_name(argv[0]);
 service_start(, );
@@ -4588,6 +4624,9 @@ loop_done:
 
 ovs_feature_support_destroy();
 free(ovs_remote);
+if (file_system_id) {
+free(file_system_id);
+}
 service_stop();
 
 exit(retval);
diff --git a/tests/ovn.at b/tests/ovn.at
index b9bbd6e49..af7bba137 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -33491,3 +33491,39 @@ check_column "" Encap ip chassis_name=hv1 type=vxlan
 OVN_CLEANUP([hv1],[hv2])
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD([

[ovs-dev] [PATCH ovn v8 1/7] Include "chassis index" into tunnel port name

2022-11-30 Thread Ihar Hrachyshka
This is in preparation to support multiple separate controller instances
with distinct chassis names operating on the same vswitchd instance.

To avoid conflicts, this patch introduces a unique "index" (from 0-9a-z
range) into the port name. Each chassis allocates a separate index for
itself on startup. The index is then stored in
Open_vSwitch:other_config:ovn-chassis-idx- key.

An alternative would be including source chassis name into the port
name, but the length is limited by IFNAMSIZ defined in kernel, which is
15.

Signed-off-by: Ihar Hrachyshka 
---
 controller/chassis.c| 133 ++--
 controller/chassis.h|   9 ++-
 controller/encaps.c |  15 ++--
 controller/ovn-controller.c |  24 ++-
 tests/automake.mk   |   1 +
 tests/ovn.at|  33 +
 6 files changed, 187 insertions(+), 28 deletions(-)

diff --git a/controller/chassis.c b/controller/chassis.c
index 685d9b2ae..49c3af832 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -794,21 +794,146 @@ chassis_get_mac(const struct sbrec_chassis *chassis_rec,
 return ret;
 }
 
+const char *
+get_ovs_chassis_id(const struct ovsrec_open_vswitch_table *ovs_table)
+{
+const struct ovsrec_open_vswitch *cfg
+= ovsrec_open_vswitch_table_first(ovs_table);
+const char *chassis_id = cfg ? smap_get(>external_ids, "system-id")
+ : NULL;
+
+if (!chassis_id) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(, "'system-id' in Open_vSwitch database is missing.");
+}
+
+return chassis_id;
+}
+
+static bool
+is_chassis_idx_stored(const struct ovsrec_open_vswitch_table *ovs_table)
+{
+const struct ovsrec_open_vswitch *cfg =
+ovsrec_open_vswitch_table_first(ovs_table);
+const char *chassis_id = get_ovs_chassis_id(ovs_table);
+if (!chassis_id) {
+return false;
+}
+char *idx_key = xasprintf(CHASSIS_IDX_PREFIX "%s", chassis_id);
+const char *idx = smap_get(>other_config, idx_key);
+free(idx_key);
+return !!idx;
+}
+
+const char *get_chassis_idx(const struct ovsrec_open_vswitch_table *ovs_table)
+{
+const struct ovsrec_open_vswitch *cfg =
+ovsrec_open_vswitch_table_first(ovs_table);
+const char *chassis_id = get_ovs_chassis_id(ovs_table);
+if (!chassis_id) {
+return "";
+}
+char *idx_key = xasprintf(CHASSIS_IDX_PREFIX "%s", chassis_id);
+const char *idx = smap_get_def(>other_config, idx_key, "");
+free(idx_key);
+return idx;
+}
+
+void
+store_chassis_index_if_needed(
+const struct ovsrec_open_vswitch_table *ovs_table)
+{
+const struct ovsrec_open_vswitch *cfg =
+ovsrec_open_vswitch_table_first(ovs_table);
+const char *chassis_id = get_ovs_chassis_id(ovs_table);
+
+char *idx_key = xasprintf(CHASSIS_IDX_PREFIX "%s", chassis_id);
+const char *chassis_idx = smap_get(>other_config, idx_key);
+if (!chassis_idx) {
+/* Collect all indices so far consumed by other chassis. */
+struct sset used_indices = SSET_INITIALIZER(_indices);
+struct smap_node *node;
+SMAP_FOR_EACH (node, >other_config) {
+if (!strncmp(node->key, CHASSIS_IDX_PREFIX,
+sizeof(CHASSIS_IDX_PREFIX) - 1)) {
+sset_add(_indices, node->value);
+}
+}
+/* First chassis on the host: use an empty string to avoid adding an
+ * unnecessary index character to tunnel port names when a single
+ * controller is running on the host (the most common case). */
+if (!sset_contains(_indices, "")) {
+ovsrec_open_vswitch_update_other_config_setkey(
+cfg, idx_key, "");
+goto out;
+}
+/* Next chassis gets an alphanum index allocated. */
+char idx[] = "0";
+for (char i = '0'; i <= '9'; i++) {
+idx[0] = i;
+if (!sset_contains(_indices, idx)) {
+ovsrec_open_vswitch_update_other_config_setkey(
+cfg, idx_key, idx);
+goto out;
+}
+}
+for (char i = 'a'; i <= 'z'; i++) {
+idx[0] = i;
+if (!sset_contains(_indices, idx)) {
+ovsrec_open_vswitch_update_other_config_setkey(
+cfg, idx_key, idx);
+goto out;
+}
+}
+/* All indices consumed: it's safer to just abort. */
+VLOG_ERR("All unique controller indices consumed. Exiting.");
+exit(EXIT_FAILURE);
+}
+out:
+free(idx_key);
+}
+
+static void
+clear_chassis_index_if_needed(
+const struct ovsrec_open_vswitch_table *ovs_table)
+{
+const struct ovsrec_open_vswitch *cfg =
+ovsrec_open_vswitch_table_first(ovs_table);
+const char *chassis_id = get_ovs_chassis_id(ovs_table);
+char *idx_key = xasprintf(CHASSIS_IDX_PREFIX "%s", 

[ovs-dev] [PATCH ovn v8 4/7] Support passing chassis name via CLI

2022-11-30 Thread Ihar Hrachyshka
This patch adds support for the desired system-id (chassis name) to be
passed via CLI:

$ ovn-controller -n 

If passed, CLI overrides any settings stored in ovsdb or in
system-id-override file.

This may be useful when running multiple controller instances using the
same vswitchd instance.

Signed-off-by: Ihar Hrachyshka 
---
 controller/chassis.c|  5 +
 controller/chassis.h|  1 +
 controller/ovn-controller.8.xml |  6 +++--
 controller/ovn-controller.c |  9 
 tests/ovn-macros.at |  4 ++--
 tests/ovn.at| 39 +
 6 files changed, 60 insertions(+), 4 deletions(-)

diff --git a/controller/chassis.c b/controller/chassis.c
index 8566f5e40..77e892bee 100644
--- a/controller/chassis.c
+++ b/controller/chassis.c
@@ -37,6 +37,7 @@ VLOG_DEFINE_THIS_MODULE(chassis);
 #define HOST_NAME_MAX 255
 #endif /* HOST_NAME_MAX */
 
+char *cli_system_id = NULL;
 char *file_system_id = NULL;
 
 /*
@@ -829,6 +830,10 @@ chassis_get_mac(const struct sbrec_chassis *chassis_rec,
 const char *
 get_ovs_chassis_id(const struct ovsrec_open_vswitch_table *ovs_table)
 {
+if (cli_system_id) {
+return cli_system_id;
+}
+
 if (file_system_id) {
 return file_system_id;
 }
diff --git a/controller/chassis.h b/controller/chassis.h
index ea085ed47..03cc2f906 100644
--- a/controller/chassis.h
+++ b/controller/chassis.h
@@ -33,6 +33,7 @@ struct sset;
 struct eth_addr;
 struct smap;
 
+extern char *cli_system_id;
 extern char *file_system_id;
 
 void chassis_register_ovs_idl(struct ovsdb_idl *);
diff --git a/controller/ovn-controller.8.xml b/controller/ovn-controller.8.xml
index 1783209c1..73a3bbf9b 100644
--- a/controller/ovn-controller.8.xml
+++ b/controller/ovn-controller.8.xml
@@ -75,8 +75,10 @@
   stale Chassis and Chassis_Private records
   after changing the system-id. Note that the chassis name can
   also be provided via the system-id-override file in the
-  local OVN "etc" directory. The file configuration overrides the one from
-  the database, if both are present.
+  local OVN "etc" directory or via the -n command-line option.
+  The following precedence is used: first, the command-line option is read;
+  if not present, the system-id-override file is read; if not
+  present, then the name configured in the database is used.
 
   external_ids:hostname
   The hostname to use in the Chassis table.
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index d648f221b..6842bca8f 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -4627,6 +4627,9 @@ loop_done:
 if (file_system_id) {
 free(file_system_id);
 }
+if (cli_system_id) {
+free(cli_system_id);
+}
 service_stop();
 
 exit(retval);
@@ -4652,6 +4655,7 @@ parse_options(int argc, char *argv[])
 STREAM_SSL_LONG_OPTIONS,
 {"peer-ca-cert", required_argument, NULL, OPT_PEER_CA_CERT},
 {"bootstrap-ca-cert", required_argument, NULL, OPT_BOOTSTRAP_CA_CERT},
+{"chassis", required_argument, NULL, 'n'},
 {"enable-dummy-vif-plug", no_argument, NULL,
  OPT_ENABLE_DUMMY_VIF_PLUG},
 {NULL, 0, NULL, 0}
@@ -4703,6 +4707,10 @@ parse_options(int argc, char *argv[])
 vif_plug_dummy_enable();
 break;
 
+case 'n':
+cli_system_id = xstrdup(optarg);
+break;
+
 case '?':
 exit(EXIT_FAILURE);
 
@@ -4738,6 +4746,7 @@ usage(void)
 daemon_usage();
 vlog_usage();
 printf("\nOther options:\n"
+   "  -n  custom chassis name\n"
"  -h, --help  display this help message\n"
"  -V, --version   display version information\n");
 exit(EXIT_SUCCESS);
diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index 8266ae526..ef7191c6a 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -290,7 +290,7 @@ net_attach () {
 
 # ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN] [ENCAP]
 ovn_az_attach() {
-local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} encap=${6-geneve,vxlan} 
systemid=${7-$sandbox}
+local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} encap=${6-geneve,vxlan} 
systemid=${7-$sandbox} cli_args=${@:8}
 net_attach $net $bridge || return 1
 
 mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
@@ -331,7 +331,7 @@ ovn_az_attach() {
 ovs-vsctl set open . external_ids:ovn-monitor-all=true
 fi
 
-start_daemon ovn-controller --enable-dummy-vif-plug || return 1
+start_daemon ovn-controller --enable-dummy-vif-plug ${cli_args} || return 1
 }
 
 # ovn_attach NETWORK BRIDGE IP [MASKLEN] [ENCAP]
diff --git a/tests/ovn.at b/tests/ovn.at
index af7bba137..6929e5d11 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -33527,3 +33527,42 @@ check_column "" Encap ip chassis_name=hv1 type=vxlan
 OVN_CLEANUP([hv1],[hv2])
 

[ovs-dev] [PATCH ovn v8 0/7] Support 2+ controllers on the same vswitchd

2022-11-30 Thread Ihar Hrachyshka
This series adds support to run multiple ovn-controller instances using
the same vswitchd instance. This may be used to reuse a single host
level vswitchd installation to run multiple CMS (e.g. k8s and
openstack), each having its own OVN stack running on a separate
integration bridge.

This setup may, in some instances, simplify administration of the
system, since the admin no longer needs to maintain separate vswitchd
installations (e.g. in separate containers). This is also helpful when
running different datapath types for the mixed setup.

v1: initial series
v2: change tunnel port naming scheme: include "chassis index" instead of
its name for source chassis.
v2: formatting adjustments.
v3: fixed build due to ovs_abort missing arguments.
v3: added documentation to CLI and system-id-override file.
v3: added documentation for chassis specific db config options.
v3: documented the ability to run multiple controllers on the same host,
while mentioning that this support is highly experimental.
v3: updated NEWS file to include the note about the new experimental
issue.
v3: rebased.
v4: fixed a memory leak in get_chassis_idx.
v5: actually fix the leak...
v6: fix race condition in new test cases where ssl db configuration was
removed before ovn-controller has a chance to read it from db,
making it fail to start and process ports
v7: addresses Mark's comments from v6, specifically:
- 1/7: Clean up allocated index on exit
- 1/7: Remove hardcoded 16 with sizeof(CHASSIS_IDX_PREFIX) - 1
- 1/7: Explain in comments why the first chassis uses an empty
  string index and not “0”
- 2/7: Document that requested-chassis should use unique chassis
  names, not hostnames
- 2/7: Document that unique own-bridges should be used for multiple
  co-hosted controllers
- 2/7: Reworked logic for get_chassis_external_value functions to
  avoid duplication with smap_get_ functions
- 2/7: Use wait_column in tests
- 3/7: Add information about the directory where system-id-override
  file is located
- 3/7: Use FILE * functions from stdio.h instead of Unix lower level
  equivalents
- 3/7: Use wait_column in tests
- 4/7: Also validate that CLI takes precedence over override file,
  not only db
- 4/7: Use wait_column in tests
- 5/7: - (no changes)
- 6/7: Move start_virtual_controller to ovn-macros.at to avoid code
  duplication
- 6/7: Added comments explaining why test cases running two
  controllers use the same sandbox name for both
- 7/7: - (no changes)
v8:
- 1/7: fixed test case for ovn-chassis-idx population
- 1/7: fixed SIGSEGV errors in chassis_cleanup for chassis index

Ihar Hrachyshka (7):
  Include "chassis index" into tunnel port name
  Support ovn-...- specific global ovsdb options
  Allow to override system-id via file
  Support passing chassis name via CLI
  Don't touch tunnel ports from a different br-int
  Add connectivity test for 2 controllers on the same host
  Document experimental support for co-hosted controllers

 NEWS|   2 +
 controller/chassis.c| 280 --
 controller/chassis.h|  15 +-
 controller/encaps.c |  76 
 controller/encaps.h |   1 -
 controller/ovn-controller.8.xml |  35 +++-
 controller/ovn-controller.c | 182 +--
 controller/patch.c  |   6 +-
 controller/physical.c   |   2 +-
 lib/ovn-util.c  |  79 +
 lib/ovn-util.h  |  26 +++
 ovn-nb.xml  |   8 +
 tests/automake.mk   |   1 +
 tests/ovn-macros.at |  36 +++-
 tests/ovn.at| 299 
 tests/ovs-macros.at |   2 +
 16 files changed, 892 insertions(+), 158 deletions(-)

-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] ovs-ctl: Allow inclusion of hugepages in coredumps

2022-11-30 Thread Ilya Maximets
On 11/30/22 20:52, David Marchand wrote:
> On Wed, Nov 30, 2022 at 8:46 PM Ilya Maximets  wrote:
>>
>> On 11/30/22 20:11, Mike Pattrick wrote:
>>> On Wed, Nov 30, 2022 at 7:27 AM Ilya Maximets  wrote:

 On 11/25/22 18:19, Adrian Moreno wrote:
> Hi Mike,
>
> Sorry it took that long to review this patch.
>
> On 3/25/22 23:17, Mike Pattrick wrote:
>> Add new option --dump-hugepages option in ovs-ctl to enable the addition
>> of hugepages in the core dump filter.
>>
>> Signed-off-by: Mike Pattrick 
>> ---
>>   NEWS |  4 
>>   utilities/ovs-ctl.in | 15 +++
>>   2 files changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 8fa57836a..7af60dce3 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -3,6 +3,10 @@ Post-v2.17.0
>>  - OVSDB:
>>* 'relay' service model now supports transaction history, i.e. 
>> honors the
>>  'last-txn-id' field in 'monitor_cond_since' requests from 
>> clients.
>> +   - ovs-ctl:
>> + * New option '--dump-hugepages' to include hugepages in core 
>> dumps. This
>> +   can assist with postmortem analysis involving DPDK, but may also 
>> produce
>> +   significantly larger core dump files.
>>
>
> I'm afraid this part needs rebasing.
>
>> v2.17.0 - 17 Feb 2022
>> diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
>> index e6e07f476..8f900314b 100644
>> --- a/utilities/ovs-ctl.in
>> +++ b/utilities/ovs-ctl.in
>> @@ -103,8 +103,13 @@ set_system_ids () {
>>   action "Configuring Open vSwitch system IDs" "$@" $extra_ids
>>   }
>>   -check_force_cores () {
>> -if test X"$FORCE_COREFILES" = Xyes; then
>> +check_core_config () {
>> +if test X"$DUMP_HUGEPAGES" = Xyes; then
>> +echo 0x3f > /proc/self/coredump_filter

 Shouldn't this be 0x7f instead?
 0x3f doesn't enable bit #6, which is responsible for dumping
 shared huge pages.  Or am I missing something?
>>>
>>> That's a good point, the hugepage may or may not be private. I'll send
>>> in a new one.
>>
>> OK.  One thing to think about though is that we'll grab
>> VM memory, I guess, in case we have vhost-user ports.
>> So, the core dump size can become insanely huge.
>>
>> The downside of not having them is inability to inspect
>> virtqueues and stuff in the dump.
> 
> Did you consider madvise()?
> 
>MADV_DONTDUMP (since Linux 3.4)
>   Exclude from a core dump those pages in the range
> specified by addr and length.  This is useful in applications that
> have large areas of memory that are known not to be useful in a core
> dump.  The effect of  MADV_DONT‐
>   DUMP takes precedence over the bit mask that is set via
> the /proc/[pid]/coredump_filter file (see core(5)).
> 
>MADV_DODUMP (since Linux 3.4)
>   Undo the effect of an earlier MADV_DONTDUMP.

I don't think OVS actually knows location of particular VM memory
pages that we do not need.  And dumping virtqueues and stuff is,
probably, the point of this patch (?).

vhost-user library might have a better idea on which particular parts
of the memory guest may use for virtqueues and buffers, but I'm not
100% sure.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v2] dpif-netdev: Put idle PMD threads to sleep.

2022-11-30 Thread Thilak Raj Surendra Babu
Hi David,
For my understanding, is part being woken up by a RX interrupt not desirable?
I am under the assumption that all NICs do support rx-interrupts through the fd.
Please correct me if I am wrong.

Will take a look at Kevin's patch and try it out as well.

Thanks
Thilak Raj S

-Original Message-
From: David Marchand  
Sent: 29 November 2022 08:05
To: Thilak Raj Surendra Babu 
Cc: d...@openvswitch.org; Karthik Chandrashekar 
Subject: Re: [ovs-dev] [RFC PATCH v2] dpif-netdev: Put idle PMD threads to 
sleep.

Hello,

On Fri, Sep 9, 2022 at 11:48 PM Thilak Raj Surendra Babu 
 wrote:
>
> Hi David,
> We had a similar thought process and interested in this patch.
> Wondering if we could help you out on making progress on this patch ?

We have been working on the power saving topic for a while and Kevin had 
proposed an alternative that does not require hw/driver assistance.
For now, my idea on using dpdk interrupts is put on hold unless some needs 
arise.

It would be great if you could look at and test Kevin patch:
https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_project_openvswitch_patch_20221129140131.361338-2D1-2Dktraynor-40redhat.com_=DwIBaQ=s883GpUCOChKOHiocYtGcg=1sd3waKor_ps6hs2j0tfqmW6ts2tlVvmmMySlXCPN6w=myKkErCB3zuTvahDVr9Kn2XxyqgStbO5DorUIZyXluMTdl2XjeDPlYGCl9lIFhnw=m1Z3EvUIizkTYbgOe7RtLXoraM-4Caw8Aq2nICx3uFY=
 

Thanks!


--
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] ovs-ctl: Allow inclusion of hugepages in coredumps

2022-11-30 Thread David Marchand
On Wed, Nov 30, 2022 at 8:46 PM Ilya Maximets  wrote:
>
> On 11/30/22 20:11, Mike Pattrick wrote:
> > On Wed, Nov 30, 2022 at 7:27 AM Ilya Maximets  wrote:
> >>
> >> On 11/25/22 18:19, Adrian Moreno wrote:
> >>> Hi Mike,
> >>>
> >>> Sorry it took that long to review this patch.
> >>>
> >>> On 3/25/22 23:17, Mike Pattrick wrote:
>  Add new option --dump-hugepages option in ovs-ctl to enable the addition
>  of hugepages in the core dump filter.
> 
>  Signed-off-by: Mike Pattrick 
>  ---
>    NEWS |  4 
>    utilities/ovs-ctl.in | 15 +++
>    2 files changed, 15 insertions(+), 4 deletions(-)
> 
>  diff --git a/NEWS b/NEWS
>  index 8fa57836a..7af60dce3 100644
>  --- a/NEWS
>  +++ b/NEWS
>  @@ -3,6 +3,10 @@ Post-v2.17.0
>   - OVSDB:
> * 'relay' service model now supports transaction history, i.e. 
>  honors the
>   'last-txn-id' field in 'monitor_cond_since' requests from 
>  clients.
>  +   - ovs-ctl:
>  + * New option '--dump-hugepages' to include hugepages in core 
>  dumps. This
>  +   can assist with postmortem analysis involving DPDK, but may also 
>  produce
>  +   significantly larger core dump files.
> 
> >>>
> >>> I'm afraid this part needs rebasing.
> >>>
>  v2.17.0 - 17 Feb 2022
>  diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
>  index e6e07f476..8f900314b 100644
>  --- a/utilities/ovs-ctl.in
>  +++ b/utilities/ovs-ctl.in
>  @@ -103,8 +103,13 @@ set_system_ids () {
>    action "Configuring Open vSwitch system IDs" "$@" $extra_ids
>    }
>    -check_force_cores () {
>  -if test X"$FORCE_COREFILES" = Xyes; then
>  +check_core_config () {
>  +if test X"$DUMP_HUGEPAGES" = Xyes; then
>  +echo 0x3f > /proc/self/coredump_filter
> >>
> >> Shouldn't this be 0x7f instead?
> >> 0x3f doesn't enable bit #6, which is responsible for dumping
> >> shared huge pages.  Or am I missing something?
> >
> > That's a good point, the hugepage may or may not be private. I'll send
> > in a new one.
>
> OK.  One thing to think about though is that we'll grab
> VM memory, I guess, in case we have vhost-user ports.
> So, the core dump size can become insanely huge.
>
> The downside of not having them is inability to inspect
> virtqueues and stuff in the dump.

Did you consider madvise()?

   MADV_DONTDUMP (since Linux 3.4)
  Exclude from a core dump those pages in the range
specified by addr and length.  This is useful in applications that
have large areas of memory that are known not to be useful in a core
dump.  The effect of  MADV_DONT‐
  DUMP takes precedence over the bit mask that is set via
the /proc/[pid]/coredump_filter file (see core(5)).

   MADV_DODUMP (since Linux 3.4)
  Undo the effect of an earlier MADV_DONTDUMP.



-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5] odp-execute: Add ISA implementation of set_masked IPv6 action

2022-11-30 Thread Aaron Conole
"Phelan, Michael"  writes:

>> -Original Message-
>> From: Finn, Emma 
>> Sent: Wednesday 30 November 2022 14:15
>> To: Eelco Chaudron ; Phelan, Michael
>> 
>> Cc: Ilya Maximets ; d...@openvswitch.org; Van
>> Haaren, Harry ; Stokes, Ian
>> 
>> Subject: RE: [ovs-dev] [v5] odp-execute: Add ISA implementation of
>> set_masked IPv6 action
>  
>> > >> I'm also wondering why CI didn't catch that...
>> > >>
>> > >> There might be 2 reasons:
>> > >>
>> > >> 1. Actions autovalidator is not enabled in CI, or 2. CI system
>> > >> doesn't have avx512vbmi.
>> > >>
>> > >> Michael, could you check that?
>> > >
>> > > Hi Ilya,
>> > > The CI system does have avx512vbmi, however, the actions
>> > > autovalidator is
>> > never enabled for any of the tests.
>> > >
>> > > I could add a test to configure with the actions autovalidator if
>> > > you think
>> > this would be a good value add for the CI?
>> >
>> > I would suggest doing a run with and without all the avx512 auto
>> > validators enabled at compile time.
>> >
> Hi Eelco,
> I believe make check-local is run through the GitHub Build and Test
> job, Aaron you might correct me if I'm wrong on that.

That job does run 'make check' and I think it is the same thing.

> If this is the case then is there a need to do a check without AVX512 enabled 
> on the Intel CI?

I am not sure what the case is that isn't covered.  Maybe Eelco has a
thought?

> Kind Regards,
> Michael.
>> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] ovs-ctl: Allow inclusion of hugepages in coredumps

2022-11-30 Thread Ilya Maximets
On 11/30/22 20:11, Mike Pattrick wrote:
> On Wed, Nov 30, 2022 at 7:27 AM Ilya Maximets  wrote:
>>
>> On 11/25/22 18:19, Adrian Moreno wrote:
>>> Hi Mike,
>>>
>>> Sorry it took that long to review this patch.
>>>
>>> On 3/25/22 23:17, Mike Pattrick wrote:
 Add new option --dump-hugepages option in ovs-ctl to enable the addition
 of hugepages in the core dump filter.

 Signed-off-by: Mike Pattrick 
 ---
   NEWS |  4 
   utilities/ovs-ctl.in | 15 +++
   2 files changed, 15 insertions(+), 4 deletions(-)

 diff --git a/NEWS b/NEWS
 index 8fa57836a..7af60dce3 100644
 --- a/NEWS
 +++ b/NEWS
 @@ -3,6 +3,10 @@ Post-v2.17.0
  - OVSDB:
* 'relay' service model now supports transaction history, i.e. 
 honors the
  'last-txn-id' field in 'monitor_cond_since' requests from clients.
 +   - ovs-ctl:
 + * New option '--dump-hugepages' to include hugepages in core dumps. 
 This
 +   can assist with postmortem analysis involving DPDK, but may also 
 produce
 +   significantly larger core dump files.

>>>
>>> I'm afraid this part needs rebasing.
>>>
 v2.17.0 - 17 Feb 2022
 diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
 index e6e07f476..8f900314b 100644
 --- a/utilities/ovs-ctl.in
 +++ b/utilities/ovs-ctl.in
 @@ -103,8 +103,13 @@ set_system_ids () {
   action "Configuring Open vSwitch system IDs" "$@" $extra_ids
   }
   -check_force_cores () {
 -if test X"$FORCE_COREFILES" = Xyes; then
 +check_core_config () {
 +if test X"$DUMP_HUGEPAGES" = Xyes; then
 +echo 0x3f > /proc/self/coredump_filter
>>
>> Shouldn't this be 0x7f instead?
>> 0x3f doesn't enable bit #6, which is responsible for dumping
>> shared huge pages.  Or am I missing something?
> 
> That's a good point, the hugepage may or may not be private. I'll send
> in a new one.

OK.  One thing to think about though is that we'll grab
VM memory, I guess, in case we have vhost-user ports.
So, the core dump size can become insanely huge.

The downside of not having them is inability to inspect
virtqueues and stuff in the dump.

> 
> Cheers,
> M
> 
>>
>> Best regards, Ilya Maximets.
>>
 +if test X"$FORCE_COREFILES" = Xyes; then
 +ulimit -c unlimited
 +fi
 +elif test X"$FORCE_COREFILES" = Xyes; then
   ulimit -c 67108864
   fi
   }
 @@ -116,7 +121,7 @@ del_transient_ports () {
   }
 do_start_ovsdb () {
 -check_force_cores
 +check_core_config
 if daemon_is_running ovsdb-server; then
   log_success_msg "ovsdb-server is already running"
 @@ -193,7 +198,7 @@ add_managers () {
   }
 do_start_forwarding () {
 -check_force_cores
 +check_core_config
 insert_mod_if_required || return 1
   @@ -330,6 +335,7 @@ set_defaults () {
 DAEMON_CWD=/
   FORCE_COREFILES=yes
 +DUMP_HUGEPAGES=no
   MLOCKALL=yes
   SELF_CONFINEMENT=yes
   MONITOR=yes
 @@ -419,6 +425,7 @@ Other important options for "start", "restart" and 
 "force-reload-kmod":
   Less important options for "start", "restart" and "force-reload-kmod":
 --daemon-cwd=DIR   set working dir for OVS daemons 
 (default: $DAEMON_CWD)
 --no-force-corefiles   do not force on core dumps for OVS 
 daemons
 +  --dump-hugepages   include hugepages in coredumps
 --no-mlockall  do not lock all of ovs-vswitchd into 
 memory
 --ovsdb-server-priority=NICE   set ovsdb-server's niceness (default: 
 $OVSDB_SERVER_PRIORITY)
 --ovsdb-server-options=OPTIONS additional options for ovsdb-server 
 (example: '-vconsole:dbg -vfile:dbg')

>>>
>>> Tested locally and verified that with the option hugepages appear in 
>>> coredumps.
>>> Apart from the need to rebase the NEWS, the patch looks good to me.
>>>
>>> Acked-by: Adrian Moreno 
>>>
>>> --
>>> Adrián Moreno
>>
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] ovs-ctl: Allow inclusion of hugepages in coredumps

2022-11-30 Thread Mike Pattrick
On Wed, Nov 30, 2022 at 7:27 AM Ilya Maximets  wrote:
>
> On 11/25/22 18:19, Adrian Moreno wrote:
> > Hi Mike,
> >
> > Sorry it took that long to review this patch.
> >
> > On 3/25/22 23:17, Mike Pattrick wrote:
> >> Add new option --dump-hugepages option in ovs-ctl to enable the addition
> >> of hugepages in the core dump filter.
> >>
> >> Signed-off-by: Mike Pattrick 
> >> ---
> >>   NEWS |  4 
> >>   utilities/ovs-ctl.in | 15 +++
> >>   2 files changed, 15 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/NEWS b/NEWS
> >> index 8fa57836a..7af60dce3 100644
> >> --- a/NEWS
> >> +++ b/NEWS
> >> @@ -3,6 +3,10 @@ Post-v2.17.0
> >>  - OVSDB:
> >>* 'relay' service model now supports transaction history, i.e. 
> >> honors the
> >>  'last-txn-id' field in 'monitor_cond_since' requests from clients.
> >> +   - ovs-ctl:
> >> + * New option '--dump-hugepages' to include hugepages in core dumps. 
> >> This
> >> +   can assist with postmortem analysis involving DPDK, but may also 
> >> produce
> >> +   significantly larger core dump files.
> >>
> >
> > I'm afraid this part needs rebasing.
> >
> >> v2.17.0 - 17 Feb 2022
> >> diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
> >> index e6e07f476..8f900314b 100644
> >> --- a/utilities/ovs-ctl.in
> >> +++ b/utilities/ovs-ctl.in
> >> @@ -103,8 +103,13 @@ set_system_ids () {
> >>   action "Configuring Open vSwitch system IDs" "$@" $extra_ids
> >>   }
> >>   -check_force_cores () {
> >> -if test X"$FORCE_COREFILES" = Xyes; then
> >> +check_core_config () {
> >> +if test X"$DUMP_HUGEPAGES" = Xyes; then
> >> +echo 0x3f > /proc/self/coredump_filter
>
> Shouldn't this be 0x7f instead?
> 0x3f doesn't enable bit #6, which is responsible for dumping
> shared huge pages.  Or am I missing something?

That's a good point, the hugepage may or may not be private. I'll send
in a new one.

Cheers,
M

>
> Best regards, Ilya Maximets.
>
> >> +if test X"$FORCE_COREFILES" = Xyes; then
> >> +ulimit -c unlimited
> >> +fi
> >> +elif test X"$FORCE_COREFILES" = Xyes; then
> >>   ulimit -c 67108864
> >>   fi
> >>   }
> >> @@ -116,7 +121,7 @@ del_transient_ports () {
> >>   }
> >> do_start_ovsdb () {
> >> -check_force_cores
> >> +check_core_config
> >> if daemon_is_running ovsdb-server; then
> >>   log_success_msg "ovsdb-server is already running"
> >> @@ -193,7 +198,7 @@ add_managers () {
> >>   }
> >> do_start_forwarding () {
> >> -check_force_cores
> >> +check_core_config
> >> insert_mod_if_required || return 1
> >>   @@ -330,6 +335,7 @@ set_defaults () {
> >> DAEMON_CWD=/
> >>   FORCE_COREFILES=yes
> >> +DUMP_HUGEPAGES=no
> >>   MLOCKALL=yes
> >>   SELF_CONFINEMENT=yes
> >>   MONITOR=yes
> >> @@ -419,6 +425,7 @@ Other important options for "start", "restart" and 
> >> "force-reload-kmod":
> >>   Less important options for "start", "restart" and "force-reload-kmod":
> >> --daemon-cwd=DIR   set working dir for OVS daemons 
> >> (default: $DAEMON_CWD)
> >> --no-force-corefiles   do not force on core dumps for OVS 
> >> daemons
> >> +  --dump-hugepages   include hugepages in coredumps
> >> --no-mlockall  do not lock all of ovs-vswitchd into 
> >> memory
> >> --ovsdb-server-priority=NICE   set ovsdb-server's niceness (default: 
> >> $OVSDB_SERVER_PRIORITY)
> >> --ovsdb-server-options=OPTIONS additional options for ovsdb-server 
> >> (example: '-vconsole:dbg -vfile:dbg')
> >>
> >
> > Tested locally and verified that with the option hugepages appear in 
> > coredumps.
> > Apart from the need to rebase the NEWS, the patch looks good to me.
> >
> > Acked-by: Adrian Moreno 
> >
> > --
> > Adrián Moreno
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH/RFC net-next] tc: allow drivers to accept gact with PIPE when offloading

2022-11-30 Thread Marcelo Leitner
On Wed, Nov 30, 2022 at 03:36:57AM +, Tianyu Yuan wrote:
>
> On Mon, Nov 29, 2022 at 8:35 PM , Eelco Chaudron wrote:
> >
> > On 28 Nov 2022, at 14:33, Marcelo Leitner wrote:
> >
> > > On Mon, Nov 28, 2022 at 02:17:40PM +0100, Eelco Chaudron wrote:
> > >>
> > >>
> > >> On 28 Nov 2022, at 14:11, Marcelo Leitner wrote:
> > >>
> > >>> On Mon, Nov 28, 2022 at 07:11:05AM +, Tianyu Yuan wrote:
> > > ...
> > 
> >  Furthermore, I think the current stats for each action mentioned in
> >  2) cannot represent the real hw stats and this is why [ RFC
> >  net-next v2 0/2] (net: flow_offload: add support for per action hw 
> >  stats)
> > will come up.
> > >>>
> > >>> Exactly. Then, when this patchset (or similar) come up, it won't
> > >>> update all actions with the same stats anymore. It will require a
> > >>> set of stats from hw for the gact with PIPE action here. But if
> > >>> drivers are ignoring this action, they can't have specific stats for
> > >>> it. Or am I missing something?
> > >>>
> > >>> So it is better for the drivers to reject the whole flow instead of
> > >>> simply ignoring it, and let vswitchd probe if it should or should
> > >>> not use this action.
> > >>
> > >> Please note that OVS does not probe features per interface, but does it
> > per datapath. So if it’s supported in pipe in tc software, we will use it. 
> > If the
> > driver rejects it, we will probably end up with the tc software rule only.
> > >
> > > Ah right. I remember it will pick 1 interface for testing and use
> > > those results everywhere, which then I don't know if it may or may not
> > > be a representor port or not. Anyhow, then it should use skip_sw, to
> > > try to probe for the offloading part. Otherwise I'm afraid tc sw will
> > > always accept this flow and trick the probing, yes.
> >
> > Well, it depends on how you look at it. In theory, we should be hardware
> > agnostic, meaning what if you have different hardware in your system? OVS
> > only supports global offload enablement.
> >
> > Tianyu how are you planning to support this from the OVS side? How would
> > you probe kernel and/or hardware support for this change?
>
> Currently in the test demo, I just extend gact with PIPE (previously only 
> SHOT as default and
> GOTO_CHAIN when chain exists), and then put such a gact with PIPE at the 
> first place of each
> filter which will be transacted with kernel tc.
>
> About the tc sw datapath mentioned, we don't have to make changes because 
> gact with PIPE
> has already been supported in current tc implementation and it could act like 
> a 'counter' And
> for the hardware we just need to ignore this PIPE and the stats of this 
> action will still be updated
> in kernel side and sent to userspace.

I can't see how the action would have stats from hw if the driver is
ignoring the action.

But maybe there was a misunderstanding here. I was reading more the
cxgb4 driver here and AFAICT this patch will skip PIPE on the action
validation, but not actually skip the action entirely. Then it will
hit cxgb4_process_flow_actions() and maybe the driver will the right
thing with a dummy action out of the blue. Was this your expectation,
to just ignore it in the validation step, and let it fall through
through the driver? If yes, the comments are misleading, as the NICs
will have to process the packets.

>
> I agree with that the unsupported actions should be rejected by drivers, so 
> may another approach
> could work without ignoring PIPE in all the related drivers, that we directly 
> make put the flower stats
> from driver into the socket which is used to transact with userspace and 
> userspace(e.g. OVS) update
> the flow stats using this stats instead of the parsing the action stats. How 
> do you think of this?

I don't understand this approach. Can you please rephrase?

Thanks,
Marcelo

>
> Cheers,
> Tianyu
> >
> > //Eelco
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 3/3] dpif-netdev: Rename pmd_info_show_rxq variables.

2022-11-30 Thread Kevin Traynor
There are some similar readings taken for pmds and Rx queues
in this function and a few of the variable names are ambiguous.

Improve the readability of the code by updating some variables
names to indicate that they are readings related to the pmd.

Signed-off-by: Kevin Traynor 
---
 lib/dpif-netdev.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 74d265a0b..cb3eb02e5 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -881,6 +881,6 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 struct rxq_poll *list;
 size_t n_rxq;
-uint64_t total_cycles = 0;
-uint64_t busy_cycles = 0;
+uint64_t total_pmd_cycles = 0;
+uint64_t busy_pmd_cycles = 0;
 uint64_t total_rxq_proc_cycles = 0;
 unsigned int intervals;
@@ -895,15 +895,15 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 
 /* Get the total pmd cycles for an interval. */
-atomic_read_relaxed(>intrvl_cycles, _cycles);
+atomic_read_relaxed(>intrvl_cycles, _pmd_cycles);
 /* Calculate how many intervals are to be used. */
 intervals = DIV_ROUND_UP(secs,
  PMD_INTERVAL_LEN / INTERVAL_USEC_TO_SEC);
 /* Estimate the cycles to cover all intervals. */
-total_cycles *= intervals;
-busy_cycles = get_interval_values(pmd->busy_cycles_intrvl,
-  >intrvl_idx,
-  intervals);
-if (busy_cycles > total_cycles) {
-busy_cycles = total_cycles;
+total_pmd_cycles *= intervals;
+busy_pmd_cycles = get_interval_values(pmd->busy_cycles_intrvl,
+  >intrvl_idx,
+  intervals);
+if (busy_pmd_cycles > total_pmd_cycles) {
+busy_pmd_cycles = total_pmd_cycles;
 }
 
@@ -922,7 +922,7 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 ? "(enabled) " : "(disabled)");
 ds_put_format(reply, "  pmd usage: ");
-if (total_cycles) {
+if (total_pmd_cycles) {
 ds_put_format(reply, "%2"PRIu64"",
-  rxq_proc_cycles * 100 / total_cycles);
+  rxq_proc_cycles * 100 / total_pmd_cycles);
 ds_put_cstr(reply, " %");
 } else {
@@ -934,12 +934,12 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 if (n_rxq > 0) {
 ds_put_cstr(reply, "  overhead: ");
-if (total_cycles) {
+if (total_pmd_cycles) {
 uint64_t overhead_cycles = 0;
 
-if (total_rxq_proc_cycles < busy_cycles) {
-overhead_cycles = busy_cycles - total_rxq_proc_cycles;
+if (total_rxq_proc_cycles < busy_pmd_cycles) {
+overhead_cycles = busy_pmd_cycles - total_rxq_proc_cycles;
 }
 ds_put_format(reply, "%2"PRIu64" %%",
-  overhead_cycles * 100 / total_cycles);
+  overhead_cycles * 100 / total_pmd_cycles);
 } else {
 ds_put_cstr(reply, "NOT AVAIL");
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/3] dpif-netdev: Make pmd-rxq-show time configurable.

2022-11-30 Thread Kevin Traynor
pmd-rxq-show shows the Rx queue to pmd assignments as well as the
pmd usage of each Rx queue.

Up until now a tail length of 60 seconds pmd usage was shown
for each Rx queue, as this is the value used during rebalance
to avoid any spike effects.

When debugging or tuning, it is also convenient to display the
pmd usage of an Rx queue over a shorter time frame, so any changes
config or traffic that impact pmd usage can be evaluated more quickly.

A parameter is added that allows pmd-rxq-show stats pmd usage to
be shown for a shorter time frame. Values are rounded up to the
nearest 5 seconds as that is the measurement granularity and the value
used is displayed. e.g.

$ ovs-appctl dpif-netdev/pmd-rxq-show -secs 5
 Displaying last 5 seconds pmd usage %
 pmd thread numa_id 0 core_id 4:
   isolated : false
   port: dpdk0queue-id:  0 (enabled)   pmd usage: 95 %
   overhead:  4 %

The default time frame has not changed and the maximum value
is limited to the maximum stored tail length (60 seconds).

Signed-off-by: Kevin Traynor 
---
v2:
- fixed comments from David's review
- Squashed new unit tests into this patch
- docs can be squashed later
---
 lib/dpif-netdev-private-thread.h |  2 +-
 lib/dpif-netdev.c| 98 
 tests/pmd.at | 62 
 3 files changed, 138 insertions(+), 24 deletions(-)

diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h
index 4472b199d..1ec3cd794 100644
--- a/lib/dpif-netdev-private-thread.h
+++ b/lib/dpif-netdev-private-thread.h
@@ -115,5 +115,5 @@ struct dp_netdev_pmd_thread {
 
 /* Write index for 'busy_cycles_intrvl'. */
-unsigned int intrvl_idx;
+atomic_count intrvl_idx;
 /* Busy cycles in last PMD_INTERVAL_MAX intervals. */
 atomic_ullong *busy_cycles_intrvl;
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 2c08a71c8..74d265a0b 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -161,9 +161,11 @@ static struct odp_support dp_netdev_support = {
 /* Time in microseconds of the interval in which rxq processing cycles used
  * in rxq to pmd assignments is measured and stored. */
-#define PMD_INTERVAL_LEN 1000LL
+#define PMD_INTERVAL_LEN 500LL
+/* For converting PMD_INTERVAL_LEN to secs. */
+#define INTERVAL_USEC_TO_SEC 100LL
 
 /* Number of intervals for which cycles are stored
  * and used during rxq to pmd assignment. */
-#define PMD_INTERVAL_MAX 6
+#define PMD_INTERVAL_MAX 12
 
 /* Time in microseconds to try RCU quiescing. */
@@ -429,5 +431,5 @@ struct dp_netdev_rxq {
   queue doesn't need to be pinned to a
   particular core. */
-unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
+atomic_count intrvl_idx;   /* Write index for 'cycles_intrvl'. */
 struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
 bool is_vhost; /* Is rxq of a vhost port. */
@@ -616,4 +618,7 @@ dp_netdev_rxq_set_intrvl_cycles(struct dp_netdev_rxq *rx,
 static uint64_t
 dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx);
+static uint64_t
+get_interval_values(atomic_ullong *source, atomic_count *cur_idx,
+int num_to_read);
 static void
 dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd,
@@ -870,5 +875,6 @@ sorted_poll_list(struct dp_netdev_pmd_thread *pmd, struct 
rxq_poll **list,
 
 static void
-pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
+pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd,
+  int secs)
 {
 if (pmd->core_id != NON_PMD_CORE_ID) {
@@ -878,4 +884,5 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
 uint64_t busy_cycles = 0;
 uint64_t total_rxq_proc_cycles = 0;
+unsigned int intervals;
 
 ds_put_format(reply,
@@ -889,13 +896,12 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
 /* Get the total pmd cycles for an interval. */
 atomic_read_relaxed(>intrvl_cycles, _cycles);
+/* Calculate how many intervals are to be used. */
+intervals = DIV_ROUND_UP(secs,
+ PMD_INTERVAL_LEN / INTERVAL_USEC_TO_SEC);
 /* Estimate the cycles to cover all intervals. */
-total_cycles *= PMD_INTERVAL_MAX;
-
-for (int j = 0; j < PMD_INTERVAL_MAX; j++) {
-uint64_t cycles;
-
-atomic_read_relaxed(>busy_cycles_intrvl[j], );
-busy_cycles += cycles;
-}
+total_cycles *= intervals;
+busy_cycles = get_interval_values(pmd->busy_cycles_intrvl,
+  >intrvl_idx,
+  intervals);
 if (busy_cycles > total_cycles) {
 busy_cycles = total_cycles;
@@ -907,7 +913,7 @@ 

[ovs-dev] [PATCH v3 2/3] docs: Add documention for pmd-rxq-show secs parameter.

2022-11-30 Thread Kevin Traynor
Add description of new '-secs' parameter in docs. Also, add to NEWS as
it is a user facing change.

Signed-off-by: Kevin Traynor 
---
v3:
- My prediction that NEWS would cause a conflict was correct but
  I did not think it would be when submitting the patch :s
- Rebased NEWS entry
---
 Documentation/topics/dpdk/pmd.rst | 23 ++-
 NEWS  |  3 +++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst 
b/Documentation/topics/dpdk/pmd.rst
index b259cc8b3..88457f366 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -102,10 +102,18 @@ core cycles for each Rx queue::
 .. note::
 
-   A history of one minute is recorded and shown for each Rx queue to allow for
-   traffic pattern spikes. Any changes in the Rx queue's PMD core cycles usage,
-   due to traffic pattern or reconfig changes, will take one minute to be fully
-   reflected in the stats.
+   By default a history of one minute is recorded and shown for each Rx queue
+   to allow for traffic pattern spikes. Any changes in the Rx queue's PMD core
+   cycles usage, due to traffic pattern or reconfig changes, will take one
+   minute to be fully reflected in the stats by default.
 
-   .. versionchanged:: 2.6.0
+PMD thread usage of an Rx queue can be displayed for a shorter period of time,
+from the last 5 seconds up to the default 60 seconds in 5 second steps.
+
+To see the port/Rx queue assignment and the last 5 secs of measured usage
+history of PMD core cycles for each Rx queue::
+
+$ ovs-appctl dpif-netdev/pmd-rxq-show -secs 5
+
+.. versionchanged:: 2.6.0
 
   The ``pmd-rxq-show`` command was added in OVS 2.6.0.
@@ -116,4 +124,9 @@ core cycles for each Rx queue::
cycles inherently consumed by the OVS PMD processing loop.
 
+.. versionchanged:: 3.1.0
+
+  The ``-secs`` parameter was added to the dpif-netdev/pmd-rxq-show
+  command.
+
 Rx queue to PMD assignment takes place whenever there are configuration changes
 or can be triggered by using::
diff --git a/NEWS b/NEWS
index f6caf1ca7..a2ff388b0 100644
--- a/NEWS
+++ b/NEWS
@@ -31,4 +31,7 @@ Post-v3.0.0
  determined.  Previously it was 10 Mbps.  Values can still be overridden
  by specifying 'max-rate' or '[r]stp-path-cost' accordingly.
+   - Userspace datapath:
+ * Add '-secs' argument to appctl 'dpif-netdev/pmd-rxq-show' to show
+   the pmd usage of an Rx queue over a configurable time period.
 
 
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 ovn] binding: add the capability to apply QoS for lsp

2022-11-30 Thread Lorenzo Bianconi
Introduce the capability to apply QoS rules for logical switch ports
claimed by ovn-controller. Rely on shash instead of sset for
egress_ifaces.

Acked-by: Mark Michelson 
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2129742
Signed-off-by: Lorenzo Bianconi 
---
Changes since v2:
- fix qos configuration restarting ovn-controller
Changes since v1:
- improve ovs interface lookup
- improve system-tests
---
 controller/binding.c| 155 ++--
 controller/binding.h|   5 +-
 controller/ovn-controller.c |  15 ++--
 tests/system-ovn.at |  48 +++
 4 files changed, 156 insertions(+), 67 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 5df62baef..53520263c 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -115,6 +115,7 @@ struct qos_queue {
 uint32_t min_rate;
 uint32_t max_rate;
 uint32_t burst;
+char *port_name;
 };
 
 void
@@ -147,25 +148,50 @@ static void update_lport_tracking(const struct 
sbrec_port_binding *pb,
   struct hmap *tracked_dp_bindings,
   bool claimed);
 
+static bool is_lport_vif(const struct sbrec_port_binding *pb);
+
+static struct qos_queue *
+get_qos_map_entry(struct hmap *queue_map, const char *name)
+{
+struct qos_queue *qos_node;
+HMAP_FOR_EACH (qos_node, node, queue_map) {
+if (!strcmp(qos_node->port_name, name)) {
+return qos_node;
+}
+}
+
+return NULL;
+}
+
 static void
-get_qos_params(const struct sbrec_port_binding *pb, struct hmap *queue_map)
+update_qos_params(const struct sbrec_port_binding *pb, struct hmap *queue_map)
 {
 uint32_t min_rate = smap_get_int(>options, "qos_min_rate", 0);
 uint32_t max_rate = smap_get_int(>options, "qos_max_rate", 0);
 uint32_t burst = smap_get_int(>options, "qos_burst", 0);
 uint32_t queue_id = smap_get_int(>options, "qdisc_queue_id", 0);
 
+struct qos_queue *node = get_qos_map_entry(queue_map, pb->logical_port);
+
 if ((!min_rate && !max_rate && !burst) || !queue_id) {
 /* Qos is not configured for this port. */
+if (node) {
+ hmap_remove(queue_map, >node);
+ free(node->port_name);
+ free(node);
+}
 return;
 }
 
-struct qos_queue *node = xzalloc(sizeof *node);
-hmap_insert(queue_map, >node, hash_int(queue_id, 0));
+if (!node) {
+node = xzalloc(sizeof *node);
+hmap_insert(queue_map, >node, hash_int(queue_id, 0));
+node->port_name = xstrdup(pb->logical_port);
+}
+node->queue_id = queue_id;
 node->min_rate = min_rate;
 node->max_rate = max_rate;
 node->burst = burst;
-node->queue_id = queue_id;
 }
 
 static const struct ovsrec_qos *
@@ -191,7 +217,7 @@ static bool
 set_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
  const struct ovsrec_port_table *port_table,
  const struct ovsrec_qos_table *qos_table,
- struct sset *egress_ifaces)
+ struct shash *egress_ifaces)
 {
 if (!ovs_idl_txn) {
 return false;
@@ -206,11 +232,11 @@ set_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
 size_t count = 0;
 
 OVSREC_PORT_TABLE_FOR_EACH (port, port_table) {
-if (sset_contains(egress_ifaces, port->name)) {
+if (shash_find(egress_ifaces, port->name)) {
 ovsrec_port_set_qos(port, noop_qos);
 count++;
 }
-if (sset_count(egress_ifaces) == count) {
+if (shash_count(egress_ifaces) == count) {
 break;
 }
 }
@@ -236,7 +262,8 @@ set_qos_type(struct netdev *netdev, const char *type)
 }
 
 static void
-setup_qos(const char *egress_iface, struct hmap *queue_map)
+setup_qos(const char *egress_iface,  const char *logical_port,
+  struct hmap *queue_map)
 {
 static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
 struct netdev *netdev_phy;
@@ -281,7 +308,7 @@ setup_qos(const char *egress_iface, struct hmap *queue_map)
  *   a configuration setting.
  *
  * - Otherwise leave the qdisc alone. */
-if (hmap_is_empty(queue_map)) {
+if (!get_qos_map_entry(queue_map, logical_port)) {
 if (!strcmp(qdisc_type, OVN_QOS_TYPE)) {
 set_qos_type(netdev_phy, "");
 }
@@ -338,6 +365,10 @@ setup_qos(const char *egress_iface, struct hmap *queue_map)
 continue;
 }
 
+if (strcmp(sb_info->port_name, logical_port)) {
+continue;
+}
+
 smap_clear(_details);
 smap_add_format(_details, "min-rate", "%d", sb_info->min_rate);
 smap_add_format(_details, "max-rate", "%d", sb_info->max_rate);
@@ -354,11 +385,12 @@ setup_qos(const char *egress_iface, struct hmap 
*queue_map)
 netdev_close(netdev_phy);
 }
 
-static void
+void
 destroy_qos_map(struct hmap *qos_map)
 {
 struct qos_queue *qos_queue;
 HMAP_FOR_EACH_POP 

[ovs-dev] mod_vlan_vid always pushes new vlan header

2022-11-30 Thread Thomas Lee
Hi, ovs
 
   I used 2.17.2 and I found mod_vlan_vid pushed new vlan header no matter if 
there was vlan_tci=0x1000/0x1000 match in the flow which indicated in 
https://mail.openvswitch.org/pipermail/ovs-dev/2018-August/351375.html
 
My flow is:
cookie=0x186ae, duration=118.947s, table=6, n_packets=116, n_bytes=11832, 
priority=10032,ip,reg2=0/0xfff,vlan_tci=0x1000/0x1000,nw_dst=30.1.2.31 
actions=mod_dl_dst:02:06:1a:33:06:81,mod_dl_src:02:06:1a:33:04:81,mod_vlan_vid:200,output:"1.x2"
 
trace shows:
 6. ip,reg2=0/0xfff,vlan_tci=0x1000/0x1000,nw_dst=30.1.2.31, priority 10032, 
cookie 0x186ae
set_field:02:06:1a:33:06:81->eth_dst
set_field:02:06:1a:33:04:81->eth_src
push_vlan:0x8100
set_field:4296->vlan_vid
output:3
Final flow: 
icmp,reg0=0x1,reg1=0x1,in_port=2,dl_vlan=200,dl_vlan_pcp=0,dl_vlan1=100,dl_vlan_pcp1=0,vlan_tci2=0x,dl_src=02:06:1a:33:04:81,dl_dst=02:06:1a:33:06:81,nw_src=30.1.1.10,nw_dst=30.1.2.31,nw_tos=0,nw_ecn=0,nw_ttl=64,icmp_type=8,icmp_code=0
Megaflow: 
recirc_id=0,eth,ip,in_port=2,dl_vlan=100,dl_vlan_pcp=0,dl_src=02:06:1a:33:05:81,dl_dst=02:06:1a:33:04:81,nw_dst=30.1.2.31,nw_frag=no
Datapath actions: 
set(eth(src=02:06:1a:33:04:81,dst=02:06:1a:33:06:81)),push_vlan(vid=200,pcp=0),4
 
Please help. Thanks!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 ovn] binding: add the capability to apply QoS for lsp

2022-11-30 Thread Lorenzo Bianconi
> On Thu, Nov 24, 2022 at 4:33 PM Lorenzo Bianconi
>  wrote:
> >
> > Introduce the capability to apply QoS rules for logical switch ports
> > claimed by ovn-controller. Rely on shash instead of sset for
> > egress_ifaces.
> >
> > Acked-by: Mark Michelson 
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2129742
> > Signed-off-by: Lorenzo Bianconi 
> 
> Hi Lorenzo,

Hi Numan,

thx for the review

> 
> I did some testing with your patch.  I found a few issues and I've few
> comments/questions.
> 
> 1.  I configured qos on a logical port - sw0-port1
> I ran these commands
> ovn-nbctl set Logical_Switch_Port sw0-port1 options:qos_burst=500
> ovn-nbctl set Logical_Switch_Port sw0-port1 options:qos_max_rate=80
> ovn-nbctl set Logical_Switch_Port sw0-port1 options:qos_min_rate=40
> ovs-vsctl set interface sw0p1-p external_ids:ovn-egress-iface="true"
> 
> After this I see qdisc configured
> # tc qdisc show
> qdisc htb 1: dev eth1 root refcnt 13 r2q 10 default 0x1
> direct_packets_stat 0 direct_qlen 1000
> qdisc htb 1: dev sw0p1-p root refcnt 13 r2q 10 default 0x1
> direct_packets_stat 0 direct_qlen 1000
> 
> I see that qdisc is also configured on eth1 which is the geneve
> tunnel interface.  Is this expected ?
> 
> Since Qos is configured only on the logical port,  why does
> ovn-controller create qdiscs for the tunnel interface  ?

we always add tunnel interfaces to egress_ifaces in build_local_bindings().
I do not know exactly why it is done this way but this is not introduced by
this patch.

> 
>  2.  After (1), if I restart ovn-controller,  the qdiscs created
> earlier are deleted by ovn-controller
>  and not created again even though qos is configured on the logical port 
> and
>  external_ids:ovn-egress-iface=true is set on the ovs interface.
> 
>  Can you please cover this scenario in the system test.  Stop and
> start ovn-controller
>  and make sure that the qdiscs are configured properly as expected.

ack, I will fix it in v3.

Regards,
Lorenzo

> 
> > ---
> > Changes since v1:
> > - improve ovs interface lookup
> > - improve system-tests
> > ---
> >  controller/binding.c| 72 -
> >  controller/binding.h|  2 +-
> >  controller/ovn-controller.c |  9 +++--
> >  tests/system-ovn.at | 31 
> >  4 files changed, 84 insertions(+), 30 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 5df62baef..054fa3c18 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -115,6 +115,7 @@ struct qos_queue {
> >  uint32_t min_rate;
> >  uint32_t max_rate;
> >  uint32_t burst;
> > +char *port_name;
> >  };
> >
> >  void
> > @@ -147,6 +148,8 @@ static void update_lport_tracking(const struct 
> > sbrec_port_binding *pb,
> >struct hmap *tracked_dp_bindings,
> >bool claimed);
> >
> > +static bool is_lport_vif(const struct sbrec_port_binding *pb);
> > +
> >  static void
> >  get_qos_params(const struct sbrec_port_binding *pb, struct hmap *queue_map)
> >  {
> > @@ -166,6 +169,7 @@ get_qos_params(const struct sbrec_port_binding *pb, 
> > struct hmap *queue_map)
> >  node->max_rate = max_rate;
> >  node->burst = burst;
> >  node->queue_id = queue_id;
> > +node->port_name = xstrdup(pb->logical_port);
> >  }
> >
> >  static const struct ovsrec_qos *
> > @@ -191,7 +195,7 @@ static bool
> >  set_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
> >   const struct ovsrec_port_table *port_table,
> >   const struct ovsrec_qos_table *qos_table,
> > - struct sset *egress_ifaces)
> > + struct shash *egress_ifaces)
> >  {
> >  if (!ovs_idl_txn) {
> >  return false;
> > @@ -206,11 +210,11 @@ set_noop_qos(struct ovsdb_idl_txn *ovs_idl_txn,
> >  size_t count = 0;
> >
> >  OVSREC_PORT_TABLE_FOR_EACH (port, port_table) {
> > -if (sset_contains(egress_ifaces, port->name)) {
> > +if (shash_find(egress_ifaces, port->name)) {
> >  ovsrec_port_set_qos(port, noop_qos);
> >  count++;
> >  }
> > -if (sset_count(egress_ifaces) == count) {
> > +if (shash_count(egress_ifaces) == count) {
> >  break;
> >  }
> >  }
> > @@ -236,7 +240,8 @@ set_qos_type(struct netdev *netdev, const char *type)
> >  }
> >
> >  static void
> > -setup_qos(const char *egress_iface, struct hmap *queue_map)
> > +setup_qos(const char *egress_iface,  const char *logical_port,
> > +  struct hmap *queue_map)
> >  {
> >  static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5);
> >  struct netdev *netdev_phy;
> > @@ -338,6 +343,10 @@ setup_qos(const char *egress_iface, struct hmap 
> > *queue_map)
> >  continue;
> >  }
> >
> > +if (strcmp(sb_info->port_name, logical_port)) {
> > +   

Re: [ovs-dev] [RFC PATCH v2] dpdk: Update to use v22.11.

2022-11-30 Thread Thomas Monjalon
30/11/2022 16:30, Stokes, Ian:
> > Hi Ian,
> > 
> > Cc: Thomas and John
> > 
> > On Wed, Nov 23, 2022 at 12:52 PM Ian Stokes  wrote:
> > > diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> > > index 23c8bbb7a..0aa90e55e 100755
> > > --- a/.ci/linux-build.sh
> > > +++ b/.ci/linux-build.sh
> > > @@ -142,7 +142,7 @@ function install_dpdk()
> > >  fi
> > >  # No cache or version mismatch.
> > >  rm -rf dpdk-dir
> > > -wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz
> > > +wget https://git.dpdk.org/dpdk/snapshot/dpdk-$1.tar.xz
> > >  tar xvf dpdk-$1.tar.xz > /dev/null
> > >  DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/")
> > >  mv ${DIR_NAME} dpdk-dir
> > 
> > dpdk.org server experienced a quite heavy load this morning, with many
> > clients requesting git related operations through http.
> > I am not sure what or who triggered this, but I just remembered this patch 
> > here.
> > 
> > Downloading a rc4 tarball through dpdk.org cgit is to be avoided.
> > dpdk.org server does not cache this kind of operations, it is heavy on
> > cpu, memory and bandwidth.
> > 
> > 
> > I think OVS should switch to the github mirror, this would work for
> > both rc and final release tarballs.
> > IOW: wget https://github.com/DPDK/dpdk/archive/refs/tags/v$1.tar.gz
> > 
> > One drawback is that github does not seem to provide xz compressed
> > tarballs, so you would need to update the script further.
> > 
> 
> Thanks for the input David, I've posted a v3 but didn’t include this change 
> as I wanted to discuss a bit further. 
> 
> So we use DPDK.org in our ci but we also make reference to its use throughout 
> the OVS documentation when users wish to download DPDK, should this also be 
> replaced with github?
> 
> I guess to my mind if github is the preferred method to retrieve the repo 
> then does it raise the question of dpdk org in general providing the 
> releases? Is there a plan to transition completely to github for DPDK in the 
> future?
> 
> If the preferred method is to go ahead with github then thats OK on my side 
> but I think it would be better to keep that change in a separate patch along 
> with an explanation as to why github should be used rather than dpdk.org as 
> we've been using the current approach for a few years without issue.

fast.dpdk.org is fine and preferred.
git.dpdk.org should not be used for tarball download in general.
github.com/DPDK is OK for tarball download, including release candidates.



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/3] dpif-netdev: Rename pmd_info_show_rxq variables.

2022-11-30 Thread 0-day Robot
Bleep bloop.  Greetings Kevin Traynor, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


Patch skipped due to previous failure.

Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/3] docs: Add documention for pmd-rxq-show secs parameter.

2022-11-30 Thread 0-day Robot
Bleep bloop.  Greetings Kevin Traynor, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


git-am:
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 docs: Add documention for pmd-rxq-show secs parameter.
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


Patch skipped due to previous failure.

Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 3/3] dpif-netdev: Rename pmd_info_show_rxq variables.

2022-11-30 Thread Kevin Traynor
There are some similar readings taken for pmds and Rx queues
in this function and a few of the variable names are ambiguous.

Improve the readability of the code by updating some variables
names to indicate that they are readings related to the pmd.

Signed-off-by: Kevin Traynor 
---
 lib/dpif-netdev.c | 30 +++---
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a9da88728..bfb74e90e 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -882,6 +882,6 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 struct rxq_poll *list;
 size_t n_rxq;
-uint64_t total_cycles = 0;
-uint64_t busy_cycles = 0;
+uint64_t total_pmd_cycles = 0;
+uint64_t busy_pmd_cycles = 0;
 uint64_t total_rxq_proc_cycles = 0;
 unsigned int intervals;
@@ -896,15 +896,15 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 
 /* Get the total pmd cycles for an interval. */
-atomic_read_relaxed(>intrvl_cycles, _cycles);
+atomic_read_relaxed(>intrvl_cycles, _pmd_cycles);
 /* Calculate how many intervals are to be used. */
 intervals = DIV_ROUND_UP(secs,
  PMD_INTERVAL_LEN / INTERVAL_USEC_TO_SEC);
 /* Estimate the cycles to cover all intervals. */
-total_cycles *= intervals;
-busy_cycles = get_interval_values(pmd->busy_cycles_intrvl,
-  >intrvl_idx,
-  intervals);
-if (busy_cycles > total_cycles) {
-busy_cycles = total_cycles;
+total_pmd_cycles *= intervals;
+busy_pmd_cycles = get_interval_values(pmd->busy_cycles_intrvl,
+  >intrvl_idx,
+  intervals);
+if (busy_pmd_cycles > total_pmd_cycles) {
+busy_pmd_cycles = total_pmd_cycles;
 }
 
@@ -923,7 +923,7 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 ? "(enabled) " : "(disabled)");
 ds_put_format(reply, "  pmd usage: ");
-if (total_cycles) {
+if (total_pmd_cycles) {
 ds_put_format(reply, "%2"PRIu64"",
-  rxq_proc_cycles * 100 / total_cycles);
+  rxq_proc_cycles * 100 / total_pmd_cycles);
 ds_put_cstr(reply, " %");
 } else {
@@ -935,12 +935,12 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd,
 if (n_rxq > 0) {
 ds_put_cstr(reply, "  overhead: ");
-if (total_cycles) {
+if (total_pmd_cycles) {
 uint64_t overhead_cycles = 0;
 
-if (total_rxq_proc_cycles < busy_cycles) {
-overhead_cycles = busy_cycles - total_rxq_proc_cycles;
+if (total_rxq_proc_cycles < busy_pmd_cycles) {
+overhead_cycles = busy_pmd_cycles - total_rxq_proc_cycles;
 }
 ds_put_format(reply, "%2"PRIu64" %%",
-  overhead_cycles * 100 / total_cycles);
+  overhead_cycles * 100 / total_pmd_cycles);
 } else {
 ds_put_cstr(reply, "NOT AVAIL");
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 2/3] docs: Add documention for pmd-rxq-show secs parameter.

2022-11-30 Thread Kevin Traynor
Add description of new '-secs' parameter in docs. Also, add to NEWS as
it is a user facing change.

Signed-off-by: Kevin Traynor 
---
 Documentation/topics/dpdk/pmd.rst | 23 ++-
 NEWS  |  3 +++
 2 files changed, 21 insertions(+), 5 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst 
b/Documentation/topics/dpdk/pmd.rst
index b259cc8b3..88457f366 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -102,10 +102,18 @@ core cycles for each Rx queue::
 .. note::
 
-   A history of one minute is recorded and shown for each Rx queue to allow for
-   traffic pattern spikes. Any changes in the Rx queue's PMD core cycles usage,
-   due to traffic pattern or reconfig changes, will take one minute to be fully
-   reflected in the stats.
+   By default a history of one minute is recorded and shown for each Rx queue
+   to allow for traffic pattern spikes. Any changes in the Rx queue's PMD core
+   cycles usage, due to traffic pattern or reconfig changes, will take one
+   minute to be fully reflected in the stats by default.
 
-   .. versionchanged:: 2.6.0
+PMD thread usage of an Rx queue can be displayed for a shorter period of time,
+from the last 5 seconds up to the default 60 seconds in 5 second steps.
+
+To see the port/Rx queue assignment and the last 5 secs of measured usage
+history of PMD core cycles for each Rx queue::
+
+$ ovs-appctl dpif-netdev/pmd-rxq-show -secs 5
+
+.. versionchanged:: 2.6.0
 
   The ``pmd-rxq-show`` command was added in OVS 2.6.0.
@@ -116,4 +124,9 @@ core cycles for each Rx queue::
cycles inherently consumed by the OVS PMD processing loop.
 
+.. versionchanged:: 3.1.0
+
+  The ``-secs`` parameter was added to the dpif-netdev/pmd-rxq-show
+  command.
+
 Rx queue to PMD assignment takes place whenever there are configuration changes
 or can be triggered by using::
diff --git a/NEWS b/NEWS
index ff77ee404..ab6834e24 100644
--- a/NEWS
+++ b/NEWS
@@ -24,4 +24,7 @@ Post-v3.0.0
If a user wishes to benefit from these fixes it is recommended to use
DPDK 21.11.2.
+   - Userspace datapath:
+ * Add '-secs' argument to appctl 'dpif-netdev/pmd-rxq-show' to show
+   the pmd usage of an Rx queue over a configurable time period.
 
 
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/3] dpif-netdev: Make pmd-rxq-show time configurable.

2022-11-30 Thread Kevin Traynor
pmd-rxq-show shows the Rx queue to pmd assignments as well as the
pmd usage of each Rx queue.

Up until now a tail length of 60 seconds pmd usage was shown
for each Rx queue, as this is the value used during rebalance
to avoid any spike effects.

When debugging or tuning, it is also convenient to display the
pmd usage of an Rx queue over a shorter time frame, so any changes
config or traffic that impact pmd usage can be evaluated more quickly.

A parameter is added that allows pmd-rxq-show stats pmd usage to
be shown for a shorter time frame. Values are rounded up to the
nearest 5 seconds as that is the measurement granularity and the value
used is displayed. e.g.

$ ovs-appctl dpif-netdev/pmd-rxq-show -secs 5
 Displaying last 5 seconds pmd usage %
 pmd thread numa_id 0 core_id 4:
   isolated : false
   port: dpdk0queue-id:  0 (enabled)   pmd usage: 95 %
   overhead:  4 %

The default time frame has not changed and the maximum value
is limited to the maximum stored tail length (60 seconds).

Signed-off-by: Kevin Traynor 

---
v2:
- fixed comments from David's review
- Squashed new unit tests into this patch
- docs can be squashed later
---
 lib/dpif-netdev-private-thread.h |  2 +-
 lib/dpif-netdev.c| 98 
 tests/pmd.at | 62 
 3 files changed, 138 insertions(+), 24 deletions(-)

diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h
index 4472b199d..1ec3cd794 100644
--- a/lib/dpif-netdev-private-thread.h
+++ b/lib/dpif-netdev-private-thread.h
@@ -115,5 +115,5 @@ struct dp_netdev_pmd_thread {
 
 /* Write index for 'busy_cycles_intrvl'. */
-unsigned int intrvl_idx;
+atomic_count intrvl_idx;
 /* Busy cycles in last PMD_INTERVAL_MAX intervals. */
 atomic_ullong *busy_cycles_intrvl;
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a45b46014..a9da88728 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -161,9 +161,11 @@ static struct odp_support dp_netdev_support = {
 /* Time in microseconds of the interval in which rxq processing cycles used
  * in rxq to pmd assignments is measured and stored. */
-#define PMD_INTERVAL_LEN 1000LL
+#define PMD_INTERVAL_LEN 500LL
+/* For converting PMD_INTERVAL_LEN to secs. */
+#define INTERVAL_USEC_TO_SEC 100LL
 
 /* Number of intervals for which cycles are stored
  * and used during rxq to pmd assignment. */
-#define PMD_INTERVAL_MAX 6
+#define PMD_INTERVAL_MAX 12
 
 /* Time in microseconds to try RCU quiescing. */
@@ -429,5 +431,5 @@ struct dp_netdev_rxq {
   queue doesn't need to be pinned to a
   particular core. */
-unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
+atomic_count intrvl_idx;   /* Write index for 'cycles_intrvl'. */
 struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
 bool is_vhost; /* Is rxq of a vhost port. */
@@ -617,4 +619,7 @@ dp_netdev_rxq_set_intrvl_cycles(struct dp_netdev_rxq *rx,
 static uint64_t
 dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx);
+static uint64_t
+get_interval_values(atomic_ullong *source, atomic_count *cur_idx,
+int num_to_read);
 static void
 dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd,
@@ -871,5 +876,6 @@ sorted_poll_list(struct dp_netdev_pmd_thread *pmd, struct 
rxq_poll **list,
 
 static void
-pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
+pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd,
+  int secs)
 {
 if (pmd->core_id != NON_PMD_CORE_ID) {
@@ -879,4 +885,5 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
 uint64_t busy_cycles = 0;
 uint64_t total_rxq_proc_cycles = 0;
+unsigned int intervals;
 
 ds_put_format(reply,
@@ -890,13 +897,12 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
 /* Get the total pmd cycles for an interval. */
 atomic_read_relaxed(>intrvl_cycles, _cycles);
+/* Calculate how many intervals are to be used. */
+intervals = DIV_ROUND_UP(secs,
+ PMD_INTERVAL_LEN / INTERVAL_USEC_TO_SEC);
 /* Estimate the cycles to cover all intervals. */
-total_cycles *= PMD_INTERVAL_MAX;
-
-for (int j = 0; j < PMD_INTERVAL_MAX; j++) {
-uint64_t cycles;
-
-atomic_read_relaxed(>busy_cycles_intrvl[j], );
-busy_cycles += cycles;
-}
+total_cycles *= intervals;
+busy_cycles = get_interval_values(pmd->busy_cycles_intrvl,
+  >intrvl_idx,
+  intervals);
 if (busy_cycles > total_cycles) {
 busy_cycles = total_cycles;
@@ -908,7 +914,7 @@ 

Re: [ovs-dev] [OVN v17 2/3] OVN Remote Port Mirroring: northd changes to sync NB and SB

2022-11-30 Thread Abhiram R N
Hi Mark,


On Tue, Nov 29, 2022 at 10:00 PM Mark Michelson  wrote:

> On 11/29/22 08:27, Abhiram R N wrote:
> > Hi Mark,
> >
> > Thanks for your review.
> > Please see replies inline below.
> >
> > On Tue, Nov 29, 2022 at 3:23 AM Mark Michelson  > > wrote:
> >
> > On 11/27/22 15:14, Abhiram R N wrote:
> >  > Changes which syncs the NB port mirrors with SB port mirrors.
> >  > Also test added to check the NB and SB sync
> >  >
> >  > Co-authored-by: Veda Barrenkala  > >
> >  > Signed-off-by: Veda Barrenkala  > >
> >  > Signed-off-by: Abhiram R N  > >
> >  > ---
> >  > v16-->v17: No changes
> >  >
> >  >   northd/en-northd.c   |   4 +
> >  >   northd/inc-proc-northd.c |   4 +
> >  >   northd/northd.c  | 172
> > +++
> >  >   northd/northd.h  |   2 +
> >  >   tests/ovn-northd.at   | 102
> > +++
> >  >   5 files changed, 284 insertions(+)
> >  >
> >  > diff --git a/northd/en-northd.c b/northd/en-northd.c
> >  > index 93891b0b7..66ecc6573 100644
> >  > --- a/northd/en-northd.c
> >  > +++ b/northd/en-northd.c
> >  > @@ -78,6 +78,8 @@ void en_northd_run(struct engine_node *node,
> > void *data)
> >  >   EN_OVSDB_GET(engine_get_input("NB_acl", node));
> >  >   input_data.nbrec_static_mac_binding_table =
> >  >   EN_OVSDB_GET(engine_get_input("NB_static_mac_binding",
> > node));
> >  > +input_data.nbrec_mirror_table =
> >  > +EN_OVSDB_GET(engine_get_input("NB_mirror", node));
> >  >
> >  >   input_data.sbrec_sb_global_table =
> >  >   EN_OVSDB_GET(engine_get_input("SB_sb_global", node));
> >  > @@ -109,6 +111,8 @@ void en_northd_run(struct engine_node *node,
> > void *data)
> >  >   EN_OVSDB_GET(engine_get_input("SB_chassis_private",
> node));
> >  >   input_data.sbrec_static_mac_binding_table =
> >  >   EN_OVSDB_GET(engine_get_input("SB_static_mac_binding",
> > node));
> >  > +input_data.sbrec_mirror_table =
> >  > +EN_OVSDB_GET(engine_get_input("SB_mirror", node));
> >  >
> >  >   northd_run(_data, data,
> >  >  eng_ctx->ovnnb_idl_txn,
> >  > diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> >  > index 73f230b2c..7b7b250f3 100644
> >  > --- a/northd/inc-proc-northd.c
> >  > +++ b/northd/inc-proc-northd.c
> >  > @@ -52,6 +52,7 @@ VLOG_DEFINE_THIS_MODULE(inc_proc_northd);
> >  >   NB_NODE(acl, "acl") \
> >  >   NB_NODE(logical_router, "logical_router") \
> >  >   NB_NODE(qos, "qos") \
> >  > +NB_NODE(mirror, "mirror") \
> >  >   NB_NODE(meter, "meter") \
> >  >   NB_NODE(meter_band, "meter_band") \
> >  >   NB_NODE(logical_router_port, "logical_router_port") \
> >  > @@ -94,6 +95,7 @@ VLOG_DEFINE_THIS_MODULE(inc_proc_northd);
> >  >   SB_NODE(logical_flow, "logical_flow") \
> >  >   SB_NODE(logical_dp_group, "logical_DP_group") \
> >  >   SB_NODE(multicast_group, "multicast_group") \
> >  > +SB_NODE(mirror, "mirror") \
> >  >   SB_NODE(meter, "meter") \
> >  >   SB_NODE(meter_band, "meter_band") \
> >  >   SB_NODE(datapath_binding, "datapath_binding") \
> >  > @@ -176,6 +178,7 @@ void inc_proc_northd_init(struct
> > ovsdb_idl_loop *nb,
> >  >   engine_add_input(_northd, _nb_acl, NULL);
> >  >   engine_add_input(_northd, _nb_logical_router, NULL);
> >  >   engine_add_input(_northd, _nb_qos, NULL);
> >  > +engine_add_input(_northd, _nb_mirror, NULL);
> >  >   engine_add_input(_northd, _nb_meter, NULL);
> >  >   engine_add_input(_northd, _nb_meter_band, NULL);
> >  >   engine_add_input(_northd, _nb_logical_router_port,
> NULL);
> >  > @@ -197,6 +200,7 @@ void inc_proc_northd_init(struct
> > ovsdb_idl_loop *nb,
> >  >   engine_add_input(_northd, _sb_encap, NULL);
> >  >   engine_add_input(_northd, _sb_port_group, NULL);
> >  >   engine_add_input(_northd, _sb_logical_dp_group, NULL);
> >  > +engine_add_input(_northd, _sb_mirror, NULL);
> >  >   engine_add_input(_northd, _sb_meter, NULL);
> >  >   engine_add_input(_northd, _sb_meter_band, NULL);
> >  >   engine_add_input(_northd, _sb_datapath_binding, NULL);
> >  > diff --git a/northd/northd.c b/northd/northd.c
> >  > index 040f46e1a..16739983c 100644
> >  > --- a/northd/northd.c
> >  > +++ b/northd/northd.c
> >  > @@ -3239,6 +3239,89 @@ ovn_port_update_sbrec_chassis(
> >  >   free(requested_chassis_sb);
> >  >   }
> >  >
> >  > +static void
> >  > 

Re: [ovs-dev] [PATCH v8] ovsdb-idl: Add the support to specify the uuid for row insert.

2022-11-30 Thread Ilya Maximets
On 11/29/22 01:32, Terry Wilson wrote:
> On Sun, Nov 27, 2022 at 9:56 PM mailto:num...@ovn.org>> 
> wrote:
> 
> From: Numan Siddique mailto:num...@ovn.org>>
> 
> ovsdb-server allows the OVSDB clients to specify the uuid for
> the row inserts [1].  Both the C IDL client library and Python
> IDL are missing this feature.  This patch adds this support.
> 
> In C IDL, for each schema table, a new function is generated -
> insert_persistent_uuid(txn, uuid) which can
> be used the clients to persist the uuid.
> 
> ovs-vsctl and other derivatives of ctl now supports the same
> in the generic 'create' command with the option "--id=".
> 
> In Python IDL, the uuid to persist can be specified in
> the Transaction.insert() function.
> 
> [1] - a529e3cd1f("ovsdb-server: Allow OVSDB clients to specify the UUID 
> for inserted rows.:)
> 
> Signed-off-by: Numan Siddique mailto:num...@ovn.org>>
> Acked-by: Adrian Moreno mailto:amore...@redhat.com>>
> Acked-by: Han Zhou mailto:hz...@ovn.org>>
> CC: twil...@redhat.com 
> CC: i.maxim...@ovn.org 
>
> Looks good to me, I'll work on adding some code to ovsdbapp that uses this as 
> well. Thanks!
> 
> Acked-by: Terry Wilson mailto:twil...@redhat.com>> 

Applied.  Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [v6] odp-execute: Add ISA implementation of set_masked IPv6 action

2022-11-30 Thread Emma Finn
This commit adds support for the AVX512 implementation of the
ipv6_set_addrs action as well as an AVX512 implementation of
updating the L4 checksums.

Signed-off-by: Emma Finn 

---
v6:
 - Added check for ipv6 extension headers.
v5:
  - Fixed load for ip6 src and dst mask for checksum check.
v4:
  - Reworked and moved check for checksum outside loop.
  - Code cleanup based on review from Eelco.
v3:
  - Added a runtime check for AVX512 vbmi.
v2:
  - Added check for availbility of s6_addr32 field of struct in6_addr.
  - Fixed network headers for freebsd builds.
---
---
 lib/odp-execute-avx512.c  | 217 ++
 lib/odp-execute-private.c |  14 +++
 lib/odp-execute-private.h |   1 +
 lib/packets.c |   2 +-
 lib/packets.h |   2 +
 5 files changed, 235 insertions(+), 1 deletion(-)

diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c
index 6c7713251..87dae6d05 100644
--- a/lib/odp-execute-avx512.c
+++ b/lib/odp-execute-avx512.c
@@ -20,6 +20,9 @@
 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #include "csum.h"
 #include "dp-packet.h"
@@ -28,6 +31,7 @@
 #include "odp-execute-private.h"
 #include "odp-netlink.h"
 #include "openvswitch/vlog.h"
+#include "packets.h"
 
 VLOG_DEFINE_THIS_MODULE(odp_execute_avx512);
 
@@ -75,6 +79,26 @@ BUILD_ASSERT_DECL(offsetof(struct ovs_key_ipv4, ipv4_tos) +
   MEMBER_SIZEOF(struct ovs_key_ipv4, ipv4_tos) ==
   offsetof(struct ovs_key_ipv4, ipv4_ttl));
 
+BUILD_ASSERT_DECL(offsetof(struct ovs_key_ipv6, ipv6_src) +
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_src) ==
+  offsetof(struct ovs_key_ipv6, ipv6_dst));
+
+BUILD_ASSERT_DECL(offsetof(struct ovs_key_ipv6, ipv6_dst) +
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_dst) ==
+  offsetof(struct ovs_key_ipv6, ipv6_label));
+
+BUILD_ASSERT_DECL(offsetof(struct ovs_key_ipv6, ipv6_label) +
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_label) ==
+  offsetof(struct ovs_key_ipv6, ipv6_proto));
+
+BUILD_ASSERT_DECL(offsetof(struct ovs_key_ipv6, ipv6_proto) +
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_proto) ==
+  offsetof(struct ovs_key_ipv6, ipv6_tclass));
+
+BUILD_ASSERT_DECL(offsetof(struct ovs_key_ipv6, ipv6_tclass) +
+  MEMBER_SIZEOF(struct ovs_key_ipv6, ipv6_tclass) ==
+  offsetof(struct ovs_key_ipv6, ipv6_hlimit));
+
 /* Array of callback functions, one for each masked operation. */
 odp_execute_action_cb impl_set_masked_funcs[__OVS_KEY_ATTR_MAX];
 
@@ -483,6 +507,193 @@ action_avx512_ipv4_set_addrs(struct dp_packet_batch 
*batch,
 }
 }
 
+#if HAVE_AVX512VBMI
+static inline uint16_t ALWAYS_INLINE
+__attribute__((__target__("avx512vbmi")))
+avx512_ipv6_sum_header(__m512i ip6_header)
+{
+__m256i v_zeros = _mm256_setzero_si256();
+__m512i v_shuf_src_dst = _mm512_setr_epi64(0x01, 0x02, 0x03, 0x04,
+   0xFF, 0xFF, 0xFF, 0xFF);
+
+/* Shuffle ip6 src and dst to beginning of register. */
+__m512i v_ip6_hdr_shuf = _mm512_permutexvar_epi64(v_shuf_src_dst,
+  ip6_header);
+
+/* Extract ip6 src and dst into smaller 256-bit wide register. */
+__m256i v_ip6_src_dst = _mm512_extracti64x4_epi64(v_ip6_hdr_shuf, 0);
+
+/* These two shuffle masks, v_swap16a and v_swap16b, are to shuffle the
+ * src and dst fields and add padding after each 16-bit value for the
+ * following carry over addition. */
+__m256i v_swap16a = _mm256_setr_epi16(0x0100, 0x, 0x0302, 0x,
+  0x0504, 0x, 0x0706, 0x,
+  0x0100, 0x, 0x0302, 0x,
+  0x0504, 0x, 0x0706, 0x);
+__m256i v_swap16b = _mm256_setr_epi16(0x0908, 0x, 0x0B0A, 0x,
+  0x0D0C, 0x, 0x0F0E, 0x,
+  0x0908, 0x, 0x0B0A, 0x,
+  0x0D0C, 0x, 0x0F0E, 0x);
+__m256i v_shuf_old1 = _mm256_shuffle_epi8(v_ip6_src_dst, v_swap16a);
+__m256i v_shuf_old2 = _mm256_shuffle_epi8(v_ip6_src_dst, v_swap16b);
+
+/* Add each part of the old and new headers together. */
+__m256i v_delta = _mm256_add_epi32(v_shuf_old1, v_shuf_old2);
+
+/* Perform horizontal add to go from 8x32-bits to 2x32-bits. */
+v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
+v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
+
+/* Shuffle 32-bit value from 3rd lane into first lane for final
+ * horizontal add. */
+__m256i v_swap32a = _mm256_setr_epi32(0x0, 0x4, 0xF, 0xF,
+  0xF, 0xF, 0xF, 0xF);
+
+v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
+v_delta = _mm256_hadd_epi32(v_delta, 

Re: [ovs-dev] [PATCH] ovs-dpctl-top: fix ovs-dpctl-top via pipe

2022-11-30 Thread Ilya Maximets
On 11/30/22 16:55, Ilya Maximets wrote:
> On 5/12/21 19:44, Timothy Redaelli wrote:
>> Currently it's not possible to use ovs-dpctl-top via pipe (eg:
>> ovs-dpctl dump-flows | ovs-dpctl-top --script --verbose) since Python3
>> doesn't allow to open a file (stdin in our case) in binary mode without
>> buffering enabled.
>>
>> This commit changes the behaviour in order to directly pass stdin to
>> flows_read instead of re-opening it without buffering.
>>
>> Signed-off-by: Timothy Redaelli 
>> ---
>>  utilities/ovs-dpctl-top.in | 6 +-
>>  1 file changed, 1 insertion(+), 5 deletions(-)
> 
> Applied.  Thanks!

Also, backported down to 2.17.

> 
> Best regards, Ilya Maximets.
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] odp-util: Fix reporting unknown keys as keys with bad length.

2022-11-30 Thread Ilya Maximets
On 11/29/22 15:33, Aaron Conole wrote:
> Ilya Maximets  writes:
> 
>> check_attr_len() currently reports all unknown keys as keys with bad
>> length.  For example, IPv6 extension headers are printed out like this
>> in flow dumps:
>>
>>   eth_type(0x86dd),ipv6(...)
>>   (bad key length 2, expected -1)(00 00/(bad mask length 2, expected -1)(00 
>> 00),
>>   icmpv6(type=0/0,code=0/0)
>>
>> However, since the key is unknown, the length check on it makes no
>> sense and should be ignored.  This will allow the unknown key to be
>> caught later by the format_unknown_key() function and printed in a
>> more user-friendly way:
>>
>>   eth_type(0x86dd),ipv6(...),key32(00 00/00 00),icmpv6(type=0/0,code=0/0)
>>
>> '32' here is the actual index of the key attribute, so we know
>> that it is unknown attribute #32 with the value/mask pair printed
>> out inside the parenthesis.
>>
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> LGTM.
> 
> Acked-by: Aaron Conole 

Thanks!  Applied and backported down to 2.17.

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovs-dpctl-top: fix ovs-dpctl-top via pipe

2022-11-30 Thread Ilya Maximets
On 5/12/21 19:44, Timothy Redaelli wrote:
> Currently it's not possible to use ovs-dpctl-top via pipe (eg:
> ovs-dpctl dump-flows | ovs-dpctl-top --script --verbose) since Python3
> doesn't allow to open a file (stdin in our case) in binary mode without
> buffering enabled.
> 
> This commit changes the behaviour in order to directly pass stdin to
> flows_read instead of re-opening it without buffering.
> 
> Signed-off-by: Timothy Redaelli 
> ---
>  utilities/ovs-dpctl-top.in | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)

Applied.  Thanks!

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] rhel: move conf.db to /var/lib/openvswitch, using symlinks

2022-11-30 Thread Ilya Maximets
On 11/14/22 20:41, Timothy Redaelli wrote:
> conf.db is by default at /etc/openvswitch, but it should be at
> /var/lib/openvswitch like on Debian or like ovnnb_db.db and ovnsb_db.db.
> 
> If conf.db already exists in /etc/openvswitch then it's moved to
> /var/lib/openvswitch.
> Symlinks are created for conf.db and .conf.db.~lock~ into /etc/openvswitch
> for backward compatibility.
> 
> Reported-at: https://bugzilla.redhat.com/1830857
> Reported-by: Yedidyah Bar David 
> Signed-off-by: Timothy Redaelli 
> ---
> v1 -> v2:
> - Use hugetlbfs group instead of openvswitch when the package is built
>   with dpdk (as reported by Flavio)
> ---
>  rhel/openvswitch-fedora.spec.in | 27 +++
>  1 file changed, 23 insertions(+), 4 deletions(-)

If that works for Fedora, then LGTM.  Applied.

Thanks!
Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev: Assume default link speed to be 10 Gbps instead of 100 Mbps.

2022-11-30 Thread Ilya Maximets
On 11/29/22 15:37, Mike Pattrick wrote:
> On Tue, Oct 25, 2022 at 12:38 PM Ilya Maximets  wrote:
>>
>> 100 Mbps was a fair assumption 13 years ago.  Modern days 10 Gbps seems
>> like a good value in case no information is available otherwise.
>>
>> The change mainly affects QoS which is currently limited to 100 Mbps if
>> the user didn't specify 'max-rate' and the card doesn't report the
>> speed or OVS doesn't have a predefined enumeration for the speed
>> reported by the NIC.
>>
>> Calculation of the path cost for STP/RSTP is also affected if OVS is
>> unable to determine the link speed.
>>
>> Lower link speed adapters are typically good at reporting their speed,
>> so chances for overshoot should be low.  But newer high-speed adapters,
>> for which there is no speed enumeration or if there are some other
>> issues, will not suffer that much.
>>
>> Signed-off-by: Ilya Maximets 
> 
> I think this is a reasonable change, 100MB has mostly been relegated
> to embedded devices for a long time.
> 
> Acked-by: Mike Pattrick 

Applied.  Thanks!

> 
> Slightly related, I noticed that we're missing support for 20 and
> 56Gbps, we should probably add those as well.

I've never see 56Gbps Ethernet NIC in practice, they tend to all be
Infiniband ones, but maybe I'm just wrong.   In any case, maybe you,
Adrian, can fix that as well while working on support for 25/50/etc
link speeds detection?  Should be simple enough.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/3] DPDK netdev code cleanup

2022-11-30 Thread Ilya Maximets
On 8/25/22 12:25, David Marchand wrote:
> Nothing earth-shattering in this series.
> 
> This is a followup after reviewing patches from Kevin: we can move
> netdev-dpdk related configuration in this code and remove unneeded stubs
> in the "dpdk" api in OVS (less code is always better :-)).


Thanks, David and Sunil!

Looks like this patch fell through the cracks at some
point.  Applied now.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1] lacp: modify the comment misspelling

2022-11-30 Thread Ilya Maximets
On 7/27/22 15:35, Mike Pattrick wrote:
> On Thu, Jun 23, 2022 at 6:32 AM yangchang  wrote:
>>
>> change 'negotations' to 'negotiations'
>>
>> Signed-off-by: yangchang 
> 
> Good catch!
> 
> Acked-by: Mike Pattrick 


Applied.  Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofp-msgs: Fix comment typo.

2022-11-30 Thread Ilya Maximets
On 11/25/22 18:26, Adrian Moreno wrote:
> 
> 
> On 5/24/22 15:04, mit...@outlook.com wrote:
>> From: Lin Huang 
>>
>> Fix comment typo.
>>
>> Signed-off-by: Lin Huang 
>> ---
>>   lib/ofp-msgs.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/lib/ofp-msgs.c b/lib/ofp-msgs.c
>> index 93aa81297..fdb898064 100644
>> --- a/lib/ofp-msgs.c
>> +++ b/lib/ofp-msgs.c
>> @@ -148,7 +148,7 @@ struct raw_instance {
>>   /* Information about a particular 'enum ofpraw'. */
>>   struct raw_info {
>>   /* All possible instantiations of this OFPRAW_* into OpenFlow headers. 
>> */
>> -    struct raw_instance *instances; /* min_version - max_version + 1 elems. 
>> */
>> +    struct raw_instance *instances; /* max_version - min_version + 1 elems. 
>> */
>>   uint8_t min_version;
>>   uint8_t max_version;
>>  
> 
> Good catch!
> Acked-by: Adrian Moreno 

Applied.  Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v2] dpdk: Update to use v22.11.

2022-11-30 Thread Stokes, Ian
> Hi Ian,
> 
> Cc: Thomas and John
> 
> On Wed, Nov 23, 2022 at 12:52 PM Ian Stokes  wrote:
> > diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> > index 23c8bbb7a..0aa90e55e 100755
> > --- a/.ci/linux-build.sh
> > +++ b/.ci/linux-build.sh
> > @@ -142,7 +142,7 @@ function install_dpdk()
> >  fi
> >  # No cache or version mismatch.
> >  rm -rf dpdk-dir
> > -wget https://fast.dpdk.org/rel/dpdk-$1.tar.xz
> > +wget https://git.dpdk.org/dpdk/snapshot/dpdk-$1.tar.xz
> >  tar xvf dpdk-$1.tar.xz > /dev/null
> >  DIR_NAME=$(tar -tf dpdk-$1.tar.xz | head -1 | cut -f1 -d"/")
> >  mv ${DIR_NAME} dpdk-dir
> 
> dpdk.org server experienced a quite heavy load this morning, with many
> clients requesting git related operations through http.
> I am not sure what or who triggered this, but I just remembered this patch 
> here.
> 
> Downloading a rc4 tarball through dpdk.org cgit is to be avoided.
> dpdk.org server does not cache this kind of operations, it is heavy on
> cpu, memory and bandwidth.
> 
> 
> I think OVS should switch to the github mirror, this would work for
> both rc and final release tarballs.
> IOW: wget https://github.com/DPDK/dpdk/archive/refs/tags/v$1.tar.gz
> 
> One drawback is that github does not seem to provide xz compressed
> tarballs, so you would need to update the script further.
> 

Thanks for the input David, I've posted a v3 but didn’t include this change as 
I wanted to discuss a bit further. 

So we use DPDK.org in our ci but we also make reference to its use throughout 
the OVS documentation when users wish to download DPDK, should this also be 
replaced with github?

I guess to my mind if github is the preferred method to retrieve the repo then 
does it raise the question of dpdk org in general providing the releases? Is 
there a plan to transition completely to github for DPDK in the future?

If the preferred method is to go ahead with github then thats OK on my side but 
I think it would be better to keep that change in a separate patch along with 
an explanation as to why github should be used rather than dpdk.org as we've 
been using the current approach for a few years without issue.

Thanks
Ian
> 
> --
> David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] dpdk: Update to use v22.11.

2022-11-30 Thread Ian Stokes
This commit add support to for DPDK v22.11, it includes the following
changes.

1. ci: Reduce DPDK compilation time.
2. system-dpdk: Update vhost tests to be compatible with DPDK 22.07.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=316528

3. system-dpdk: Update vhost tests to be compatible with DPDK 22.07.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=311332

4. netdev-dpdk: Report device bus specific information.
5. netdev-dpdk: Drop reference to Rx header split.

   http://patchwork.ozlabs.org/project/openvswitch/list/?series=321808

In addition documentation was also updated in this commit for use with
DPDK v22.11.

For credit all authors of the original commits to 'dpdk-latest' with the
above changes have been added as co-authors for this commit

Signed-off-by: David Marchand 
Co-authored-by: David Marchand 
Signed-off-by: Sunil Pai G 
Co-authored-by: Sunil Pai G 
Signed-off-by: Ian Stokes 

---
v2 -> v3
* Remove RFC status.
* Update debian control to use 22.11.

v1 -> v2
* Updated to use DPDK 22.11 rc4.

* Please Note: Although DPDK documentation has been updated in this patch
the resource has not been updated on the DPDK site as of yet, this will
be expected as part of DPDK 22.11 final release.

* The GitHub actions 'linux deb shared dpdk' is expected to fail with this
patch as DPDK 22.11 is not part of the package structure yet.
---
 .ci/linux-build.sh   |  7 ++-
 Documentation/faq/releases.rst   |  2 +-
 Documentation/intro/install/dpdk.rst | 16 +++---
 Documentation/topics/dpdk/phy.rst|  8 +--
 NEWS | 18 +--
 debian/control.in|  2 +-
 lib/netdev-dpdk.c| 24 +++--
 rhel/openvswitch-fedora.spec.in  |  2 +-
 tests/system-dpdk.at | 78 ++--
 9 files changed, 69 insertions(+), 88 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 23c8bbb7a..90eac5146 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -160,6 +160,11 @@ function install_dpdk()
 # meson verbose outputs.
 DPDK_OPTS="$DPDK_OPTS -Ddeveloper_mode=disabled"
 
+# OVS compilation and "normal" unit tests (run in the CI) do not depend on
+# any DPDK driver being present.
+# We can disable all drivers to save compilation time.
+DPDK_OPTS="$DPDK_OPTS -Ddisable_drivers=*/*"
+
 # Install DPDK using prefix.
 DPDK_OPTS="$DPDK_OPTS --prefix=$(pwd)/build"
 
@@ -228,7 +233,7 @@ fi
 
 if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then
 if [ -z "$DPDK_VER" ]; then
-DPDK_VER="21.11.2"
+DPDK_VER="22.11"
 fi
 install_dpdk $DPDK_VER
 fi
diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index ac0001cd5..e19f54c8f 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -233,7 +233,7 @@ Q: Are all the DPDK releases that OVS versions work with 
maintained?
 The latest information about DPDK stable and LTS releases can be found
 at `DPDK stable`_.
 
-.. _DPDK stable: http://doc.dpdk.org/guides-21.11/contributing/stable.html
+.. _DPDK stable: http://doc.dpdk.org/guides-22.11/contributing/stable.html
 
 Q: I get an error like this when I configure Open vSwitch:
 
diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index a284e6851..2193efddc 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -42,7 +42,7 @@ Build requirements
 In addition to the requirements described in :doc:`general`, building Open
 vSwitch with DPDK will require the following:
 
-- DPDK 21.11.2
+- DPDK 22.11
 
 - A `DPDK supported NIC`_
 
@@ -59,8 +59,8 @@ vSwitch with DPDK will require the following:
 
 Detailed system requirements can be found at `DPDK requirements`_.
 
-.. _DPDK supported NIC: https://doc.dpdk.org/guides-21.11/nics/index.html
-.. _DPDK requirements: 
https://doc.dpdk.org/guides-21.11/linux_gsg/sys_reqs.html
+.. _DPDK supported NIC: https://doc.dpdk.org/guides-22.11/nics/index.html
+.. _DPDK requirements: 
https://doc.dpdk.org/guides-22.11/linux_gsg/sys_reqs.html
 
 .. _dpdk-install:
 
@@ -73,9 +73,9 @@ Install DPDK
 #. Download the `DPDK sources`_, extract the file and set ``DPDK_DIR``::
 
$ cd /usr/src/
-   $ wget https://fast.dpdk.org/rel/dpdk-21.11.2.tar.xz
-   $ tar xf dpdk-21.11.2.tar.xz
-   $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.2
+   $ wget https://fast.dpdk.org/rel/dpdk-22.11.tar.xz
+   $ tar xf dpdk-22.11.tar.xz
+   $ export DPDK_DIR=/usr/src/dpdk-22.11
$ cd $DPDK_DIR
 
 #. Configure and install DPDK using Meson
@@ -121,7 +121,7 @@ Install DPDK
 
 .. _DPDK sources: http://dpdk.org/rel
 .. _DPDK documentation:
-   https://doc.dpdk.org/guides-21.11/linux_gsg/build_dpdk.html
+   https://doc.dpdk.org/guides-22.11/linux_gsg/build_dpdk.html
 
 Install OVS
 ~~~
@@ -722,7 +722,7 @@ Limitations
   release notes`_.
 
 .. 

Re: [ovs-dev] [v5] odp-execute: Add ISA implementation of set_masked IPv6 action

2022-11-30 Thread Phelan, Michael


> -Original Message-
> From: Finn, Emma 
> Sent: Wednesday 30 November 2022 14:15
> To: Eelco Chaudron ; Phelan, Michael
> 
> Cc: Ilya Maximets ; d...@openvswitch.org; Van
> Haaren, Harry ; Stokes, Ian
> 
> Subject: RE: [ovs-dev] [v5] odp-execute: Add ISA implementation of
> set_masked IPv6 action
 
> > >> I'm also wondering why CI didn't catch that...
> > >>
> > >> There might be 2 reasons:
> > >>
> > >> 1. Actions autovalidator is not enabled in CI, or 2. CI system
> > >> doesn't have avx512vbmi.
> > >>
> > >> Michael, could you check that?
> > >
> > > Hi Ilya,
> > > The CI system does have avx512vbmi, however, the actions
> > > autovalidator is
> > never enabled for any of the tests.
> > >
> > > I could add a test to configure with the actions autovalidator if
> > > you think
> > this would be a good value add for the CI?
> >
> > I would suggest doing a run with and without all the avx512 auto
> > validators enabled at compile time.
> >
Hi Eelco,
I believe make check-local is run through the GitHub Build and Test job, Aaron 
you might correct me if I'm wrong on that. 
If this is the case then is there a need to do a check without AVX512 enabled 
on the Intel CI?

Kind Regards,
Michael.
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH v2] dpdk: Update to use v22.11.

2022-11-30 Thread Stokes, Ian
: Re: [ovs-dev] [RFC PATCH v2] dpdk: Update to use v22.11.
> 
> On 11/23/22 12:52, Ian Stokes wrote:
> > This commit add support to for DPDK v22.11, it includes the following
> > changes.
> >
> > 1. ci: Reduce DPDK compilation time.
> > 2. system-dpdk: Update vhost tests to be compatible with DPDK 22.07.
> >
> >http://patchwork.ozlabs.org/project/openvswitch/list/?series=316528
> >
> > 3. system-dpdk: Update vhost tests to be compatible with DPDK 22.07.
> >
> >http://patchwork.ozlabs.org/project/openvswitch/list/?series=311332
> >
> > 4. netdev-dpdk: Report device bus specific information.
> > 5. netdev-dpdk: Drop reference to Rx header split.
> >
> >http://patchwork.ozlabs.org/project/openvswitch/list/?series=321808
> >
> > In addition documentation was also updated in this commit for use with
> > DPDK v22.11.
> >
> > For credit all authors of the original commits to 'dpdk-latest' with the
> > above changes have been added as co-authors for this commit
> >
> > Signed-off-by: David Marchand 
> > Co-authored-by: David Marchand 
> > Signed-off-by: Sunil Pai G 
> > Co-authored-by: Sunil Pai G 
> > Signed-off-by: Ian Stokes 
> >
> > ---
> > v1 -> v2
> > * Updated to use DPDK 22.11 rc4.
> >
> > * Please Note: Although DPDK documentation has been updated in this patch
> > the resource has not been updated on the DPDK site as of yet, this will
> > be expected as part of DPDK 22.11 final release.
> >
> > * The GitHub actions 'linux deb shared dpdk' is expected to fail with this
> > patch as DPDK 22.11 is not part of the package structure yet.
> 
> This patch is missing the update in debian/control.in for the version
> of libdpdk-dev.  The line is "commented out" because the same file
> is used to generate control file for both DPDK and non-DPDK builds.
> 
> This won't fix the build issue, but the failure will be more obvious,
> i.e. dependency installation failure vs linkage issues.
> 
> For the actual path forward for the debian build, we discussed it in
> the past with Frode here:
>   https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/396124.html
> 
> So, what we need is Ubuntu to start packaging DPDK 22.11 in the dev
> branch for Ubuntu 23.04.  Then we can modify our CI scripts to use
> pre-release container images to test.
> 
> FWIW, Debian seems to already package DPDK 22.11 in the experimental
> branch:
>   https://packages.debian.org/experimental/libdpdk-dev
> 
> Frode, do you know the approximate timeline on when we could expect
> development container images of Ubuntu with DPDK 22.11 to be available?
> Or how to get one?
> 
> Alternative solution is to temporarily disable DPDK-enabled build for
> deb packages in GHA, until Ubuntu 23.04 is available.
> 
> There is no rush, AFAIK, we have a bit of time before the soft freeze,
> but it would be nice to have DPDK 22.11 support merged by the end of a
> year.

Thanks Ilya, I've updated the control file in v3 but left it commented for the 
moment until we have a decision here.

Thanks
Ian
> 
> Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/4] dpif-netdev: Make pmd-rxq-show time configurable.

2022-11-30 Thread David Marchand
On Wed, Nov 30, 2022 at 1:55 PM Kevin Traynor  wrote:
> >> @@ -1463,5 +1479,16 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int 
> >> argc, const char *argv[],
> >>   }
> >>   if (type == PMD_INFO_SHOW_RXQ) {
> >> -pmd_info_show_rxq(, pmd);
> >> +if (first_show_rxq) {
> >> +if (!secs || secs > max_secs) {
> >> +secs = max_secs;
> >> +} else {
> >> +secs = ROUND_UP(secs,
> >> +PMD_INTERVAL_LEN / 
> >> INTERVAL_USEC_TO_SEC);
> >> +}
> >> +ds_put_format(, "Displaying last %u seconds "
> >> +  "pmd usage %%\n", secs);
> >> +first_show_rxq = false;
> >
> > Always displaying this banner regardless of a pmd matching would make
> > this code smaller.
>
> If I got you right, wouldn't that lead to this?
>
> # ovs-appctl dpif-netdev/pmd-rxq-show -pmd 8
> Displaying last 60 seconds pmd usage %
> pmd thread numa_id 0 core_id 8:
>isolated : false
>port: myportqueue-id:  0 (enabled)   pmd usage:  0 %
>overhead:  0 %
>
> # ovs-appctl dpif-netdev/pmd-rxq-show -pmd 5
> Displaying last 60 seconds pmd usage %
> #
>
> I think it looks a bit off to give info about the stats for the case
> where there are then no pmds matching.

I have no strong opinion, just that I preferred simpler code.


>
> > Plus, secs can be computed once, before this per pmd loop.
> > Wdyt?
> >
>
> Secs are only calculated once as is. It could be moved outside the loop
> but then it might be calculated and not used for other commands or no
> pmd being displayed. So not sure it's worth it considering i think i'd
> still want to the keep the 'if (first_show_rxq)' for the banner based on
> above.
>
> Does it sound ok?

Yes, so keep it as is.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5] odp-execute: Add ISA implementation of set_masked IPv6 action

2022-11-30 Thread Finn, Emma

> 
>  On 11/25/22 17:23, Emma Finn wrote:
> > This commit adds support for the AVX512 implementation of the
> > ipv6_set_addrs action as well as an AVX512 implementation of
> > updating the L4 checksums.
> >
> > Signed-off-by: Emma Finn 
> >
> > ---
> > v5:
> >   - Fixed load for ip6 src and dst mask for checksum check.
> > v4:
> >   - Reworked and moved check for checksum outside loop.
> >   - Code cleanup based on review from Eelco.
> > v3:
> >   - Added a runtime check for AVX512 vbmi.
> > v2:
> >   - Added check for availbility of s6_addr32 field of struct in6_addr.
> >   - Fixed network headers for freebsd builds.
> > ---
> > ---
> >  lib/odp-execute-avx512.c  | 204
> > ++
> >  lib/odp-execute-private.c |  17 
> >  lib/odp-execute-private.h |   1 +
> >  3 files changed, 222 insertions(+)
> 
>  Hi, Emma.  Thanks for the patch!
>  I didn't review the actual AVX512 code, but I have a couple of
>  questions and nits inline.
> 
> >>>
> >>> Thanks Ilya.
> >>> My replies are inline below.
> >>>
> >>> 
> > +
> > +/* This function performs the same operation on each packet in
> > +the batch as
> > + * the scalar odp_set_ipv6() function. */
> 
>  I'm not sure if that statement is correct.  If you'll look at the
>  odp_set_ipv6() implementation and precisely at the
>  packet_set_ipv6() implementation, there is a check for the routing
>  extension header combined with the check for the fragmentation
>  header
>  (packet_rh_present) to prevent writing into L4 fields that do not
>  exist or, in case of routing header being present, checksum should
>  not be updated for the destination address.
> 
>  Could you point me to the AVX512 code that is responsible for that
> check?
> 
> >>> I think the AVX code is handling this case the same as scalar and
> >>> also I
> >> cannot reproduce a failure with the autovalidator.
> >>> If I am following the scalar code correctly, you're right. If there
> >>> is a routing
> >> extension header present, for the dst address no checksum will happen.
> >>> But similarly for src address, a checksum won't happen.
> >>> As packet_update_csum128() will only do a checksum if ip6_nxt is
> >>> UPD,TCP
> >> or ICMPv6. Which won't be the case if any extension header is present.
> >>
> >> Not really, the 'proto' argument in this function is one of the
> >> results of
> >> packet_rh_present() that iterates over all the extension headers and
> >> takes the protocol number from the last one.  So, extension headers
> >> are jumped over this way.
> >>
> >>> Similarly in the AVX code, l4 checksum will only happen if ip6_nxt
> >>> is
> >> UPD,TCP or ICMPv6, i.e no extension header is present.
> >>> So I think this case is covered if I'm not missing any corner cases?
> >>
> >> I didn't read the AVX code carefully enough to confirm that, but it
> >> is not really a correct behavior as extension headers should
> >> generally be just ignored except for fragmentation header and the
> routing header.  So, the logic is:
> >>
> >> - If the fragmentation header is present and it is a 'later'
> >>   fragment - skip the checksum as there is no L4 header in
> >>   the packet.  For the 'first' fragment the checksum should
> >>   be re-calculated.
> >>
> >> - If the routing header with non-zero segments_left is present
> >>   then update of the destination address should not be reflected
> >>   in the checksum.  Update of the source address should still
> >>   trigger the checksum update.  This is because the original
> >>   packet checksum is calculated with the destination address
> >>   taken from the last segment of the routing header.
> >>
> >> - In all other cases, extension headers should be just ignored
> >>   and the checksum should be updated.
> >>
> >> I'm not sure if that logic is covering all the cases, but that is
> >> what scalar code is doing.
> >>
> >>> Have you been able to see a failure with autovalidator ?
> >>
> >> Yes, there is a failure on a system test:
> >>
> >> 9. system-traffic.at:229: testing datapath - ping6 between two ports
> >> with header modify ...
> >>
> >> 2022-11-
> 28T17:27:13.067Z|00107|dpif_lookup_avx512_gather|INFO|Using
> >> non-specialized AVX512 lookup for subtable (4,5) and possibly oth ers.
> >> 2022-11-28T17:27:13.389Z|00108|odp_execute_impl|ERR|Autovalidation
> of
> >> avx512 failed. Details:
> >> Packet: 0
> >> Action : set(ipv6(dst=fc00::2))
> >> Good hex:
> >>   e4 11 22 33 44 54 e4 11-22 33 44 55 86 dd 60 06
> >> 0010  01 5d 05 b0 2c 40 fc 00-00 00 00 00 00 00 00 00
> >> 0020  00 00 00 00 00 01 fc 00-00 00 00 00 00 00 00 00
> >> 0030  00 00 00 00 00 02 3a 00-00 01 81 05 2f 0d 80 00
> >> 0040  0b bc 39 39 00 01 71 ef-84 63 00 00 00 00 13 ed
> >> 0050  05 00 00 00 00 00 10 11-12 13 14 15 16 17 18 19 <...> Test 

Re: [ovs-dev] [PATCH 1/4] dpif-netdev: Make pmd-rxq-show time configurable.

2022-11-30 Thread Kevin Traynor

On 22/11/2022 14:32, David Marchand wrote:

On Wed, Oct 5, 2022 at 2:53 PM Kevin Traynor  wrote:


pmd-rxq-show shows the Rx queue to pmd assignments as well as the
pmd usage of each Rx queue.

Up until now a tail length of 60 seconds pmd usage was shown
for each Rx queue, as this is the value used during rebalance
to avoid any spike affects.


effects*



When debugging or tuning, it is also convienent to display the


convenient*


pmd usage of an Rx queue over a shorter time frame, so any changes
config or traffic that impact pmd usage can be evaulated more quickly.


evaluated*



fixed spellings :/



A parameter is added that allows pmd-rxq-show stats pmd usage to
be shown for a shorter time frame. Values are rounded up to the
nearest 5 seconds as that is the measurement granularity and the value
used is displayed. e.g.

$ ovs-appctl dpif-netdev/pmd-rxq-show -secs 5
  Displaying last 5 seconds pmd usage %
  pmd thread numa_id 0 core_id 4:
isolated : false
port: dpdk0queue-id:  0 (enabled)   pmd usage: 95 %
overhead:  4 %

The default time frame has not changed and the maximum value
is limited to the maximum stored tail length (60 seconds).

Signed-off-by: Kevin Traynor 


I was expecting the doc and test update as part of this patch.
Not a big deal if you prefer it separate in patch 2/3.



I usually put update to NEWS in different patch for review, because it 
keeps getting updates and then the patch won't apply cleanly for reviewers.


I put the tests separately to distinguish old tests updated and tests 
for the new functionality.


I'm fine with squashing all of these, but will leave the docs at least 
separate for now to avoid conflicts for reviewers in a few weeks time.



Overall, the series lgtm, I have some comments on this first patch, see below.



Thanks for reviewing.




---
  lib/dpif-netdev-private-thread.h |  2 +-
  lib/dpif-netdev.c| 91 
  tests/pmd.at |  9 
  3 files changed, 80 insertions(+), 22 deletions(-)

diff --git a/lib/dpif-netdev-private-thread.h b/lib/dpif-netdev-private-thread.h
index 4472b199d..1ec3cd794 100644
--- a/lib/dpif-netdev-private-thread.h
+++ b/lib/dpif-netdev-private-thread.h
@@ -115,5 +115,5 @@ struct dp_netdev_pmd_thread {

  /* Write index for 'busy_cycles_intrvl'. */
-unsigned int intrvl_idx;
+atomic_count intrvl_idx;
  /* Busy cycles in last PMD_INTERVAL_MAX intervals. */
  atomic_ullong *busy_cycles_intrvl;
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a45b46014..a4e44b657 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -161,9 +161,11 @@ static struct odp_support dp_netdev_support = {
  /* Time in microseconds of the interval in which rxq processing cycles used
   * in rxq to pmd assignments is measured and stored. */
-#define PMD_INTERVAL_LEN 1000LL
+#define PMD_INTERVAL_LEN 500LL
+/* For converting PMD_INTERVAL_LEN to secs. */
+#define INTERVAL_USEC_TO_SEC 100LL

  /* Number of intervals for which cycles are stored
   * and used during rxq to pmd assignment. */
-#define PMD_INTERVAL_MAX 6
+#define PMD_INTERVAL_MAX 12

  /* Time in microseconds to try RCU quiescing. */
@@ -429,5 +431,5 @@ struct dp_netdev_rxq {
queue doesn't need to be pinned to a
particular core. */
-unsigned intrvl_idx;   /* Write index for 'cycles_intrvl'. */
+atomic_count intrvl_idx;   /* Write index for 'cycles_intrvl'. */
  struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
  bool is_vhost; /* Is rxq of a vhost port. */
@@ -617,4 +619,7 @@ dp_netdev_rxq_set_intrvl_cycles(struct dp_netdev_rxq *rx,
  static uint64_t
  dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx);
+static uint64_t
+get_interval_values(atomic_ullong *source, atomic_count *cur_idx,
+int num_to_read);
  static void
  dpif_netdev_xps_revalidate_pmd(const struct dp_netdev_pmd_thread *pmd,
@@ -871,5 +876,6 @@ sorted_poll_list(struct dp_netdev_pmd_thread *pmd, struct 
rxq_poll **list,

  static void
-pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
+pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd,
+  int secs)
  {
  if (pmd->core_id != NON_PMD_CORE_ID) {
@@ -879,4 +885,5 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
  uint64_t busy_cycles = 0;
  uint64_t total_rxq_proc_cycles = 0;
+unsigned int intervals;

  ds_put_format(reply,
@@ -890,13 +897,12 @@ pmd_info_show_rxq(struct ds *reply, struct 
dp_netdev_pmd_thread *pmd)
  /* Get the total pmd cycles for an interval. */
  atomic_read_relaxed(>intrvl_cycles, _cycles);
+/* Calculate how many intervals are to be used. */
+intervals = DIV_ROUND_UP(secs,
+ 

Re: [ovs-dev] [PATCH v1] ovs-ctl: Allow inclusion of hugepages in coredumps

2022-11-30 Thread Ilya Maximets
On 11/25/22 18:19, Adrian Moreno wrote:
> Hi Mike,
> 
> Sorry it took that long to review this patch.
> 
> On 3/25/22 23:17, Mike Pattrick wrote:
>> Add new option --dump-hugepages option in ovs-ctl to enable the addition
>> of hugepages in the core dump filter.
>>
>> Signed-off-by: Mike Pattrick 
>> ---
>>   NEWS |  4 
>>   utilities/ovs-ctl.in | 15 +++
>>   2 files changed, 15 insertions(+), 4 deletions(-)
>>
>> diff --git a/NEWS b/NEWS
>> index 8fa57836a..7af60dce3 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -3,6 +3,10 @@ Post-v2.17.0
>>  - OVSDB:
>>    * 'relay' service model now supports transaction history, i.e. honors 
>> the
>>  'last-txn-id' field in 'monitor_cond_since' requests from clients.
>> +   - ovs-ctl:
>> + * New option '--dump-hugepages' to include hugepages in core dumps. 
>> This
>> +   can assist with postmortem analysis involving DPDK, but may also 
>> produce
>> +   significantly larger core dump files.
>>   
> 
> I'm afraid this part needs rebasing.
> 
>>     v2.17.0 - 17 Feb 2022
>> diff --git a/utilities/ovs-ctl.in b/utilities/ovs-ctl.in
>> index e6e07f476..8f900314b 100644
>> --- a/utilities/ovs-ctl.in
>> +++ b/utilities/ovs-ctl.in
>> @@ -103,8 +103,13 @@ set_system_ids () {
>>   action "Configuring Open vSwitch system IDs" "$@" $extra_ids
>>   }
>>   -check_force_cores () {
>> -    if test X"$FORCE_COREFILES" = Xyes; then
>> +check_core_config () {
>> +    if test X"$DUMP_HUGEPAGES" = Xyes; then
>> +    echo 0x3f > /proc/self/coredump_filter

Shouldn't this be 0x7f instead?
0x3f doesn't enable bit #6, which is responsible for dumping
shared huge pages.  Or am I missing something?

Best regards, Ilya Maximets.

>> +    if test X"$FORCE_COREFILES" = Xyes; then
>> +    ulimit -c unlimited
>> +    fi
>> +    elif test X"$FORCE_COREFILES" = Xyes; then
>>   ulimit -c 67108864
>>   fi
>>   }
>> @@ -116,7 +121,7 @@ del_transient_ports () {
>>   }
>>     do_start_ovsdb () {
>> -    check_force_cores
>> +    check_core_config
>>     if daemon_is_running ovsdb-server; then
>>   log_success_msg "ovsdb-server is already running"
>> @@ -193,7 +198,7 @@ add_managers () {
>>   }
>>     do_start_forwarding () {
>> -    check_force_cores
>> +    check_core_config
>>     insert_mod_if_required || return 1
>>   @@ -330,6 +335,7 @@ set_defaults () {
>>     DAEMON_CWD=/
>>   FORCE_COREFILES=yes
>> +    DUMP_HUGEPAGES=no
>>   MLOCKALL=yes
>>   SELF_CONFINEMENT=yes
>>   MONITOR=yes
>> @@ -419,6 +425,7 @@ Other important options for "start", "restart" and 
>> "force-reload-kmod":
>>   Less important options for "start", "restart" and "force-reload-kmod":
>>     --daemon-cwd=DIR   set working dir for OVS daemons (default: 
>> $DAEMON_CWD)
>>     --no-force-corefiles   do not force on core dumps for OVS daemons
>> +  --dump-hugepages   include hugepages in coredumps
>>     --no-mlockall  do not lock all of ovs-vswitchd into 
>> memory
>>     --ovsdb-server-priority=NICE   set ovsdb-server's niceness (default: 
>> $OVSDB_SERVER_PRIORITY)
>>     --ovsdb-server-options=OPTIONS additional options for ovsdb-server 
>> (example: '-vconsole:dbg -vfile:dbg')
>>
> 
> Tested locally and verified that with the option hugepages appear in 
> coredumps.
> Apart from the need to rebase the NEWS, the patch looks good to me.
> 
> Acked-by: Adrian Moreno 
> 
> -- 
> Adrián Moreno

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v5] odp-execute: Add ISA implementation of set_masked IPv6 action

2022-11-30 Thread Eelco Chaudron



On 29 Nov 2022, at 17:35, Phelan, Michael wrote:

>> -Original Message-
>> From: Ilya Maximets 
>> Sent: Tuesday 29 November 2022 15:07
>> To: Finn, Emma ; d...@openvswitch.org; Phelan,
>> Michael 
>> Cc: i.maxim...@ovn.org; Eelco Chaudron ; Van
>> Haaren, Harry ; Stokes, Ian
>> 
>> Subject: Re: [ovs-dev] [v5] odp-execute: Add ISA implementation of
>> set_masked IPv6 action
>>
>> On 11/29/22 15:09, Finn, Emma wrote:
>>>
>>>
 -Original Message-
 From: Ilya Maximets 
 Sent: Friday 25 November 2022 17:22
 To: Finn, Emma ; d...@openvswitch.org
 Cc: i.maxim...@ovn.org; Eelco Chaudron 
 Subject: Re: [ovs-dev] [v5] odp-execute: Add ISA implementation of
 set_masked IPv6 action

 On 11/25/22 17:23, Emma Finn wrote:
> This commit adds support for the AVX512 implementation of the
> ipv6_set_addrs action as well as an AVX512 implementation of
> updating the L4 checksums.
>
> Signed-off-by: Emma Finn 
>
> ---
> v5:
>   - Fixed load for ip6 src and dst mask for checksum check.
> v4:
>   - Reworked and moved check for checksum outside loop.
>   - Code cleanup based on review from Eelco.
> v3:
>   - Added a runtime check for AVX512 vbmi.
> v2:
>   - Added check for availbility of s6_addr32 field of struct in6_addr.
>   - Fixed network headers for freebsd builds.
> ---
> ---
>  lib/odp-execute-avx512.c  | 204
> ++
>  lib/odp-execute-private.c |  17 
>  lib/odp-execute-private.h |   1 +
>  3 files changed, 222 insertions(+)

 Hi, Emma.  Thanks for the patch!
 I didn't review the actual AVX512 code, but I have a couple of
 questions and nits inline.

>>>
>>> Thanks Ilya.
>>> My replies are inline below.
>>>
>>> 
> +
> +/* This function performs the same operation on each packet in the
> +batch as
> + * the scalar odp_set_ipv6() function. */

 I'm not sure if that statement is correct.  If you'll look at the
 odp_set_ipv6() implementation and precisely at the
 packet_set_ipv6() implementation, there is a check for the routing
 extension header combined with the check for the fragmentation header
 (packet_rh_present) to prevent writing into L4 fields that do not
 exist or, in case of routing header being present, checksum should
 not be updated for the destination address.

 Could you point me to the AVX512 code that is responsible for that check?

>>> I think the AVX code is handling this case the same as scalar and also I
>> cannot reproduce a failure with the autovalidator.
>>> If I am following the scalar code correctly, you're right. If there is a 
>>> routing
>> extension header present, for the dst address no checksum will happen.
>>> But similarly for src address, a checksum won't happen.
>>> As packet_update_csum128() will only do a checksum if ip6_nxt is UPD,TCP
>> or ICMPv6. Which won't be the case if any extension header is present.
>>
>> Not really, the 'proto' argument in this function is one of the results of
>> packet_rh_present() that iterates over all the extension headers and takes
>> the protocol number from the last one.  So, extension headers are jumped
>> over this way.
>>
>>> Similarly in the AVX code, l4 checksum will only happen if ip6_nxt is
>> UPD,TCP or ICMPv6, i.e no extension header is present.
>>> So I think this case is covered if I'm not missing any corner cases?
>>
>> I didn't read the AVX code carefully enough to confirm that, but it is not 
>> really
>> a correct behavior as extension headers should generally be just ignored
>> except for fragmentation header and the routing header.  So, the logic is:
>>
>> - If the fragmentation header is present and it is a 'later'
>>   fragment - skip the checksum as there is no L4 header in
>>   the packet.  For the 'first' fragment the checksum should
>>   be re-calculated.
>>
>> - If the routing header with non-zero segments_left is present
>>   then update of the destination address should not be reflected
>>   in the checksum.  Update of the source address should still
>>   trigger the checksum update.  This is because the original
>>   packet checksum is calculated with the destination address
>>   taken from the last segment of the routing header.
>>
>> - In all other cases, extension headers should be just ignored
>>   and the checksum should be updated.
>>
>> I'm not sure if that logic is covering all the cases, but that is what 
>> scalar code is
>> doing.
>>
>>> Have you been able to see a failure with autovalidator ?
>>
>> Yes, there is a failure on a system test:
>>
>> 9. system-traffic.at:229: testing datapath - ping6 between two ports with
>> header modify ...
>>
>> 2022-11-28T17:27:13.067Z|00107|dpif_lookup_avx512_gather|INFO|Using
>> non-specialized AVX512 lookup for subtable (4,5) and possibly oth ers.
>> 

[ovs-dev] [PATCH ovn] actions: Clarify the NAT type for ovnact_ct_nat

2022-11-30 Thread Ales Musil
The encode_ct_nat took a bool specifying if the NAT
type is source or destination. However that doesn't work
with ct_commit_nat which has the NAT type unspecified.
Add enum that allows to differentiate between those
to make it clearer which type of NAT should be applied.

Signed-off-by: Ales Musil 
---
 include/ovn/actions.h | 10 
 lib/actions.c | 58 ---
 2 files changed, 42 insertions(+), 26 deletions(-)

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index a56351081..de9647f1b 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -264,6 +264,14 @@ struct ovnact_ct_commit_v1 {
 ovs_be128 ct_label, ct_label_mask;
 };
 
+/* Type of NAT used for the particular action.
+ * UNSPEC translates to applying NAT that works for both directions. */
+enum ovnact_ct_nat_type {
+OVNACT_CT_NAT_SRC,
+OVNACT_CT_NAT_DEST,
+OVNACT_CT_NAT_UNSPEC,
+};
+
 /* OVNACT_CT_DNAT, OVNACT_CT_SNAT, OVNACT_CT_COMMIT_NAT. */
 struct ovnact_ct_nat {
 struct ovnact ovnact;
@@ -279,6 +287,8 @@ struct ovnact_ct_nat {
uint16_t port_hi;
 } port_range;
 
+enum ovnact_ct_nat_type type;
+
 bool commit;/* Explicit commit action. */
 
 uint8_t ltable; /* Logical table ID of next table. */
diff --git a/lib/actions.c b/lib/actions.c
index 47ec654e1..f3a5f3bf3 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -910,7 +910,7 @@ encode_CT_COMMIT_V2(const struct ovnact_nest *on,
 
 static void
 parse_ct_nat(struct action_context *ctx, const char *name,
- struct ovnact_ct_nat *cn)
+ enum ovnact_ct_nat_type type, struct ovnact_ct_nat *cn)
 {
 add_prerequisite(ctx, "ip");
 
@@ -921,6 +921,8 @@ parse_ct_nat(struct action_context *ctx, const char *name,
 }
 cn->ltable = ctx->pp->cur_ltable + 1;
 cn->commit = false;
+cn->family = AF_UNSPEC;
+cn->type = OVNACT_CT_NAT_UNSPEC;
 
 if (lexer_match(ctx->lexer, LEX_T_LPAREN)) {
 if (ctx->lexer->token.type != LEX_T_INTEGER
@@ -932,10 +934,12 @@ parse_ct_nat(struct action_context *ctx, const char *name,
 if (ctx->lexer->token.format == LEX_F_IPV4) {
 cn->commit = true;
 cn->family = AF_INET;
+cn->type = type;
 cn->ipv4 = ctx->lexer->token.value.ipv4;
 } else if (ctx->lexer->token.format == LEX_F_IPV6) {
 cn->commit = true;
 cn->family = AF_INET6;
+cn->type = type;
 cn->ipv6 = ctx->lexer->token.value.ipv6;
 }
 lexer_get(ctx->lexer);
@@ -984,26 +988,28 @@ parse_ct_nat(struct action_context *ctx, const char *name,
 static void
 parse_CT_DNAT(struct action_context *ctx)
 {
-parse_ct_nat(ctx, "ct_dnat", ovnact_put_CT_DNAT(ctx->ovnacts));
+parse_ct_nat(ctx, "ct_dnat", OVNACT_CT_NAT_DEST,
+ ovnact_put_CT_DNAT(ctx->ovnacts));
 }
 
 static void
 parse_CT_SNAT(struct action_context *ctx)
 {
-parse_ct_nat(ctx, "ct_snat", ovnact_put_CT_SNAT(ctx->ovnacts));
+parse_ct_nat(ctx, "ct_snat", OVNACT_CT_NAT_SRC,
+ ovnact_put_CT_SNAT(ctx->ovnacts));
 }
 
 static void
 parse_CT_DNAT_IN_CZONE(struct action_context *ctx)
 {
-parse_ct_nat(ctx, "ct_dnat_in_czone",
+parse_ct_nat(ctx, "ct_dnat_in_czone", OVNACT_CT_NAT_DEST,
  ovnact_put_CT_DNAT_IN_CZONE(ctx->ovnacts));
 }
 
 static void
 parse_CT_SNAT_IN_CZONE(struct action_context *ctx)
 {
-parse_ct_nat(ctx, "ct_snat_in_czone",
+parse_ct_nat(ctx, "ct_snat_in_czone", OVNACT_CT_NAT_SRC,
  ovnact_put_CT_SNAT_IN_CZONE(ctx->ovnacts));
 }
 
@@ -1022,6 +1028,7 @@ parse_CT_COMMIT_NAT(struct action_context *ctx)
 cn->commit = true;
 cn->ltable = ctx->pp->cur_ltable + 1;
 cn->family = AF_UNSPEC;
+cn->type = OVNACT_CT_NAT_UNSPEC;
 cn->port_range.exists = false;
 }
 
@@ -1083,8 +1090,7 @@ format_CT_COMMIT_NAT(const struct ovnact_ct_nat *cn 
OVS_UNUSED, struct ds *s)
 static void
 encode_ct_nat(const struct ovnact_ct_nat *cn,
   const struct ovnact_encode_params *ep,
-  bool snat, enum mf_field_id zone_src,
-  struct ofpbuf *ofpacts)
+  enum mf_field_id zone_src, struct ofpbuf *ofpacts)
 {
 const size_t ct_offset = ofpacts->size;
 ofpbuf_pull(ofpacts, ct_offset);
@@ -1103,25 +1109,25 @@ encode_ct_nat(const struct ovnact_ct_nat *cn,
 ofpbuf_pull(ofpacts, nat_offset);
 
 nat = ofpact_put_NAT(ofpacts);
-nat->flags = 0;
-nat->range_af = AF_UNSPEC;
+nat->range_af = cn->family;
 
-if (cn->family == AF_INET) {
-nat->range_af = AF_INET;
-nat->range.addr.ipv4.min = cn->ipv4;
-if (snat) {
+switch (cn->type) {
+case OVNACT_CT_NAT_SRC:
 nat->flags |= NX_NAT_F_SRC;
-} else {
+break;
+case OVNACT_CT_NAT_DEST:
 nat->flags |= NX_NAT_F_DST;
-}
+break;
+case OVNACT_CT_NAT_UNSPEC:
+  

Re: [ovs-dev] [PATCH v3 ovn 0/5] Add OVN component templates.

2022-11-30 Thread Dumitru Ceara
On 11/28/22 16:34, Dumitru Ceara wrote:
> On 11/23/22 09:50, Dumitru Ceara wrote:
>> On 11/22/22 21:11, Mark Michelson wrote:
>>> For the series:
>>>
>>> Acked-by: Mark Michelson 
>>
>> Thanks, Mark, for the review!
>>
>> Han, I know you were testing v2, I'll wait for your confirmation before
>> merging v3.
>>
> 
> It seems we have two ways forward with this series:
> 
> a. merge v3 and use the time between branch-22.12 creation and release
> to address any new concerns about it.
> b. give it a couple more days for though and, if nothing critical needs
> to change, merge it after the branch is created (pushing it to the 22.12
> branch as well).
> 
> In a private discussion earlier Mark and Numan agreed with "b" so I'll
> be waiting a few more days for feedback on this series.
> 

I updated patches 4 and 5 with the incremental changes suggested by Han;
I also added a small NEWS entry for this feature and a reported-at tag
to patch 4.

I added Han's and Mark's acks to the patches and then I pushed this
series to the main branch and to branch-22.12.

Thanks everyone for the reviews and for trying it out!

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] [PATCH v6 net-next] net: openvswitch: Add support to count upcall packets

2022-11-30 Thread wangchuanlei
Add support to count upall packets, when kmod of openvswitch
upcall to userspace , here count the number of packets for
upcall succeed and failed, which is a better way to see how
many packets upcalled to userspace(ovs-vswitchd) on every
interfaces.

Here modify format of code used by comments of v6.

Changes since v4 & v5:
- optimize the function used by comments

Changes since v3:
- use nested NLA_NESTED attribute in netlink message

Changes since v2:
- add count of upcall failed packets

Changes since v1:
- add count of upcall succeed packets

Signed-off-by: wangchuanlei 
---
 include/uapi/linux/openvswitch.h | 14 +
 net/openvswitch/datapath.c   | 50 
 net/openvswitch/vport.c  | 44 
 net/openvswitch/vport.h  | 24 +++
 4 files changed, 132 insertions(+)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 94066f87e9ee..8422ebf6885b 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -277,11 +277,25 @@ enum ovs_vport_attr {
OVS_VPORT_ATTR_PAD,
OVS_VPORT_ATTR_IFINDEX,
OVS_VPORT_ATTR_NETNSID,
+   OVS_VPORT_ATTR_UPCALL_STATS,
__OVS_VPORT_ATTR_MAX
 };
 
 #define OVS_VPORT_ATTR_MAX (__OVS_VPORT_ATTR_MAX - 1)
 
+/**
+ * enum ovs_vport_upcall_attr - attributes for %OVS_VPORT_UPCALL* commands
+ * @OVS_VPORT_UPCALL_SUCCESS: 64-bit upcall success packets.
+ * @OVS_VPORT_UPCALL_FAIL: 64-bit upcall fail packets.
+ */
+enum ovs_vport_upcall_attr {
+   OVS_VPORT_UPCALL_SUCCESS,
+   OVS_VPORT_UPCALL_FAIL,
+   __OVS_VPORT_UPCALL_MAX
+};
+
+#define OVS_VPORT_UPCALL_MAX (__OVS_VPORT_UPCALL_MAX - 1)
+
 enum {
OVS_VXLAN_EXT_UNSPEC,
OVS_VXLAN_EXT_GBP,  /* Flag or __u32 */
diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index c8a9075ddd0a..f9279aee2adb 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -209,6 +209,26 @@ static struct vport *new_vport(const struct vport_parms 
*parms)
return vport;
 }
 
+static void ovs_vport_upcalls(struct sk_buff *skb,
+ const struct dp_upcall_info *upcall_info,
+ bool upcall_result)
+{
+   struct vport *p = OVS_CB(skb)->input_vport;
+   struct vport_upcall_stats_percpu *vport_stats;
+
+   if (upcall_info->cmd != OVS_PACKET_CMD_MISS &&
+   upcall_info->cmd != OVS_PACKET_CMD_ACTION)
+   return;
+
+   vport_stats = this_cpu_ptr(p->upcall_stats);
+   u64_stats_update_begin(_stats->syncp);
+   if (upcall_result)
+   u64_stats_inc(_stats->n_success);
+   else
+   u64_stats_inc(_stats->n_fail);
+   u64_stats_update_end(_stats->syncp);
+}
+
 void ovs_dp_detach_port(struct vport *p)
 {
ASSERT_OVSL();
@@ -216,6 +236,9 @@ void ovs_dp_detach_port(struct vport *p)
/* First drop references to device. */
hlist_del_rcu(>dp_hash_node);
 
+   /* Free percpu memory */
+   free_percpu(p->upcall_stats);
+
/* Then destroy it. */
ovs_vport_del(p);
 }
@@ -305,6 +328,8 @@ int ovs_dp_upcall(struct datapath *dp, struct sk_buff *skb,
err = queue_userspace_packet(dp, skb, key, upcall_info, cutlen);
else
err = queue_gso_packets(dp, skb, key, upcall_info, cutlen);
+
+   ovs_vport_upcalls(skb, upcall_info, !err);
if (err)
goto err;
 
@@ -1825,6 +1850,12 @@ static int ovs_dp_cmd_new(struct sk_buff *skb, struct 
genl_info *info)
goto err_destroy_portids;
}
 
+   vport->upcall_stats = netdev_alloc_pcpu_stats(struct 
vport_upcall_stats_percpu);
+   if (!vport->upcall_stats) {
+   err = -ENOMEM;
+   goto err_destroy_portids;
+   }
+
err = ovs_dp_cmd_fill_info(dp, reply, info->snd_portid,
   info->snd_seq, 0, OVS_DP_CMD_NEW);
BUG_ON(err < 0);
@@ -2068,6 +2099,8 @@ static int ovs_vport_cmd_fill_info(struct vport *vport, 
struct sk_buff *skb,
 {
struct ovs_header *ovs_header;
struct ovs_vport_stats vport_stats;
+   struct ovs_vport_upcall_stats stat;
+   struct nlattr *nla;
int err;
 
ovs_header = genlmsg_put(skb, portid, seq, _vport_genl_family,
@@ -2097,6 +2130,15 @@ static int ovs_vport_cmd_fill_info(struct vport *vport, 
struct sk_buff *skb,
  OVS_VPORT_ATTR_PAD))
goto nla_put_failure;
 
+   nla = nla_nest_start_noflag(skb, OVS_VPORT_ATTR_UPCALL_STATS);
+   if (!nla)
+   goto nla_put_failure;
+
+   ovs_vport_get_upcall_stats(vport, );
+   if (ovs_vport_put_upcall_stats(skb, ))
+   goto nla_put_failure;
+   nla_nest_end(skb, nla);
+
if (ovs_vport_get_upcall_portids(vport, skb))
goto nla_put_failure;
 
@@ -2278,6 +2320,13 @@ static int 

Re: [ovs-dev] [PATCH] [openvswitch v4] openvswitch: Add support to count upcall packets

2022-11-30 Thread wangchuanlei
Hi,Eelco Chaudron

> +struct ovs_vport_upcall_stats {
> + __u64   tx_success; /* total packets upcalls succeed */
> + __u64   tx_fail;/* total packets upcalls failed  */
> +};
> +

=> This is no longer a user API data structure, so it should be removed from 
this include.
=>  --This structure will used in userspace, ovs-vswitchd will use it.
=>  -- and that will be another patch of ovs-vswitchd. so it keep it here ?

The above was your response to v4. However, as this structure is not part of 
the UAPI
 from the Linux side, it should not be exposed. If you need a similar structure 
in OVS one should be defined there.

--Yes, i modified here , v6 will pushed several minutes ,thanks for review!
Best regards!
wangchuanlei

On 30 Nov 2022, at 8:25, wangchuanlei wrote:

> Add support to count upall packets, when kmod of openvswitch upcall to 
> userspace , here count the number of packets for upcall succeed and 
> failed, which is a better way to see how many packets upcalled to 
> userspace(ovs-vswitchd) on every interfaces.
>
> Here modify format of code used by comments of v4.
>
> Changes since v4:
> - optimize the function used by comments
>
> Changes since v3:
> - use nested NLA_NESTED attribute in netlink message
>
> Changes since v2:
> - add count of upcall failed packets
>
> Changes since v1:
> - add count of upcall succeed packets
>
> Signed-off-by: wangchuanlei 
> ---
>  include/uapi/linux/openvswitch.h | 19 
>  net/openvswitch/datapath.c   | 50 
>  net/openvswitch/vport.c  | 44 
>  net/openvswitch/vport.h  | 19 
>  4 files changed, 132 insertions(+)
>
> diff --git a/include/uapi/linux/openvswitch.h 
> b/include/uapi/linux/openvswitch.h
> index 94066f87e9ee..ad7cea9827cc 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -126,6 +126,11 @@ struct ovs_vport_stats {
>   __u64   tx_dropped; /* no space available in linux  */
>  };
>
> +struct ovs_vport_upcall_stats {
> + __u64   tx_success; /* total packets upcalls succeed */
> + __u64   tx_fail;/* total packets upcalls failed  */
> +};
> +

=> This is no longer a user API data structure, so it should be removed from 
this include.
=>  --This structure will used in userspace, ovs-vswitchd will use it.
=>  -- and that will be another patch of ovs-vswitchd. so it keep it here ?

The above was your response to v4. However, as this structure is not part of 
the UAPI from the Linux side, it should not be exposed. If you need a similar 
structure in OVS one should be defined there.


>  /* Allow last Netlink attribute to be unaligned */
>  #define OVS_DP_F_UNALIGNED   (1 << 0)
>
> @@ -277,11 +282,25 @@ enum ovs_vport_attr {
>   OVS_VPORT_ATTR_PAD,
>   OVS_VPORT_ATTR_IFINDEX,
>   OVS_VPORT_ATTR_NETNSID,
> + OVS_VPORT_ATTR_UPCALL_STATS, /* struct ovs_vport_upcall_stats */
>   __OVS_VPORT_ATTR_MAX
>  };
>
>  #define OVS_VPORT_ATTR_MAX (__OVS_VPORT_ATTR_MAX - 1)
>
> +/**
> + * enum ovs_vport_upcall_attr - attributes for %OVS_VPORT_UPCALL* 
> +commands
> + * @OVS_VPORT_UPCALL_SUCCESS: 64-bit upcall success packets.
> + * @OVS_VPORT_UPCALL_FAIL: 64-bit upcall fail packets.
> + */
> +enum ovs_vport_upcall_attr {
> + OVS_VPORT_UPCALL_SUCCESS,
> + OVS_VPORT_UPCALL_FAIL,
> + __OVS_VPORT_UPCALL_MAX
> +};
> +
> +#define OVS_VPORT_UPCALL_MAX (__OVS_VPORT_UPCALL_MAX - 1)
> +
>  enum {
>   OVS_VXLAN_EXT_UNSPEC,
>   OVS_VXLAN_EXT_GBP,  /* Flag or __u32 */
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c 
> index c8a9075ddd0a..f9279aee2adb 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -209,6 +209,26 @@ static struct vport *new_vport(const struct vport_parms 
> *parms)
>   return vport;
>  }
>
> +static void ovs_vport_upcalls(struct sk_buff *skb,
> +   const struct dp_upcall_info *upcall_info,
> +   bool upcall_result)
> +{
> + struct vport *p = OVS_CB(skb)->input_vport;
> + struct vport_upcall_stats_percpu *vport_stats;
> +
> + if (upcall_info->cmd != OVS_PACKET_CMD_MISS &&
> + upcall_info->cmd != OVS_PACKET_CMD_ACTION)
> + return;
> +
> + vport_stats = this_cpu_ptr(p->upcall_stats);
> + u64_stats_update_begin(_stats->syncp);
> + if (upcall_result)
> + u64_stats_inc(_stats->n_success);
> + else
> + u64_stats_inc(_stats->n_fail);
> + u64_stats_update_end(_stats->syncp);
> +}
> +
>  void ovs_dp_detach_port(struct vport *p)  {
>   ASSERT_OVSL();
> @@ -216,6 +236,9 @@ void ovs_dp_detach_port(struct vport *p)
>   /* First drop references to device. */
>   hlist_del_rcu(>dp_hash_node);
>
> + /* Free percpu memory */
> + free_percpu(p->upcall_stats);
> +
>   /* Then destroy it. */
>