Re: [ovs-dev] [PATCH ovn v7 2/3] Support LSP:options:requested-chassis as a list

2022-05-18 Thread 0-day Robot
Bleep bloop.  Greetings Ihar Hrachyshka, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


build:
depbase=`echo utilities/ovn-appctl.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I /var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT utilities/ovn-appctl.o -MD -MP -MF $depbase.Tpo -c -o 
utilities/ovn-appctl.o utilities/ovn-appctl.c &&\
mv -f $depbase.Tpo $depbase.Po
/bin/sh ./libtool  --tag=CC   --mode=link gcc -std=gnu99 -Wstrict-prototypes 
-Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security 
-Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align 
-Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -o utilities/ovn-appctl utilities/ovn-appctl.o lib/libovn.la 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/ovsdb/libovsdb.la
 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib/libopenvswitch.la
 -lpthread -lrt -lm  -lunbound
libtool: link: gcc -std=gnu99 -Wstrict-prototypes -Wall -Wextra 
-Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum 
-Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes 
-Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers 
-fno-strict-aliasing -Wshadow -Werror -Werror -g -O2 -o utilities/ovn-appctl 
utilities/ovn-appctl.o  lib/.libs/libovn.a 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/ovsdb/.libs/libovsdb.a
 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib/.libs/libopenvswitch.a
 -lssl -lcrypto -lcap-ng -lpthread -lrt -lm -lunbound
depbase=`echo controller/bfd.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I /var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT controller/bfd.o -MD -MP -MF $depbase.Tpo -c -o controller/bfd.o 
controller/bfd.c &&\
mv -f $depbase.Tpo $depbase.Po
depbase=`echo controller/binding.o | sed 's|[^/]*$|.deps/&|;s|\.o$||'`;\
gcc -std=gnu99 -DHAVE_CONFIG_H -I.   -I ./include  -I ./include -I ./ovn -I 
./include -I ./lib -I ./lib -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/include
 -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs/lib 
-I /var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs -I 
/var/lib/jenkins/jobs/0day_robot_upstream_build_ovn_from_pw/workspace/ovs
-Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat 
-Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast 
-Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes 
-Wmissing-field-initializers -fno-strict-aliasing -Wshadow -Werror -Werror  -g 
-O2 -MT controller/binding.o -MD -MP -MF $depbase.Tpo -c -o 
controller/binding.o controller/binding.c &&\
mv -f $depbase.Tpo $depbase.Po
controller/binding.c: In function ‘claim_lport’:
controller/binding.c:1117:10: error: label at end of compound 

[ovs-dev] [PATCH ovn v7 3/3] Clone packets to both port chassis

2022-05-18 Thread Ihar Hrachyshka
When multiple chassis are set in requested-chassis, port binding is
configured in multiple cluster locations. In case of live migration
scenario, only one of the locations run a workload at a particular
point in time. Yet, it's expected that the workload may switch to
running at an additional chassis at any moment during live migration
(depends on libvirt / qemu migration progress). To speed up the switch
to near instant, do the following:

When a port located sends a packet to another port that has multiple
chassis then, in addition to sending the packet to the main chassis,
also send it to additional chassis. When the sending port is bound on
either the main or additional chassis, then handle the packet locally
plus send it to all other chassis.

This is achieved with additional flows in tables 37 and 38.

Signed-off-by: Ihar Hrachyshka 
---
 controller/binding.c  |   2 +-
 controller/binding.h  |   3 +
 controller/physical.c | 376 ++-
 ovn-nb.xml|   9 +
 ovn-sb.xml|   9 +
 tests/ovn.at  | 693 ++
 6 files changed, 948 insertions(+), 144 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 53a9120c2..4a563b721 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -1007,7 +1007,7 @@ update_port_additional_encap_if_needed(
 return true;
 }
 
-static bool
+bool
 is_additional_chassis(const struct sbrec_port_binding *pb,
   const struct sbrec_chassis *chassis_rec)
 {
diff --git a/controller/binding.h b/controller/binding.h
index 6815d1fd6..c60af8402 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -195,4 +195,7 @@ enum en_lport_type {
 
 enum en_lport_type get_lport_type(const struct sbrec_port_binding *);
 
+bool is_additional_chassis(const struct sbrec_port_binding *pb,
+   const struct sbrec_chassis *chassis_rec);
+
 #endif /* controller/binding.h */
diff --git a/controller/physical.c b/controller/physical.c
index 36f265a8c..af1f41c04 100644
--- a/controller/physical.c
+++ b/controller/physical.c
@@ -60,6 +60,11 @@ struct zone_ids {
 int snat;   /* MFF_LOG_SNAT_ZONE. */
 };
 
+struct tunnel {
+struct ovs_list list_node;
+const struct chassis_tunnel *tun;
+};
+
 static void
 load_logical_ingress_metadata(const struct sbrec_port_binding *binding,
   const struct zone_ids *zone_ids,
@@ -286,25 +291,83 @@ match_outport_dp_and_port_keys(struct match *match,
 match_set_reg(match, MFF_LOG_OUTPORT - MFF_REG0, port_key);
 }
 
+static struct sbrec_encap *
+find_additional_encap_for_chassis(const struct sbrec_port_binding *pb,
+  const struct sbrec_chassis *chassis_rec)
+{
+for (int i = 0; i < pb->n_additional_encap; i++) {
+if (!strcmp(pb->additional_encap[i]->chassis_name,
+chassis_rec->name)) {
+return pb->additional_encap[i];
+}
+}
+return NULL;
+}
+
+static struct ovs_list *
+get_remote_tunnels(const struct sbrec_port_binding *binding,
+   const struct sbrec_chassis *chassis,
+   const struct hmap *chassis_tunnels)
+{
+const struct chassis_tunnel *tun;
+
+struct ovs_list *tunnels = xmalloc(sizeof *tunnels);
+ovs_list_init(tunnels);
+
+if (binding->chassis && binding->chassis != chassis) {
+tun = get_port_binding_tun(binding->encap, binding->chassis,
+   chassis_tunnels);
+if (!tun) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(
+, "Failed to locate tunnel to reach main chassis %s "
+ "for port %s. Cloning packets disabled for the chassis.",
+binding->chassis->name, binding->logical_port);
+} else {
+struct tunnel *tun_elem = xmalloc(sizeof *tun_elem);
+tun_elem->tun = tun;
+ovs_list_push_back(tunnels, _elem->list_node);
+}
+}
+
+for (int i = 0; i < binding->n_additional_chassis; i++) {
+if (binding->additional_chassis[i] == chassis) {
+continue;
+}
+const struct sbrec_encap *additional_encap;
+additional_encap = find_additional_encap_for_chassis(binding, chassis);
+tun = get_port_binding_tun(additional_encap,
+   binding->additional_chassis[i],
+   chassis_tunnels);
+if (!tun) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(
+, "Failed to locate tunnel to reach additional chassis %s "
+ "for port %s. Cloning packets disabled for the chassis.",
+binding->additional_chassis[i]->name, binding->logical_port);
+continue;
+}
+struct tunnel *tun_elem = xmalloc(sizeof 

[ovs-dev] [PATCH ovn v7 2/3] Support LSP:options:requested-chassis as a list

2022-05-18 Thread Ihar Hrachyshka
When the option is set to a comma separated list of chassis names, OVN
will attempt to bind the port at any number of other locations in
addition to the main chassis.

This is useful in live migration scenarios where it's important to
prepare the environment for workloads to move to, avoiding costly flow
configuration at the moment of the final port binding location change.

This may also be useful in port mirroring scenarios.

Corresponding database fields (pb->additional_chassis,
pb->requested_additional_chassis, pb->additional_encap) are introduced
as part of the patch. These are similar to fields designed to track the
main chassis (->chassis, ->requested_chassis, ->encap). But because we
support any number of additional chassis, these fields are lists.

Signed-off-by: Ihar Hrachyshka 
---
 NEWS |   1 +
 controller/binding.c | 298 ---
 controller/lport.c   |  46 ---
 controller/lport.h   |  11 +-
 northd/northd.c  |  62 ++---
 northd/ovn-northd.c  |   4 +-
 ovn-nb.xml   |  20 ++-
 ovn-sb.ovsschema |  17 ++-
 ovn-sb.xml   |  63 +++--
 tests/ovn.at | 218 +++
 10 files changed, 638 insertions(+), 102 deletions(-)

diff --git a/NEWS b/NEWS
index f18a7d766..856d683f3 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,7 @@ Post v22.03.0
 explicitly request routers to reply to any ARP/ND request for a VIP
 (when set to "all") and only for reachable VIPs (when set to "reachable"
 or by default).
+  - Support list of chassis for Logical_Switch_Port:options:requested-chassis.
 
 OVN v22.03.0 - 11 Mar 2022
 --
diff --git a/controller/binding.c b/controller/binding.c
index 6bdbe9912..53a9120c2 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -930,6 +930,130 @@ claimed_lport_set_up(const struct sbrec_port_binding *pb,
 }
 }
 
+typedef void (*set_func)(const struct sbrec_port_binding *pb,
+ const struct sbrec_encap *);
+
+static bool
+update_port_encap_if_needed(const struct sbrec_port_binding *pb,
+const struct sbrec_chassis *chassis_rec,
+const struct ovsrec_interface *iface_rec,
+bool sb_readonly)
+{
+const struct sbrec_encap *encap_rec =
+sbrec_get_port_encap(chassis_rec, iface_rec);
+if ((encap_rec && pb->encap != encap_rec) ||
+(!encap_rec && pb->encap)) {
+if (sb_readonly) {
+return false;
+}
+sbrec_port_binding_set_encap(pb, encap_rec);
+}
+return true;
+}
+
+static void
+append_additional_encap(const struct sbrec_port_binding *pb,
+const struct sbrec_encap *encap)
+{
+struct sbrec_encap **additional_encap = xmalloc(
+(pb->n_additional_encap + 1) * (sizeof *additional_encap));
+memcpy(additional_encap, pb->additional_encap,
+   pb->n_additional_encap * (sizeof *additional_encap));
+additional_encap[pb->n_additional_encap] = (struct sbrec_encap *) encap;
+sbrec_port_binding_set_additional_encap(
+pb, additional_encap, pb->n_additional_encap + 1);
+free(additional_encap);
+}
+
+static void
+remove_additional_encap_for_chassis(const struct sbrec_port_binding *pb,
+const struct sbrec_chassis *chassis_rec)
+{
+struct sbrec_encap **additional_encap = xmalloc(
+pb->n_additional_encap * (sizeof *additional_encap));
+
+int idx = 0;
+for (int i = 0; i < pb->n_additional_encap; i++) {
+if (!strcmp(pb->additional_encap[i]->chassis_name,
+chassis_rec->name)) {
+continue;
+}
+additional_encap[idx++] = pb->additional_encap[i];
+}
+sbrec_port_binding_set_additional_encap(pb, additional_encap, idx);
+free(additional_encap);
+}
+
+static bool
+update_port_additional_encap_if_needed(
+const struct sbrec_port_binding *pb,
+const struct sbrec_chassis *chassis_rec,
+const struct ovsrec_interface *iface_rec,
+bool sb_readonly)
+{
+const struct sbrec_encap *encap_rec =
+sbrec_get_port_encap(chassis_rec, iface_rec);
+if (encap_rec) {
+for (int i = 0; i < pb->n_additional_encap; i++) {
+if (pb->additional_encap[i] == encap_rec) {
+return true;
+}
+}
+if (sb_readonly) {
+return false;
+}
+append_additional_encap(pb, encap_rec);
+}
+return true;
+}
+
+static bool
+is_additional_chassis(const struct sbrec_port_binding *pb,
+  const struct sbrec_chassis *chassis_rec)
+{
+for (int i = 0; i < pb->n_additional_chassis; i++) {
+if (pb->additional_chassis[i] == chassis_rec) {
+return true;
+}
+}
+return false;
+}
+
+static void
+append_additional_chassis(const struct sbrec_port_binding *pb,
+  const 

[ovs-dev] [PATCH ovn v7 1/3] Update port-up on main chassis only

2022-05-18 Thread Ihar Hrachyshka
In a future patch, there will be a scenario where the same port has
attachments at multiple (specifically, 2) chassis, so make sure that
'up' property is updated by the main chassis only.

Signed-off-by: Ihar Hrachyshka 
---
 controller/binding.c|  9 ++---
 controller/binding.h|  2 ++
 controller/if-status.c  | 15 ++-
 controller/if-status.h  |  1 +
 controller/ovn-controller.c |  4 ++--
 5 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 0b6e83a74..6bdbe9912 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -698,6 +698,7 @@ local_binding_is_down(struct shash *local_bindings, const 
char *pb_name)
 
 void
 local_binding_set_up(struct shash *local_bindings, const char *pb_name,
+ const struct sbrec_chassis *chassis_rec,
  const char *ts_now_str, bool sb_readonly,
  bool ovs_readonly)
 {
@@ -717,8 +718,8 @@ local_binding_set_up(struct shash *local_bindings, const 
char *pb_name,
 ts_now_str);
 }
 
-if (!sb_readonly && lbinding && b_lport && b_lport->pb->n_up
-&& !b_lport->pb->up[0]) {
+if (!sb_readonly && lbinding && b_lport && b_lport->pb->n_up &&
+!b_lport->pb->up[0] && b_lport->pb->chassis == chassis_rec) {
 VLOG_INFO("Setting lport %s up in Southbound", pb_name);
 binding_lport_set_up(b_lport, sb_readonly);
 LIST_FOR_EACH (b_lport, list_node, >binding_lports) {
@@ -729,6 +730,7 @@ local_binding_set_up(struct shash *local_bindings, const 
char *pb_name,
 
 void
 local_binding_set_down(struct shash *local_bindings, const char *pb_name,
+   const struct sbrec_chassis *chassis_rec,
bool sb_readonly, bool ovs_readonly)
 {
 struct local_binding *lbinding =
@@ -743,7 +745,8 @@ local_binding_set_down(struct shash *local_bindings, const 
char *pb_name,
 OVN_INSTALLED_EXT_ID);
 }
 
-if (!sb_readonly && b_lport && b_lport->pb->n_up && b_lport->pb->up[0]) {
+if (!sb_readonly && b_lport && b_lport->pb->n_up && b_lport->pb->up[0] &&
+(!b_lport->pb->chassis || b_lport->pb->chassis == chassis_rec)) {
 VLOG_INFO("Setting lport %s down in Southbound", pb_name);
 binding_lport_set_down(b_lport, sb_readonly);
 LIST_FOR_EACH (b_lport, list_node, >binding_lports) {
diff --git a/controller/binding.h b/controller/binding.h
index e49e1ebb6..6815d1fd6 100644
--- a/controller/binding.h
+++ b/controller/binding.h
@@ -155,9 +155,11 @@ ofp_port_t local_binding_get_lport_ofport(const struct 
shash *local_bindings,
 bool local_binding_is_up(struct shash *local_bindings, const char *pb_name);
 bool local_binding_is_down(struct shash *local_bindings, const char *pb_name);
 void local_binding_set_up(struct shash *local_bindings, const char *pb_name,
+  const struct sbrec_chassis *chassis_rec,
   const char *ts_now_str, bool sb_readonly,
   bool ovs_readonly);
 void local_binding_set_down(struct shash *local_bindings, const char *pb_name,
+const struct sbrec_chassis *chassis_rec,
 bool sb_readonly, bool ovs_readonly);
 
 void binding_register_ovs_idl(struct ovsdb_idl *);
diff --git a/controller/if-status.c b/controller/if-status.c
index dbdf8b66f..ad61844d8 100644
--- a/controller/if-status.c
+++ b/controller/if-status.c
@@ -115,6 +115,7 @@ static void ovs_iface_set_state(struct if_status_mgr *, 
struct ovs_iface *,
 
 static void if_status_mgr_update_bindings(
 struct if_status_mgr *mgr, struct local_binding_data *binding_data,
+const struct sbrec_chassis *,
 bool sb_readonly, bool ovs_readonly);
 
 struct if_status_mgr *
@@ -306,6 +307,7 @@ if_status_mgr_update(struct if_status_mgr *mgr,
 void
 if_status_mgr_run(struct if_status_mgr *mgr,
   struct local_binding_data *binding_data,
+  const struct sbrec_chassis *chassis_rec,
   bool sb_readonly, bool ovs_readonly)
 {
 struct ofctrl_acked_seqnos *acked_seqnos =
@@ -328,8 +330,8 @@ if_status_mgr_run(struct if_status_mgr *mgr,
 ofctrl_acked_seqnos_destroy(acked_seqnos);
 
 /* Update binding states. */
-if_status_mgr_update_bindings(mgr, binding_data, sb_readonly,
-  ovs_readonly);
+if_status_mgr_update_bindings(mgr, binding_data, chassis_rec,
+  sb_readonly, ovs_readonly);
 }
 
 static void
@@ -390,6 +392,7 @@ ovs_iface_set_state(struct if_status_mgr *mgr, struct 
ovs_iface *iface,
 static void
 if_status_mgr_update_bindings(struct if_status_mgr *mgr,
   struct local_binding_data *binding_data,
+  const struct sbrec_chassis *chassis_rec,
 

[ovs-dev] [PATCH ovn v7 0/3] Support multiple requested-chassis

2022-05-18 Thread Ihar Hrachyshka
This version excludes the implementation for rarp activation strategy.
This last piece needs some more work, sending an updated series without
it to speed up reviews and merges.

This iteration of the patch series includes the following changes:

v7: dropped the patch that tags all traffic from tunnels as LOCAL_ONLY
(unneeded now that this series doesn't handle local traffic in
remote table).
v7: local_binding_set_up: don't set up when pb->chassis is NULL.
v7: added more test scenarios: 3 chassis, flipping chassis roles (main
to additional and vice versa), check behavior when one of chassis
doesn't claim a port.
v7: don't update_lport_tracking when port is not newly claimed.
v7: release localports from additional chassis too.
v7: properly handle binding when ->chassis not set but chassis name can
be found in the requested-chassis option.
v7: refactored consider_port_binding to simplify the logic, remove
redundant code paths.
v7: remove redundant flows that were left from prior versions of the
series.
v6: rebased, solved git conflicts.
v5: moved activation flows from table=8 to table=0.
v5: removed pause=true from rarp activation flow since we don't rely on
continuations.
v5: make rarp handle resubmit() admitted RARP packet to table=8. This
allows to avoid holding pending packets / waiting for flows deleted
etc.
v5: when cloning packets destined to a local binding to additional
chassis, clone them to tunnels in table=37, not table=38, to stay
consistent with tables' intent.
v5: dropped patch that added chassis-mirroring-enabled option. The
option doesn't resolve the ARP flipping issue. Instead, just
document the behavior of localnet attached switches when ports are
multi-chassis.
v5: (minor) set match's port and dp key inside
put_remote_port_redirect_overlay.
v4: redesign to reuse requested-chassis option
v4: support >2 chassis per port
v4: allow to disable tunneling enforcement when n_chassis >= 2
v3: re-sent as a single series
v2: added ddlog implementation
v2: re-inject RARP packet after vswitch updates flows
v1: split into pieces
v1: renamed options: migration-destination ->
 requested-additional-chassis,
 migration-unblocked ->
 additional-chassis-activated
v1: introduced options:activation-strategy=rarp to allow for other
strategies / having default no-op strategy
v1: implement in-memory port-activated tracking to avoid races
v1: numerous code cleanup / bug fixes
v1: special handling for localnet attached switches
v0: initial draft (single patch)

Ihar Hrachyshka (3):
  Update port-up on main chassis only
  Support LSP:options:requested-chassis as a list
  Clone packets to both port chassis

 NEWS|   1 +
 controller/binding.c| 307 ++--
 controller/binding.h|   5 +
 controller/if-status.c  |  15 +-
 controller/if-status.h  |   1 +
 controller/lport.c  |  46 +-
 controller/lport.h  |  11 +-
 controller/ovn-controller.c |   4 +-
 controller/physical.c   | 376 +--
 northd/northd.c |  62 ++-
 northd/ovn-northd.c |   4 +-
 ovn-nb.xml  |  29 +-
 ovn-sb.ovsschema|  17 +-
 ovn-sb.xml  |  72 ++-
 tests/ovn.at| 911 
 15 files changed, 1606 insertions(+), 255 deletions(-)

-- 
2.34.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v6 3/5] Support LSP:options:requested-chassis as a list

2022-05-18 Thread Ihar Hrachyshka
Thanks Han for your comments.

I've handled what you spotted + added more scenarios, for 3 chassis,
switching roles of chassis from main to additional and back, to have
some chassis not claiming the binding. I hope this resolves your
concerns.

Ihar

On Wed, May 11, 2022 at 2:22 AM Han Zhou  wrote:
>
>
>
> On Thu, May 5, 2022 at 6:38 AM Ihar Hrachyshka  wrote:
> >
> > When the option is set to a comma separated list of chassis names, OVN
> > will attempt to bind the port at any number of other locations in
> > addition to the main chassis.
> >
> > This is useful in live migration scenarios where it's important to
> > prepare the environment for workloads to move to, avoiding costly flow
> > configuration at the moment of the final port binding location change.
> >
> > This may also be useful in port mirroring scenarios.
> >
> > Corresponding database fields (pb->additional_chassis,
> > pb->requested_additional_chassis, pb->additional_encap) are introduced
> > as part of the patch. These are similar to fields designed to track the
> > main chassis (->chassis, ->requested_chassis, ->encap). But because we
> > support any number of additional chassis, these fields are lists.
> >
> > Signed-off-by: Ihar Hrachyshka 
>
> Thanks Ihar. Please check my comments below.
> (Please allow me some more time to review the rest of the patches in this 
> series.)
>
> > ---
> >  NEWS |   1 +
> >  controller/binding.c | 276 ---
> >  controller/lport.c   |  42 ---
> >  northd/northd.c  |  62 ++
> >  northd/ovn-northd.c  |   4 +-
> >  ovn-nb.xml   |  20 +++-
> >  ovn-sb.ovsschema |  17 ++-
> >  ovn-sb.xml   |  63 --
> >  tests/ovn.at |  89 ++
> >  9 files changed, 476 insertions(+), 98 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index dbe89e9cf..3fedcfeed 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -7,6 +7,7 @@ Post v22.03.0
> >- Support NAT for logical routers with multiple distributed gateway 
> > ports.
> >- Add global option (NB_Global.options:default_acl_drop) to enable
> >  implicit drop behavior on logical switches with ACLs applied.
> > +  - Support list of chassis for 
> > Logical_Switch_Port:options:requested-chassis.
> >
> >  OVN v22.03.0 - 11 Mar 2022
> >  --
> > diff --git a/controller/binding.c b/controller/binding.c
> > index e8c96a64a..20f0fd11b 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -922,6 +922,142 @@ claimed_lport_set_up(const struct sbrec_port_binding 
> > *pb,
> >  }
> >  }
> >
> > +typedef void (*set_func)(const struct sbrec_port_binding *pb,
> > + const struct sbrec_encap *);
> > +
> > +static bool
> > +update_port_encap_if_needed(const struct sbrec_port_binding *pb,
> > +const struct sbrec_chassis *chassis_rec,
> > +const struct ovsrec_interface *iface_rec,
> > +bool sb_readonly)
> > +{
> > +const struct sbrec_encap *encap_rec =
> > +sbrec_get_port_encap(chassis_rec, iface_rec);
> > +if ((encap_rec && pb->encap != encap_rec) ||
> > +(!encap_rec && pb->encap)) {
> > +if (sb_readonly) {
> > +return false;
> > +}
> > +sbrec_port_binding_set_encap(pb, encap_rec);
> > +}
> > +return true;
> > +}
> > +
> > +static void
> > +append_additional_encap(const struct sbrec_port_binding *pb,
> > +const struct sbrec_encap *encap)
> > +{
> > +struct sbrec_encap **additional_encap = xmalloc(
> > +(pb->n_additional_encap + 1) * (sizeof *additional_encap));
> > +memcpy(additional_encap, pb->additional_encap,
> > +   pb->n_additional_encap * (sizeof *additional_encap));
> > +additional_encap[pb->n_additional_encap] = (struct sbrec_encap *) 
> > encap;
> > +sbrec_port_binding_set_additional_encap(
> > +pb, additional_encap, pb->n_additional_encap + 1);
> > +free(additional_encap);
> > +}
> > +
> > +static void
> > +remove_additional_encap_for_chassis(const struct sbrec_port_binding *pb,
> > +const struct sbrec_chassis 
> > *chassis_rec)
> > +{
> > +struct sbrec_encap **additional_encap = xmalloc(
> > +pb->n_additional_encap * (sizeof *additional_encap));
> > +
> > +int idx = 0;
> > +for (int i = 0; i < pb->n_additional_encap; i++) {
> > +if (!strcmp(pb->additional_encap[i]->chassis_name,
> > +chassis_rec->name)) {
> > +continue;
> > +}
> > +additional_encap[idx++] = pb->additional_encap[i];
> > +}
> > +sbrec_port_binding_set_additional_encap(pb, additional_encap, idx);
> > +free(additional_encap);
> > +}
> > +
> > +static bool
> > +update_port_additional_encap_if_needed(
> > +const struct sbrec_port_binding *pb,
> > +const struct sbrec_chassis *chassis_rec,

Re: [ovs-dev] [PATCH ovn v6 1/5] Tag all packets that arrived from a tunnel as LOCAL_ONLY

2022-05-18 Thread Ihar Hrachyshka
This patch won't be needed after the next iteration of 'Clone
packets...' patch in the set drops. I'm dropping it in the next set.

On Wed, May 11, 2022 at 2:20 AM Han Zhou  wrote:
>
>
>
> On Thu, May 5, 2022 at 6:38 AM Ihar Hrachyshka  wrote:
> >
> > A next patch may need to clone packets to another chassis as part of
> > "port migration" procedure, but only if they haven't originated at
> > another chassis. This LOCAL_ONLY tag will enforce the requirement.
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  controller/physical.c | 1 +
> >  controller/pinctrl.c  | 2 ++
> >  2 files changed, 3 insertions(+)
> >
> > diff --git a/controller/physical.c b/controller/physical.c
> > index 73320..527df5df8 100644
> > --- a/controller/physical.c
> > +++ b/controller/physical.c
> > @@ -1737,6 +1737,7 @@ physical_run(struct physical_ctx *p_ctx,
> >  OVS_NOT_REACHED();
> >  }
> >
> > +put_load(1, MFF_LOG_FLAGS, MLF_LOCAL_ONLY_BIT, 1, );
> >  put_resubmit(OFTABLE_LOCAL_OUTPUT, );
> >
> >  ofctrl_add_flow(flow_table, OFTABLE_PHY_TO_LOG, 100, 0, ,
> > diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> > index ae3da332c..df9284e50 100644
> > --- a/controller/pinctrl.c
> > +++ b/controller/pinctrl.c
> > @@ -586,6 +586,8 @@ set_actions_and_enqueue_msg(struct rconn *swconn,
> >  enum ofp_version version = rconn_get_version(swconn);
> >
> >  reload_metadata(, md);
> > +/* Allow packet to leave the node. */
> > +put_load(0, MFF_LOG_FLAGS, MLF_LOCAL_ONLY_BIT, 1, );
> >  enum ofperr error = ofpacts_pull_openflow_actions(userdata, 
> > userdata->size,
> >version, NULL, NULL,
> >);
> > --
> > 2.34.1
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
> Acked-by: Han Zhou 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v6 2/5] Update port-up on main chassis only

2022-05-18 Thread Ihar Hrachyshka
Correct, I'm handling this in the next iteration.

On Wed, May 11, 2022 at 2:23 AM Han Zhou  wrote:
>
>
>
> On Thu, May 5, 2022 at 6:38 AM Ihar Hrachyshka  wrote:
> >
> > In a future patch, there will be a scenario where the same port has
> > attachments at multiple (specifically, 2) chassis, so make sure that
> > 'up' property is updated by the main chassis only.
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  controller/binding.c| 10 +++---
> >  controller/binding.h|  2 ++
> >  controller/if-status.c  | 15 ++-
> >  controller/if-status.h  |  1 +
> >  controller/ovn-controller.c |  4 ++--
> >  5 files changed, 22 insertions(+), 10 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index e284704d8..e8c96a64a 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -691,6 +691,7 @@ local_binding_is_down(struct shash *local_bindings, 
> > const char *pb_name)
> >
> >  void
> >  local_binding_set_up(struct shash *local_bindings, const char *pb_name,
> > + const struct sbrec_chassis *chassis_rec,
> >   const char *ts_now_str, bool sb_readonly,
> >   bool ovs_readonly)
> >  {
> > @@ -710,8 +711,9 @@ local_binding_set_up(struct shash *local_bindings, 
> > const char *pb_name,
> >  ts_now_str);
> >  }
> >
> > -if (!sb_readonly && lbinding && b_lport && b_lport->pb->n_up
> > -&& !b_lport->pb->up[0]) {
> > +if (!sb_readonly && lbinding && b_lport && b_lport->pb->n_up &&
> > +!b_lport->pb->up[0] &&
> > +(!b_lport->pb->chassis || b_lport->pb->chassis == 
> > chassis_rec)) {
>
> Why not just add "&& b_lport->pb->chassis == chassis_rec"? It doesn't seem to 
> be reasonable to set "up" when pb->chassis is NULL.
>
> Thanks,
> Han
>
> >  VLOG_INFO("Setting lport %s up in Southbound", pb_name);
> >  binding_lport_set_up(b_lport, sb_readonly);
> >  LIST_FOR_EACH (b_lport, list_node, >binding_lports) {
> > @@ -722,6 +724,7 @@ local_binding_set_up(struct shash *local_bindings, 
> > const char *pb_name,
> >
> >  void
> >  local_binding_set_down(struct shash *local_bindings, const char *pb_name,
> > +   const struct sbrec_chassis *chassis_rec,
> > bool sb_readonly, bool ovs_readonly)
> >  {
> >  struct local_binding *lbinding =
> > @@ -736,7 +739,8 @@ local_binding_set_down(struct shash *local_bindings, 
> > const char *pb_name,
> >  OVN_INSTALLED_EXT_ID);
> >  }
> >
> > -if (!sb_readonly && b_lport && b_lport->pb->n_up && 
> > b_lport->pb->up[0]) {
> > +if (!sb_readonly && b_lport && b_lport->pb->n_up && b_lport->pb->up[0] 
> > &&
> > +(!b_lport->pb->chassis || b_lport->pb->chassis == 
> > chassis_rec)) {
> >  VLOG_INFO("Setting lport %s down in Southbound", pb_name);
> >  binding_lport_set_down(b_lport, sb_readonly);
> >  LIST_FOR_EACH (b_lport, list_node, >binding_lports) {
> > diff --git a/controller/binding.h b/controller/binding.h
> > index 430a8d9b1..46f88aff7 100644
> > --- a/controller/binding.h
> > +++ b/controller/binding.h
> > @@ -154,9 +154,11 @@ ofp_port_t local_binding_get_lport_ofport(const struct 
> > shash *local_bindings,
> >  bool local_binding_is_up(struct shash *local_bindings, const char 
> > *pb_name);
> >  bool local_binding_is_down(struct shash *local_bindings, const char 
> > *pb_name);
> >  void local_binding_set_up(struct shash *local_bindings, const char 
> > *pb_name,
> > +  const struct sbrec_chassis *chassis_rec,
> >const char *ts_now_str, bool sb_readonly,
> >bool ovs_readonly);
> >  void local_binding_set_down(struct shash *local_bindings, const char 
> > *pb_name,
> > +const struct sbrec_chassis *chassis_rec,
> >  bool sb_readonly, bool ovs_readonly);
> >
> >  void binding_register_ovs_idl(struct ovsdb_idl *);
> > diff --git a/controller/if-status.c b/controller/if-status.c
> > index dbdf8b66f..ad61844d8 100644
> > --- a/controller/if-status.c
> > +++ b/controller/if-status.c
> > @@ -115,6 +115,7 @@ static void ovs_iface_set_state(struct if_status_mgr *, 
> > struct ovs_iface *,
> >
> >  static void if_status_mgr_update_bindings(
> >  struct if_status_mgr *mgr, struct local_binding_data *binding_data,
> > +const struct sbrec_chassis *,
> >  bool sb_readonly, bool ovs_readonly);
> >
> >  struct if_status_mgr *
> > @@ -306,6 +307,7 @@ if_status_mgr_update(struct if_status_mgr *mgr,
> >  void
> >  if_status_mgr_run(struct if_status_mgr *mgr,
> >struct local_binding_data *binding_data,
> > +  const struct sbrec_chassis *chassis_rec,
> >bool sb_readonly, bool ovs_readonly)
> >  

Re: [ovs-dev] [PATCH ovn v6 4/5] Clone packets to both port chassis

2022-05-18 Thread Ihar Hrachyshka
Thanks for spotting unneeded flows. These are artifacts from swapping
the order of local/remote handling. (I've also noticed that now we
don't need to tag packets coming from tunnels as LOCAL_ONLY to make it
work, so I'll also drop the patch from the set.)

As for convoluted logic in consider_port_binding, I agree and I'm
sending a new version of the patch that has this mostly tackled.
There's still more work that can be done in the function to make it
less convoluted / long, but the rest of improvement don't belong to
this patch.

Thanks again,
Ihar

On Thu, May 12, 2022 at 9:10 PM Han Zhou  wrote:
>
>
>
> On Thu, May 5, 2022 at 6:38 AM Ihar Hrachyshka  wrote:
> >
> > When multiple chassis are set in requested-chassis, port binding is
> > configured in multiple cluster locations. In case of live migration
> > scenario, only one of the locations run a workload at a particular
> > point in time. Yet, it's expected that the workload may switch to
> > running at an additional chassis at any moment during live migration
> > (depends on libvirt / qemu migration progress). To speed up the switch
> > to near instant, do the following:
> >
> > When a port located sends a packet to another port that has multiple
> > chassis then, in addition to sending the packet to the main chassis,
> > also send it to additional chassis. When the sending port is bound on
> > either the main or additional chassis, then handle the packet locally
> > plus send it to all other chassis.
> >
> > This is achieved with additional flows in tables 37 and 38.
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  controller/binding.c  |   2 +-
> >  controller/binding.h  |   3 +
> >  controller/physical.c | 247 +--
> >  ovn-nb.xml|   9 +
> >  ovn-sb.xml|   9 +
> >  tests/ovn.at  | 693 ++
> >  6 files changed, 929 insertions(+), 34 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 20f0fd11b..b185e1603 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -999,7 +999,7 @@ update_port_additional_encap_if_needed(
> >  return true;
> >  }
> >
> > -static bool
> > +bool
> >  is_additional_chassis(const struct sbrec_port_binding *pb,
> >const struct sbrec_chassis *chassis_rec)
> >  {
> > diff --git a/controller/binding.h b/controller/binding.h
> > index 46f88aff7..80f1f66a1 100644
> > --- a/controller/binding.h
> > +++ b/controller/binding.h
> > @@ -194,4 +194,7 @@ enum en_lport_type {
> >
> >  enum en_lport_type get_lport_type(const struct sbrec_port_binding *);
> >
> > +bool is_additional_chassis(const struct sbrec_port_binding *pb,
> > +   const struct sbrec_chassis *chassis_rec);
> > +
> >  #endif /* controller/binding.h */
> > diff --git a/controller/physical.c b/controller/physical.c
> > index 527df5df8..6399d1fce 100644
> > --- a/controller/physical.c
> > +++ b/controller/physical.c
> > @@ -60,6 +60,11 @@ struct zone_ids {
> >  int snat;   /* MFF_LOG_SNAT_ZONE. */
> >  };
> >
> > +struct additional_tunnel {
> > +struct ovs_list list_node;
> > +const struct chassis_tunnel *tun;
> > +};
> > +
> >  static void
> >  load_logical_ingress_metadata(const struct sbrec_port_binding *binding,
> >const struct zone_ids *zone_ids,
> > @@ -287,28 +292,63 @@ match_outport_dp_and_port_keys(struct match *match,
> >  }
> >
> >  static void
> > -put_remote_port_redirect_overlay(const struct
> > - sbrec_port_binding *binding,
> > +put_remote_port_redirect_overlay(const struct sbrec_port_binding *binding,
> >   bool is_ha_remote,
> >   struct ha_chassis_ordered *ha_ch_ordered,
> >   enum mf_field_id mff_ovn_geneve,
> >   const struct chassis_tunnel *tun,
> > + const struct ovs_list *additional_tuns,
> > + uint32_t dp_key,
> >   uint32_t port_key,
> >   struct match *match,
> >   struct ofpbuf *ofpacts_p,
> >   const struct hmap *chassis_tunnels,
> >   struct ovn_desired_flow_table *flow_table)
> >  {
> > +match_outport_dp_and_port_keys(match, dp_key, port_key);
> >  if (!is_ha_remote) {
> >  /* Setup encapsulation */
> > -if (!tun) {
> > -return;
> > +bool is_vtep = !strcmp(binding->type, "vtep");
> > +if (!additional_tuns) {
> > +/* Output to main chassis tunnel. */
> > +put_encapsulation(mff_ovn_geneve, tun, binding->datapath, 
> > port_key,
> > +  is_vtep, ofpacts_p);
> > +ofpact_put_OUTPUT(ofpacts_p)->port = 

Re: [ovs-dev] [PATCH ovn] Allow for setting the Next server IP in the DHCP header

2022-05-18 Thread Numan Siddique
On Wed, May 11, 2022 at 11:34 AM  wrote:
>
> From: Lucas Alvares Gomes 
>
> In order to PXE boot a baremetal server using the OVN DHCP server we
> need to allow users to set the "next-server" (siaddr) [0] field in the
> DHCP header.
>
> While investigating this issue by comparing the DHCPOFFER and DHCPACK
> packets sent my dnsmasq and OVN we saw that the "next-server" field
> was the problem for OVN, without it PXE booting was timing out while
> fetching the iPXE image from the TFTP server (see the bugzilla ticket
> below for reference).
>
> To confirm this problem we created a bogus patch hardcoding the TFTP
> address in the siaddr of the DHCP header (see the discussion in the
> maillist below) and with this in place we were able to deploy a
> baremetal node using the OVN DHCP end-to-end.
>
> This patch is a proper implementation that creates a new DHCP
> configuration option called "next_server" to allow users to set this
> field dynamically. This patch uses the DHCP code 253 which is a unsed
> code for DHCP specification as this is not a normal DHCP option but a
> special use case in OVN.
>
> [0]
> https://github.com/openvswitch/ovs/blob/9dd3031d2e0e9597449e95428320ccaaff7d8b3d/lib/dhcp.h#L42
>
> Reported-by: https://bugzilla.redhat.com/show_bug.cgi?id=2083629
> Reported-by:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2022-May/051821.html
> Signed-off-by: Lucas Alvares Gomes 
> ---
>  controller/pinctrl.c | 69 ++--
>  lib/actions.c| 13 -
>  lib/ovn-l7.h |  8 +
>  northd/ovn-northd.c  |  1 +
>  ovn-nb.xml   |  7 +
>  tests/ovn.at |  6 ++--
>  tests/test-ovn.c |  1 +
>  7 files changed, 80 insertions(+), 25 deletions(-)
>
> diff --git a/controller/pinctrl.c b/controller/pinctrl.c
> index ae3da332c..428863293 100644
> --- a/controller/pinctrl.c
> +++ b/controller/pinctrl.c
> @@ -2203,30 +2203,56 @@ pinctrl_handle_put_dhcp_opts(
>   *| 4 Bytes padding | 1 Byte (option end 0xFF ) | 4 Bytes padding|
>   * --
>   */
> -struct dhcp_opt_header *in_dhcp_opt =
> -(struct dhcp_opt_header *)reply_dhcp_opts_ptr->data;
> -if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_CODE) {
> -unsigned char *ptr = (unsigned char *)in_dhcp_opt;
> -int len = sizeof *in_dhcp_opt + in_dhcp_opt->len;
> -struct dhcp_opt_header *next_dhcp_opt =
> -(struct dhcp_opt_header *)(ptr + len);
> -
> -if (next_dhcp_opt->code == DHCP_OPT_BOOTFILE_ALT_CODE) {
> -if (!ipxe_req) {
> -ofpbuf_pull(reply_dhcp_opts_ptr, len);
> -next_dhcp_opt->code = DHCP_OPT_BOOTFILE_CODE;
> -} else {
> -char *buf = xmalloc(len);
> +ovs_be32 next_server = in_dhcp_data->siaddr;
> +bool bootfile_name_set = false;
> +in_dhcp_ptr = reply_dhcp_opts_ptr->data;
> +end = (const char *)reply_dhcp_opts_ptr->data + 
> reply_dhcp_opts_ptr->size;
> +

Hi Lucas,

Thanks for adding this support.

It seems to me this while loop can be avoided since lib/actions.c
always encodes offer_ip, DHCP_OPT_BOOTFILE_CODE,
DHCP_OPT_BOOTFILE_ALT_CODE
and now DHCP_OPT_NEXT_SERVER_CODE  in the userdata buffer before
encoding other dhcp options.

Since it is deterministic,  can't we do something like below instead
of the while loop ?

struct dhcp_opt_header *in_dhcp_opt =
(struct dhcp_opt_header *)reply_dhcp_opts_ptr->data;
if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_CODE) {
   
   advance in_dhcp_opt to the next option
} else if (in_dhcp_opt->code == DHCP_OPT_BOOTFILE_ALT_CODE) {
  ..
 advance in_dhcp_opt to the next option
}

if (in_dhcp_opt->code == DHCP_OPT_NEXT_SERVER_CODE) {
next_server = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
}

Let me know if this gets complicated because of my above suggestion.
In that case,I'm fine to run the below while loop.


Also it would be great if you can enhance the test "dhcpv4 : 1 HV, 2
LS, 2 LSPs/LS" in ovn.at with the newly added option.

>From what I understand, the dhcp response  will also include this new
option - 253 along with setting the dhcp_header->siaddr if
CMS has defined next_server in the dhcp_options right ?  If so, the
dhcp clients would ignore this option gracefully right ?

Thanks
Numan


> +while (in_dhcp_ptr < end) {
> +struct dhcp_opt_header *in_dhcp_opt =
> +(struct dhcp_opt_header *)in_dhcp_ptr;
> +
> +switch (in_dhcp_opt->code) {
> +case DHCP_OPT_NEXT_SERVER_CODE:
> +next_server = get_unaligned_be32(DHCP_OPT_PAYLOAD(in_dhcp_opt));
> +break;
> +case DHCP_OPT_BOOTFILE_CODE: ;
> +unsigned char *ptr = (unsigned char *)in_dhcp_opt;
> +int len = sizeof *in_dhcp_opt + in_dhcp_opt->len;
> +struct dhcp_opt_header *next_dhcp_opt =
> +(struct dhcp_opt_header *)(ptr + len);
> +
> +if 

Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Consider zone 0 as a valid zone when restoring.

2022-05-18 Thread Numan Siddique
On Wed, May 18, 2022 at 4:41 PM Mark Michelson  wrote:
>
> Looks good to me Dumitru.
>
> Acked-by: Mark Michelson 

Thanks Dumiru and Mark.

I applied this patch to the main branch and backported to branch-22.03
and branch-21.12.

Numan

>
> On 5/18/22 13:47, Dumitru Ceara wrote:
> > 0 is a valid zone ID and some CMSs might actually use it.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087194
> > Signed-off-by: Dumitru Ceara 
> > ---
> > v2: Use str_to_uint() as suggested by Ilya.
> > ---
> >   controller/ovn-controller.c | 15 ++-
> >   1 file changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 0efe5c5ce..dfe30d1d1 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -825,13 +825,18 @@ restore_ct_zones(const struct ovsrec_bridge_table 
> > *bridge_table,
> >   }
> >
> >   const char *user = node->key + 8;
> > -int zone = atoi(node->value);
> > +if (!user[0]) {
> > +continue;
> > +}
> >
> > -if (user[0] && zone) {
> > -VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
> > -bitmap_set1(ct_zone_bitmap, zone);
> > -simap_put(ct_zones, user, zone);
> > +unsigned int zone;
> > +if (!str_to_uint(node->value, 10, )) {
> > +continue;
> >   }
> > +
> > +VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
> > +bitmap_set1(ct_zone_bitmap, zone);
> > +simap_put(ct_zones, user, zone);
> >   }
> >   }
> >
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] ovn-controller: Consider zone 0 as a valid zone when restoring.

2022-05-18 Thread Mark Michelson

Looks good to me Dumitru.

Acked-by: Mark Michelson 

On 5/18/22 13:47, Dumitru Ceara wrote:

0 is a valid zone ID and some CMSs might actually use it.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087194
Signed-off-by: Dumitru Ceara 
---
v2: Use str_to_uint() as suggested by Ilya.
---
  controller/ovn-controller.c | 15 ++-
  1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 0efe5c5ce..dfe30d1d1 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -825,13 +825,18 @@ restore_ct_zones(const struct ovsrec_bridge_table 
*bridge_table,
  }
  
  const char *user = node->key + 8;

-int zone = atoi(node->value);
+if (!user[0]) {
+continue;
+}
  
-if (user[0] && zone) {

-VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
-bitmap_set1(ct_zone_bitmap, zone);
-simap_put(ct_zones, user, zone);
+unsigned int zone;
+if (!str_to_uint(node->value, 10, )) {
+continue;
  }
+
+VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
+bitmap_set1(ct_zone_bitmap, zone);
+simap_put(ct_zones, user, zone);
  }
  }
  



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] ovn-controller: Set Port_Binding.up field only if the Southbound DB is aware of this field

2022-05-18 Thread Mark Michelson
Thank you for the update, Mary. I have pushed the change to main, 
branch-22.03, and branch-21.12.


On 5/18/22 14:21, mary.mano...@nutanix.com wrote:

From: Mary Manohar 

This patch is fixing a backward compatibility issue when using an older
ovn-northd (20.09).
The 20.09 Southbound schema is not aware of the 'up'
field in the Port_Binding table. ovn-controller should set
Port_Binding.up field only if the Southbound DB
is aware of this field (or transaction would fail).

Signed-off-by: Mary Manohar 
---
  controller/binding.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/controller/binding.c b/controller/binding.c
index e284704..e7dc537 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -908,7 +908,9 @@ claimed_lport_set_up(const struct sbrec_port_binding *pb,
  if (!notify_up) {
  bool up = true;
  if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
-sbrec_port_binding_set_up(pb, , 1);
+if (pb->n_up) {
+sbrec_port_binding_set_up(pb, , 1);
+}
  }
  return;
  }



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 0/3] Adding generic port security flows.

2022-05-18 Thread Mark Michelson

On 5/18/22 14:17, Numan Siddique wrote:



On Mon, May 16, 2022 at 5:42 PM Numan Siddique > wrote:




On Mon, May 16, 2022 at 4:27 PM Mark Michelson mailto:mmich...@redhat.com>> wrote:

Hi Numan,

I've taken a close look at the patches in this series, and they
seem
really good. They're very well commented and well tested as
well. It's
quite easy to follow the change, and I couldn't find any flaws
in my review.

However, I do want to double-check that this does not put
unnecessary
load on ovn-controller. I suspect it won't be much of a problem
since

1) The port security flows are calculated incrementally.
2) The reduced SB DB size likely lessens the overall load on
ovn-controller .
3) The removed port security logical flows means there is less
parsing
of logical flows per iteration of ovn-controller.

However, this does add new OF flow creation in ovn-controller,
so it's
worth checking to make sure ovn-controller does not see any
noticeable
performance decrease.

If we can get confirmation, then I'll ack the series.


Thanks Mark for the reviews.

Sure.  I'll do some tests and share the results.


I did some testing and these are the findings.

1.  I started 2 separate ovn-central components, one on central-1 with 
OVN main commit (as of today's)
     and the other on central-2 node with OVN main + these port security 
patches.

      Same OVN databases on both the nodes.

     central-1 SB DB has 208962 logical flows
     central-2 SB DB (with port sec patches) has 84294 logical flows.

2.  Started 2 separate compute nodes - compute-1 (with ovn main 
ovn-controller) and compute-2 (with ovn main + port sec ovn-controller)
      Created around 1000 ovs ports on compute-1 and copied the conf.db 
to compute-2 - so that both the ovn-controllers claim the same logical 
ports.


3.  After both the ovn-controllers settle down.  I triggered recompute 
multiple times.   This recompute will generate all the openflows (but 
ofcourse will not program ovs-vswitchd)
    compute-1 ovn-controller takes around 3100 ms to complete the loop.  
I see unreasonable long poll interval message and compute-2 
ovn-controller takes around 1500ms to

    complete the loop.


I think these patches also help ovn-controller as it has to do less 
logical flow processing.


Below is the stopwatch/show for compute-1 ovn-controller

[root@ovn-chassis-1 data]# ovn-appctl -t ovn-controller stopwatch/show 
flow-generation

Statistics for 'flow-generation'
   Total samples: 679
   Maximum: 3435 msec
   Minimum: 0 msec
   95th percentile: 32.512116 msec
   Short term average: 15.137543 msec
   Long term average: 109.403161 msec

And below is the same for compute-2 ovn-controller

[root@ovn-chassis-2 /]#  ovn-appctl -t ovn-controller stopwatch/show 
flow-generation

Statistics for 'flow-generation'
   Total samples: 700
   Maximum: 1341 msec
   Minimum: 0 msec
   95th percentile: 2.987580 msec
   Short term average: 0.00 msec
   Long term average: 24.980349 msec


Thanks
Numan


Excellent work Numan. I suspected that ovn-controller might run the same 
or slightly slower with this change, but the result is that it's 
actually quicker!


Acked-by: Mark Michelson 





Numan


On 5/12/22 20:42, num...@ovn.org  wrote:
 > From: Numan Siddique mailto:num...@ovn.org>>
 >
 > This patch series adds generic logical flows for port security in
 > the logical switch pipeline and pushes the actual port security
 > implementation logic to ovn-controller from ovn-northd.
 >
 > ovn-northd will now add logical flows like:
 >
 > table=0 (ls_in_check_port_sec), priority=50   , match=(1),
action=(reg0[14] = check_in_port_sec(); next;)
 > table=1 (ls_in_apply_port_sec), priority=50   ,
match=(reg0[14] == 1), action=(drop;)
 > table=1 (ls_in_apply_port_sec), priority=0    , match=(1),
action=(next;)
 >
 > OVN action check_in_port_sec() resubmits the packet to
openflow table
 > 73.  ovn-controller will add port security flows in table
73,74 and 75
 > for all the logical ports it has claimed.  The port security
information
 > is passed down the Port_Binding table in Southbound database.
 >
 > The main motivation for the patch is to address scale concerns.
 > This patch series reduces the number of logical flows and
ovn-northd
 > CPU utilization time.
 >
 > Did some scale testing and below are the results:
 >
 > Used a Northbound database from a deployment of 120 node cluster.
 > Number of logical switch ports with port security configured:
13711
 >

Re: [ovs-dev] [PATCH ovn] northd: Honor ct-snat-zone option for common case.

2022-05-18 Thread Mark Michelson

On 5/17/22 16:48, Numan Siddique wrote:



On Fri, May 6, 2022 at 9:47 AM Mark Michelson > wrote:


Commit 4deac4509 introduced the new "czone" variants for ct_dnat and
ct_snat. This makes it so that a single conntrack zone is used for both
DNAT and SNAT on the datapath. The common CT zone chosen in this commit
was the DNAT zone.

The ct-snat-zone option for logical routers makes it so that an SNAT
zone can be explicitly configured instead of having a zone selected at
random by OVN. The problem is that since the DNAT zone is used as the
common zone ID, it means that when SNAT is performed, the SNAT is not
performed in the configured ct-snat-zone.

The fix for this is to change to using the SNAT zone as the common zone
if ct-snat-zone is configured on the logical router. This way, SNAT will
use the configured zone as expected. DNAT will also use this zone when
possible.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2061619


Signed-off-by: Mark Michelson mailto:mmich...@redhat.com>>


Thanks for fixing this.

Acked-by: Numan Siddique mailto:num...@ovn.org>>

Numan


Thanks for the review Numan. I pushed the change to main, branch-22.03 
and branch-21.12.




---
  controller/lflow.c           | 18 +
  include/ovn/actions.h        |  2 +
  include/ovn/logical-fields.h |  2 -
  lib/actions.c                |  4 +-
  tests/ovn.at                  | 72

  tests/test-ovn.c             |  1 +
  6 files changed, 95 insertions(+), 4 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index b7ddeb1c0..3c953e18b 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1104,6 +1104,23 @@ lflow_parse_ctrl_meter(const struct
sbrec_logical_flow *lflow,
      }
  }

+static int
+get_common_nat_zone(const struct local_datapath *ldp)
+{
+    /* Normally, the common NAT zone defaults to the DNAT zone.
However,
+     * if the "snat-ct-zone" is set on the datapath, the user is
+     * expecting an explicit CT zone to be used for SNAT. If we default
+     * to the DNAT zone, then it means SNAT will not use the configured
+     * value. The way we get around this is to use the SNAT zone as the
+     * common zone if "snat-ct-zone" is set.
+     */
+    if (smap_get(>datapath->external_ids, "snat-ct-zone")) {
+        return MFF_LOG_SNAT_ZONE;
+    } else {
+        return MFF_LOG_DNAT_ZONE;
+    }
+}
+
  static void
  add_matches_to_flow_table(const struct sbrec_logical_flow *lflow,
                            const struct local_datapath *ldp,
@@ -1153,6 +1170,7 @@ add_matches_to_flow_table(const struct
sbrec_logical_flow *lflow,
          .fdb_ptable = OFTABLE_GET_FDB,
          .fdb_lookup_ptable = OFTABLE_LOOKUP_FDB,
          .ctrl_meter_id = ctrl_meter_id,
+        .common_nat_ct_zone = get_common_nat_zone(ldp),
      };
      ovnacts_encode(ovnacts->data, ovnacts->size, , );

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index f55d77d47..547797584 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -808,6 +808,8 @@ struct ovnact_encode_params {
                                  * 'lookup_fdb' to resubmit. */
      uint32_t ctrl_meter_id;     /* Meter to be used if the
resulting flow
                                     sends packets to controller. */
+    uint32_t common_nat_ct_zone; /* When performing NAT in a common
CT zone,
+                                    this determines which CT zone
to use */
  };

  void ovnacts_encode(const struct ovnact[], size_t ovnacts_len,
diff --git a/include/ovn/logical-fields.h b/include/ovn/logical-fields.h
index 25b5a62a3..18516634e 100644
--- a/include/ovn/logical-fields.h
+++ b/include/ovn/logical-fields.h
@@ -36,8 +36,6 @@ enum ovn_controller_event {
                                         * (32 bits). */
  #define MFF_LOG_SNAT_ZONE  MFF_REG12  /* conntrack snat zone for
gateway router
                                         * (32 bits). */
-#define MFF_LOG_NAT_ZONE   MFF_LOG_DNAT_ZONE /* conntrack zone for
both snat
-                                              * and dnat. */
  #define MFF_LOG_CT_ZONE    MFF_REG13  /* Logical conntrack zone
for lports
                                         * (32 bits). */
  #define MFF_LOG_INPORT     MFF_REG14  /* Logical input port (32
bits). */
diff --git a/lib/actions.c b/lib/actions.c
index a3cf747db..a9c27600c 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -1131,7 +1131,7 @@ encode_CT_DNAT_IN_CZONE(const struct
ovnact_ct_nat *cn,
              

Re: [ovs-dev] [PATCH v2] sha1: Use implementation from openssl if available.

2022-05-18 Thread Dumitru Ceara
On 5/18/22 20:00, Ilya Maximets wrote:
> On 5/16/22 09:40, Dumitru Ceara wrote:
>> On 5/7/22 00:55, Ilya Maximets wrote:
>>> Implementation of SHA1 in OpenSSL library is much faster and optimized
>>> for all available CPU architectures and instruction sets.  OVS should
>>> use it instead of internal implementation if possible.
>>>
>>> Depending on compiler options OpenSSL's version finishes our sha1
>>> unit tests from 3 to 12 times faster.  Performance of OpenSSL's
>>> version is constant, but OVS's implementation highly depends on
>>> compiler.  Interestingly, default build with  '-g -O2' works faster
>>> than optimized '-march=native -Ofast'.
>>>
>>> Tests with ovsdb-server on big databases shows ~5-10% improvement of
>>> the time needed for database compaction (sha1 is only a part of this
>>> operation), depending on compiler options.
>>>
>>> We still need internal implementation, because OpenSSL can be not
>>> available on some platforms.  Tests enhanced to check both versions
>>> of API.
>>>
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>
>> Hi Ilya,
>>
>> I'm no openssl expert so this is a rather shallow review.  In any case,
>> the change looks good to me overall.  I left a question below and a
>> minor nit comment.
>>
>> Regardless:
>>
>> Reviewed-by: Dumitru Ceara 
>>
>> Thanks,
>> Dumitru
>>
>>>
>>> Version 2:
>>>   - Deprecated low level SHA1_* API replaced with EVP_Digest* API.
>>>   - Added more accurate logging of OpenSSL errors.
>>>   - Stack-based allocation of the digest context is no longer possible
>>> in modern versions of OpenSSL, so converted to supported dynamic
>>> allocation with EVP_MD_CTX_create().  Technically, this function
>>> was also renamed to EVP_MD_CTX_new() in OpenSSL 3.0, but the old
>>> name still works.
>>>
>>>  lib/sha1.c| 166 +++---
>>>  lib/sha1.h|  27 ++--
>>>  ovsdb/log.c   |   1 +
>>>  tests/library.at  |   2 +-
>>>  tests/test-sha1.c |  60 +++--
>>>  5 files changed, 193 insertions(+), 63 deletions(-)
>>>
>>> diff --git a/lib/sha1.c b/lib/sha1.c
>>> index 87360d9cd..3ecc8fb9f 100644
>>> --- a/lib/sha1.c
>>> +++ b/lib/sha1.c
>>> @@ -31,12 +31,131 @@
>>>  
>>>  #include 
>>>  #include "sha1.h"
>>> +
>>> +#ifdef HAVE_OPENSSL
>>> +#include 
>>> +#include 
>>> +#endif
>>> +
>>>  #include 
>>>  #include 
>>>  #include "compiler.h"
>>> +#include "openvswitch/vlog.h"
>>>  #include "util.h"
>>>  
>>> -/* a bit faster & bigger, if defined */
>>> +VLOG_DEFINE_THIS_MODULE(sha1);
>>> +
>>> +#ifdef HAVE_OPENSSL
>>> +static void
>>> +log_openssl_err(const char *func)
>>> +{
>>> +char buf[1024];
>>> +
>>> +ERR_error_string_n(ERR_get_error(), buf, 1024);
>>> +VLOG_FATAL("%s failed: %s", func, buf);
>>> +}
>>> +#endif
>>> +
>>> +/*
>>> + * Initialize the SHA digest.
>>> + * context: The SHA context to initialize
>>> + */
>>> +void
>>> +sha1_init(struct sha1_ctx *sha_info)
>>> +{
>>> +#ifdef HAVE_OPENSSL
>>> +sha_info->ctx = EVP_MD_CTX_create();
>>> +if (!EVP_DigestInit_ex(sha_info->ctx, EVP_sha1(), NULL)) {
>>> +log_openssl_err("EVP_DigestInit_ex");
>>> +}
>>> +#else
>>> +ovs_sha1_init(sha_info);
>>> +#endif
>>> +}
>>> +
>>> +/*
>>> + * Update the SHA digest.
>>> + * context: The SHA1 context to update.
>>> + * input: The buffer to add to the SHA digest.
>>> + * inputLen: The length of the input buffer.
>>> + */
>>> +void
>>> +sha1_update(struct sha1_ctx *ctx, const void *buffer_, uint32_t count)
>>> +{
>>> +#ifdef HAVE_OPENSSL
>>> +if (!EVP_DigestUpdate(ctx->ctx, buffer_, count)) {
>>> +log_openssl_err("EVP_DigestUpdate");
>>> +}
>>> +#else
>>> +ovs_sha1_update(ctx, buffer_, count);
>>> +#endif
>>> +}
>>> +
>>> +/*
>>> + * Finish computing the SHA digest.
>>> + * digest: the output buffer in which to store the digest.
>>> + * context: The context to finalize.
>>> + */
>>> +void
>>> +sha1_final(struct sha1_ctx *ctx, uint8_t digest[SHA1_DIGEST_SIZE])
>>> +{
>>> +#ifdef HAVE_OPENSSL
>>> +unsigned int len;
>>> +
>>> +if (!EVP_DigestFinal_ex(ctx->ctx, digest, )) {
>>
>> According to the openssl man page, "at most EVP_MAX_MD_SIZE bytes will
>> be written".  Will there be ever be a case when less than
>> SHA1_DIGEST_SIZE bytes are written causing the assert below to fail?
> 
> I don't think so.  The openssl documentation is notoriously bad,
> but IIUC, digest sizes are constant for a specific algorithm, as
> they should be.  SHA1 hash is always 160 bits long.
> If openssl doesn't fail but writes less than the hash size, it's
> a bug in openssl and we can't actually proceed here.  As if we
> can't calculate the hash, we can't perform any operations that
> requires hashing.
> 
> So, this should never happen, but if that happens, that's a fatal
> problem.
> 

Ack.

>>
>> Also, will there ever be a case when more than SHA1_DIGEST_SIZE are
>> written?  Should we ensure that all callers of sha1_final pass a digest
>> that's at least 

Re: [ovs-dev] [PATCH ovn v2] Do not set 'up' attribute in the port-binding table if ovn-northd is at an older version.

2022-05-18 Thread Mary Manohar
Hi Numan

Sure, I have sent out a v3 with the updated commit message.

Thanks
Mary

From: Numan Siddique 
Date: Wednesday, May 18, 2022 at 7:37 AM
To: Mark Michelson 
Cc: Mary Manohar , Xavier Simonart 
, ovs-dev@openvswitch.org 
Subject: Re: [ovs-dev] [PATCH ovn v2] Do not set 'up' attribute in the 
port-binding table if ovn-northd is at an older version.


On Tue, May 17, 2022 at 9:55 AM Mark Michelson 
mailto:mmich...@redhat.com>> wrote:
OK thanks for the explanations, Xavier and Mary. This is pretty
important since we expect that ovn-controller should be upgraded before
ovn-northd.

I'm acking the patch because the content is correct. I suggest that when
this is merged, the person that merges should expand on the commit
message a bit to explain why the commit is necessary. If nothing else,
make it clear that 20.09 is the cutoff between an "older" and "newer"
version of ovn-northd in this case.

Acked-by: Mark Michelson

Thanks Mark for the review.

@Mary - would you mind updating just the commit message here ?

Thanks
Numan

On 5/12/22 11:59, Mary Manohar wrote:
> Thanks Xavier for clarifying.
>
> Hi Mark,
>
> Xavier’s explanation is bang on. The ‘up’ field is introduced in the
> Port Binding table post 20.09.
>
> So, when the southbound is at 20.09 version, the set operation fails in
> ovn-controller.
>
> Thanks
>
> Mary
>
> *From: *Xavier Simonart mailto:xsimo...@redhat.com>>
> *Date: *Wednesday, May 11, 2022 at 8:10 AM
> *To: *Mark Michelson mailto:mmich...@redhat.com>>
> *Cc: *Mary Manohar 
> mailto:mary.mano...@nutanix.com>>, 
> ovs-dev@openvswitch.org
> mailto:ovs-dev@openvswitch.org>>
> *Subject: *Re: [ovs-dev] [PATCH ovn v2] Do not set 'up' attribute in the
> port-binding table if ovn-northd is at an older version.
>
> Hi Mark
>
> This is fixing a backward compatibility issue when using an older
> ovn-northd.
>
> ovn-controller should only set Port_Binding.up field if the Southbound
> DB is aware of this field (or transaction would fail).
>
> We already do this when setting pb up through notification (if-status
> module).
>
> More explanations on commit 8b45fc9b2 from Dumitru.
>
> Thanks
>
> Xavier
>
> On Tue, May 10, 2022 at 8:12 PM Mark Michelson 
> mailto:mmich...@redhat.com>
> >> wrote:
>
> Hi Mary,
>
> I'm confused by this change. The summary mentions ovn-northd being
> at an
> older version, but there's no version check being performed in the
> patch. Also, what constitutes an "older" version of ovn-northd? Why
> shouldn't we set the port up in this case? What problem is being solved?
>
> On 5/9/22 15:28, mary.mano...@nutanix.com
> > wrote:
> > From: Mary Manohar 
> mailto:mary.mano...@nutanix.com> 
> >>
> >
> > Signed-off-by: Mary Manohar 
> mailto:mary.mano...@nutanix.com> 
> >>
> > ---
> >   controller/binding.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index e284704..e7dc537 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -908,7 +908,9 @@ claimed_lport_set_up(const struct 
> sbrec_port_binding *pb,
> >   if (!notify_up) {
> >   bool up = true;
> >   if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
> > -sbrec_port_binding_set_up(pb, , 1);
> > +if (pb->n_up) {
> > +sbrec_port_binding_set_up(pb, , 1);
> > +}
> >   }
> >   return;
> >   }
> >
>
> ___
> dev mailing list
> d...@openvswitch.org 
> >
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> [mail.openvswitch.org]
> [mail.openvswitch.org 
> [mail.openvswitch.org]]
> 
> 

[ovs-dev] [PATCH ovn v3] ovn-controller: Set Port_Binding.up field only if the Southbound DB is aware of this field

2022-05-18 Thread mary . manohar
From: Mary Manohar 

This patch is fixing a backward compatibility issue when using an older
ovn-northd (20.09).
The 20.09 Southbound schema is not aware of the 'up'
field in the Port_Binding table. ovn-controller should set
Port_Binding.up field only if the Southbound DB
is aware of this field (or transaction would fail).

Signed-off-by: Mary Manohar 
---
 controller/binding.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/controller/binding.c b/controller/binding.c
index e284704..e7dc537 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -908,7 +908,9 @@ claimed_lport_set_up(const struct sbrec_port_binding *pb,
 if (!notify_up) {
 bool up = true;
 if (!parent_pb || (parent_pb->n_up && parent_pb->up[0])) {
-sbrec_port_binding_set_up(pb, , 1);
+if (pb->n_up) {
+sbrec_port_binding_set_up(pb, , 1);
+}
 }
 return;
 }
-- 
2.9.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 0/3] Adding generic port security flows.

2022-05-18 Thread Numan Siddique
On Mon, May 16, 2022 at 5:42 PM Numan Siddique  wrote:

>
>
> On Mon, May 16, 2022 at 4:27 PM Mark Michelson 
> wrote:
>
>> Hi Numan,
>>
>> I've taken a close look at the patches in this series, and they seem
>> really good. They're very well commented and well tested as well. It's
>> quite easy to follow the change, and I couldn't find any flaws in my
>> review.
>>
>> However, I do want to double-check that this does not put unnecessary
>> load on ovn-controller. I suspect it won't be much of a problem since
>>
>> 1) The port security flows are calculated incrementally.
>> 2) The reduced SB DB size likely lessens the overall load on
>> ovn-controller .
>> 3) The removed port security logical flows means there is less parsing
>> of logical flows per iteration of ovn-controller.
>>
>> However, this does add new OF flow creation in ovn-controller, so it's
>> worth checking to make sure ovn-controller does not see any noticeable
>> performance decrease.
>>
>> If we can get confirmation, then I'll ack the series.
>>
>
> Thanks Mark for the reviews.
>
> Sure.  I'll do some tests and share the results.
>

I did some testing and these are the findings.

1.  I started 2 separate ovn-central components, one on central-1 with OVN
main commit (as of today's)
and the other on central-2 node with OVN main + these port security
patches.
 Same OVN databases on both the nodes.

central-1 SB DB has 208962 logical flows
central-2 SB DB (with port sec patches) has 84294 logical flows.

2.  Started 2 separate compute nodes - compute-1 (with ovn main
ovn-controller) and compute-2 (with ovn main + port sec ovn-controller)
 Created around 1000 ovs ports on compute-1 and copied the conf.db to
compute-2 - so that both the ovn-controllers claim the same logical ports.

3.  After both the ovn-controllers settle down.  I triggered recompute
multiple times.   This recompute will generate all the openflows (but
ofcourse will not program ovs-vswitchd)
   compute-1 ovn-controller takes around 3100 ms to complete the loop.  I
see unreasonable long poll interval message and compute-2 ovn-controller
takes around 1500ms to
   complete the loop.


I think these patches also help ovn-controller as it has to do less logical
flow processing.

Below is the stopwatch/show for compute-1 ovn-controller

[root@ovn-chassis-1 data]# ovn-appctl -t ovn-controller stopwatch/show
flow-generation
Statistics for 'flow-generation'
  Total samples: 679
  Maximum: 3435 msec
  Minimum: 0 msec
  95th percentile: 32.512116 msec
  Short term average: 15.137543 msec
  Long term average: 109.403161 msec

And below is the same for compute-2 ovn-controller

[root@ovn-chassis-2 /]#  ovn-appctl -t ovn-controller stopwatch/show
flow-generation
Statistics for 'flow-generation'
  Total samples: 700
  Maximum: 1341 msec
  Minimum: 0 msec
  95th percentile: 2.987580 msec
  Short term average: 0.00 msec
  Long term average: 24.980349 msec


Thanks
Numan



> Numan
>
>>
>> On 5/12/22 20:42, num...@ovn.org wrote:
>> > From: Numan Siddique 
>> >
>> > This patch series adds generic logical flows for port security in
>> > the logical switch pipeline and pushes the actual port security
>> > implementation logic to ovn-controller from ovn-northd.
>> >
>> > ovn-northd will now add logical flows like:
>> >
>> > table=0 (ls_in_check_port_sec), priority=50   , match=(1),
>> action=(reg0[14] = check_in_port_sec(); next;)
>> > table=1 (ls_in_apply_port_sec), priority=50   , match=(reg0[14] == 1),
>> action=(drop;)
>> > table=1 (ls_in_apply_port_sec), priority=0, match=(1),
>> action=(next;)
>> >
>> > OVN action check_in_port_sec() resubmits the packet to openflow table
>> > 73.  ovn-controller will add port security flows in table 73,74 and 75
>> > for all the logical ports it has claimed.  The port security information
>> > is passed down the Port_Binding table in Southbound database.
>> >
>> > The main motivation for the patch is to address scale concerns.
>> > This patch series reduces the number of logical flows and ovn-northd
>> > CPU utilization time.
>> >
>> > Did some scale testing and below are the results:
>> >
>> > Used a Northbound database from a deployment of 120 node cluster.
>> > Number of logical switch ports with port security configured: 13711
>> >
>> > With vanilla ovn-northd
>> > ---
>> > Number of logical flows : 208061
>> > Avg time taken to run build_lflows() : 1301 msec
>> > Size of Southbound database after compaction: 104M
>> >
>> > With ovn-northd using this feature
>> > -
>> > Number of logical flows : 83396
>> > Avg time taken to run build_lflows() : 560  msec
>> > Size of Southbound database after compaction: 45M
>> >
>> >
>> >
>> > Numan Siddique (3):
>> >ovn-controller: Add OF rules for port security.
>> >actions: Add new actions check_in_port_sec and check_out_port_sec.
>> >northd: Add generic port security logical flows.
>> >
>> >   controller/binding.c |  

Re: [ovs-dev] [PATCH v2] sha1: Use implementation from openssl if available.

2022-05-18 Thread Ilya Maximets
On 5/16/22 09:40, Dumitru Ceara wrote:
> On 5/7/22 00:55, Ilya Maximets wrote:
>> Implementation of SHA1 in OpenSSL library is much faster and optimized
>> for all available CPU architectures and instruction sets.  OVS should
>> use it instead of internal implementation if possible.
>>
>> Depending on compiler options OpenSSL's version finishes our sha1
>> unit tests from 3 to 12 times faster.  Performance of OpenSSL's
>> version is constant, but OVS's implementation highly depends on
>> compiler.  Interestingly, default build with  '-g -O2' works faster
>> than optimized '-march=native -Ofast'.
>>
>> Tests with ovsdb-server on big databases shows ~5-10% improvement of
>> the time needed for database compaction (sha1 is only a part of this
>> operation), depending on compiler options.
>>
>> We still need internal implementation, because OpenSSL can be not
>> available on some platforms.  Tests enhanced to check both versions
>> of API.
>>
>> Signed-off-by: Ilya Maximets 
>> ---
> 
> Hi Ilya,
> 
> I'm no openssl expert so this is a rather shallow review.  In any case,
> the change looks good to me overall.  I left a question below and a
> minor nit comment.
> 
> Regardless:
> 
> Reviewed-by: Dumitru Ceara 
> 
> Thanks,
> Dumitru
> 
>>
>> Version 2:
>>   - Deprecated low level SHA1_* API replaced with EVP_Digest* API.
>>   - Added more accurate logging of OpenSSL errors.
>>   - Stack-based allocation of the digest context is no longer possible
>> in modern versions of OpenSSL, so converted to supported dynamic
>> allocation with EVP_MD_CTX_create().  Technically, this function
>> was also renamed to EVP_MD_CTX_new() in OpenSSL 3.0, but the old
>> name still works.
>>
>>  lib/sha1.c| 166 +++---
>>  lib/sha1.h|  27 ++--
>>  ovsdb/log.c   |   1 +
>>  tests/library.at  |   2 +-
>>  tests/test-sha1.c |  60 +++--
>>  5 files changed, 193 insertions(+), 63 deletions(-)
>>
>> diff --git a/lib/sha1.c b/lib/sha1.c
>> index 87360d9cd..3ecc8fb9f 100644
>> --- a/lib/sha1.c
>> +++ b/lib/sha1.c
>> @@ -31,12 +31,131 @@
>>  
>>  #include 
>>  #include "sha1.h"
>> +
>> +#ifdef HAVE_OPENSSL
>> +#include 
>> +#include 
>> +#endif
>> +
>>  #include 
>>  #include 
>>  #include "compiler.h"
>> +#include "openvswitch/vlog.h"
>>  #include "util.h"
>>  
>> -/* a bit faster & bigger, if defined */
>> +VLOG_DEFINE_THIS_MODULE(sha1);
>> +
>> +#ifdef HAVE_OPENSSL
>> +static void
>> +log_openssl_err(const char *func)
>> +{
>> +char buf[1024];
>> +
>> +ERR_error_string_n(ERR_get_error(), buf, 1024);
>> +VLOG_FATAL("%s failed: %s", func, buf);
>> +}
>> +#endif
>> +
>> +/*
>> + * Initialize the SHA digest.
>> + * context: The SHA context to initialize
>> + */
>> +void
>> +sha1_init(struct sha1_ctx *sha_info)
>> +{
>> +#ifdef HAVE_OPENSSL
>> +sha_info->ctx = EVP_MD_CTX_create();
>> +if (!EVP_DigestInit_ex(sha_info->ctx, EVP_sha1(), NULL)) {
>> +log_openssl_err("EVP_DigestInit_ex");
>> +}
>> +#else
>> +ovs_sha1_init(sha_info);
>> +#endif
>> +}
>> +
>> +/*
>> + * Update the SHA digest.
>> + * context: The SHA1 context to update.
>> + * input: The buffer to add to the SHA digest.
>> + * inputLen: The length of the input buffer.
>> + */
>> +void
>> +sha1_update(struct sha1_ctx *ctx, const void *buffer_, uint32_t count)
>> +{
>> +#ifdef HAVE_OPENSSL
>> +if (!EVP_DigestUpdate(ctx->ctx, buffer_, count)) {
>> +log_openssl_err("EVP_DigestUpdate");
>> +}
>> +#else
>> +ovs_sha1_update(ctx, buffer_, count);
>> +#endif
>> +}
>> +
>> +/*
>> + * Finish computing the SHA digest.
>> + * digest: the output buffer in which to store the digest.
>> + * context: The context to finalize.
>> + */
>> +void
>> +sha1_final(struct sha1_ctx *ctx, uint8_t digest[SHA1_DIGEST_SIZE])
>> +{
>> +#ifdef HAVE_OPENSSL
>> +unsigned int len;
>> +
>> +if (!EVP_DigestFinal_ex(ctx->ctx, digest, )) {
> 
> According to the openssl man page, "at most EVP_MAX_MD_SIZE bytes will
> be written".  Will there be ever be a case when less than
> SHA1_DIGEST_SIZE bytes are written causing the assert below to fail?

I don't think so.  The openssl documentation is notoriously bad,
but IIUC, digest sizes are constant for a specific algorithm, as
they should be.  SHA1 hash is always 160 bits long.
If openssl doesn't fail but writes less than the hash size, it's
a bug in openssl and we can't actually proceed here.  As if we
can't calculate the hash, we can't perform any operations that
requires hashing.

So, this should never happen, but if that happens, that's a fatal
problem.

> 
> Also, will there ever be a case when more than SHA1_DIGEST_SIZE are
> written?  Should we ensure that all callers of sha1_final pass a digest
> that's at least EVP_MAX_MD_SIZE bytes long?

If we will want to literally follow the documentation, then yes,
we need to use EVP_MAX_MD_SIZE long buffer.  But callers of the
sha1_*() API in OVS doesn't know that.  And, 

Re: [ovs-dev] [PATCH ovn] nb: Add Load_Balancer.options:neighbor_responder knob.

2022-05-18 Thread Mark Michelson

I applied this to main.

On 5/11/22 14:48, Mark Michelson wrote:

Thanks for this Dumitru.

Acked-by: Mark Michelson 

On 4/28/22 11:33, Dumitru Ceara wrote:

This allows CMS to tweak the way logical routers reply to ARP/ND packets
targeting load balancer VIPs.  By default a router only replies for VIPs
that are reachable locally (they're part of a subnet configured on the
router).  There are cases though when it's desirable for routers to
reply for all VIPs.

Reported-at: https://github.com/ovn-org/ovn/issues/124
Reported-by: Tom Parrott 
Signed-off-by: Dumitru Ceara 
---
  NEWS    |  4 
  northd/northd.c | 20 
  ovn-nb.xml  |  9 +
  tests/ovn-northd.at | 10 --
  4 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index dbe89e9cf..ed735c32c 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,10 @@ Post v22.03.0
    - Support NAT for logical routers with multiple distributed 
gateway ports.

    - Add global option (NB_Global.options:default_acl_drop) to enable
  implicit drop behavior on logical switches with ACLs applied.
+  - Add NB.Load_Balancer.options:neighbor_responder to allow the CMS to
+    explicitly request routers to reply to any ARP/ND request for a VIP
+    (when set to "all") and only for reachable VIPs (when set to 
"reachable"

+    or by default).
  OVN v22.03.0 - 11 Mar 2022
  --
diff --git a/northd/northd.c b/northd/northd.c
index a5297..f01bd2cf7 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -3920,6 +3920,26 @@ static void
  build_lrouter_lb_reachable_ips(struct ovn_datapath *od,
 const struct ovn_northd_lb *lb)
  {
+    const char *neighbor_responder_mode =
+    smap_get_def(>nlb->options, "neighbor_responder", 
"reachable");

+
+    /* If configured to reply to neighbor requests for all VIPs force 
them

+ * all to be considered "reachable".
+ */
+    if (!strcmp(neighbor_responder_mode, "all")) {
+    for (size_t i = 0; i < lb->n_vips; i++) {
+    if (IN6_IS_ADDR_V4MAPPED(>vips[i].vip)) {
+    sset_add(>lb_ips_v4_reachable, lb->vips[i].vip_str);
+    } else {
+    sset_add(>lb_ips_v6_reachable, lb->vips[i].vip_str);
+    }
+    }
+    return;
+    }
+
+    /* Otherwise, a VIP is reachable if there's at least one router
+ * subnet that includes it.
+ */
  for (size_t i = 0; i < lb->n_vips; i++) {
  if (IN6_IS_ADDR_V4MAPPED(>vips[i].vip)) {
  ovs_be32 vip_ip4 = 
in6_addr_get_mapped_ipv4(>vips[i].vip);

diff --git a/ovn-nb.xml b/ovn-nb.xml
index 9010240a8..756c2a378 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -1856,6 +1856,15 @@
  more information about what flows are added for IP routes, 
please

  see the ovn-northd manpage section on IP Routing.
    
+
+  
+    If set to all, then routers on which the load 
balancer

+    is applied reply to ARP/neighbor discovery requests for all VIPs
+    of the load balancer.  If set to reachable, then 
routers
+    on which the load balancer is applied reply to ARP/neighbor 
discovery
+    requests only for VIPs that are part of a router's subnet.  
The default
+    value of this option, if not specified, is 
reachable.

+  
  
    
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 69ad85533..8e256de2e 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1708,6 +1708,10 @@ ovn-nbctl lb-add lb5 
"[[fe80::200:ff:fe00:101]]:8080" "[[fe02::200:ff:fe00:101]]
  ovn-nbctl lb-add lb5 "[[fe80::200:ff:fe00:102]]:8080" 
"[[fe02::200:ff:fe00:102]]:8080"

  ovn-nbctl lb-add lb6 "43.43.43.43:8080" "10.0.0.8:8080" udp
  ovn-nbctl lb-add lb7 "[[4343::4343]]:8080" "[[10::10]]:8080" udp
+ovn-nbctl lb-add lb8 "44.44.44.44:8080" "10.0.0.8:8080" udp
+ovn-nbctl set Load_Balancer lb8 options:neighbor_responder=all
+ovn-nbctl lb-add lb9 "[[::]]:8080" "[[10::10]]:8080" udp
+ovn-nbctl set Load_Balancer lb9 options:neighbor_responder=all
  ovn-nbctl lr-lb-add lr lb1
  ovn-nbctl lr-lb-add lr lb2
@@ -1716,6 +1720,8 @@ ovn-nbctl lr-lb-add lr lb4
  ovn-nbctl lr-lb-add lr lb5
  ovn-nbctl lr-lb-add lr lb6
  ovn-nbctl lr-lb-add lr lb7
+ovn-nbctl lr-lb-add lr lb8
+ovn-nbctl lr-lb-add lr lb9
  ovn-nbctl --wait=sb sync
  lr_key=$(fetch_column sb:datapath_binding tunnel_key 
external_ids:name=lr)

@@ -1723,8 +1729,8 @@ lb_as_v4="_rtr_lb_${lr_key}_ip4"
  lb_as_v6="_rtr_lb_${lr_key}_ip6"
  # Check generated VIP address sets (only reachable IPs).
-check_column '43.43.43.43' Address_Set addresses name=${lb_as_v4}
-check_column '4343::4343 fe80::200:ff:fe00:101 fe80::200:ff:fe00:102' 
Address_Set addresses name=${lb_as_v6}
+check_column '43.43.43.43 44.44.44.44' Address_Set addresses 
name=${lb_as_v4}
+check_column '4343::4343 :: fe80::200:ff:fe00:101 
fe80::200:ff:fe00:102' Address_Set addresses name=${lb_as_v6}

  # Ingress router port ETH 

Re: [ovs-dev] [PATCH ovn] ovn-controller: Consider zone 0 as a valid zone when restoring.

2022-05-18 Thread Dumitru Ceara
On 5/18/22 18:41, Ilya Maximets wrote:
> On 5/18/22 18:09, Dumitru Ceara wrote:
>> 0 is a valid zone ID and some CMSs might actually use it.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087194
>> Signed-off-by: Dumitru Ceara 
>> ---
>>  controller/ovn-controller.c | 15 ++-
>>  1 file changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>> index 0efe5c5ce..c2949fa30 100644
>> --- a/controller/ovn-controller.c
>> +++ b/controller/ovn-controller.c
>> @@ -825,13 +825,18 @@ restore_ct_zones(const struct ovsrec_bridge_table 
>> *bridge_table,
>>  }
>>  
>>  const char *user = node->key + 8;
>> -int zone = atoi(node->value);
>> +if (!user[0]) {
>> +continue;
>> +}
>>  
>> -if (user[0] && zone) {
>> -VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
>> -bitmap_set1(ct_zone_bitmap, zone);
>> -simap_put(ct_zones, user, zone);
>> +int zone;
>> +if (!str_to_int(node->value, 10, )) {
> 
> Maybe uint?  Negative values doesn't seem to be valid.
> 

Fair point.

I posted v2:
https://patchwork.ozlabs.org/project/ovn/patch/20220518174728.3366-1-dce...@redhat.com/

Thanks!

>> +continue;
>>  }
>> +
>> +VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
>> +bitmap_set1(ct_zone_bitmap, zone);
>> +simap_put(ct_zones, user, zone);
>>  }
>>  }
>>  
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] ovn-controller: Consider zone 0 as a valid zone when restoring.

2022-05-18 Thread Dumitru Ceara
0 is a valid zone ID and some CMSs might actually use it.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087194
Signed-off-by: Dumitru Ceara 
---
v2: Use str_to_uint() as suggested by Ilya.
---
 controller/ovn-controller.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 0efe5c5ce..dfe30d1d1 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -825,13 +825,18 @@ restore_ct_zones(const struct ovsrec_bridge_table 
*bridge_table,
 }
 
 const char *user = node->key + 8;
-int zone = atoi(node->value);
+if (!user[0]) {
+continue;
+}
 
-if (user[0] && zone) {
-VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
-bitmap_set1(ct_zone_bitmap, zone);
-simap_put(ct_zones, user, zone);
+unsigned int zone;
+if (!str_to_uint(node->value, 10, )) {
+continue;
 }
+
+VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
+bitmap_set1(ct_zone_bitmap, zone);
+simap_put(ct_zones, user, zone);
 }
 }
 
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-controller: Consider zone 0 as a valid zone when restoring.

2022-05-18 Thread Ilya Maximets
On 5/18/22 18:09, Dumitru Ceara wrote:
> 0 is a valid zone ID and some CMSs might actually use it.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087194
> Signed-off-by: Dumitru Ceara 
> ---
>  controller/ovn-controller.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 0efe5c5ce..c2949fa30 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -825,13 +825,18 @@ restore_ct_zones(const struct ovsrec_bridge_table 
> *bridge_table,
>  }
>  
>  const char *user = node->key + 8;
> -int zone = atoi(node->value);
> +if (!user[0]) {
> +continue;
> +}
>  
> -if (user[0] && zone) {
> -VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
> -bitmap_set1(ct_zone_bitmap, zone);
> -simap_put(ct_zones, user, zone);
> +int zone;
> +if (!str_to_int(node->value, 10, )) {

Maybe uint?  Negative values doesn't seem to be valid.

> +continue;
>  }
> +
> +VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
> +bitmap_set1(ct_zone_bitmap, zone);
> +simap_put(ct_zones, user, zone);
>  }
>  }
>  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVN - flushing conntrack entries for newly allocated zones

2022-05-18 Thread Dumitru Ceara
On 5/17/22 16:10, Dumitru Ceara wrote:
> Hi Justin,
> 
> I know it's a while ago since this patch [0][1] made it in OVN but I was
> wondering if you remembered some of the reasons behind doing this.
> 
> More recently, OVN got a new feature to allow the CMS to explicitly
> specify a zone ID to be used for conntrack for gateway routers [2].
> 
> The reason behind that was that, in some cases, ovn-kubernetes shares
> the gateway router conntrack zone with the host [3].
> 
> The fact that an ovn-controller restart flushes the shared zone becomes
> an issue sometimes.  We're trying to see if there's a way to avoid that,
> potentially via an ovn-controller cmdline argument or configuration.
> 
> But before that, I wonder if you know of any CMS that relies on the
> flush-on-use behavior.

Sorry for the noise.  I think we figured out the context and we also
found the bug that was strictly related to conntrack zone ID 0.

I posted a fix for that:

https://patchwork.ozlabs.org/project/ovn/patch/20220518160952.9998-1-dce...@redhat.com/

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] ovn-controller: Consider zone 0 as a valid zone when restoring.

2022-05-18 Thread Dumitru Ceara
0 is a valid zone ID and some CMSs might actually use it.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2087194
Signed-off-by: Dumitru Ceara 
---
 controller/ovn-controller.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 0efe5c5ce..c2949fa30 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -825,13 +825,18 @@ restore_ct_zones(const struct ovsrec_bridge_table 
*bridge_table,
 }
 
 const char *user = node->key + 8;
-int zone = atoi(node->value);
+if (!user[0]) {
+continue;
+}
 
-if (user[0] && zone) {
-VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
-bitmap_set1(ct_zone_bitmap, zone);
-simap_put(ct_zones, user, zone);
+int zone;
+if (!str_to_int(node->value, 10, )) {
+continue;
 }
+
+VLOG_DBG("restoring ct zone %"PRId32" for '%s'", zone, user);
+bitmap_set1(ct_zone_bitmap, zone);
+simap_put(ct_zones, user, zone);
 }
 }
 
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2 3/3] netdev-offload-dpdk: Add geneve header pattern match

2022-05-18 Thread Eli Britstein via dev
  1.  The focus was VXLAN. Need to enhance DPDK to support it. If this is not 
important, let’s abandon this patch-set until DPDK is enhanced.
  2.  There is no need. DPDK has only specific encaps for VXLAN/NVGRE. Other 
encaps are done only by “RAW”. In OVS VXLAN is used if applicable, and fallback 
to RAW. Geneve is under this category.

From: Hemal Shah 
Sent: Wednesday, May 18, 2022 6:02 PM
To: Eli Britstein 
Cc: Ilya Maximets ; d...@openvswitch.org
Subject: Re: [ovs-dev] [PATCH V2 3/3] netdev-offload-dpdk: Add geneve header 
pattern match

Eli,

I'm trying to understand options handling during Geneve encap/decap offload.

  1.  This patchset will allow decap offload for Geneve w/o options only. I do 
not think that covers important use cases for Geneve. Geneve was meant to be 
used with options. Why is the limitation of not having options support in 
struct rte_flow_restore_info gating the offload design?
  2.  I have not seen companion patch support for encap offload for Geneve. Is 
similar restriction of not offloading Geneve w/ options apply on the encap 
offload?
Hemal

On Sat, May 7, 2022 at 10:07 PM Eli Britstein via dev 
mailto:ovs-dev@openvswitch.org>> wrote:


>-Original Message-
>From: Ilya Maximets mailto:i.maxim...@ovn.org>>
>Sent: Wednesday, May 4, 2022 2:44 PM
>To: Eli Britstein mailto:el...@nvidia.com>>; 
>d...@openvswitch.org
>Cc: i.maxim...@ovn.org; Gaetan Rivet 
>mailto:gaet...@nvidia.com>>;
>msant...@redhat.com; Nir Anteby 
>mailto:nant...@nvidia.com>>
>Subject: Re: [PATCH V2 3/3] netdev-offload-dpdk: Add geneve header pattern
>match
>
>External email: Use caution opening links or attachments
>
>
>On 2/7/22 18:24, Eli Britstein wrote:
>> Add support for matching on geneve header.
>>
>> Signed-off-by: Eli Britstein mailto:el...@nvidia.com>>
>> Reviewed-by: Nir Anteby mailto:nant...@nvidia.com>>
>> Acked-by: Michael Santana mailto:msant...@redhat.com>>
>> ---
>>  NEWS  |  2 ++
>>  lib/netdev-offload-dpdk.c | 58
>> +++
>>  2 files changed, 60 insertions(+)
>>
>> diff --git a/NEWS b/NEWS
>> index e1c48f3a1..41a80d127 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -29,6 +29,8 @@ v2.17.0 - xx xxx 
>>   * Add support for DPDK 21.11.
>>   * Forbid use of DPDK multiprocess feature.
>>   * Add support for running threads on cores >= RTE_MAX_LCORE.
>> + * Add hardware offload support for GENEVE flows (experimental).
>> +   Available only if DPDK experimantal APIs enabled during the build.
>> - Python:
>>   * For SSL support, the use of the pyOpenSSL library has been replaced
>> with the native 'ssl' module.
>> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
>> index edd4e009d..0303bd2df 100644
>> --- a/lib/netdev-offload-dpdk.c
>> +++ b/lib/netdev-offload-dpdk.c
>> @@ -638,6 +638,24 @@ dump_flow_pattern(struct ds *s,
>>ntohl(*key_spec), ntohl(*key_mask), 0);
>>  }
>>  ds_put_cstr(s, "/ ");
>> +} else if (item->type == RTE_FLOW_ITEM_TYPE_GENEVE) {
>> +const struct rte_flow_item_geneve *geneve_spec = item->spec;
>> +const struct rte_flow_item_geneve *geneve_mask = item->mask;
>> +ovs_be32 spec_vni, mask_vni;
>> +
>> +ds_put_cstr(s, "geneve ");
>> +if (geneve_spec) {
>> +if (!geneve_mask) {
>> +geneve_mask = _flow_item_geneve_mask;
>> +}
>> +spec_vni = get_unaligned_be32(ALIGNED_CAST(ovs_be32 *,
>> +   geneve_spec->vni));
>> +mask_vni = get_unaligned_be32(ALIGNED_CAST(ovs_be32 *,
>> +   geneve_mask->vni));
>> +DUMP_PATTERN_ITEM(geneve_mask->vni, false, "vni", "%"PRIu32,
>> +  ntohl(spec_vni) >> 8, ntohl(mask_vni) >> 8, 
>> 0);
>> +}
>> +ds_put_cstr(s, "/ ");
>>  } else {
>>  ds_put_format(s, "unknown rte flow pattern (%d)\n", item->type);
>>  }
>> @@ -1351,6 +1369,44 @@ parse_gre_match(struct flow_patterns
>*patterns,
>>  return 0;
>>  }
>>
>> +static int
>> +parse_geneve_match(struct flow_patterns *patterns,
>> +   struct match *match) {
>> +struct rte_flow_item_geneve *geneve_spec, *geneve_mask;
>> +struct flow *consumed_masks;
>> +int ret;
>> +
>> +ret = parse_tnl_ip_match(patterns, match, IPPROTO_UDP);
>> +if (ret) {
>> +return -1;
>> +}
>> +parse_tnl_udp_match(patterns, match);
>> +
>> +consumed_masks = >wc.masks;
>> +/* GENEVE */
>> +geneve_spec = xzalloc(sizeof *geneve_spec);
>> +geneve_mask = xzalloc(sizeof *geneve_mask);
>> +
>> +put_unaligned_be32(ALIGNED_CAST(ovs_be32 *, geneve_spec->vni),
>> +   htonl(ntohll(match->flow.tunnel.tun_id) << 8));
>> +

Re: [ovs-dev] [PATCH V2 3/3] netdev-offload-dpdk: Add geneve header pattern match

2022-05-18 Thread Hemal Shah via dev
Eli,

I'm trying to understand options handling during Geneve encap/decap offload.

   1. This patchset will allow decap offload for Geneve w/o options only. I
   do not think that covers important use cases for Geneve. Geneve was meant
   to be used with options. Why is the limitation of not having options
   support in struct rte_flow_restore_info gating the offload design?
   2. I have not seen companion patch support for encap offload for Geneve.
   Is similar restriction of not offloading Geneve w/ options apply on the
   encap offload?

Hemal

On Sat, May 7, 2022 at 10:07 PM Eli Britstein via dev <
ovs-dev@openvswitch.org> wrote:

>
>
> >-Original Message-
> >From: Ilya Maximets 
> >Sent: Wednesday, May 4, 2022 2:44 PM
> >To: Eli Britstein ; d...@openvswitch.org
> >Cc: i.maxim...@ovn.org; Gaetan Rivet ;
> >msant...@redhat.com; Nir Anteby 
> >Subject: Re: [PATCH V2 3/3] netdev-offload-dpdk: Add geneve header pattern
> >match
> >
> >External email: Use caution opening links or attachments
> >
> >
> >On 2/7/22 18:24, Eli Britstein wrote:
> >> Add support for matching on geneve header.
> >>
> >> Signed-off-by: Eli Britstein 
> >> Reviewed-by: Nir Anteby 
> >> Acked-by: Michael Santana 
> >> ---
> >>  NEWS  |  2 ++
> >>  lib/netdev-offload-dpdk.c | 58
> >> +++
> >>  2 files changed, 60 insertions(+)
> >>
> >> diff --git a/NEWS b/NEWS
> >> index e1c48f3a1..41a80d127 100644
> >> --- a/NEWS
> >> +++ b/NEWS
> >> @@ -29,6 +29,8 @@ v2.17.0 - xx xxx 
> >>   * Add support for DPDK 21.11.
> >>   * Forbid use of DPDK multiprocess feature.
> >>   * Add support for running threads on cores >= RTE_MAX_LCORE.
> >> + * Add hardware offload support for GENEVE flows (experimental).
> >> +   Available only if DPDK experimantal APIs enabled during the
> build.
> >> - Python:
> >>   * For SSL support, the use of the pyOpenSSL library has been
> replaced
> >> with the native 'ssl' module.
> >> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> >> index edd4e009d..0303bd2df 100644
> >> --- a/lib/netdev-offload-dpdk.c
> >> +++ b/lib/netdev-offload-dpdk.c
> >> @@ -638,6 +638,24 @@ dump_flow_pattern(struct ds *s,
> >>ntohl(*key_spec), ntohl(*key_mask), 0);
> >>  }
> >>  ds_put_cstr(s, "/ ");
> >> +} else if (item->type == RTE_FLOW_ITEM_TYPE_GENEVE) {
> >> +const struct rte_flow_item_geneve *geneve_spec = item->spec;
> >> +const struct rte_flow_item_geneve *geneve_mask = item->mask;
> >> +ovs_be32 spec_vni, mask_vni;
> >> +
> >> +ds_put_cstr(s, "geneve ");
> >> +if (geneve_spec) {
> >> +if (!geneve_mask) {
> >> +geneve_mask = _flow_item_geneve_mask;
> >> +}
> >> +spec_vni = get_unaligned_be32(ALIGNED_CAST(ovs_be32 *,
> >> +
>  geneve_spec->vni));
> >> +mask_vni = get_unaligned_be32(ALIGNED_CAST(ovs_be32 *,
> >> +
>  geneve_mask->vni));
> >> +DUMP_PATTERN_ITEM(geneve_mask->vni, false, "vni",
> "%"PRIu32,
> >> +  ntohl(spec_vni) >> 8, ntohl(mask_vni) >>
> 8, 0);
> >> +}
> >> +ds_put_cstr(s, "/ ");
> >>  } else {
> >>  ds_put_format(s, "unknown rte flow pattern (%d)\n",
> item->type);
> >>  }
> >> @@ -1351,6 +1369,44 @@ parse_gre_match(struct flow_patterns
> >*patterns,
> >>  return 0;
> >>  }
> >>
> >> +static int
> >> +parse_geneve_match(struct flow_patterns *patterns,
> >> +   struct match *match) {
> >> +struct rte_flow_item_geneve *geneve_spec, *geneve_mask;
> >> +struct flow *consumed_masks;
> >> +int ret;
> >> +
> >> +ret = parse_tnl_ip_match(patterns, match, IPPROTO_UDP);
> >> +if (ret) {
> >> +return -1;
> >> +}
> >> +parse_tnl_udp_match(patterns, match);
> >> +
> >> +consumed_masks = >wc.masks;
> >> +/* GENEVE */
> >> +geneve_spec = xzalloc(sizeof *geneve_spec);
> >> +geneve_mask = xzalloc(sizeof *geneve_mask);
> >> +
> >> +put_unaligned_be32(ALIGNED_CAST(ovs_be32 *, geneve_spec->vni),
> >> +   htonl(ntohll(match->flow.tunnel.tun_id) << 8));
> >> +put_unaligned_be32(ALIGNED_CAST(ovs_be32 *, geneve_mask->vni),
> >> +   htonl(ntohll(match->wc.masks.tunnel.tun_id) <<
> >> + 8));
> >> +
> >> +consumed_masks->tunnel.tun_id = 0;
> >> +consumed_masks->tunnel.flags = 0;
> >> +/* tunnel.metadata.present.len value indicates the number of
> >> + * options, it's mask does not indicate any match on the packet,
> >> + * thus masked.
> >
> >I'm not sure I get that.  Options are part of the geneve header, so if
> the match
> >is requested, we have to match on them.  And there is a special item for
> them
> >- RTE_FLOW_ITEM_TYPE_GENEVE_OPT, which, I think, should be used in this
> >patch.
> It is correct dpdk supports this flow item. However, it doesn't 

Re: [ovs-dev] [PATCH ovn v2] Do not set 'up' attribute in the port-binding table if ovn-northd is at an older version.

2022-05-18 Thread Numan Siddique
On Tue, May 17, 2022 at 9:55 AM Mark Michelson  wrote:

> OK thanks for the explanations, Xavier and Mary. This is pretty
> important since we expect that ovn-controller should be upgraded before
> ovn-northd.
>
> I'm acking the patch because the content is correct. I suggest that when
> this is merged, the person that merges should expand on the commit
> message a bit to explain why the commit is necessary. If nothing else,
> make it clear that 20.09 is the cutoff between an "older" and "newer"
> version of ovn-northd in this case.
>
> Acked-by: Mark Michelson
>

Thanks Mark for the review.

@Mary - would you mind updating just the commit message here ?

Thanks
Numan

>
> On 5/12/22 11:59, Mary Manohar wrote:
> > Thanks Xavier for clarifying.
> >
> > Hi Mark,
> >
> > Xavier’s explanation is bang on. The ‘up’ field is introduced in the
> > Port Binding table post 20.09.
> >
> > So, when the southbound is at 20.09 version, the set operation fails in
> > ovn-controller.
> >
> > Thanks
> >
> > Mary
> >
> > *From: *Xavier Simonart 
> > *Date: *Wednesday, May 11, 2022 at 8:10 AM
> > *To: *Mark Michelson 
> > *Cc: *Mary Manohar , ovs-dev@openvswitch.org
> > 
> > *Subject: *Re: [ovs-dev] [PATCH ovn v2] Do not set 'up' attribute in the
> > port-binding table if ovn-northd is at an older version.
> >
> > Hi Mark
> >
> > This is fixing a backward compatibility issue when using an older
> > ovn-northd.
> >
> > ovn-controller should only set Port_Binding.up field if the Southbound
> > DB is aware of this field (or transaction would fail).
> >
> > We already do this when setting pb up through notification (if-status
> > module).
> >
> > More explanations on commit 8b45fc9b2 from Dumitru.
> >
> > Thanks
> >
> > Xavier
> >
> > On Tue, May 10, 2022 at 8:12 PM Mark Michelson  > > wrote:
> >
> > Hi Mary,
> >
> > I'm confused by this change. The summary mentions ovn-northd being
> > at an
> > older version, but there's no version check being performed in the
> > patch. Also, what constitutes an "older" version of ovn-northd? Why
> > shouldn't we set the port up in this case? What problem is being
> solved?
> >
> > On 5/9/22 15:28, mary.mano...@nutanix.com
> >  wrote:
> > > From: Mary Manohar  mary.mano...@nutanix.com>>
> > >
> > > Signed-off-by: Mary Manohar  mary.mano...@nutanix.com>>
> > > ---
> > >   controller/binding.c | 4 +++-
> > >   1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/controller/binding.c b/controller/binding.c
> > > index e284704..e7dc537 100644
> > > --- a/controller/binding.c
> > > +++ b/controller/binding.c
> > > @@ -908,7 +908,9 @@ claimed_lport_set_up(const struct
> sbrec_port_binding *pb,
> > >   if (!notify_up) {
> > >   bool up = true;
> > >   if (!parent_pb || (parent_pb->n_up && parent_pb->up[0]))
> {
> > > -sbrec_port_binding_set_up(pb, , 1);
> > > +if (pb->n_up) {
> > > +sbrec_port_binding_set_up(pb, , 1);
> > > +}
> > >   }
> > >   return;
> > >   }
> > >
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org 
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > [mail.openvswitch.org]
> > <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev=DwMFaQ=s883GpUCOChKOHiocYtGcg=PrJpP14VwAwIfuISqmhQn4UbXr0029-Ifb7y0YpWR_A=4M1akZ2SPpiYOkwcXOUTtZYo6SeEU7e0T0dg9b9E3E1ypfMSyt_QPOvkYzq20Pkr=go8izusrxnEFuOBTOkGitblsby6P-0arnWEq0ZkNJ2w=
> >
> >
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] utilities: nbctl: dump lbs in load_balancer group running {ls, lr}-lb-list

2022-05-18 Thread Numan Siddique
On Tue, May 10, 2022 at 4:22 PM Mark Michelson  wrote:

> Thanks for this, Lorenzo.
>
> Acked-by: Mark Michelson 
>

Thanks.  I applied this patch to the main branch.

Numan

>
> On 4/27/22 06:47, Lorenzo Bianconi wrote:
> > Dump load balancers belonging to the load_balancer groups attached to the
> > specified logical switch/logical router running ls-lb-list/lr-lb-list.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2059261
> > Signed-off-by: Lorenzo Bianconi 
> > ---
> >   tests/ovn-nbctl.at| 70 +--
> >   utilities/ovn-nbctl.c | 54 +
> >   2 files changed, 122 insertions(+), 2 deletions(-)
> >
> > diff --git a/tests/ovn-nbctl.at b/tests/ovn-nbctl.at
> > index f9b9defd0..2388eba2e 100644
> > --- a/tests/ovn-nbctl.at
> > +++ b/tests/ovn-nbctl.at
> > @@ -997,6 +997,30 @@ AT_CHECK([ovn-nbctl ls-lb-del ls0 lb3])
> >   AT_CHECK([ovn-nbctl ls-lb-list ls0 | uuidfilt], [0], [])
> >   AT_CHECK([ovn-nbctl --if-exists ls-lb-del ls0 lb1])
> >
> > +AT_CHECK([ovn-nbctl lb-add lb4 40.0.0.10 162.168.10.10,162.168.10.20])
> > +AT_CHECK([ovn-nbctl lb-add lb5 50.0.0.10 172.168.10.10,172.168.10.20])
> > +AT_CHECK([ovn-nbctl lb-add lb6 60.0.0.10 182.168.10.10,182.168.10.20])
> > +
> > +lb4=$(fetch_column nb:load_balancer _uuid name=lb4)
> > +lb5=$(fetch_column nb:load_balancer _uuid name=lb5)
> > +lb6=$(fetch_column nb:load_balancer _uuid name=lb6)
> > +
> > +lbg=$(ovn-nbctl create load_balancer_group name=lbg -- \
> > +  add load_balancer_group lbg load_balancer $lb4 -- \
> > +  add load_balancer_group lbg load_balancer $lb5 -- \
> > +  add load_balancer_group lbg load_balancer $lb6)
> > +
> > +AT_CHECK([ovn-nbctl add logical_switch ls0 load_balancer_group $lbg])
> > +
> > +AT_CHECK([ovn-nbctl ls-lb-list ls0 | uuidfilt], [0], [dnl
> > +UUIDLB  PROTO
> VIP  IPs
> > +<0>lb4 tcp40.0.0.10
> 162.168.10.10,162.168.10.20
> > +<1>lb5 tcp50.0.0.10
> 172.168.10.10,172.168.10.20
> > +<2>lb6 tcp60.0.0.10
> 182.168.10.10,182.168.10.20
> > +])
> > +
> > +AT_CHECK([ovn-nbctl remove logical_switch ls0 load_balancer_group $lbg])
> > +
> >   dnl Remove all load balancers from logical switch.
> >   AT_CHECK([ovn-nbctl ls-lb-add ls0 lb0])
> >   AT_CHECK([ovn-nbctl ls-lb-add ls0 lb1])
> > @@ -1046,7 +1070,16 @@ AT_CHECK([ovn-nbctl ls-lb-add ls0 lb1])
> >   AT_CHECK([ovn-nbctl lb-del lb0])
> >   AT_CHECK([ovn-nbctl lb-del lb1])
> >   AT_CHECK([ovn-nbctl lr-lb-list lr0 | uuidfilt], [0], [])
> > -AT_CHECK([ovn-nbctl ls-lb-list ls0 | uuidfilt], [0], [])])
> > +AT_CHECK([ovn-nbctl ls-lb-list ls0 | uuidfilt], [0], [])
> > +
> > +AT_CHECK([ovn-nbctl add logical_router lr0 load_balancer_group $lbg])
> > +
> > +AT_CHECK([ovn-nbctl lr-lb-list lr0 | uuidfilt], [0], [dnl
> > +UUIDLB  PROTO
> VIP  IPs
> > +<0>lb4 tcp40.0.0.10
> 162.168.10.10,162.168.10.20
> > +<1>lb5 tcp50.0.0.10
> 172.168.10.10,172.168.10.20
> > +<2>lb6 tcp60.0.0.10
> 182.168.10.10,182.168.10.20
> > +])])
> >
> >   dnl
> -
> >
> > @@ -1271,6 +1304,30 @@ AT_CHECK([ovn-nbctl ls-lb-add ls0 lb3])
> >   AT_CHECK([ovn-nbctl ls-lb-del ls0])
> >   AT_CHECK([ovn-nbctl ls-lb-list ls0 | uuidfilt], [0], [])
> >
> > +AT_CHECK([ovn-nbctl lb-add lb4 [[ae07::10]]:80
> [[fd0f::10]]:80,[[fd0f::20]]:80])
> > +AT_CHECK([ovn-nbctl lb-add lb5 [[ae08::10]]:80
> [[fd0f::10]]:80,[[fd0f::20]]:80 udp])
> > +AT_CHECK([ovn-nbctl lb-add lb6 ae09::10 fd0f::10,fd0f::20])
> > +
> > +lb4=$(fetch_column nb:load_balancer _uuid name=lb4)
> > +lb5=$(fetch_column nb:load_balancer _uuid name=lb5)
> > +lb6=$(fetch_column nb:load_balancer _uuid name=lb6)
> > +
> > +lbg=$(ovn-nbctl create load_balancer_group name=lbg -- \
> > +  add load_balancer_group lbg load_balancer $lb4 -- \
> > +  add load_balancer_group lbg load_balancer $lb5 -- \
> > +  add load_balancer_group lbg load_balancer $lb6)
> > +
> > +AT_CHECK([ovn-nbctl add logical_switch ls0 load_balancer_group $lbg])
> > +
> > +AT_CHECK([ovn-nbctl ls-lb-list ls0 | uuidfilt], [0], [dnl
> > +UUIDLB  PROTO
> VIP  IPs
> > +<0>lb4 tcp[[ae07::10]]:80
> [[fd0f::10]]:80,[[fd0f::20]]:80
> > +<1>lb5 udp[[ae08::10]]:80
> [[fd0f::10]]:80,[[fd0f::20]]:80
> > +<2>lb6 tcpae09::10 fd0f::10,fd0f::20
> > +])
> > +
> > +AT_CHECK([ovn-nbctl remove logical_switch ls0 load_balancer_group $lbg])
> > +
> >   dnl Add load balancer to logical router.
> >   AT_CHECK([ovn-nbctl lr-add lr0])
> >   AT_CHECK([ovn-nbctl lr-lb-add lr0 lb0])
> > @@ -1305,7 +1362,16 @@ AT_CHECK([ovn-nbctl lr-lb-add lr0 lb0])
> >   

[ovs-dev] [PATCH v4 1/2] dpif-netdev : Fix ALB parameters type mismatch.

2022-05-18 Thread miterv
From: Lin Huang 

The ALB parameters should never be negative.
So it's to use smap_get_ulonglong() or smap_get_uint() to get it properly.

Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
Signed-off-by: Lin Huang 
---
 lib/dpif-netdev.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 21277b236..3597d7e40 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -4880,8 +4880,9 @@ dpif_netdev_set_config(struct dpif *dpif, const struct 
smap *other_config)
 
 struct pmd_auto_lb *pmd_alb = >pmd_alb;
 
-rebalance_intvl = smap_get_int(other_config, "pmd-auto-lb-rebal-interval",
-   ALB_REBALANCE_INTERVAL);
+rebalance_intvl = smap_get_ullong(other_config,
+  "pmd-auto-lb-rebal-interval",
+  ALB_REBALANCE_INTERVAL);
 
 /* Input is in min, convert it to msec. */
 rebalance_intvl =
@@ -4894,9 +4895,9 @@ dpif_netdev_set_config(struct dpif *dpif, const struct 
smap *other_config)
 log_autolb = true;
 }
 
-rebalance_improve = smap_get_int(other_config,
- "pmd-auto-lb-improvement-threshold",
- ALB_IMPROVEMENT_THRESHOLD);
+rebalance_improve = smap_get_uint(other_config,
+  "pmd-auto-lb-improvement-threshold",
+  ALB_IMPROVEMENT_THRESHOLD);
 if (rebalance_improve > 100) {
 rebalance_improve = ALB_IMPROVEMENT_THRESHOLD;
 }
@@ -4907,8 +4908,8 @@ dpif_netdev_set_config(struct dpif *dpif, const struct 
smap *other_config)
 log_autolb = true;
 }
 
-rebalance_load = smap_get_int(other_config, "pmd-auto-lb-load-threshold",
-  ALB_LOAD_THRESHOLD);
+rebalance_load = smap_get_uint(other_config, "pmd-auto-lb-load-threshold",
+   ALB_LOAD_THRESHOLD);
 if (rebalance_load > 100) {
 rebalance_load = ALB_LOAD_THRESHOLD;
 }
-- 
2.36.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/2] dpif-netdev : Fix ALB 'rebalance_intvl' max hard limit.

2022-05-18 Thread miterv
From: Lin Huang 

Currently the pmd-auto-lb-rebal-interval's value was not been
checked properly.
It maybe a negative, or too big value (>2 weeks between rebalances),
which will be lead to a big unsigned value. So reset it to default
if the value exceeds the max permitted as described in vswitchd.xml.

Fixes: 5bf84282482a ("Adding support for PMD auto load balancing")
Signed-off-by: Lin Huang 
---
 lib/dpif-netdev.c |  6 +-
 tests/alb.at  | 20 +++-
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 3597d7e40..33fb8ad81 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -93,7 +93,8 @@ VLOG_DEFINE_THIS_MODULE(dpif_netdev);
 /* Auto Load Balancing Defaults */
 #define ALB_IMPROVEMENT_THRESHOLD25
 #define ALB_LOAD_THRESHOLD   95
-#define ALB_REBALANCE_INTERVAL   1 /* 1 Min */
+#define ALB_REBALANCE_INTERVAL   1 /* 1 Min */
+#define MAX_ALB_REBALANCE_INTERVAL   2 /* 2 Min */
 #define MIN_TO_MSEC  6
 
 #define FLOW_DUMP_MAX_BATCH 50
@@ -4883,6 +4884,9 @@ dpif_netdev_set_config(struct dpif *dpif, const struct 
smap *other_config)
 rebalance_intvl = smap_get_ullong(other_config,
   "pmd-auto-lb-rebal-interval",
   ALB_REBALANCE_INTERVAL);
+if (rebalance_intvl > MAX_ALB_REBALANCE_INTERVAL) {
+rebalance_intvl = ALB_REBALANCE_INTERVAL;
+}
 
 /* Input is in min, convert it to msec. */
 rebalance_intvl =
diff --git a/tests/alb.at b/tests/alb.at
index 0036bd1f2..922185d61 100644
--- a/tests/alb.at
+++ b/tests/alb.at
@@ -243,7 +243,25 @@ get_log_next_line_num
 AT_CHECK([ovs-vsctl set open_vswitch . 
other_config:pmd-auto-lb-rebal-interval="0"])
 CHECK_ALB_PARAM([interval], [1 mins], [+$LINENUM])
 
-# No check for above max as it is only a documented max value and not a hard 
limit
+# Set new value
+get_log_next_line_num
+AT_CHECK([ovs-vsctl set open_vswitch . 
other_config:pmd-auto-lb-rebal-interval="100"])
+CHECK_ALB_PARAM([interval], [100 mins], [+$LINENUM])
+
+# Set above max value
+get_log_next_line_num
+AT_CHECK([ovs-vsctl set open_vswitch . 
other_config:pmd-auto-lb-rebal-interval="20001"])
+CHECK_ALB_PARAM([interval], [1 mins], [+$LINENUM])
+
+# Set new value
+get_log_next_line_num
+AT_CHECK([ovs-vsctl set open_vswitch . 
other_config:pmd-auto-lb-rebal-interval="1000"])
+CHECK_ALB_PARAM([interval], [1000 mins], [+$LINENUM])
+
+# Set Negative value
+get_log_next_line_num
+AT_CHECK([ovs-vsctl set open_vswitch . 
other_config:pmd-auto-lb-rebal-interval="-1"])
+CHECK_ALB_PARAM([interval], [1 mins], [+$LINENUM])
 
 OVS_VSWITCHD_STOP
 AT_CLEANUP
-- 
2.36.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 0/2] Fix ALB parameters type and value mismatch.

2022-05-18 Thread miterv
From: Lin Huang 

This series fix ALB parameters type mismatch and set a hard limit
to rebalance_intvl.

It also includes some self-tests.

Changes since v3:
1. Fix CI warnings

Changes since v2:
   1. Fix type mismatch int dpif-netdev.c.

Changes since v1:
1. Fix the commit message and formatting.
2. Fix typo in unit tests.

Lin Huang (2):
  dpif-netdev : Fix ALB parameters type mismatch.
  dpif-netdev : Fix ALB 'rebalance_intvl' max hard limit.

 lib/dpif-netdev.c | 21 +
 tests/alb.at  | 20 +++-
 2 files changed, 32 insertions(+), 9 deletions(-)

-- 
2.36.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/2] Fix ALB parameters type and value mismatch.

2022-05-18 Thread lin huang
Hi Kevin,

Thanks for the review.

It's very helpful to me.

> -Original Message-
> From: Kevin Traynor 
> Sent: Tuesday, May 17, 2022 11:34 PM
> To: mit...@outlook.com; ovs-dev@openvswitch.org
> Cc: Lin Huang 
> Subject: Re: [PATCH v3 0/2] Fix ALB parameters type and value mismatch.
> 
> On 15/05/2022 08:09, mit...@outlook.com wrote:
> > From: Lin Huang 
> >
> > This series fix ALB parameters type mismatch and set a hard limit
> > to rebalance_intvl.
> >
> > It also includes some self-tests.
> >
> > Changes since v2:
> >1. Fix type mismatch int dpif-netdev.c.
> >
> > Changes since v1:
> >1. Fix the commit message and formatting.
> >2. Fix typo in unit tests.
> >
> > Lin Huang (2):
> >dpif-netdev: Fix ALB parameters type mismatch.
> >dpif-netdev : Fix ALB 'rebalance_intvl' max hard limit.
> >
> >   lib/dpif-netdev.c | 28 
> >   tests/alb.at  | 20 +++-
> >   2 files changed, 35 insertions(+), 13 deletions(-)
> >
> 
> The changes lgtm, just need the checkpatch CI warnings fixed and I can Ack.
> 
> fyi - you can check these before submission with the checkpatch script e.g.
> $ git format-patch -2
> $ ./utilities/checkpatch.py *.patch
> 
> thanks,
> Kevin.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Only hash port number when necessary.

2022-05-18 Thread Ferriter, Cian



> -Original Message-
> From: Ilya Maximets 
> Sent: Wednesday 18 May 2022 11:35
> To: Mike Pattrick ; Ferriter, Cian 
> Cc: ovs-dev@openvswitch.org; i.maxim...@ovn.org
> Subject: Re: [ovs-dev] [PATCH] dpif-netdev: Only hash port number when 
> necessary.
> 
> On 5/10/22 17:03, Mike Pattrick wrote:
> > On Fri, Apr 29, 2022 at 11:44 AM Cian Ferriter  
> > wrote:
> >>
> >> The hash of the port number is only needed when a DPCLS needs to be
> >> created. Move the hash calculation inside the if to accomplish this.
> >>
> >> Signed-off-by: Cian Ferriter 
> >> ---
> >
> > Good catch!
> >
> > Acked-by: Mike Pattrick 
> 
> I'd guess that compiler will produce identical code in most cases,
> but the change makes the code a bit cleaner.
> 
> Applied.  Thanks!
> 
> Best regards, Ilya Maximets.

Thanks for the review Mike and for applying Ilya.

Agreed, the compiler should clean this up for us. I was thinking of the change 
as a code readability improvement rather than an optimization.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 0/7] Fix UndefinedBehaviorSanitizer reported issues and enable it in CI.

2022-05-18 Thread Dumitru Ceara
On 5/18/22 12:31, Ilya Maximets wrote:
> On 4/11/22 13:37, Dumitru Ceara wrote:
>> As privately reported by Aaron Conole, and by Jeffrey Walton [0]
>> there's currently a number of undefined behavior instances in
>> the OVS code base.  Running the OVS (and OVN) tests with UBSan [1]
>> enabled uncovers these.
>>
>> This series fixes the issues reported by UBSan and, through the last
>> patch, enables UBSan tests in GitHub Actions CI.
>>
>> [0] https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390894.html
>> [1] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
>> [2] 
>> https://patchwork.ozlabs.org/project/openvswitch/list/?series=286494=*
>>
>> Changes in v6:
>> - Rebased.
>> - Added acks from Aaron and/or Adrian to patches 3, 4, 5.
>> - Added a new patch, 6/7, addressing another UB instance spotted when
>>   running system tests.
>> - Addresses Aaron's comments on patch 2.
>>
>> Changes in v5:
>> - Rebased, dropped patch 'util: Avoid UB when iterating collections.',
>>   the proper fix from Adrian was accepted.
>> - Addressed Adrian's comments.
>>
>> Changes in v4:
>> - Rebased, dropping patches that were already applied.
>> - Addressed Ilya's comments.
>> - Added acks from Paolo from v3 as nothing major changed in this revision.
>> - Rephrased the commit message for "util: Avoid UB when iterating 
>> collections."
>>
>> Changes in v3:
>> - Added acks to the patches acked by Aaron.
>> - Addressed Aaron's comments.
>> - Split previous patch 07/11 "ofp-actions: ofp-errors: Use aligned
>>   structures when decoding ofp actions." into three separate patches
>>   addressing independent issues.
>> - Reordered patches such that the ones that might need follow up are
>>   towards the end of the series.
>> - Added a new patch, patch 13/14, fixing a typo in the CFLAGS_ASAN
>>   variables in the script used for building OVS in CI.  This typo was
>>   originally copy/pasted in the CFLAGS_UBSAN flags in v1 and v2 of
>>   this series.
>>
>> Changes in v2:
>> - Patch 3/11:
>>   - Remove cache line size aligment markers instead, as suggested by
>> Ilya.
>>
>> Dumitru Ceara (7):
>>   treewide: Fix invalid bit shift operations.
>>   treewide: Avoid offsetting NULL pointers.
>>   ofp-actions: Ensure aligned accesses to masked fields.
>>   ofp-errors: Ensure parsed OFPT_ERROR messages are properly aligned.
>>   ofp-actions: Use aligned structures when decoding ofp actions.
>>   netdev-linux: Properly access 32-bit aligned rtnl_link_stats64 structs.
>>   ci: Add UB Sanitizer.
>>
>>
>>  .ci/linux-build.sh   |6 ++
>>  .github/workflows/build-and-test.yml |5 ++
>>  configure.ac |1 
>>  include/openvswitch/ofp-actions.h|   15 +++--
>>  include/openvswitch/ofpbuf.h |   24 ++-
>>  include/openvswitch/util.h   |3 +
>>  lib/dpif-netlink.c   |2 -
>>  lib/dynamic-string.c |8 ++
>>  lib/meta-flow.c  |4 +
>>  lib/netdev-linux.c   |9 ++-
>>  lib/ofp-actions.c|  111 
>> ++
>>  lib/ofp-errors.c |2 +
>>  lib/ofpbuf.c |   43 +
>>  lib/ovsdb-data.c |   37 ++-
>>  lib/ovsdb-data.h |4 +
>>  lib/sset.c   |4 +
>>  lib/tnl-ports.c  |2 -
>>  ofproto/ofproto-dpif-xlate.c |3 +
>>  ofproto/ofproto.c|2 -
>>  tests/atlocal.in |   16 +
>>  tests/automake.mk|1 
>>  tests/daemon.at  |8 ++
>>  tests/ovs-macros.at  |5 ++
>>  tests/ovs-ofctl.at   |2 -
>>  tests/ovsdb-server.at|   16 +
>>  tests/test-hash.c|3 +
>>  tests/test-util.c|   13 ++--
>>  27 files changed, 279 insertions(+), 70 deletions(-)
>>
>>
> 
> Applied.  Thanks!
> 
> Unfortunately, at least one of ofp-* patches breaks ABI, so we can't
> backport them to stable branches.  For example, they will almost
> certainly break OVN if it will be linked with libopenvswitch dynamically.
> I backported treewide and netdev-linux patches though.
> 
> Best regards, Ilya Maximets.
> 

Thanks!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] northd: add configuration option for enabling parallelization

2022-05-18 Thread Dumitru Ceara
On 5/18/22 14:59, Dumitru Ceara wrote:
> On 5/18/22 12:07, Xavier Simonart wrote:
>> This patch is intended to change the way to enable northd lflow build
>> parallelization, as well as enable runtime change of number of threads.
>> Before this patch, the following was needed to use parallelization:
>> - enable parallelization through use_parallel_build in NBDB
>> - use --dummy-numa to select number of threads.
>> This second part was needed as otherwise as many threads as cores in the 
>> system
>> were used, while parallelization showed some performance improvement only 
>> until
>> using around 4 (or maybe 8) threads.
>>
>> With this patch, the number of threads used for lflow parallel build can be
>> specified either:
>> - at startup, using --n-threads= as ovn-northd command line option
>> - using unixctl
>> If the number of threads specified is > 1, then parallelization is enabled.
>> If the number is 1, parallelization is disabled.
>> If the number is < 1, parallelization is disabled at startup and a warning
>> is logged.
>> If the number is > 256, parallelization is enabled (with 256 threads) and
>> a warning is logged.
>>
>> The following unixctl have been added:
>> - set-n-threads : set the number of treads used.
>> - get-n-threads: returns the number of threads used
>> If the number of threads is within <2-256> bounds, parallelization is 
>> enabled.
>> If the number of thread is 1, parallelization is disabled.
>> Otherwise an error is thrown.
>>
>> Note that, if set-n-threads failed for any reason (e.g. failure to setup some
>> semaphore), parallelization is disabled, and get-n-thread will return 1.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2078552
>> Signed-off-by: Xavier Simonart 
>>
>> ---
>> v2:  - handled Dumitru's comments
>>  - added missing mutex_destroy
>>  - fixed issue when use_logical_dp_group is enabled after northd startup
>>  - rebased on top of main
>> v3:
>>  - fix mutex_destroy issue
>> v4:
>>  - handled Mark's comments
>>  - rebased on top of main
>> ---
>>  NEWS  |   7 +
>>  lib/ovn-parallel-hmap.c   | 291 +-
>>  lib/ovn-parallel-hmap.h   |  30 ++--
>>  northd/northd.c   | 176 ---
>>  northd/northd.h   |   1 +
>>  northd/ovn-northd-ddlog.c |   6 -
>>  northd/ovn-northd.8.xml   |  70 +
>>  northd/ovn-northd.c   |  68 -
>>  tests/ovn-macros.at   |  59 ++--
>>  tests/ovn-northd.at   | 109 ++
>>  10 files changed, 495 insertions(+), 322 deletions(-)
>>
> 
> Hi Xavier,
> 
> Sorry, I should've mentioned this earlier but I forgot.  Can you please
> also add an option to ovn-ctl to allow specifiying the number of threads?
> 
> Something along the lines of:
> 
> ovn-ctl start_northd .. --ovn-northd-n-threads=42
> 
> This would allow CMSs that use ovn-ctl [0] to enable parallelization.
> For ovn-org/ovn-kubernetes in particular, we could even make it such
> that in the ovn-org/ovn repo the ovn-kubernetes CI run we do runs both
> with and without northd parallelization.
> 
> Overall the code looks good but given that we probably also want the
> ovn-ctl change I also left a few very small style comments below.
> 
> Thanks,
> Dumitru

[0]
https://github.com/ovn-org/ovn-kubernetes/blob/4d79168271dcb22e8f2a580f5e75e5e38232224b/dist/images/ovnkube.sh#L857

> 
>> diff --git a/NEWS b/NEWS
>> index 244824e3f..6e489df32 100644
>> --- a/NEWS
>> +++ b/NEWS
>> @@ -9,6 +9,13 @@ Post v22.03.0
>>  implicit drop behavior on logical switches with ACLs applied.
>>- Support (LSP.options:qos_min_rate) to guarantee minimal bandwidth 
>> available
>>  for a logical port.
>> +  - Changed the way to enable northd parallelization.
>> +Removed support for:
>> +- use_parallel_build in NBDB.
>> +- --dummy-numa in northd cmdline.
>> +Added support for:
>> +-  --n-threads= in northd cmdline.
>> +- set-n-threads/get-n-threads unixctls.
>>  
>>  OVN v22.03.0 - 11 Mar 2022
>>  --
>> diff --git a/lib/ovn-parallel-hmap.c b/lib/ovn-parallel-hmap.c
>> index 7edc4c0b6..c4d8cee16 100644
>> --- a/lib/ovn-parallel-hmap.c
>> +++ b/lib/ovn-parallel-hmap.c
>> @@ -38,14 +38,10 @@ VLOG_DEFINE_THIS_MODULE(ovn_parallel_hmap);
>>  
>>  #ifndef OVS_HAS_PARALLEL_HMAP
>>  
>> -#define WORKER_SEM_NAME "%x-%p-%x"
>> +#define WORKER_SEM_NAME "%x-%p-%"PRIxSIZE
>>  #define MAIN_SEM_NAME "%x-%p-main"
>>  
>> -/* These are accessed under mutex inside add_worker_pool().
>> - * They do not need to be atomic.
>> - */
>>  static atomic_bool initial_pool_setup = ATOMIC_VAR_INIT(false);
>> -static bool can_parallelize = false;
>>  
>>  /* This is set only in the process of exit and the set is
>>   * accompanied by a fence. It does not need to be atomic or be
>> @@ -57,18 +53,18 @@ static struct ovs_list worker_pools = 
>> OVS_LIST_INITIALIZER(_pools);
>>  
>>  static struct ovs_mutex init_mutex = 

Re: [ovs-dev] [PATCH ovn v4] northd: add configuration option for enabling parallelization

2022-05-18 Thread Dumitru Ceara
On 5/18/22 12:07, Xavier Simonart wrote:
> This patch is intended to change the way to enable northd lflow build
> parallelization, as well as enable runtime change of number of threads.
> Before this patch, the following was needed to use parallelization:
> - enable parallelization through use_parallel_build in NBDB
> - use --dummy-numa to select number of threads.
> This second part was needed as otherwise as many threads as cores in the 
> system
> were used, while parallelization showed some performance improvement only 
> until
> using around 4 (or maybe 8) threads.
> 
> With this patch, the number of threads used for lflow parallel build can be
> specified either:
> - at startup, using --n-threads= as ovn-northd command line option
> - using unixctl
> If the number of threads specified is > 1, then parallelization is enabled.
> If the number is 1, parallelization is disabled.
> If the number is < 1, parallelization is disabled at startup and a warning
> is logged.
> If the number is > 256, parallelization is enabled (with 256 threads) and
> a warning is logged.
> 
> The following unixctl have been added:
> - set-n-threads : set the number of treads used.
> - get-n-threads: returns the number of threads used
> If the number of threads is within <2-256> bounds, parallelization is enabled.
> If the number of thread is 1, parallelization is disabled.
> Otherwise an error is thrown.
> 
> Note that, if set-n-threads failed for any reason (e.g. failure to setup some
> semaphore), parallelization is disabled, and get-n-thread will return 1.
> 
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2078552
> Signed-off-by: Xavier Simonart 
> 
> ---
> v2:  - handled Dumitru's comments
>  - added missing mutex_destroy
>  - fixed issue when use_logical_dp_group is enabled after northd startup
>  - rebased on top of main
> v3:
>  - fix mutex_destroy issue
> v4:
>  - handled Mark's comments
>  - rebased on top of main
> ---
>  NEWS  |   7 +
>  lib/ovn-parallel-hmap.c   | 291 +-
>  lib/ovn-parallel-hmap.h   |  30 ++--
>  northd/northd.c   | 176 ---
>  northd/northd.h   |   1 +
>  northd/ovn-northd-ddlog.c |   6 -
>  northd/ovn-northd.8.xml   |  70 +
>  northd/ovn-northd.c   |  68 -
>  tests/ovn-macros.at   |  59 ++--
>  tests/ovn-northd.at   | 109 ++
>  10 files changed, 495 insertions(+), 322 deletions(-)
> 

Hi Xavier,

Sorry, I should've mentioned this earlier but I forgot.  Can you please
also add an option to ovn-ctl to allow specifiying the number of threads?

Something along the lines of:

ovn-ctl start_northd .. --ovn-northd-n-threads=42

This would allow CMSs that use ovn-ctl [0] to enable parallelization.
For ovn-org/ovn-kubernetes in particular, we could even make it such
that in the ovn-org/ovn repo the ovn-kubernetes CI run we do runs both
with and without northd parallelization.

Overall the code looks good but given that we probably also want the
ovn-ctl change I also left a few very small style comments below.

Thanks,
Dumitru

> diff --git a/NEWS b/NEWS
> index 244824e3f..6e489df32 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -9,6 +9,13 @@ Post v22.03.0
>  implicit drop behavior on logical switches with ACLs applied.
>- Support (LSP.options:qos_min_rate) to guarantee minimal bandwidth 
> available
>  for a logical port.
> +  - Changed the way to enable northd parallelization.
> +Removed support for:
> +- use_parallel_build in NBDB.
> +- --dummy-numa in northd cmdline.
> +Added support for:
> +-  --n-threads= in northd cmdline.
> +- set-n-threads/get-n-threads unixctls.
>  
>  OVN v22.03.0 - 11 Mar 2022
>  --
> diff --git a/lib/ovn-parallel-hmap.c b/lib/ovn-parallel-hmap.c
> index 7edc4c0b6..c4d8cee16 100644
> --- a/lib/ovn-parallel-hmap.c
> +++ b/lib/ovn-parallel-hmap.c
> @@ -38,14 +38,10 @@ VLOG_DEFINE_THIS_MODULE(ovn_parallel_hmap);
>  
>  #ifndef OVS_HAS_PARALLEL_HMAP
>  
> -#define WORKER_SEM_NAME "%x-%p-%x"
> +#define WORKER_SEM_NAME "%x-%p-%"PRIxSIZE
>  #define MAIN_SEM_NAME "%x-%p-main"
>  
> -/* These are accessed under mutex inside add_worker_pool().
> - * They do not need to be atomic.
> - */
>  static atomic_bool initial_pool_setup = ATOMIC_VAR_INIT(false);
> -static bool can_parallelize = false;
>  
>  /* This is set only in the process of exit and the set is
>   * accompanied by a fence. It does not need to be atomic or be
> @@ -57,18 +53,18 @@ static struct ovs_list worker_pools = 
> OVS_LIST_INITIALIZER(_pools);
>  
>  static struct ovs_mutex init_mutex = OVS_MUTEX_INITIALIZER;
>  
> -static int pool_size;
> +static size_t pool_size = 1;
>  
>  static int sembase;
>  
>  static void worker_pool_hook(void *aux OVS_UNUSED);
> -static void setup_worker_pools(bool force);
> +static void setup_worker_pools(void);
>  static void merge_list_results(struct worker_pool *pool 

Re: [ovs-dev] [PATCH v1] Pmd.at: fix dpcls and dpif configuration test cases.

2022-05-18 Thread Phelan, Michael


> -Original Message-
> From: Amber, Kumar 
> Sent: Tuesday 17 May 2022 18:35
> To: ovs-dev@openvswitch.org
> Cc: i.maxim...@ovn.org; Ferriter, Cian ; Stokes, Ian
> ; Phelan, Michael ; Van
> Haaren, Harry ; Amber, Kumar
> 
> Subject: [PATCH v1] Pmd.at: fix dpcls and dpif configuration test cases.
> 
> Without running set command first the string matching fails on get command
> beacuse DPCLS prio value is different for different default builds like with 
> --
> enable-autovalidator build auto-validator prio is set to 255 and if the build 
> is a
> scalar than generic value is default 255.
> 
> The same problem is seen with dpif where re-arranging the get command
> after set makes it consistent across any builds.
> 
> Fixes: cc0a87b11c (pmd.at: Add test-cases for DPCLS and DPIF commands.)
> Signed-off-by: Kumar Amber 
> ---
>  tests/pmd.at | 15 ---
>  1 file changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/tests/pmd.at b/tests/pmd.at index 0a451f33c..3962dd2bd 100644
> --- a/tests/pmd.at
> +++ b/tests/pmd.at
> @@ -1113,15 +1113,15 @@ AT_SETUP([PMD - dpif configuration])
> OVS_VSWITCHD_START([], [], [], [--dummy-numa 0,0])  AT_CHECK([ovs-vsctl
> add-port br0 p1 -- set Interface p1 type=dummy-pmd])
> 
> +AT_CHECK([ovs-appctl dpif-netdev/dpif-impl-set dpif_scalar], [0], [dnl
> +DPIF implementation set to dpif_scalar.
> +])
> +
>  AT_CHECK([ovs-vsctl show], [], [stdout])  AT_CHECK([ovs-appctl dpif-
> netdev/dpif-impl-get | grep "dpif_scalar"], [], [dnl
>dpif_scalar (pmds: 0)
>  ])
> 
> -AT_CHECK([ovs-appctl dpif-netdev/dpif-impl-set dpif_scalar], [0], [dnl -DPIF
> implementation set to dpif_scalar.
> -])
> -
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> 
> @@ -1130,13 +1130,6 @@ OVS_VSWITCHD_START([], [], [], [--dummy-numa
> 0,0])  AT_CHECK([ovs-vsctl add-port br0 p1 -- set Interface p1 type=dummy-
> pmd])
> 
>  AT_CHECK([ovs-vsctl show], [], [stdout]) -AT_CHECK([ovs-appctl dpif-
> netdev/subtable-lookup-prio-get | grep generic], [], [dnl
> -  1 : generic
> -])
> -
> -AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-get | grep
> autovalidator], [], [dnl
> -  0 : autovalidator
> -])
> 
>  AT_CHECK([ovs-appctl dpif-netdev/subtable-lookup-prio-set autovalidator
> 3], [0], [dnl  Lookup priority change affected 0 dpcls ports and 0 subtables.
> --
> 2.25.1

Acked-by: Michael Phelan 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dynamic-string: Fix undefined behavior due to offsetting null pointer.

2022-05-18 Thread Dumitru Ceara
On 5/17/22 22:52, Ilya Maximets wrote:
> On 4/26/22 17:16, Dumitru Ceara wrote:
>> On 4/26/22 16:29, Adrian Moreno wrote:
>>>
>>>
>>> On 4/12/22 15:29, Dumitru Ceara wrote:
 When compiled with '-fsanitize=undefined' running 'ovsdb-client
 --timestamp monitor Open_vSwitch' in a sandbox triggers the following
 undefined behavior (flagged by UBSan):

    lib/dynamic-string.c:207:38: runtime error: applying zero offset to
 null pointer
    #0 0x196c05b in ds_put_strftime_msec lib/dynamic-string.c:207:38
    #1 0x196c2aa in xastrftime_msec lib/dynamic-string.c:225:5
    #2 0x1935f33 in pmd_info_show_perf lib/dpif-netdev.c:799:17
    #3 0x1935f33 in dpif_netdev_pmd_info lib/dpif-netdev.c:1487:13
    #4 0x1938155 in pmd_perf_show_cmd lib/dpif-netdev.c:1537:5
    #5 0x1d7634b in process_command lib/unixctl.c:310:13
    #6 0x1d7634b in run_connection lib/unixctl.c:344:17
    #7 0x1d7634b in unixctl_server_run lib/unixctl.c:395:21
    #8 0x168b5bd in main vswitchd/ovs-vswitchd.c:130:9
    #9 0x7f6fa2fd8492 in __libc_start_main (/lib64/libc.so.6+0x23492)
    #10 0xe6911d in _start (vswitchd/ovs-vswitchd+0xe6911d)

 Reported-by: Ilya Maximets 
 Signed-off-by: Dumitru Ceara 
 ---
   lib/dynamic-string.c | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/lib/dynamic-string.c b/lib/dynamic-string.c
 index fd0127ed1740..3d415af4f4e0 100644
 --- a/lib/dynamic-string.c
 +++ b/lib/dynamic-string.c
 @@ -200,8 +200,8 @@ ds_put_strftime_msec(struct ds *ds, const char
 *template, long long int when,
     for (;;) {
   size_t avail = ds->string ? ds->allocated - ds->length + 1 : 0;
 -    size_t used = strftime_msec(>string[ds->length], avail,
 template,
 -    );
 +    char *dest = ds->string ? >string[ds->length] : NULL;
 +    size_t used = strftime_msec(dest, avail, template, );
   if (used) {
   ds->length += used;
   return;
>>>
>>> The code looks OK to me. However, calling strftime_msec(NULL, 0,...)
>>> once if the dynamic string is empty seems to me a bit obfuscated.
>>>
>>> I would personally just add:
>>>
>>> if (!ds->string) {
>>>  ds_reserve(ds, 64);
>>> }
>>>
>>> before the loop to mentally get rid of that corner case.
>>
>> Sure, that's fine too, I have no preference either way.  Ilya, what do
>> you think, does this need a v2?
> 
> Yeah, the ds_reserve() seems cleaner.  We don't really need to
> check for !ds->string before calling it though, IIUC.
> 
> With that we may also remove the check while calculating 'avail'.

Done, thanks for the suggestions.  I posted v2:

dynamic-string: Fix undefined behavior due to offsetting null pointer.

Thanks,
Dumitru

> 
> Best regards, Ilya Maximets.
> 
>>
>>>
>>> In any case:
>>>
>>> Acked-by: Adrian Moreno 
>>>
>>
>> Thanks for the review!
>>
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] dynamic-string: Fix undefined behavior due to offsetting null pointer.

2022-05-18 Thread Dumitru Ceara
When compiled with clang and '-fsanitize=undefined' set, running
'ovsdb-client --timestamp monitor Open_vSwitch' in a sandbox triggers
the following undefined behavior (flagged by UBSan):

  lib/dynamic-string.c:207:38: runtime error: applying zero offset to null 
pointer
  #0 0x4ebc18 in ds_put_strftime_msec lib/dynamic-string.c:207:38
  #1 0x4ebd04 in xastrftime_msec lib/dynamic-string.c:225:5
  #2 0x552e6a in table_format_timestamp__ lib/table.c:226:12
  #3 0x552852 in table_print_timestamp__ lib/table.c:233:27
  #4 0x5506f3 in table_print_table__ lib/table.c:254:5
  #5 0x550633 in table_format lib/table.c:601:9
  #6 0x5524f3 in table_print lib/table.c:633:5
  #7 0x44dc5e in monitor_print_table ovsdb/ovsdb-client.c:1019:5
  #8 0x44c650 in monitor_print ovsdb/ovsdb-client.c:1040:13
  #9 0x44ac56 in do_monitor__ ovsdb/ovsdb-client.c:1500:21
  #10 0x44636e in do_monitor ovsdb/ovsdb-client.c:1575:5
  #11 0x442c41 in main ovsdb/ovsdb-client.c:283:5

Reported-by: Ilya Maximets 
Signed-off-by: Dumitru Ceara 
---
v2:
- Use Adrian's suggestion.
- Fixed commit log to use the backtrace matching the ovsdb-client
  reproducer.
---
 lib/dynamic-string.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/dynamic-string.c b/lib/dynamic-string.c
index 6940e1fd63bd..3b4520f87c35 100644
--- a/lib/dynamic-string.c
+++ b/lib/dynamic-string.c
@@ -202,10 +202,11 @@ ds_put_strftime_msec(struct ds *ds, const char *template, 
long long int when,
 localtime_msec(when, );
 }
 
+ds_reserve(ds, 64);
 for (;;) {
-size_t avail = ds->string ? ds->allocated - ds->length + 1 : 0;
-size_t used = strftime_msec(>string[ds->length], avail, template,
-);
+size_t avail = ds->allocated - ds->length + 1;
+char *dest = >string[ds->length];
+size_t used = strftime_msec(dest, avail, template, );
 if (used) {
 ds->length += used;
 return;
-- 
2.27.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] tests: Properly kill ovsdb test processes

2022-05-18 Thread Ilya Maximets
On 5/10/22 20:33, Rosemarie O'Riorden wrote:
> The FreeBSD CI builds keep failing because processes of tests are not
> properly killed. This leaves the build hanging until it times out, and
> ultimately fails.
> 
> Changes name of pidfile pid2 to 2.pid so that the
> on_exit 'kill `cat *.pid`' will capture all pidfiles.
> 
> Fixes pidfile name logic in test that uses OVSDB_SERVER_SHUTDOWN_N, so
> that all pidfile names match the form *.pid.
> 
> Replaces unnecessary --pidfile="`pwd`"/pid with just --pidfile, because
> by default this argument creates a pidfile named .pid.
> 
> Removes extra [test ! -e pid || kill `cat pid`] that run upon AT_CHECK
> failure, because those processes are killed with on_exit. Also adds
> on_exit in tests where it was missing.
> 
> Fixes: 561205007e ("tests: Get rid of overly specific --pidfile and --unixctl 
> options.")
> Fixes: 0be15ad76f ("ovsdb-server.at: Add unit test for record/replay.")
> Fixes: 7964ffe7d2 ("ovsdb: relay: Add support for transaction forwarding.")
> Fixes: e879d33e83 ("ovsdb/jsonrpc-server: ovsdb-server closes accepted 
> connections immediately.")
> Fixes: 7ed9240836 ("ovsdb-client: Move ovsdb-client specific tests to new .at 
> file.")
> Signed-off-by: Rosemarie O'Riorden 
> ---
>  - v1 was an outdated work-in-progress
>  - Fixes OVSDB_SERVER_SHUTDOWN_N, pidfile names, removes unnecessary
>process kills.
> 
>  tests/ovsdb-client.at |   3 ++
>  tests/ovsdb-server.at | 110 ++
>  2 files changed, 51 insertions(+), 62 deletions(-)

This literally saves hours when applying patches.
Thanks for tracking all these processes down!

Applied and backported down to 2.13.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpif-netdev: Only hash port number when necessary.

2022-05-18 Thread Ilya Maximets
On 5/10/22 17:03, Mike Pattrick wrote:
> On Fri, Apr 29, 2022 at 11:44 AM Cian Ferriter  
> wrote:
>>
>> The hash of the port number is only needed when a DPCLS needs to be
>> created. Move the hash calculation inside the if to accomplish this.
>>
>> Signed-off-by: Cian Ferriter 
>> ---
> 
> Good catch!
> 
> Acked-by: Mike Pattrick 

I'd guess that compiler will produce identical code in most cases,
but the change makes the code a bit cleaner.

Applied.  Thanks!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] ovs-save: Get highest ofp version error.

2022-05-18 Thread Ilya Maximets
On 5/11/22 10:53, Han Ding wrote:
> When setting just one ofp version to protocols of bridge, The function
> get_highest_ofp_version in ovs-save parse it error.
> 
> For example:
> $ ovs-vsctl get bridge br-int protocols
> [OpenFlow15]
> 
> $ ovs-vsctl get bridge br-int protocols |
>   sed 's/[][]//g' | sed 's/\ //g' | awk -F ',' '{ print (NF>1)? $(NF) : 
> "OpenFlow14" }'
> OpenFlow14
> 
> Signed-off-by: Han Ding 
> Acked-by: Adrian Moreno 
> ---
> 
> Notes:
> Change the author name to full name.
> 
>  utilities/ovs-save | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Applied.  Thanks!
Backported down to 2.16.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 0/7] Fix UndefinedBehaviorSanitizer reported issues and enable it in CI.

2022-05-18 Thread Ilya Maximets
On 4/11/22 13:37, Dumitru Ceara wrote:
> As privately reported by Aaron Conole, and by Jeffrey Walton [0]
> there's currently a number of undefined behavior instances in
> the OVS code base.  Running the OVS (and OVN) tests with UBSan [1]
> enabled uncovers these.
> 
> This series fixes the issues reported by UBSan and, through the last
> patch, enables UBSan tests in GitHub Actions CI.
> 
> [0] https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390894.html
> [1] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
> [2] 
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=286494=*
> 
> Changes in v6:
> - Rebased.
> - Added acks from Aaron and/or Adrian to patches 3, 4, 5.
> - Added a new patch, 6/7, addressing another UB instance spotted when
>   running system tests.
> - Addresses Aaron's comments on patch 2.
> 
> Changes in v5:
> - Rebased, dropped patch 'util: Avoid UB when iterating collections.',
>   the proper fix from Adrian was accepted.
> - Addressed Adrian's comments.
> 
> Changes in v4:
> - Rebased, dropping patches that were already applied.
> - Addressed Ilya's comments.
> - Added acks from Paolo from v3 as nothing major changed in this revision.
> - Rephrased the commit message for "util: Avoid UB when iterating 
> collections."
> 
> Changes in v3:
> - Added acks to the patches acked by Aaron.
> - Addressed Aaron's comments.
> - Split previous patch 07/11 "ofp-actions: ofp-errors: Use aligned
>   structures when decoding ofp actions." into three separate patches
>   addressing independent issues.
> - Reordered patches such that the ones that might need follow up are
>   towards the end of the series.
> - Added a new patch, patch 13/14, fixing a typo in the CFLAGS_ASAN
>   variables in the script used for building OVS in CI.  This typo was
>   originally copy/pasted in the CFLAGS_UBSAN flags in v1 and v2 of
>   this series.
> 
> Changes in v2:
> - Patch 3/11:
>   - Remove cache line size aligment markers instead, as suggested by
> Ilya.
> 
> Dumitru Ceara (7):
>   treewide: Fix invalid bit shift operations.
>   treewide: Avoid offsetting NULL pointers.
>   ofp-actions: Ensure aligned accesses to masked fields.
>   ofp-errors: Ensure parsed OFPT_ERROR messages are properly aligned.
>   ofp-actions: Use aligned structures when decoding ofp actions.
>   netdev-linux: Properly access 32-bit aligned rtnl_link_stats64 structs.
>   ci: Add UB Sanitizer.
> 
> 
>  .ci/linux-build.sh   |6 ++
>  .github/workflows/build-and-test.yml |5 ++
>  configure.ac |1 
>  include/openvswitch/ofp-actions.h|   15 +++--
>  include/openvswitch/ofpbuf.h |   24 ++-
>  include/openvswitch/util.h   |3 +
>  lib/dpif-netlink.c   |2 -
>  lib/dynamic-string.c |8 ++
>  lib/meta-flow.c  |4 +
>  lib/netdev-linux.c   |9 ++-
>  lib/ofp-actions.c|  111 
> ++
>  lib/ofp-errors.c |2 +
>  lib/ofpbuf.c |   43 +
>  lib/ovsdb-data.c |   37 ++-
>  lib/ovsdb-data.h |4 +
>  lib/sset.c   |4 +
>  lib/tnl-ports.c  |2 -
>  ofproto/ofproto-dpif-xlate.c |3 +
>  ofproto/ofproto.c|2 -
>  tests/atlocal.in |   16 +
>  tests/automake.mk|1 
>  tests/daemon.at  |8 ++
>  tests/ovs-macros.at  |5 ++
>  tests/ovs-ofctl.at   |2 -
>  tests/ovsdb-server.at|   16 +
>  tests/test-hash.c|3 +
>  tests/test-util.c|   13 ++--
>  27 files changed, 279 insertions(+), 70 deletions(-)
> 
> 

Applied.  Thanks!

Unfortunately, at least one of ofp-* patches breaks ABI, so we can't
backport them to stable branches.  For example, they will almost
certainly break OVN if it will be linked with libopenvswitch dynamically.
I backported treewide and netdev-linux patches though.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v4] northd: add configuration option for enabling parallelization

2022-05-18 Thread 0-day Robot
Bleep bloop.  Greetings Xavier Simonart, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line lacks whitespace around operator
WARNING: Line lacks whitespace around operator
#1073 FILE: northd/ovn-northd.c:533:
  --n-threads=N specify number of threads\n\

Lines checked: 1406, Warnings: 2, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v4] northd: add configuration option for enabling parallelization

2022-05-18 Thread Xavier Simonart
This patch is intended to change the way to enable northd lflow build
parallelization, as well as enable runtime change of number of threads.
Before this patch, the following was needed to use parallelization:
- enable parallelization through use_parallel_build in NBDB
- use --dummy-numa to select number of threads.
This second part was needed as otherwise as many threads as cores in the system
were used, while parallelization showed some performance improvement only until
using around 4 (or maybe 8) threads.

With this patch, the number of threads used for lflow parallel build can be
specified either:
- at startup, using --n-threads= as ovn-northd command line option
- using unixctl
If the number of threads specified is > 1, then parallelization is enabled.
If the number is 1, parallelization is disabled.
If the number is < 1, parallelization is disabled at startup and a warning
is logged.
If the number is > 256, parallelization is enabled (with 256 threads) and
a warning is logged.

The following unixctl have been added:
- set-n-threads : set the number of treads used.
- get-n-threads: returns the number of threads used
If the number of threads is within <2-256> bounds, parallelization is enabled.
If the number of thread is 1, parallelization is disabled.
Otherwise an error is thrown.

Note that, if set-n-threads failed for any reason (e.g. failure to setup some
semaphore), parallelization is disabled, and get-n-thread will return 1.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2078552
Signed-off-by: Xavier Simonart 

---
v2:  - handled Dumitru's comments
 - added missing mutex_destroy
 - fixed issue when use_logical_dp_group is enabled after northd startup
 - rebased on top of main
v3:
 - fix mutex_destroy issue
v4:
 - handled Mark's comments
 - rebased on top of main
---
 NEWS  |   7 +
 lib/ovn-parallel-hmap.c   | 291 +-
 lib/ovn-parallel-hmap.h   |  30 ++--
 northd/northd.c   | 176 ---
 northd/northd.h   |   1 +
 northd/ovn-northd-ddlog.c |   6 -
 northd/ovn-northd.8.xml   |  70 +
 northd/ovn-northd.c   |  68 -
 tests/ovn-macros.at   |  59 ++--
 tests/ovn-northd.at   | 109 ++
 10 files changed, 495 insertions(+), 322 deletions(-)

diff --git a/NEWS b/NEWS
index 244824e3f..6e489df32 100644
--- a/NEWS
+++ b/NEWS
@@ -9,6 +9,13 @@ Post v22.03.0
 implicit drop behavior on logical switches with ACLs applied.
   - Support (LSP.options:qos_min_rate) to guarantee minimal bandwidth available
 for a logical port.
+  - Changed the way to enable northd parallelization.
+Removed support for:
+- use_parallel_build in NBDB.
+- --dummy-numa in northd cmdline.
+Added support for:
+-  --n-threads= in northd cmdline.
+- set-n-threads/get-n-threads unixctls.
 
 OVN v22.03.0 - 11 Mar 2022
 --
diff --git a/lib/ovn-parallel-hmap.c b/lib/ovn-parallel-hmap.c
index 7edc4c0b6..c4d8cee16 100644
--- a/lib/ovn-parallel-hmap.c
+++ b/lib/ovn-parallel-hmap.c
@@ -38,14 +38,10 @@ VLOG_DEFINE_THIS_MODULE(ovn_parallel_hmap);
 
 #ifndef OVS_HAS_PARALLEL_HMAP
 
-#define WORKER_SEM_NAME "%x-%p-%x"
+#define WORKER_SEM_NAME "%x-%p-%"PRIxSIZE
 #define MAIN_SEM_NAME "%x-%p-main"
 
-/* These are accessed under mutex inside add_worker_pool().
- * They do not need to be atomic.
- */
 static atomic_bool initial_pool_setup = ATOMIC_VAR_INIT(false);
-static bool can_parallelize = false;
 
 /* This is set only in the process of exit and the set is
  * accompanied by a fence. It does not need to be atomic or be
@@ -57,18 +53,18 @@ static struct ovs_list worker_pools = 
OVS_LIST_INITIALIZER(_pools);
 
 static struct ovs_mutex init_mutex = OVS_MUTEX_INITIALIZER;
 
-static int pool_size;
+static size_t pool_size = 1;
 
 static int sembase;
 
 static void worker_pool_hook(void *aux OVS_UNUSED);
-static void setup_worker_pools(bool force);
+static void setup_worker_pools(void);
 static void merge_list_results(struct worker_pool *pool OVS_UNUSED,
void *fin_result, void *result_frags,
-   int index);
+   size_t index);
 static void merge_hash_results(struct worker_pool *pool OVS_UNUSED,
void *fin_result, void *result_frags,
-   int index);
+   size_t index);
 
 bool
 ovn_stop_parallel_processing(void)
@@ -76,107 +72,184 @@ ovn_stop_parallel_processing(void)
 return workers_must_exit;
 }
 
-bool
-ovn_can_parallelize_hashes(bool force_parallel)
+size_t
+ovn_get_worker_pool_size(void)
 {
-bool test = false;
+return pool_size;
+}
 
-if (atomic_compare_exchange_strong(
-_pool_setup,
-,
-true)) {
-ovs_mutex_lock(_mutex);
-setup_worker_pools(force_parallel);
-ovs_mutex_unlock(_mutex);
+static 

Re: [ovs-dev] [PATCH v2] tests/mfex: Improve pcap script for mfex tests.

2022-05-18 Thread Amber, Kumar
Hi Eelco,

Please find my replies Inline.

> -Original Message-
> From: Eelco Chaudron 
> Sent: Thursday, May 12, 2022 2:53 PM
> To: Amber, Kumar 
> Cc: ovs-dev@openvswitch.org; Ferriter, Cian ;
> Stokes, Ian ; Van Haaren, Harry
> ; Ilya Maximets 
> Subject: Re: [PATCH v2] tests/mfex: Improve pcap script for mfex tests.
> 
> 
> 
> On 11 May 2022, at 18:28, Kumar Amber wrote:
> 
> > The mfex pcap generation script is improved for varied length traffic
> > and also removes the hard coded mfex_pcap and instead uses the script
> > itself to generate complex traffic patterns for testing.
> 
> I did not follow the discussion around the fuzzy testing being part of the 
> unit
> tests, so I’ll only express my concerns below.
> 
 The discussion on this led to a patch-set which introduces a new way to test 
in an integrated way to include flow
based testing of AVX512 mfex as it happens for the scalar mfex in the testing 
setup, the new method has its own merits and these .at based tests have their 
own merits and we feel having both will strengthen the AVX512 data-path testing 
both at functional level and in an integrated level as well.

The new Testing patch is under review and can be found here: 
http://patchwork.ozlabs.org/project/openvswitch/list/?series=297158

On the Sidelines I feel its good to have the infrastructure ready to enable the 
test in CI, even if we choose
Not to enable it and considering its just adding some warnings which can be 
ignored.

> > Signed-off-by: Kumar Amber 
> > Acked-by: Cian Ferriter 
> >
> > ---
> > v2:
> > - Add huge page test-skip.
> > - Change core id to 3 to 0 to allow the mfex config test-case
> >   to run on any system.
> > ---
> > ---
> >  tests/automake.mk |   1 -
> >  tests/mfex_fuzzy.py   |  66 +++---
> >  tests/pcap/mfex_test.pcap | Bin 416 -> 0 bytes
> >  tests/system-dpdk.at  |  44 +
> >  4 files changed, 77 insertions(+), 34 deletions(-)  delete mode
> > 100644 tests/pcap/mfex_test.pcap
> >
> > diff --git a/tests/automake.mk b/tests/automake.mk index
> > 34ddda6aa..204e86fac 100644
> > --- a/tests/automake.mk
> > +++ b/tests/automake.mk
> > @@ -146,7 +146,6 @@ $(srcdir)/tests/fuzz-regression-list.at:
> > tests/automake.mk
> >
> >  EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
> MFEX_AUTOVALIDATOR_TESTS =
> > \
> > -   tests/pcap/mfex_test.pcap \
> > tests/mfex_fuzzy.py
> >
> >  OVSDB_CLUSTER_TESTSUITE_AT = \
> > diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py index
> > 3efe1152d..dbde5fe1b 100755
> > --- a/tests/mfex_fuzzy.py
> > +++ b/tests/mfex_fuzzy.py
> > @@ -3,30 +3,58 @@
> >  import sys
> >
> >  from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6,
> > RandShort, fuzz -from scapy.all import IPv6, Dot1Q, IP, Ether, UDP,
> > TCP
> > +from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random
> >
> > +# Relative path for the pcap file location.
> >  path = str(sys.argv[1]) + "/pcap/fuzzy.pcap"
> > +# The number of packets generated will be size * 8.
> > +size = int(sys.argv[2])
> > +# Traffic option is used to choose between fuzzy or simple packet type.
> > +traffic_opt = str(sys.argv[3])
> 
> Think here we should check for the existence of sys.argv[3], and if not just
> initialize traffic_opt as “” so we do not have to supply a dummy “0” option.
> 

Done in V3.

> > +
> >  pktdump = PcapWriter(path, append=False, sync=True)
> >
> > -for i in range(0, 2000):
> > +pkt = []
> > +
> > +for i in range(0, size):
> >
> > -# Generate random protocol bases, use a fuzz() over the combined
> packet
> > -# for full fuzzing.
> >  eth = Ether(src=RandMAC(), dst=RandMAC())
> >  vlan = Dot1Q()
> > -ipv4 = IP(src=RandIP(), dst=RandIP())
> > -ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
> >  udp = UDP(dport=RandShort(), sport=RandShort())
> > -tcp = TCP(dport=RandShort(), sport=RandShort())
> > -
> > -# IPv4 packets with fuzzing
> > -pktdump.write(fuzz(eth / ipv4 / udp))
> > -pktdump.write(fuzz(eth / ipv4 / tcp))
> > -pktdump.write(fuzz(eth / vlan / ipv4 / udp))
> > -pktdump.write(fuzz(eth / vlan / ipv4 / tcp))
> > -
> > -# IPv6 packets with fuzzing
> > -pktdump.write(fuzz(eth / ipv6 / udp))
> > -pktdump.write(fuzz(eth / ipv6 / tcp))
> > -pktdump.write(fuzz(eth / vlan / ipv6 / udp))
> > -pktdump.write(fuzz(eth / vlan / ipv6 / tcp))
> > +
> > +if traffic_opt == "fuzzy":
> > +
> > +ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100))
> > +ipv6 = IPv6(src=RandIP6(), dst=RandIP6(), plen=random.randint(0,
> 100))
> > +tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S',
> > +  dataofs=random.randint(0, 20))
> 
> dataof is 4 bits, why go to 20? Do we want more of the lower values?
> 

Reduced the value to 0-15.

> > +# IPv4 packets with fuzzing
> > +pkt.append(fuzz(eth / ipv4 / udp))
> > +pkt.append(fuzz(eth / ipv4 / tcp))
> > +

[ovs-dev] [PATCH v3] tests/mfex: Improve pcap script for mfex tests.

2022-05-18 Thread Kumar Amber
The mfex pcap generation script is improved for varied length
traffic and also removes the hard coded mfex_pcap and instead uses
the script itself to generate complex traffic patterns for testing.

Signed-off-by: Kumar Amber 
Acked-by: Cian Ferriter 

---
v3:
- Fix comments(Eelco).
- Script generates mac/ip/l4_ports in a fixed range.
v2:
- Add huge page test-skip.
- Change core id to 3 to 0 to allow the mfex config test-case
  to run on any system.
---
---
 tests/automake.mk |   1 -
 tests/mfex_fuzzy.py   |  80 +++---
 tests/pcap/mfex_test.pcap | Bin 416 -> 0 bytes
 tests/system-dpdk.at  |  62 ++---
 4 files changed, 96 insertions(+), 47 deletions(-)
 delete mode 100644 tests/pcap/mfex_test.pcap

diff --git a/tests/automake.mk b/tests/automake.mk
index 34ddda6aa..204e86fac 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -146,7 +146,6 @@ $(srcdir)/tests/fuzz-regression-list.at: tests/automake.mk
 
 EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
 MFEX_AUTOVALIDATOR_TESTS = \
-   tests/pcap/mfex_test.pcap \
tests/mfex_fuzzy.py
 
 OVSDB_CLUSTER_TESTSUITE_AT = \
diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py
index 3efe1152d..5a15c49e1 100755
--- a/tests/mfex_fuzzy.py
+++ b/tests/mfex_fuzzy.py
@@ -3,30 +3,64 @@
 import sys
 
 from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6, RandShort, fuzz
-from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP
+from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random
 
+# Relative path for the pcap file location.
 path = str(sys.argv[1]) + "/pcap/fuzzy.pcap"
+# The number of packets generated will be size * 8.
+size = int(sys.argv[2])
+# Traffic option is used to choose between fuzzy or simple packet type.
+if (len(sys.argv) > 3):
+traffic_opt = str(sys.argv[3])
+else:
+traffic_opt = ""
+
 pktdump = PcapWriter(path, append=False, sync=True)
 
-for i in range(0, 2000):
-
-# Generate random protocol bases, use a fuzz() over the combined packet
-# for full fuzzing.
-eth = Ether(src=RandMAC(), dst=RandMAC())
-vlan = Dot1Q()
-ipv4 = IP(src=RandIP(), dst=RandIP())
-ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
-udp = UDP(dport=RandShort(), sport=RandShort())
-tcp = TCP(dport=RandShort(), sport=RandShort())
-
-# IPv4 packets with fuzzing
-pktdump.write(fuzz(eth / ipv4 / udp))
-pktdump.write(fuzz(eth / ipv4 / tcp))
-pktdump.write(fuzz(eth / vlan / ipv4 / udp))
-pktdump.write(fuzz(eth / vlan / ipv4 / tcp))
-
-# IPv6 packets with fuzzing
-pktdump.write(fuzz(eth / ipv6 / udp))
-pktdump.write(fuzz(eth / ipv6 / tcp))
-pktdump.write(fuzz(eth / vlan / ipv6 / udp))
-pktdump.write(fuzz(eth / vlan / ipv6 / tcp))
+pkt = []
+
+for i in range(0, size):
+if traffic_opt == "fuzzy":
+
+eth = Ether(src=RandMAC(), dst=RandMAC())
+vlan = Dot1Q()
+udp = UDP(dport=RandShort(), sport=RandShort())
+ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100))
+ipv6 = IPv6(src=RandIP6(), dst=RandIP6(), plen=random.randint(0, 100))
+tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S',
+  dataofs=random.randint(0, 15))
+# IPv4 packets with fuzzing
+pkt.append(fuzz(eth / ipv4 / udp))
+pkt.append(fuzz(eth / ipv4 / tcp))
+pkt.append(fuzz(eth / vlan / ipv4 / udp))
+pkt.append(fuzz(eth / vlan / ipv4 / tcp))
+
+# IPv6 packets with fuzzing
+pkt.append(fuzz(eth / ipv6 / udp))
+pkt.append(fuzz(eth / ipv6 / tcp))
+pkt.append(fuzz(eth / vlan / ipv6 / udp))
+pkt.append(fuzz(eth / vlan / ipv6 / tcp))
+
+else:
+mac_addr = "52:54:00:FF:FF:%02x" % (random.randint(0, 255),)
+src_port = random.randrange(600, 800)
+dst_port = random.randrange(800, 1000)
+eth = Ether(src=mac_addr, dst=mac_addr)
+vlan = Dot1Q(vlan=random.randrange(1, 20))
+udp = UDP(dport=src_port, sport=dst_port)
+ipv4 = IP(src=RandIP()._fix(), dst=RandIP()._fix())
+ipv6 = IPv6(src=RandIP6()._fix(), dst=RandIP6()._fix())
+tcp = TCP(dport=src_port, sport=dst_port, flags='S')
+# IPv4 packets
+pkt.append(eth / ipv4 / udp)
+pkt.append(eth / ipv4 / tcp)
+pkt.append(eth / vlan / ipv4 / udp)
+pkt.append(eth / vlan / ipv4 / tcp)
+
+# IPv6 packets
+pkt.append(eth / ipv6 / udp)
+pkt.append(eth / ipv6 / tcp)
+pkt.append(eth / vlan / ipv6 / udp)
+pkt.append(eth / vlan / ipv6 / tcp)
+
+pktdump.write(pkt)
diff --git a/tests/pcap/mfex_test.pcap b/tests/pcap/mfex_test.pcap
deleted file mode 100644
index 
1aac67b8d643ecb016c758cba4cc32212a80f52a..
GIT binary patch
literal 0
HcmV?d1

literal 416
zcmca|c+)~A1{MYw`2U}Qff2}QK`M68ITRa|G@yFii5$Gfk6YL%z>@uY&}o|

Re: [ovs-dev] [PATCH v5 5/5] acinclude: Add seperate checks for AVX512 ISA.

2022-05-18 Thread Ferriter, Cian



> -Original Message-
> From: Pai G, Sunil 
> Sent: Tuesday 17 May 2022 15:52
> To: Ferriter, Cian ; ovs-dev@openvswitch.org
> Cc: Van Haaren, Harry 
> Subject: RE: [PATCH v5 5/5] acinclude: Add seperate checks for AVX512 ISA.
> 
> > Checking for each of the required AVX512 ISA separately will allow the
> > compiler to generate some AVX512 code where there is some support in the
> > compiler rather than only generating all AVX512 code when all of it is
> > supported or no AVX512 code at all.
> >
> > For example, in GCC 4.9 where there is just support for AVX512F, this
> > patch will allow building the AVX512 DPIF.
> >
> > Another example, in GCC 5 and 6, most AVX512 code can be generated, just
> > without AVX512VPOPCNTDQ support.
> >
> > Signed-off-by: Cian Ferriter 
> >
> > ---
> > v5:
> > * Create a selector function for the permutexvar implementations based
> >   on Sunil's feedback on the v4.  This hides the complexity of compile
> >   time and run time selection of permutexvar implementations.
> > * Add a comment explaining why VPOPCNTDQ_TARGET is defined and used.
> >
> > v4:
> > * Combine the 3 commits which added checks for AVX512 ISA into this
> >   single commit since the first 2 commits were only useful and active
> >   when the 3rd commit was applied. This also takes care of Sunil's
> >   comment about explaining that the first 2 commits are precursors.
> > * Don't check for AVX512DQ availability in the compiler. This ISA isn't
> >   used in OVS.
> > * Put all AVX512 ISA checks in the OVS_CHECK_AVX512 macro as per Sunil's
> >   feedback.
> > * Define a function in acinclude.m4, (OVS_CONDITIONAL_CC_OPTION_DEFINE),
> >   to help with checking for AVX512 ISA support in the compiler.
> > * Remove the '__AVX512VPOPCNTDQ__' check. Use the HAVE_AVX512* pattern
> >   consistently with all AVX512 ISA checks instead. Fixup the comment
> >   explaining the _mm512_popcnt_epi64_wrapper() function to reflect this.
> 
> 
> Based on the comments and responses to v4 and changes implemented in v5 ,
> 
> Acked-by: Sunil Pai G 

Thanks for the reviews on this and other patches in the series Sunil!
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v4 7/8] netdev-offload-tc: Offloading rules with police actions

2022-05-18 Thread Eelco Chaudron


On 18 May 2022, at 9:46, Jianbo Liu wrote:

> On Wed, 2022-05-18 at 09:19 +0200, Eelco Chaudron wrote:
>>
>>
>> On 18 May 2022, at 8:54, Jianbo Liu wrote:
>>
>>> On Wed, 2022-05-18 at 08:25 +0200, Eelco Chaudron wrote:


 On 18 May 2022, at 5:27, Jianbo Liu wrote:

> On Mon, 2022-05-16 at 13:34 +0200, Eelco Chaudron wrote:
>>
>>
>> On 3 May 2022, at 5:08, Jianbo Liu via dev wrote:
>>
>>> When offloading rule, tc should be filled with police
>>> index,
>>> instead
>>> of meter id. As meter is mapped to police action, and the
>>> mapping is
>>> inserted into meter_id_to_police_idx hmap, this hmap is
>>> used to
>>> find
>>> the police index. Besides, an action cookie is added to
>>> save
>>> meter id
>>> on rule creation, so meter id can be retrieved from the
>>> cookie,
>>> and
>>> pass to dpif while dumping rules.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---
>>>  lib/netdev-offload-tc.c | 16 +++-
>>>  lib/tc.c    | 25 -
>>>  2 files changed, 39 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-
>>> tc.c
>>> index bded4bc8c..fa3e8113b 100644
>>> --- a/lib/netdev-offload-tc.c
>>> +++ b/lib/netdev-offload-tc.c
>>> @@ -169,6 +169,8 @@ static struct netlink_field
>>> set_flower_map[][4] =
>>> {
>>>  },
>>>  };
>>>
>>> +static int meter_id_lookup(uint32_t meter_id, uint32_t
>>> *police_idx);
>>> +
>>
>> Maybe move this up to the place where you also define the
>> meter
>> mutex/ids?
>>
>>>  static struct ovs_mutex ufid_lock = OVS_MUTEX_INITIALIZER;
>>>
>>>  /**
>>> @@ -1038,7 +1040,8 @@ parse_tc_flower_to_match(struct
>>> tc_flower
>>> *flower,
>>>  }
>>>  break;
>>>  case TC_ACT_POLICE: {
>>> -    /* Not supported yet */
>>> +    nl_msg_put_u32(buf, OVS_ACTION_ATTR_METER,
>>> +   action->police.meter_id);
>>>  }
>>>  break;
>>>  }
>>> @@ -1961,6 +1964,17 @@ netdev_tc_flow_put(struct netdev
>>> *netdev,
>>> struct match *match,
>>>  action->type = TC_ACT_GOTO;
>>>  action->chain = 0;  /* 0 is reserved and not
>>> used
>>> by
>>> recirc. */
>>>  flower.action_count++;
>>> +    } else if (nl_attr_type(nla) ==
>>> OVS_ACTION_ATTR_METER)
>>> {
>>> +    uint32_t police_index;
>>> +
>>> +    action->type = TC_ACT_POLICE;
>>> +    action->police.meter_id =
>>> nl_attr_get_u32(nla);
>>> +    if (meter_id_lookup(action->police.meter_id,
>>> _index)) {
>>> +    return EOPNOTSUPP;
>>> +    }
>>> +
>>> +    action->police.index = police_index;
>>> +    flower.action_count++;
>>>  } else {
>>>  VLOG_DBG_RL(, "unsupported put action type:
>>> %d",
>>>  nl_attr_type(nla));
>>> diff --git a/lib/tc.c b/lib/tc.c
>>> index 16d4ae3da..ecdcb676d 100644
>>> --- a/lib/tc.c
>>> +++ b/lib/tc.c
>>> @@ -2421,6 +2421,22 @@ nl_msg_put_act_gact(struct ofpbuf
>>> *request,
>>> uint32_t chain)
>>>  nl_msg_end_nested(request, offset);
>>>  }
>>>
>>> +static void
>>> +nl_msg_put_act_police_index(struct ofpbuf *request,
>>> uint32_t
>>> police_idx)
>>> +{
>>> +    struct tc_police police;
>>> +    size_t offset;
>>> +
>>> +    memset(, 0, sizeof police);
>>> +    police.index = police_idx;
>>> +
>>> +    nl_msg_put_string(request, TCA_ACT_KIND, "police");
>>> +    offset = nl_msg_start_nested(request,
>>> TCA_ACT_OPTIONS);
>>> +    nl_msg_put_unspec(request, TCA_POLICE_TBF, ,
>>> sizeof
>>> police);
>>> +    nl_msg_put_u32(request, TCA_POLICE_RESULT,
>>> TC_ACT_PIPE);
>>> +    nl_msg_end_nested(request, offset);
>>> +}
>>> +
>>>  static void
>>>  nl_msg_put_act_ct(struct ofpbuf *request, struct tc_action
>>> *action)
>>>  {
>>> @@ -2911,7 +2927,14 @@ nl_msg_put_flower_acts(struct ofpbuf
>>> *request,
>>> struct tc_flower *flower)
>>>  }
>>>  break;
>>>  case TC_ACT_POLICE: {
>>> -    /* Not supported yet */
>>> +    struct tc_cookie act_cookie;
>>> +
>>> +    act_offset = nl_msg_start_nested(request,
>>> act_index++);
>>> +    nl_msg_put_act_police_index(request,
>>> action-
 police.index);
>>> +    act_cookie.data = 
 police.meter_id;
>>> +    act_cookie.len = sizeof(action-
 

Re: [ovs-dev] [v4 7/8] netdev-offload-tc: Offloading rules with police actions

2022-05-18 Thread Jianbo Liu via dev
On Wed, 2022-05-18 at 09:19 +0200, Eelco Chaudron wrote:
> 
> 
> On 18 May 2022, at 8:54, Jianbo Liu wrote:
> 
> > On Wed, 2022-05-18 at 08:25 +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 18 May 2022, at 5:27, Jianbo Liu wrote:
> > > 
> > > > On Mon, 2022-05-16 at 13:34 +0200, Eelco Chaudron wrote:
> > > > > 
> > > > > 
> > > > > On 3 May 2022, at 5:08, Jianbo Liu via dev wrote:
> > > > > 
> > > > > > When offloading rule, tc should be filled with police
> > > > > > index,
> > > > > > instead
> > > > > > of meter id. As meter is mapped to police action, and the
> > > > > > mapping is
> > > > > > inserted into meter_id_to_police_idx hmap, this hmap is
> > > > > > used to
> > > > > > find
> > > > > > the police index. Besides, an action cookie is added to
> > > > > > save
> > > > > > meter id
> > > > > > on rule creation, so meter id can be retrieved from the
> > > > > > cookie,
> > > > > > and
> > > > > > pass to dpif while dumping rules.
> > > > > > 
> > > > > > Signed-off-by: Jianbo Liu 
> > > > > > ---
> > > > > >  lib/netdev-offload-tc.c | 16 +++-
> > > > > >  lib/tc.c    | 25 -
> > > > > >  2 files changed, 39 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-
> > > > > > tc.c
> > > > > > index bded4bc8c..fa3e8113b 100644
> > > > > > --- a/lib/netdev-offload-tc.c
> > > > > > +++ b/lib/netdev-offload-tc.c
> > > > > > @@ -169,6 +169,8 @@ static struct netlink_field
> > > > > > set_flower_map[][4] =
> > > > > > {
> > > > > >  },
> > > > > >  };
> > > > > > 
> > > > > > +static int meter_id_lookup(uint32_t meter_id, uint32_t
> > > > > > *police_idx);
> > > > > > +
> > > > > 
> > > > > Maybe move this up to the place where you also define the
> > > > > meter
> > > > > mutex/ids?
> > > > > 
> > > > > >  static struct ovs_mutex ufid_lock = OVS_MUTEX_INITIALIZER;
> > > > > > 
> > > > > >  /**
> > > > > > @@ -1038,7 +1040,8 @@ parse_tc_flower_to_match(struct
> > > > > > tc_flower
> > > > > > *flower,
> > > > > >  }
> > > > > >  break;
> > > > > >  case TC_ACT_POLICE: {
> > > > > > -    /* Not supported yet */
> > > > > > +    nl_msg_put_u32(buf, OVS_ACTION_ATTR_METER,
> > > > > > +   action->police.meter_id);
> > > > > >  }
> > > > > >  break;
> > > > > >  }
> > > > > > @@ -1961,6 +1964,17 @@ netdev_tc_flow_put(struct netdev
> > > > > > *netdev,
> > > > > > struct match *match,
> > > > > >  action->type = TC_ACT_GOTO;
> > > > > >  action->chain = 0;  /* 0 is reserved and not
> > > > > > used
> > > > > > by
> > > > > > recirc. */
> > > > > >  flower.action_count++;
> > > > > > +    } else if (nl_attr_type(nla) ==
> > > > > > OVS_ACTION_ATTR_METER)
> > > > > > {
> > > > > > +    uint32_t police_index;
> > > > > > +
> > > > > > +    action->type = TC_ACT_POLICE;
> > > > > > +    action->police.meter_id =
> > > > > > nl_attr_get_u32(nla);
> > > > > > +    if (meter_id_lookup(action->police.meter_id,
> > > > > > _index)) {
> > > > > > +    return EOPNOTSUPP;
> > > > > > +    }
> > > > > > +
> > > > > > +    action->police.index = police_index;
> > > > > > +    flower.action_count++;
> > > > > >  } else {
> > > > > >  VLOG_DBG_RL(, "unsupported put action type:
> > > > > > %d",
> > > > > >  nl_attr_type(nla));
> > > > > > diff --git a/lib/tc.c b/lib/tc.c
> > > > > > index 16d4ae3da..ecdcb676d 100644
> > > > > > --- a/lib/tc.c
> > > > > > +++ b/lib/tc.c
> > > > > > @@ -2421,6 +2421,22 @@ nl_msg_put_act_gact(struct ofpbuf
> > > > > > *request,
> > > > > > uint32_t chain)
> > > > > >  nl_msg_end_nested(request, offset);
> > > > > >  }
> > > > > > 
> > > > > > +static void
> > > > > > +nl_msg_put_act_police_index(struct ofpbuf *request,
> > > > > > uint32_t
> > > > > > police_idx)
> > > > > > +{
> > > > > > +    struct tc_police police;
> > > > > > +    size_t offset;
> > > > > > +
> > > > > > +    memset(, 0, sizeof police);
> > > > > > +    police.index = police_idx;
> > > > > > +
> > > > > > +    nl_msg_put_string(request, TCA_ACT_KIND, "police");
> > > > > > +    offset = nl_msg_start_nested(request,
> > > > > > TCA_ACT_OPTIONS);
> > > > > > +    nl_msg_put_unspec(request, TCA_POLICE_TBF, ,
> > > > > > sizeof
> > > > > > police);
> > > > > > +    nl_msg_put_u32(request, TCA_POLICE_RESULT,
> > > > > > TC_ACT_PIPE);
> > > > > > +    nl_msg_end_nested(request, offset);
> > > > > > +}
> > > > > > +
> > > > > >  static void
> > > > > >  nl_msg_put_act_ct(struct ofpbuf *request, struct tc_action
> > > > > > *action)
> > > > > >  {
> > > > > > @@ -2911,7 +2927,14 @@ nl_msg_put_flower_acts(struct ofpbuf
> > > > > > *request,
> > > > > > struct tc_flower *flower)
> > > > > >  }
> > > > > >  break;
> > > > > 

Re: [ovs-dev] [v4 7/8] netdev-offload-tc: Offloading rules with police actions

2022-05-18 Thread Eelco Chaudron


On 18 May 2022, at 8:54, Jianbo Liu wrote:

> On Wed, 2022-05-18 at 08:25 +0200, Eelco Chaudron wrote:
>>
>>
>> On 18 May 2022, at 5:27, Jianbo Liu wrote:
>>
>>> On Mon, 2022-05-16 at 13:34 +0200, Eelco Chaudron wrote:


 On 3 May 2022, at 5:08, Jianbo Liu via dev wrote:

> When offloading rule, tc should be filled with police index,
> instead
> of meter id. As meter is mapped to police action, and the
> mapping is
> inserted into meter_id_to_police_idx hmap, this hmap is used to
> find
> the police index. Besides, an action cookie is added to save
> meter id
> on rule creation, so meter id can be retrieved from the cookie,
> and
> pass to dpif while dumping rules.
>
> Signed-off-by: Jianbo Liu 
> ---
>  lib/netdev-offload-tc.c | 16 +++-
>  lib/tc.c    | 25 -
>  2 files changed, 39 insertions(+), 2 deletions(-)
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index bded4bc8c..fa3e8113b 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -169,6 +169,8 @@ static struct netlink_field
> set_flower_map[][4] =
> {
>  },
>  };
>
> +static int meter_id_lookup(uint32_t meter_id, uint32_t
> *police_idx);
> +

 Maybe move this up to the place where you also define the meter
 mutex/ids?

>  static struct ovs_mutex ufid_lock = OVS_MUTEX_INITIALIZER;
>
>  /**
> @@ -1038,7 +1040,8 @@ parse_tc_flower_to_match(struct tc_flower
> *flower,
>  }
>  break;
>  case TC_ACT_POLICE: {
> -    /* Not supported yet */
> +    nl_msg_put_u32(buf, OVS_ACTION_ATTR_METER,
> +   action->police.meter_id);
>  }
>  break;
>  }
> @@ -1961,6 +1964,17 @@ netdev_tc_flow_put(struct netdev
> *netdev,
> struct match *match,
>  action->type = TC_ACT_GOTO;
>  action->chain = 0;  /* 0 is reserved and not used
> by
> recirc. */
>  flower.action_count++;
> +    } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_METER)
> {
> +    uint32_t police_index;
> +
> +    action->type = TC_ACT_POLICE;
> +    action->police.meter_id = nl_attr_get_u32(nla);
> +    if (meter_id_lookup(action->police.meter_id,
> _index)) {
> +    return EOPNOTSUPP;
> +    }
> +
> +    action->police.index = police_index;
> +    flower.action_count++;
>  } else {
>  VLOG_DBG_RL(, "unsupported put action type:
> %d",
>  nl_attr_type(nla));
> diff --git a/lib/tc.c b/lib/tc.c
> index 16d4ae3da..ecdcb676d 100644
> --- a/lib/tc.c
> +++ b/lib/tc.c
> @@ -2421,6 +2421,22 @@ nl_msg_put_act_gact(struct ofpbuf
> *request,
> uint32_t chain)
>  nl_msg_end_nested(request, offset);
>  }
>
> +static void
> +nl_msg_put_act_police_index(struct ofpbuf *request, uint32_t
> police_idx)
> +{
> +    struct tc_police police;
> +    size_t offset;
> +
> +    memset(, 0, sizeof police);
> +    police.index = police_idx;
> +
> +    nl_msg_put_string(request, TCA_ACT_KIND, "police");
> +    offset = nl_msg_start_nested(request, TCA_ACT_OPTIONS);
> +    nl_msg_put_unspec(request, TCA_POLICE_TBF, , sizeof
> police);
> +    nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_PIPE);
> +    nl_msg_end_nested(request, offset);
> +}
> +
>  static void
>  nl_msg_put_act_ct(struct ofpbuf *request, struct tc_action
> *action)
>  {
> @@ -2911,7 +2927,14 @@ nl_msg_put_flower_acts(struct ofpbuf
> *request,
> struct tc_flower *flower)
>  }
>  break;
>  case TC_ACT_POLICE: {
> -    /* Not supported yet */
> +    struct tc_cookie act_cookie;
> +
> +    act_offset = nl_msg_start_nested(request,
> act_index++);
> +    nl_msg_put_act_police_index(request, action-
>> police.index);
> +    act_cookie.data = >police.meter_id;
> +    act_cookie.len = sizeof(action-
>> police.meter_id);
> +    nl_msg_put_act_cookie(request, _cookie);

 I think we should only set the action cookie to flower-
> act_cookie as
 the entire infrastructure relies on this being the ufid. If you
 need to
 store some meter_id, you might need to do it in tcf_id somehow.
 See
 your colleague's sflow patch he has the same problem as he needs
 to
 store the sflow id. Guess this also requires changes in
 nl_parse_act_police() as you can not 

Re: [ovs-dev] [v4 7/8] netdev-offload-tc: Offloading rules with police actions

2022-05-18 Thread Jianbo Liu via dev
On Wed, 2022-05-18 at 08:25 +0200, Eelco Chaudron wrote:
> 
> 
> On 18 May 2022, at 5:27, Jianbo Liu wrote:
> 
> > On Mon, 2022-05-16 at 13:34 +0200, Eelco Chaudron wrote:
> > > 
> > > 
> > > On 3 May 2022, at 5:08, Jianbo Liu via dev wrote:
> > > 
> > > > When offloading rule, tc should be filled with police index,
> > > > instead
> > > > of meter id. As meter is mapped to police action, and the
> > > > mapping is
> > > > inserted into meter_id_to_police_idx hmap, this hmap is used to
> > > > find
> > > > the police index. Besides, an action cookie is added to save
> > > > meter id
> > > > on rule creation, so meter id can be retrieved from the cookie,
> > > > and
> > > > pass to dpif while dumping rules.
> > > > 
> > > > Signed-off-by: Jianbo Liu 
> > > > ---
> > > >  lib/netdev-offload-tc.c | 16 +++-
> > > >  lib/tc.c    | 25 -
> > > >  2 files changed, 39 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> > > > index bded4bc8c..fa3e8113b 100644
> > > > --- a/lib/netdev-offload-tc.c
> > > > +++ b/lib/netdev-offload-tc.c
> > > > @@ -169,6 +169,8 @@ static struct netlink_field
> > > > set_flower_map[][4] =
> > > > {
> > > >  },
> > > >  };
> > > > 
> > > > +static int meter_id_lookup(uint32_t meter_id, uint32_t
> > > > *police_idx);
> > > > +
> > > 
> > > Maybe move this up to the place where you also define the meter
> > > mutex/ids?
> > > 
> > > >  static struct ovs_mutex ufid_lock = OVS_MUTEX_INITIALIZER;
> > > > 
> > > >  /**
> > > > @@ -1038,7 +1040,8 @@ parse_tc_flower_to_match(struct tc_flower
> > > > *flower,
> > > >  }
> > > >  break;
> > > >  case TC_ACT_POLICE: {
> > > > -    /* Not supported yet */
> > > > +    nl_msg_put_u32(buf, OVS_ACTION_ATTR_METER,
> > > > +   action->police.meter_id);
> > > >  }
> > > >  break;
> > > >  }
> > > > @@ -1961,6 +1964,17 @@ netdev_tc_flow_put(struct netdev
> > > > *netdev,
> > > > struct match *match,
> > > >  action->type = TC_ACT_GOTO;
> > > >  action->chain = 0;  /* 0 is reserved and not used
> > > > by
> > > > recirc. */
> > > >  flower.action_count++;
> > > > +    } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_METER)
> > > > {
> > > > +    uint32_t police_index;
> > > > +
> > > > +    action->type = TC_ACT_POLICE;
> > > > +    action->police.meter_id = nl_attr_get_u32(nla);
> > > > +    if (meter_id_lookup(action->police.meter_id,
> > > > _index)) {
> > > > +    return EOPNOTSUPP;
> > > > +    }
> > > > +
> > > > +    action->police.index = police_index;
> > > > +    flower.action_count++;
> > > >  } else {
> > > >  VLOG_DBG_RL(, "unsupported put action type:
> > > > %d",
> > > >  nl_attr_type(nla));
> > > > diff --git a/lib/tc.c b/lib/tc.c
> > > > index 16d4ae3da..ecdcb676d 100644
> > > > --- a/lib/tc.c
> > > > +++ b/lib/tc.c
> > > > @@ -2421,6 +2421,22 @@ nl_msg_put_act_gact(struct ofpbuf
> > > > *request,
> > > > uint32_t chain)
> > > >  nl_msg_end_nested(request, offset);
> > > >  }
> > > > 
> > > > +static void
> > > > +nl_msg_put_act_police_index(struct ofpbuf *request, uint32_t
> > > > police_idx)
> > > > +{
> > > > +    struct tc_police police;
> > > > +    size_t offset;
> > > > +
> > > > +    memset(, 0, sizeof police);
> > > > +    police.index = police_idx;
> > > > +
> > > > +    nl_msg_put_string(request, TCA_ACT_KIND, "police");
> > > > +    offset = nl_msg_start_nested(request, TCA_ACT_OPTIONS);
> > > > +    nl_msg_put_unspec(request, TCA_POLICE_TBF, , sizeof
> > > > police);
> > > > +    nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_PIPE);
> > > > +    nl_msg_end_nested(request, offset);
> > > > +}
> > > > +
> > > >  static void
> > > >  nl_msg_put_act_ct(struct ofpbuf *request, struct tc_action
> > > > *action)
> > > >  {
> > > > @@ -2911,7 +2927,14 @@ nl_msg_put_flower_acts(struct ofpbuf
> > > > *request,
> > > > struct tc_flower *flower)
> > > >  }
> > > >  break;
> > > >  case TC_ACT_POLICE: {
> > > > -    /* Not supported yet */
> > > > +    struct tc_cookie act_cookie;
> > > > +
> > > > +    act_offset = nl_msg_start_nested(request,
> > > > act_index++);
> > > > +    nl_msg_put_act_police_index(request, action-
> > > > > police.index);
> > > > +    act_cookie.data = >police.meter_id;
> > > > +    act_cookie.len = sizeof(action-
> > > > >police.meter_id);
> > > > +    nl_msg_put_act_cookie(request, _cookie);
> > > 
> > > I think we should only set the action cookie to flower-
> > > >act_cookie as
> > > the entire infrastructure relies on this being the ufid. If you
> > > need to
> > > store some meter_id, you might need to 

Re: [ovs-dev] [v4 7/8] netdev-offload-tc: Offloading rules with police actions

2022-05-18 Thread Eelco Chaudron


On 18 May 2022, at 5:27, Jianbo Liu wrote:

> On Mon, 2022-05-16 at 13:34 +0200, Eelco Chaudron wrote:
>>
>>
>> On 3 May 2022, at 5:08, Jianbo Liu via dev wrote:
>>
>>> When offloading rule, tc should be filled with police index, instead
>>> of meter id. As meter is mapped to police action, and the mapping is
>>> inserted into meter_id_to_police_idx hmap, this hmap is used to find
>>> the police index. Besides, an action cookie is added to save meter id
>>> on rule creation, so meter id can be retrieved from the cookie, and
>>> pass to dpif while dumping rules.
>>>
>>> Signed-off-by: Jianbo Liu 
>>> ---
>>>  lib/netdev-offload-tc.c | 16 +++-
>>>  lib/tc.c    | 25 -
>>>  2 files changed, 39 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>>> index bded4bc8c..fa3e8113b 100644
>>> --- a/lib/netdev-offload-tc.c
>>> +++ b/lib/netdev-offload-tc.c
>>> @@ -169,6 +169,8 @@ static struct netlink_field set_flower_map[][4] =
>>> {
>>>  },
>>>  };
>>>
>>> +static int meter_id_lookup(uint32_t meter_id, uint32_t *police_idx);
>>> +
>>
>> Maybe move this up to the place where you also define the meter
>> mutex/ids?
>>
>>>  static struct ovs_mutex ufid_lock = OVS_MUTEX_INITIALIZER;
>>>
>>>  /**
>>> @@ -1038,7 +1040,8 @@ parse_tc_flower_to_match(struct tc_flower
>>> *flower,
>>>  }
>>>  break;
>>>  case TC_ACT_POLICE: {
>>> -    /* Not supported yet */
>>> +    nl_msg_put_u32(buf, OVS_ACTION_ATTR_METER,
>>> +   action->police.meter_id);
>>>  }
>>>  break;
>>>  }
>>> @@ -1961,6 +1964,17 @@ netdev_tc_flow_put(struct netdev *netdev,
>>> struct match *match,
>>>  action->type = TC_ACT_GOTO;
>>>  action->chain = 0;  /* 0 is reserved and not used by
>>> recirc. */
>>>  flower.action_count++;
>>> +    } else if (nl_attr_type(nla) == OVS_ACTION_ATTR_METER) {
>>> +    uint32_t police_index;
>>> +
>>> +    action->type = TC_ACT_POLICE;
>>> +    action->police.meter_id = nl_attr_get_u32(nla);
>>> +    if (meter_id_lookup(action->police.meter_id,
>>> _index)) {
>>> +    return EOPNOTSUPP;
>>> +    }
>>> +
>>> +    action->police.index = police_index;
>>> +    flower.action_count++;
>>>  } else {
>>>  VLOG_DBG_RL(, "unsupported put action type: %d",
>>>  nl_attr_type(nla));
>>> diff --git a/lib/tc.c b/lib/tc.c
>>> index 16d4ae3da..ecdcb676d 100644
>>> --- a/lib/tc.c
>>> +++ b/lib/tc.c
>>> @@ -2421,6 +2421,22 @@ nl_msg_put_act_gact(struct ofpbuf *request,
>>> uint32_t chain)
>>>  nl_msg_end_nested(request, offset);
>>>  }
>>>
>>> +static void
>>> +nl_msg_put_act_police_index(struct ofpbuf *request, uint32_t
>>> police_idx)
>>> +{
>>> +    struct tc_police police;
>>> +    size_t offset;
>>> +
>>> +    memset(, 0, sizeof police);
>>> +    police.index = police_idx;
>>> +
>>> +    nl_msg_put_string(request, TCA_ACT_KIND, "police");
>>> +    offset = nl_msg_start_nested(request, TCA_ACT_OPTIONS);
>>> +    nl_msg_put_unspec(request, TCA_POLICE_TBF, , sizeof
>>> police);
>>> +    nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_PIPE);
>>> +    nl_msg_end_nested(request, offset);
>>> +}
>>> +
>>>  static void
>>>  nl_msg_put_act_ct(struct ofpbuf *request, struct tc_action *action)
>>>  {
>>> @@ -2911,7 +2927,14 @@ nl_msg_put_flower_acts(struct ofpbuf *request,
>>> struct tc_flower *flower)
>>>  }
>>>  break;
>>>  case TC_ACT_POLICE: {
>>> -    /* Not supported yet */
>>> +    struct tc_cookie act_cookie;
>>> +
>>> +    act_offset = nl_msg_start_nested(request,
>>> act_index++);
>>> +    nl_msg_put_act_police_index(request, action-
 police.index);
>>> +    act_cookie.data = >police.meter_id;
>>> +    act_cookie.len = sizeof(action->police.meter_id);
>>> +    nl_msg_put_act_cookie(request, _cookie);
>>
>> I think we should only set the action cookie to flower->act_cookie as
>> the entire infrastructure relies on this being the ufid. If you need to
>> store some meter_id, you might need to do it in tcf_id somehow. See
>> your colleague's sflow patch he has the same problem as he needs to
>> store the sflow id. Guess this also requires changes in
>> nl_parse_act_police() as you can not use the action_cookie.
>>
>
> tcf_id is used to find ufid. But I don't think it's proper to put
> meter_id in tcf_id, because not all rules have meter. From this logic,
> meter_id can't be a part of the id or key for a rule.

So by design, the TC rules were supposed to be stateless, but as with sflow 
offload and this patch, this is no longer true so we need some way to store 
this data. But the action cookie in the current design is exclusively used to 
store