Re: [ovs-dev] [External] Re: [PATCH v4] utilities/ofctl: add-meters for save and restore

2023-03-03 Thread Wan Junjie
Hi Simon,

Thanks for the review.

On Fri, Mar 3, 2023 at 11:06 PM Simon Horman  wrote:
>
> On Wed, Mar 01, 2023 at 04:05:17PM +0800, Wan Junjie wrote:
> > put dump-meters' result in one line so add-meters can handle.
> > save and restore meters when restart ovs.
> > bundle functions are not implemented in this patch.
> >
> > Signed-off-by: Wan Junjie 
> >
> > ---
> > v4:
> > code refactor according to comments
> >
> > v3:
> > add '--oneline' option for dump-meter(s) command
> >
> > v2:
> > fix failed testcases, as dump-meters format changes
> > ---
> >  include/openvswitch/ofp-meter.h |   8 +-
> >  lib/ofp-meter.c | 103 +++-
> >  lib/ofp-print.c |   3 +-
> >  tests/dpif-netdev.at|   9 +
> >  utilities/ovs-ofctl.8.in|  25 ++-
> >  utilities/ovs-ofctl.c   | 286 
> >  utilities/ovs-save  |   8 +
> >  7 files changed, 397 insertions(+), 45 deletions(-)
>
> Hi Wan Junjie,
>
> thanks for your patch.
>
> It is a longish patch.
> It might be nice to break it up a bit.
>

> > diff --git a/include/openvswitch/ofp-meter.h 
> > b/include/openvswitch/ofp-meter.h
> > index 6776eae87..1bc91464e 100644
> > --- a/include/openvswitch/ofp-meter.h
> > +++ b/include/openvswitch/ofp-meter.h
> > @@ -61,7 +61,8 @@ int ofputil_decode_meter_config(struct ofpbuf *,
> >  struct ofputil_meter_config *,
> >  struct ofpbuf *bands);
> >  void ofputil_format_meter_config(struct ds *,
> > - const struct ofputil_meter_config *);
> > + const struct ofputil_meter_config *,
> > + int);
>
> I think that:
>
> 1. Function declarations should include parameter names.
> 2. bool would be a better type for the new oneline parameter
>
> void ofputil_format_meter_config(struct ds *,
>  const struct ofputil_meter_config *,
>  bool oneline);
>
OK.

>
> >  struct ofputil_meter_mod {
> >  uint16_t command;
> > @@ -79,6 +80,11 @@ char *parse_ofp_meter_mod_str(struct ofputil_meter_mod 
> > *, const char *string,
> >  OVS_WARN_UNUSED_RESULT;
> >  void ofputil_format_meter_mod(struct ds *, const struct ofputil_meter_mod 
> > *);
> >
> > +char *parse_ofp_meter_mod_file(const char *file_name, int command,
> > +   struct ofputil_meter_mod **mms, size_t 
> > *n_mms,
> > +   enum ofputil_protocol *usable_protocols)
> > +OVS_WARN_UNUSED_RESULT;
> > +
> >  struct ofputil_meter_stats {
> >  uint32_t meter_id;
> >  uint32_t flow_count;
> > diff --git a/lib/ofp-meter.c b/lib/ofp-meter.c
> > index 9ea40a0bf..b94cb6a05 100644
> > --- a/lib/ofp-meter.c
> > +++ b/lib/ofp-meter.c
> > @@ -15,6 +15,7 @@
> >   */
> >
> >  #include 
> > +#include 
> >  #include "openvswitch/ofp-meter.h"
> >  #include "byte-order.h"
> >  #include "nx-match.h"
> > @@ -57,7 +58,7 @@ void
> >  ofputil_format_meter_band(struct ds *s, enum ofp13_meter_flags flags,
> >const struct ofputil_meter_band *mb)
> >  {
> > -ds_put_cstr(s, "\ntype=");
> > +ds_put_cstr(s, "type=");
> >  switch (mb->type) {
> >  case OFPMBT13_DROP:
> >  ds_put_cstr(s, "drop");
> > @@ -343,7 +344,7 @@ ofp_print_meter_flags(struct ds *s, enum 
> > ofp13_meter_flags flags)
> >
> >  void
> >  ofputil_format_meter_config(struct ds *s,
> > -const struct ofputil_meter_config *mc)
> > +const struct ofputil_meter_config *mc, int 
> > oneline)
> >  {
> >  uint16_t i;
> >
> > @@ -354,9 +355,12 @@ ofputil_format_meter_config(struct ds *s,
> >
> >  ds_put_cstr(s, "bands=");
> >  for (i = 0; i < mc->n_bands; ++i) {
> > +ds_put_cstr(s, oneline > 0 ? " ": "\n");
>
> I think that as oneline is a boolean this may be clearer:
>
> ds_put_cstr(s, oneline ? " ": "\n");
>
OK

> >  ofputil_format_meter_band(s, mc->flags, >bands[i]);
> >  }
> > -ds_put_char(s, '\n');
>
> Likewise,
>
> if (!oneline) {
>
> > +if (oneline == 0) {
> > +ds_put_char(s, '\n');
> > +}
> >  }
> >
> >  static enum ofperr
> > @@ -578,6 +582,24 @@ parse_ofp_meter_mod_str__(struct ofputil_meter_mod 
> > *mm, char *string,
> >
> >  /* Meters require at least OF 1.3. */
> >  *usable_protocols = OFPUTIL_P_OF13_UP;
> > +if (command == -2) {
> > +size_t len;
> > +
> > +string += strspn(string, " \t\r\n");   /* Skip white space. */
> > +len = strcspn(string, ", \t\r\n"); /* Get length of the first 
> > token. */
> > +
> > +if (!strncmp(string, "add", len)) {
> > +command = OFPMC13_ADD;
> > +} else if (!strncmp(string, "delete", len)) {
> > +command = OFPMC13_DELETE;
> > +} else if (!strncmp(string, "modify", len)) {
> > +

[ovs-dev] [PATCH ovn] controller: Use ofctrl_add_flow for CT SNAT hairpin flows.

2023-03-03 Thread Ilya Maximets
We no longer use conjunctions for these flows, so no need to
try to append them.

Fixes: c2f968235241 ("controller: Fix hairpin SNAT flow explosion if 
hairpin_snat_ip is set.")
Signed-off-by: Ilya Maximets 
---
 controller/lflow.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/controller/lflow.c b/controller/lflow.c
index 6a98b19e1..003195ae4 100644
--- a/controller/lflow.c
+++ b/controller/lflow.c
@@ -1746,11 +1746,9 @@ add_lb_ct_snat_hairpin_for_dp(const struct 
ovn_controller_lb *lb,
  * datapath match, but it will also match on the less restrictive
  * general case.  Therefore, we set the priority in the
  * "hairpin_snat_ip" case to be higher than the general case. */
-ofctrl_add_or_append_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN,
-  datapath ? 200 : 100,
-  lb->slb->header_.uuid.parts[0],
-  dp_match, dp_acts, >slb->header_.uuid,
-  NX_CTLR_NO_METER, NULL);
+ofctrl_add_flow(flow_table, OFTABLE_CT_SNAT_HAIRPIN,
+datapath ? 200 : 100, lb->slb->header_.uuid.parts[0],
+dp_match, dp_acts, >slb->header_.uuid);
 }
 
 /* Add a ct_snat flow for each VIP of the LB.  If this LB does not use
-- 
2.39.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 1/2] ofproto-dpif-upcall: Reset ukey's last stats value if the datapath changed.

2023-03-03 Thread Ilya Maximets
On 2/27/23 16:29, Eelco Chaudron wrote:
> When the ukey's action set changes, it could cause the flow to use a
> different datapath, for example, when it moves from tc to kernel.
> This will cause the the cached previous datapath statistics to be used.
> 
> This change will reset the cached statistics when a change in
> datapath is discovered.
> 
> Signed-off-by: Eelco Chaudron 
> ---
> v2: Updated commit message to explain behavior.
> v3: Added a test case.
> Added dpif API to query the dpif for datapath layers synchronization.
> Added coverage counter to identify potential problem.
> v4: Fixed some spelling, moved common code to a macro, and moved
> the synced_dp_layers function pointer to a bool.
> 
>  lib/dpif-netdev.c|1 +
>  lib/dpif-netlink.c   |1 +
>  lib/dpif-provider.h  |8 +
>  lib/dpif.c   |5 +++
>  lib/dpif.h   |1 +
>  ofproto/ofproto-dpif-upcall.c|   41 +-
>  tests/system-offloads-traffic.at |   60 
> ++
>  7 files changed, 115 insertions(+), 2 deletions(-)
> 



> diff --git a/lib/dpif.c b/lib/dpif.c
> index fe4db83fb..5ed8aef17 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -2109,3 +2109,8 @@ dpif_cache_set_size(struct dpif *dpif, uint32_t level, 
> uint32_t size)
> ? dpif->dpif_class->cache_set_size(dpif, level, size)
> : EOPNOTSUPP;
>  }
> +
> +bool dpif_synced_dp_layers(struct dpif *dpif)

Function name in the implementation should start form a new line.

I fixed that and applied the set.  Also backported down to 2.17,
since it's actually a bug fix.

Thanks, Eelco and Simon!

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] ofproto-dpif-upcall: Wait for valid hw flow stats before applying min-revalidate-pps

2023-03-03 Thread Ilya Maximets
On 2/27/23 15:58, Eelco Chaudron wrote:
> Depending on the driver implementation, it can take from 0.2 seconds
> up to 2 seconds before offloaded flow statistics are updated. This is
> true for both TC and rte_flow-based offloading. This is causing a
> problem with min-revalidate-pps, as old statistic values are used
> during this period.
> 
> This fix will wait for at least 2 seconds, by default, before assuming no
> packets where received during this period.
> 
> Reviewed-by: Simon Horman 
> Signed-off-by: Eelco Chaudron 
> ---
> 
> Changes:
> - v2: Use existing ukey->offloaded so the revalidate_missed_dp_flow case is
>   also covered.
> - v3: Added OVS_REQUIRES(ukey->mutex) to should_revalidate() to keep
>   thread-safety-analysis happy.
> - v4: Add a configurable option.
>   After looking at multiple vendor implementation for both TC and
>   rte_flow I came to the conclusion that the delay is roughly between
>   0.2 and 2 seconds. Updated commit message.
> - v5: Rebased to latest upstream master.
>   Made the key parameter const in should_revalidate().
> 
>  ofproto/ofproto-dpif-upcall.c |   26 --
>  ofproto/ofproto-provider.h|4 
>  ofproto/ofproto.c |9 +
>  ofproto/ofproto.h |2 ++
>  vswitchd/bridge.c |3 +++
>  vswitchd/vswitch.xml  |   12 
>  6 files changed, 46 insertions(+), 10 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index fc94078cb..60c273a1d 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -2100,10 +2100,12 @@ ukey_delete(struct umap *umap, struct udpif_key *ukey)
>  }
>  
>  static bool
> -should_revalidate(const struct udpif *udpif, uint64_t packets,
> -  long long int used)
> +should_revalidate(const struct udpif *udpif, const struct udpif_key *ukey,
> +  uint64_t packets)
> +OVS_REQUIRES(ukey->mutex)
>  {
>  long long int metric, now, duration;
> +long long int used = ukey->stats.used;
>  
>  if (!ofproto_min_revalidate_pps) {
>  return true;
> @@ -2134,8 +2136,12 @@ should_revalidate(const struct udpif *udpif, uint64_t 
> packets,
>  duration = now - used;
>  metric = duration / packets;
>  
> -if (metric < 1000 / ofproto_min_revalidate_pps) {
> -/* The flow is receiving more than min-revalidate-pps, so keep it. */
> +if (metric < 1000 / ofproto_min_revalidate_pps ||
> +(ukey->offloaded && duration < ofproto_offloaded_stats_delay)) {
> +/* The flow is receiving more than min-revalidate-pps, so keep it.
> + * Or it's a hardware offloaded flow that might take up to X seconds
> + * to update its statistics. Until we are sure the statistics had a
> + * chance to be updated, also keep it. */
>  return true;
>  }
>  return false;
> @@ -2339,7 +2345,7 @@ static enum reval_result
>  revalidate_ukey(struct udpif *udpif, struct udpif_key *ukey,
>  const struct dpif_flow_stats *stats,
>  struct ofpbuf *odp_actions, uint64_t reval_seq,
> -struct recirc_refs *recircs, bool offloaded)
> +struct recirc_refs *recircs)
>  OVS_REQUIRES(ukey->mutex)
>  {
>  bool need_revalidate = ukey->reval_seq != reval_seq;
> @@ -2358,7 +2364,7 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
> *ukey,
>  : 0);
>  
>  if (need_revalidate) {
> -if (should_revalidate(udpif, push.n_packets, ukey->stats.used)) {
> +if (should_revalidate(udpif, ukey, push.n_packets)) {
>  if (!ukey->xcache) {
>  ukey->xcache = xlate_cache_new();
>  } else {
> @@ -2374,7 +2380,7 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
> *ukey,
>  
>  /* Stats for deleted flows will be attributed upon flow deletion. Skip. 
> */
>  if (result != UKEY_DELETE) {
> -xlate_push_stats(ukey->xcache, , offloaded);
> +xlate_push_stats(ukey->xcache, , ukey->offloaded);
>  ukey->stats = *stats;
>  ukey->reval_seq = reval_seq;
>  }
> @@ -2774,6 +2780,7 @@ revalidate(struct revalidator *revalidator)
>  continue;
>  }
>  
> +ukey->offloaded = f->attrs.offloaded;
>  already_dumped = ukey->dump_seq == dump_seq;
>  if (already_dumped) {
>  /* The flow has already been handled during this flow dump
> @@ -2805,8 +2812,7 @@ revalidate(struct revalidator *revalidator)
>  result = UKEY_DELETE;
>  } else {
>  result = revalidate_ukey(udpif, ukey, , _actions,
> - reval_seq, ,
> - f->attrs.offloaded);
> + reval_seq, );
>  }
>  ukey->dump_seq = 

Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Reduce stack usage in recursive xlate functions

2023-03-03 Thread Mike Pattrick
On Fri, Mar 3, 2023 at 12:31 PM Adrian Moreno  wrote:
>
> Hi Mike,
>
> I've gone though this patch and have some minor style comments and nits. I've
> also played a bit with it and run it through valgrind and looks solid.
>
> On 12/8/22 17:22, Mike Pattrick wrote:
> > Several xlate actions used in recursive translation currently store a
> > large amount of information on the stack. This can result in handler
> > threads quickly running out of stack space despite before
> > xlate_resubmit_resource_check() is able to terminate translation. This
> > patch reduces stack usage by over 3kb from several translation actions.
> >
> > This patch also moves some trace function from do_xlate_actions into its
> > own function to reduce stack usage.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2104779
> > Signed-off-by: Mike Pattrick 
> > ---
> >   ofproto/ofproto-dpif-xlate.c | 168 +--
> >   1 file changed, 99 insertions(+), 69 deletions(-)
> >
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index a9cf3cbee..8ed264d0b 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -411,6 +411,18 @@ struct xlate_ctx {
> >   enum xlate_error error; /* Translation failed. */
> >   };
> >
> > +/* This structure is used to save stack space in actions that need to
> > + * retain a large amount of xlate_ctx state. */
> > +struct xsaved_state {
> > +union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)];
> > +uint64_t actset_stub[1024 / 8];
> > +struct ofpbuf old_stack;
> > +struct ofpbuf old_action_set;
> > +struct flow old_flow;
> > +struct flow old_base;
> > +struct flow_tnl flow_tnl;
> > +};
> > +
>
> Nit: not that I have a better alternative but the name of this struct is
> sligthly confusing. The name suggets it's a part of xlate_ctx state so one 
> would
> expect a simple function that creates this struct from an existing xlate_ctx 
> and
> one that copies its content back. However, this has a mix of old and new data.
>
> >   /* Structure to track VLAN manipulation */
> >   struct xvlan_single {
> >   uint16_t tpid;
> > @@ -3906,20 +3918,21 @@ static void
> >   patch_port_output(struct xlate_ctx *ctx, const struct xport *in_dev,
> > struct xport *out_dev, bool is_last_action)
> >   {
> > -struct flow *flow = >xin->flow;
> > -struct flow old_flow = ctx->xin->flow;
> > -struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel;
> > +struct xsaved_state * old_state = xmalloc(sizeof * old_state);
>
> Nits:
> s/xsaved_state * old_state/xsaved_state *old_state/
> s/sizeof * old_state/sizeof *old_state/
>
>
> > +struct ovs_list *old_trace = ctx->xin->trace;
> >   bool old_conntrack = ctx->conntracked;
> > +struct flow *flow = >xin->flow;
> >   bool old_was_mpls = ctx->was_mpls;
> > -ovs_version_t old_version = ctx->xin->tables_version;
> > -struct ofpbuf old_stack = ctx->stack;
> > -uint8_t new_stack[1024];
> > -struct ofpbuf old_action_set = ctx->action_set;
> > -struct ovs_list *old_trace = ctx->xin->trace;
> > -uint64_t actset_stub[1024 / 8];
> >
> > -ofpbuf_use_stub(>stack, new_stack, sizeof new_stack);
> > -ofpbuf_use_stub(>action_set, actset_stub, sizeof actset_stub);
> > +old_state->old_flow = ctx->xin->flow;
> > +old_state->flow_tnl = ctx->wc->masks.tunnel;
> > +ovs_version_t old_version = ctx->xin->tables_version;
>
> Any reason why we would not want to put this into xsaved_state as well?

I had only moved very large types to xsaved_state. But I can see how
the case could be made, for readability and simplicity, to move all
these variables into it.

>
> > +old_state->old_stack = ctx->stack;
> > +old_state->old_action_set = ctx->action_set;
> > +ofpbuf_use_stub(>stack, old_state->new_stack,
> > +sizeof old_state->new_stack);
> > +ofpbuf_use_stub(>action_set, old_state->actset_stub,
> > +sizeof old_state->actset_stub);
>
> I think something that would help with the naming nit above would be to, 
> instead
> of using the xsaved_state to store the new stack's data plus the old stack's
> ofpbuf, just use ofpbuf_clone to save the old stack entirely (the ofpbuf 
> header
> plus its data). Same for actset_stub.
>
> That way, when we go down the recursion we would reuse the current ctx->stack
> (which we might want to zero before depending on the case).
>
> It doesn't make a huge difference technically speaking but I think would make
> the code cleaner because we would now be able to move the state-saving and
> state-restoring to helper functions if we wanted, or just make this one easier
> to read. Plus we would not need to prefix all of xsaved_state's members with
> "old" or "new".
>
> Again, not that it really matters in terms of logic but I think it might yield
> some simpler code.
> What do you think?

I see what 

Re: [ovs-dev] [PATCH ovn branch-23.03 2/2] Prepare for 23.03.1.

2023-03-03 Thread Mark Michelson

Thanks for the ACKs Dumitru. I pushed the changes and tagged v23.03.0 .

On 3/3/23 11:06, Dumitru Ceara wrote:

On 3/3/23 16:42, Mark Michelson wrote:

Signed-off-by: Mark Michelson 
---


Acked-by: Dumitru Ceara 

Regards,
Dumitru



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ofproto-dpif-xlate: Reduce stack usage in recursive xlate functions

2023-03-03 Thread Adrian Moreno

Hi Mike,

I've gone though this patch and have some minor style comments and nits. I've 
also played a bit with it and run it through valgrind and looks solid.


On 12/8/22 17:22, Mike Pattrick wrote:

Several xlate actions used in recursive translation currently store a
large amount of information on the stack. This can result in handler
threads quickly running out of stack space despite before
xlate_resubmit_resource_check() is able to terminate translation. This
patch reduces stack usage by over 3kb from several translation actions.

This patch also moves some trace function from do_xlate_actions into its
own function to reduce stack usage.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2104779
Signed-off-by: Mike Pattrick 
---
  ofproto/ofproto-dpif-xlate.c | 168 +--
  1 file changed, 99 insertions(+), 69 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index a9cf3cbee..8ed264d0b 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -411,6 +411,18 @@ struct xlate_ctx {
  enum xlate_error error; /* Translation failed. */
  };
  
+/* This structure is used to save stack space in actions that need to

+ * retain a large amount of xlate_ctx state. */
+struct xsaved_state {
+union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)];
+uint64_t actset_stub[1024 / 8];
+struct ofpbuf old_stack;
+struct ofpbuf old_action_set;
+struct flow old_flow;
+struct flow old_base;
+struct flow_tnl flow_tnl;
+};
+


Nit: not that I have a better alternative but the name of this struct is 
sligthly confusing. The name suggets it's a part of xlate_ctx state so one would 
expect a simple function that creates this struct from an existing xlate_ctx and 
one that copies its content back. However, this has a mix of old and new data.



  /* Structure to track VLAN manipulation */
  struct xvlan_single {
  uint16_t tpid;
@@ -3906,20 +3918,21 @@ static void
  patch_port_output(struct xlate_ctx *ctx, const struct xport *in_dev,
struct xport *out_dev, bool is_last_action)
  {
-struct flow *flow = >xin->flow;
-struct flow old_flow = ctx->xin->flow;
-struct flow_tnl old_flow_tnl_wc = ctx->wc->masks.tunnel;
+struct xsaved_state * old_state = xmalloc(sizeof * old_state);


Nits:
s/xsaved_state * old_state/xsaved_state *old_state/
s/sizeof * old_state/sizeof *old_state/



+struct ovs_list *old_trace = ctx->xin->trace;
  bool old_conntrack = ctx->conntracked;
+struct flow *flow = >xin->flow;
  bool old_was_mpls = ctx->was_mpls;
-ovs_version_t old_version = ctx->xin->tables_version;
-struct ofpbuf old_stack = ctx->stack;
-uint8_t new_stack[1024];
-struct ofpbuf old_action_set = ctx->action_set;
-struct ovs_list *old_trace = ctx->xin->trace;
-uint64_t actset_stub[1024 / 8];
  
-ofpbuf_use_stub(>stack, new_stack, sizeof new_stack);

-ofpbuf_use_stub(>action_set, actset_stub, sizeof actset_stub);
+old_state->old_flow = ctx->xin->flow;
+old_state->flow_tnl = ctx->wc->masks.tunnel;
+ovs_version_t old_version = ctx->xin->tables_version;


Any reason why we would not want to put this into xsaved_state as well?


+old_state->old_stack = ctx->stack;
+old_state->old_action_set = ctx->action_set;
+ofpbuf_use_stub(>stack, old_state->new_stack,
+sizeof old_state->new_stack);
+ofpbuf_use_stub(>action_set, old_state->actset_stub,
+sizeof old_state->actset_stub);


I think something that would help with the naming nit above would be to, instead 
of using the xsaved_state to store the new stack's data plus the old stack's 
ofpbuf, just use ofpbuf_clone to save the old stack entirely (the ofpbuf header 
plus its data). Same for actset_stub.


That way, when we go down the recursion we would reuse the current ctx->stack 
(which we might want to zero before depending on the case).


It doesn't make a huge difference technically speaking but I think would make 
the code cleaner because we would now be able to move the state-saving and 
state-restoring to helper functions if we wanted, or just make this one easier 
to read. Plus we would not need to prefix all of xsaved_state's members with 
"old" or "new".


Again, not that it really matters in terms of logic but I think it might yield 
some simpler code.

What do you think?



  flow->in_port.ofp_port = out_dev->ofp_port;
  flow->metadata = htonll(0);
  memset(>tunnel, 0, sizeof flow->tunnel);
@@ -3958,14 +3971,14 @@ patch_port_output(struct xlate_ctx *ctx, const struct 
xport *in_dev,
  } else {
  /* Forwarding is disabled by STP and RSTP.  Let OFPP_NORMAL and
   * the learning action look at the packet, then drop it. */
-struct flow old_base_flow = ctx->base_flow;
  size_t old_size = ctx->odp_actions->size;
+old_state->old_base 

Re: [ovs-dev] [PATCH ovn branch-23.03 2/2] Prepare for 23.03.1.

2023-03-03 Thread Dumitru Ceara
On 3/3/23 16:42, Mark Michelson wrote:
> Signed-off-by: Mark Michelson 
> ---

Acked-by: Dumitru Ceara 

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn branch-23.03 1/2] Set release date for 23.03.0.

2023-03-03 Thread Dumitru Ceara
On 3/3/23 16:42, Mark Michelson wrote:
> Signed-off-by: Mark Michelson 
> ---

Acked-by: Dumitru Ceara 

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn branch-23.03 1/2] Set release date for 23.03.0.

2023-03-03 Thread Mark Michelson
Signed-off-by: Mark Michelson 
---
 NEWS | 2 +-
 debian/changelog | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index bb0e27bd5..5e8aed06d 100644
--- a/NEWS
+++ b/NEWS
@@ -1,4 +1,4 @@
-OVN v23.03.0 - xx xxx 
+OVN v23.03.0 - 03 Mar 2023
 --
   - ovn-controller: Experimental support for co-hosting multiple controller
 instances on the same host.
diff --git a/debian/changelog b/debian/changelog
index a70fdfe25..11a07dd38 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ ovn (23.03.0-1) unstable; urgency=low
 
* New upstream version
 
- -- OVN team   Fri, 17 Feb 2023 08:29:57 -0500
+ -- OVN team   Fri, 03 Mar 2023 10:40:37 -0500
 
 ovn (22.12.0-1) unstable; urgency=low
 
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn branch-23.03 0/2] Release patches for v23.03.0.

2023-03-03 Thread Mark Michelson


Mark Michelson (2):
  Set release date for 23.03.0.
  Prepare for 23.03.1.

 NEWS | 5 -
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 12 insertions(+), 3 deletions(-)

-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn branch-23.03 2/2] Prepare for 23.03.1.

2023-03-03 Thread Mark Michelson
Signed-off-by: Mark Michelson 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index 5e8aed06d..360d4d63e 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+OVN v23.03.1 - xx xxx 
+--
+
 OVN v23.03.0 - 03 Mar 2023
 --
   - ovn-controller: Experimental support for co-hosting multiple controller
diff --git a/configure.ac b/configure.ac
index b51d0f01e..0ba9e8d7e 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(ovn, 23.03.0, b...@openvswitch.org)
+AC_INIT(ovn, 23.03.1, b...@openvswitch.org)
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
 AC_CONFIG_HEADERS([config.h])
diff --git a/debian/changelog b/debian/changelog
index 11a07dd38..02a9953ba 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+OVN (23.03.1-1) unstable; urgency=low
+   [ OVN team ]
+   * New upstream version
+
+ -- OVN team   Fri, 03 Mar 2023 10:40:37 -0500
+
 ovn (23.03.0-1) unstable; urgency=low
 
* New upstream version
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] dpdk: Allow retaining CAP_SYS_RAWIO privileges

2023-03-03 Thread Aaron Conole
Open vSwitch generally tries to let the underlying operating system
managed the low level details of hardware, for example DMA mapping,
bus arbitration, etc.  However, when using DPDK, the underlying
operating system yields control of many of these details to userspace
for management.

In the case of some DPDK port drivers, configuring rte_flow or even
allocating resources may require access to iopl/ioperm calls, which
are guarded by the CAP_SYS_RAWIO privilege on linux systems.  These
calls are dangerous, and can allow a process to completely compromise
a system.  However, they are needed in the case of some userspace
driver code which manages the hardware (for example, the mlx
implementation of backend support for rte_flow).

Here, we create an opt-in flag passed to the command line to allow
this access.  We need to do this before ever accessing the database,
because we want to drop all privileges asap, and cannot wait for
a connection to the database to be established and functional before
dropping.  There may be distribution specific ways to do capability
management as well (using for example, systemd), but they are not
as universal to the vswitchd as a flag.

Signed-off-by: Aaron Conole 
---
 NEWS   |  4 
 lib/daemon-unix.c  | 30 ++
 lib/daemon-windows.c   |  3 ++-
 lib/daemon.c   |  2 +-
 lib/daemon.h   |  4 ++--
 ovsdb/ovsdb-client.c   |  6 +++---
 ovsdb/ovsdb-server.c   |  4 ++--
 tests/test-netflow.c   |  2 +-
 tests/test-sflow.c |  2 +-
 tests/test-unixctl.c   |  2 +-
 utilities/ovs-ofctl.c  |  4 ++--
 utilities/ovs-testcontroller.c |  4 ++--
 vswitchd/ovs-vswitchd.8.in |  8 
 vswitchd/ovs-vswitchd.c| 11 ++-
 14 files changed, 61 insertions(+), 25 deletions(-)

diff --git a/NEWS b/NEWS
index 85b3496214..65f35dcdd5 100644
--- a/NEWS
+++ b/NEWS
@@ -10,6 +10,10 @@ Post-v3.1.0
in order to create OVSDB sockets with access mode of 0770.
- QoS:
  * Added new configuration option 'jitter' for a linux-netem QoS type.
+   - DPDK:
+ * ovs-vswitchd will keep the CAP_SYS_RAWIO capability when started
+   with the --hw-rawio-access command line option.  This allows the
+   process extra privileges when mapping physical interconnect memory.
 
 
 v3.1.0 - 16 Feb 2023
diff --git a/lib/daemon-unix.c b/lib/daemon-unix.c
index 1a7ba427d7..5a63ac55ce 100644
--- a/lib/daemon-unix.c
+++ b/lib/daemon-unix.c
@@ -88,7 +88,8 @@ static bool switch_user = false;
 static uid_t uid;
 static gid_t gid;
 static char *user = NULL;
-static void daemon_become_new_user__(bool access_datapath);
+static void daemon_become_new_user__(bool access_datapath,
+ bool access_hardware_ports);
 
 static void check_already_running(void);
 static int lock_pidfile(FILE *, int command);
@@ -443,13 +444,13 @@ monitor_daemon(pid_t daemon_pid)
  * daemonize_complete()) or that it failed to start up (by exiting with a
  * nonzero exit code). */
 void
-daemonize_start(bool access_datapath)
+daemonize_start(bool access_datapath, bool access_hardware_ports)
 {
 assert_single_threaded();
 daemonize_fd = -1;
 
 if (switch_user) {
-daemon_become_new_user__(access_datapath);
+daemon_become_new_user__(access_datapath, access_hardware_ports);
 switch_user = false;
 }
 
@@ -807,7 +808,8 @@ daemon_become_new_user_unix(void)
 /* Linux specific implementation of daemon_become_new_user()
  * using libcap-ng.   */
 static void
-daemon_become_new_user_linux(bool access_datapath OVS_UNUSED)
+daemon_become_new_user_linux(bool access_datapath OVS_UNUSED,
+ bool access_hardware_ports OVS_UNUSED)
 {
 #if defined __linux__ &&  HAVE_LIBCAPNG
 int ret;
@@ -827,6 +829,17 @@ daemon_become_new_user_linux(bool access_datapath 
OVS_UNUSED)
 ret = capng_update(CAPNG_ADD, cap_sets, CAP_NET_ADMIN)
   || capng_update(CAPNG_ADD, cap_sets, CAP_NET_RAW)
   || capng_update(CAPNG_ADD, cap_sets, CAP_NET_BROADCAST);
+#ifdef DPDK_NETDEV
+if (access_hardware_ports && !ret) {
+ret = capng_update(CAPNG_ADD, cap_sets, CAP_SYS_RAWIO);
+VLOG_INFO("CAP_SYS_RAWIO enabled.");
+}
+#else
+;
+if (access_hardware_ports) {
+VLOG_WARN("Dropped CAP_SYS_RAWIO request (no drivers).");
+}
+#endif
 }
 } else {
 ret = -1;
@@ -854,7 +867,7 @@ daemon_become_new_user_linux(bool access_datapath 
OVS_UNUSED)
 }
 
 static void
-daemon_become_new_user__(bool access_datapath)
+daemon_become_new_user__(bool access_datapath, bool access_hardware_ports)
 {
 /* If vlog file has been created, change its owner to the non-root user
  * as specifed by the --user option.  */

Re: [ovs-dev] [PATCH v2 2/2] ci: Run tc offload tests in GitHub Actions.

2023-03-03 Thread Simon Horman
On Tue, Feb 28, 2023 at 04:52:15PM +0100, Eelco Chaudron wrote:
> Run "make check-offloads" as part of the GitHub actions tests.
> 
> This test was run 25 times using GitHub actions, and the
> failing rerun test cases where excluded. There are quite some
> first-run failures, but unfortunately, there is no other
> more stable kernel available as a GitHub-hosted runner.
> 
> Did not yet include sanitizers in the run, as it's causing
> the test to run too long >30min and there seems to be (timing)
> issues with some of the tests.
> 
> Signed-off-by: Eelco Chaudron 

Hi Eelco,

perhaps I am missing something obvious,
but I seem to be getting failures with;

9: offloads - check_pkt_len action - offloads disabled FAILED 
(system-offloads-traffic.at:342)

Also, there does not seem to be enough in the logs to determine why.

https://github.com/horms/ovs/actions/runs/4323873948/jobs/7549439777#step:12:2847
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] utilities/ofctl: add-meters for save and restore

2023-03-03 Thread Simon Horman
On Wed, Mar 01, 2023 at 04:05:17PM +0800, Wan Junjie wrote:
> put dump-meters' result in one line so add-meters can handle.
> save and restore meters when restart ovs.
> bundle functions are not implemented in this patch.
> 
> Signed-off-by: Wan Junjie 
> 
> ---
> v4:
> code refactor according to comments
> 
> v3:
> add '--oneline' option for dump-meter(s) command
> 
> v2:
> fix failed testcases, as dump-meters format changes
> ---
>  include/openvswitch/ofp-meter.h |   8 +-
>  lib/ofp-meter.c | 103 +++-
>  lib/ofp-print.c |   3 +-
>  tests/dpif-netdev.at|   9 +
>  utilities/ovs-ofctl.8.in|  25 ++-
>  utilities/ovs-ofctl.c   | 286 
>  utilities/ovs-save  |   8 +
>  7 files changed, 397 insertions(+), 45 deletions(-)

Hi Wan Junjie,

thanks for your patch.

It is a longish patch.
It might be nice to break it up a bit.

> diff --git a/include/openvswitch/ofp-meter.h b/include/openvswitch/ofp-meter.h
> index 6776eae87..1bc91464e 100644
> --- a/include/openvswitch/ofp-meter.h
> +++ b/include/openvswitch/ofp-meter.h
> @@ -61,7 +61,8 @@ int ofputil_decode_meter_config(struct ofpbuf *,
>  struct ofputil_meter_config *,
>  struct ofpbuf *bands);
>  void ofputil_format_meter_config(struct ds *,
> - const struct ofputil_meter_config *);
> + const struct ofputil_meter_config *,
> + int);

I think that:

1. Function declarations should include parameter names.
2. bool would be a better type for the new oneline parameter

void ofputil_format_meter_config(struct ds *,
 const struct ofputil_meter_config *,
 bool oneline);


>  struct ofputil_meter_mod {
>  uint16_t command;
> @@ -79,6 +80,11 @@ char *parse_ofp_meter_mod_str(struct ofputil_meter_mod *, 
> const char *string,
>  OVS_WARN_UNUSED_RESULT;
>  void ofputil_format_meter_mod(struct ds *, const struct ofputil_meter_mod *);
>  
> +char *parse_ofp_meter_mod_file(const char *file_name, int command,
> +   struct ofputil_meter_mod **mms, size_t *n_mms,
> +   enum ofputil_protocol *usable_protocols)
> +OVS_WARN_UNUSED_RESULT;
> +
>  struct ofputil_meter_stats {
>  uint32_t meter_id;
>  uint32_t flow_count;
> diff --git a/lib/ofp-meter.c b/lib/ofp-meter.c
> index 9ea40a0bf..b94cb6a05 100644
> --- a/lib/ofp-meter.c
> +++ b/lib/ofp-meter.c
> @@ -15,6 +15,7 @@
>   */
>  
>  #include 
> +#include 
>  #include "openvswitch/ofp-meter.h"
>  #include "byte-order.h"
>  #include "nx-match.h"
> @@ -57,7 +58,7 @@ void
>  ofputil_format_meter_band(struct ds *s, enum ofp13_meter_flags flags,
>const struct ofputil_meter_band *mb)
>  {
> -ds_put_cstr(s, "\ntype=");
> +ds_put_cstr(s, "type=");
>  switch (mb->type) {
>  case OFPMBT13_DROP:
>  ds_put_cstr(s, "drop");
> @@ -343,7 +344,7 @@ ofp_print_meter_flags(struct ds *s, enum 
> ofp13_meter_flags flags)
>  
>  void
>  ofputil_format_meter_config(struct ds *s,
> -const struct ofputil_meter_config *mc)
> +const struct ofputil_meter_config *mc, int 
> oneline)
>  {
>  uint16_t i;
>  
> @@ -354,9 +355,12 @@ ofputil_format_meter_config(struct ds *s,
>  
>  ds_put_cstr(s, "bands=");
>  for (i = 0; i < mc->n_bands; ++i) {
> +ds_put_cstr(s, oneline > 0 ? " ": "\n");

I think that as oneline is a boolean this may be clearer:

ds_put_cstr(s, oneline ? " ": "\n");

>  ofputil_format_meter_band(s, mc->flags, >bands[i]);
>  }
> -ds_put_char(s, '\n');

Likewise,

if (!oneline) {

> +if (oneline == 0) {
> +ds_put_char(s, '\n');
> +}
>  }
>  
>  static enum ofperr
> @@ -578,6 +582,24 @@ parse_ofp_meter_mod_str__(struct ofputil_meter_mod *mm, 
> char *string,
>  
>  /* Meters require at least OF 1.3. */
>  *usable_protocols = OFPUTIL_P_OF13_UP;
> +if (command == -2) {
> +size_t len;
> +
> +string += strspn(string, " \t\r\n");   /* Skip white space. */
> +len = strcspn(string, ", \t\r\n"); /* Get length of the first token. 
> */
> +
> +if (!strncmp(string, "add", len)) {
> +command = OFPMC13_ADD;
> +} else if (!strncmp(string, "delete", len)) {
> +command = OFPMC13_DELETE;
> +} else if (!strncmp(string, "modify", len)) {
> +command = OFPMC13_MODIFY;
> +} else {
> +len = 0;
> +command = OFPMC13_ADD;
> +}
> +string += len;
> +}
>  
>  switch (command) {
>  case -1:
> @@ -606,6 +628,11 @@ parse_ofp_meter_mod_str__(struct ofputil_meter_mod *mm, 
> char *string,
>  mm->meter.n_bands = 0;
>  mm->meter.bands = NULL;
>  
> +

Re: [ovs-dev] [PATCH v7 1/2] ofp, dpif: Allow CT flush based on partial match

2023-03-03 Thread Roi Dayan via dev


On 01/03/2023 14:01, Ales Musil wrote:
> On Wed, Mar 1, 2023 at 12:34 PM Roi Dayan  wrote:
> 
>>
>>
>> On 01/03/2023 12:53, Roi Dayan via dev wrote:
>>>
>>>
>>> On 16/01/2023 13:45, Ales Musil wrote:
 Currently, the CT can be flushed by dpctl only by specifying
 the whole 5-tuple. This is not very convenient when there are
 only some fields known to the user of CT flush. Add new struct
 ofputil_ct_match which represents the generic filtering that can
 be done for CT flush. The match is done only on fields that are
 non-zero with exception to the icmp fields.

 This allows the filtering just within dpctl, however
 it is a preparation for OpenFlow extension.

 Reported-at:
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2F2120546=05%7C01%7Croid%40nvidia.com%7Cbc4c95a2472a4dfbb0a408db1a434035%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638132648384230198%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=wBA5yCdijnN%2FkcwxncDe1xxHzsqzGQgCRqNRHIMQUHQ%3D=0
 Signed-off-by: Ales Musil 
 ---
>>
>> ...
>>
 @@ -1707,37 +1708,55 @@ dpctl_flush_conntrack(int argc, const char
>> *argv[],
struct dpctl_params *dpctl_p)
  {
  struct dpif *dpif = NULL;
 -struct ct_dpif_tuple tuple, *ptuple = NULL;
 +struct ofp_ct_match match = {0};
  struct ds ds = DS_EMPTY_INITIALIZER;
  uint16_t zone, *pzone = NULL;
  int error;
  int args = argc - 1;

 -/* Parse ct tuple */
 -if (args && ct_dpif_parse_tuple(, argv[args], )) {
 -ptuple = 
 +/* Parse zone. */
 +if (args && !strncmp(argv[1], "zone=", 5)) {
 +if (!ovs_scan(argv[1], "zone=%"SCNu16, )) {
 +ds_put_cstr(, "failed to parse zone");
 +error = EINVAL;
 +goto error;
 +}
 +pzone = 
  args--;
  }

 -/* Parse zone */
 -if (args && ovs_scan(argv[args], "zone=%"SCNu16, )) {
 -pzone = 
 +/* Parse ct tuples. */
 +for (int i = 0; i < 2; i++) {
 +if (!args) {
 +break;
 +}
 +
 +struct ofp_ct_tuple *tuple =
 +i ? _reply : _orig;
 +const char *arg = argv[argc - args];
 +
 +if (arg[0] && !ofp_ct_tuple_parse(tuple, arg, ,
>> _proto,
 +  _type)) {
 +error = EINVAL;
 +goto error;
 +}
  args--;
  }
>>>
>>> Hi,
>>>
>>> There is a problem with the change here that you cannot specify now dp.
>>>
>>> # ovs-appctl dpctl/flush-conntrack system@ovs-system
>>> ovs-dpctl: field system@ovs-system missing value (Invalid argument)
>>>
>>> As I understand it the old logic was parsing from the end and if we left
>>> with 1 arg it's considered the dp.
>>>
>>> The new logic starts from first argument and fails on first failure
>>> so we actually never reach the following check if args > 1.
>>>
>>> A quick fix I tried is to goto error only if args > 1.
>>> but this leaves an opening that specifying wrong key fails on the dp arg.
>>>
>>> # ovs-dpctl flush-conntrack  system@ovs-system key=val
>>> ovs-dpctl: field system@ovs-system missing value (Invalid argument)
>>>
>>> I ended up with parsing backwards and checking if the error is on last
>>> arg or not.
>>>
>>> I'll send a fixup for review.
>>>
>>> Thanks,
>>> Roi
>>>
>>
>> Well it worked for manual run I tested but I fail the ct flush testsuite.
>>
>>  53: conntrack - ct flushFAILED (
>> system-traffic.at:2364)
>>
>> I'm having some trouble following the log maybe you can help and do the
>> correct fix.
>> i'll also try to look but reporting this for now.
>>
>> The change I tested is the following:
>>
>> --- a/lib/dpctl.c
>> +++ b/lib/dpctl.c
>> @@ -1737,12 +1737,15 @@ dpctl_flush_conntrack(int argc, const char *argv[],
>>
>>  struct ofp_ct_tuple *tuple =
>>  i ? _reply : _orig;
>> -const char *arg = argv[argc - args];
>> +const char *arg = argv[args];
>>
>>  if (arg[0] && !ofp_ct_tuple_parse(tuple, arg, ,
>> _proto,
>>_type)) {
>> -error = EINVAL;
>> -goto error;
>> +if (args > 1) {
>> +error = EINVAL;
>> +goto error;
>> +}
>> +break;
>>  }
>>  args--;
>>  }
>>
> 
> 
> Hi Roi,
> 
> thank you for looking into this.
> I'm not sure if going from the back would work (it might).
> But we can still go from beginning with something like the diff below. (I
> haven't tested that).
> WDYT?
> 
> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index c501a0cd7..1174844d3 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> 

Re: [ovs-dev] [PATCH ovn v2] northd, controller: Commit flows dropped by ACLs to conntrack

2023-03-03 Thread Abhiram Sangana



> On 13 Feb 2023, at 16:35, Abhiram Sangana  wrote:
> 
> This patch adds support to commit connections dropped/rejected by
> ACLs to the connection tracking table. Dropped connections are
> committed to conntrack only if NB_Global options:ct_commit_acl_drop
> is set to true (false by default) and ACL dropping/rejecting the
> connection has label configured. The dropped connections are
> committed in a separate conntrack zone so that they can be managed
> independently and do not interact with the connection tracking state
> of allowed connections.
> 
> This provides a new approach to identify connections dropped by ACLs
> besides the existing ACL logging and drop sampling approaches.
> 
> Each logical switch is assigned a new conntrack zone for committing
> dropped flows. The zone is loaded into register MFF_LOG_ACL_DROP_ZONE.
> A new lflow action "ct_commit_drop" is introduced that commits flows
> to connection tracking table in a zone identified by
> MFF_LOG_ACL_DROP_ZONE register. An ACL with "drop" or "reject" action
> and non-empty label translates to include "ct_commit_drop" in its
> actions instead of simply dropping/rejecting the packet.
> 
> Signed-off-by: Abhiram Sangana 
> ---
> controller/ovn-controller.c  |  14 +++-
> controller/physical.c|  32 -
> include/ovn/actions.h|   1 +
> include/ovn/logical-fields.h |   1 +
> lib/actions.c|  65 +
> lib/ovn-util.c   |   4 +-
> lib/ovn-util.h   |   2 +-
> northd/northd.c  |  25 ++-
> northd/ovn-northd.8.xml  |  30 +++-
> ovn-nb.xml   |  17 +++--
> ovn-sb.xml   |  22 ++
> tests/ovn-nbctl.at   |  10 ++-
> tests/ovn-northd.at  | 133 ---
> tests/ovn.at |  90 +++-
> utilities/ovn-nbctl.c|   7 --
> utilities/ovn-trace.c|   2 +
> 16 files changed, 383 insertions(+), 72 deletions(-)
> 

Can someone please review this patch?

Thank you,
Abhiram Sangana
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v8] netdev-offload-tc: del ufid mapping if device not exist

2023-03-03 Thread Eelco Chaudron



On 2 Mar 2023, at 8:57, Faicker Mo wrote:

> The device may be deleted and added with ifindex changed.
> The tc rules on the device will be deleted if the device is deleted.
> The func tc_del_filter will fail when flow del. The mapping of
> ufid to tc will not be deleted.
> The traffic will trigger the same flow(with same ufid) to put to tc
> on the new device. Duplicated ufid mapping will be added.
> If the hashmap is expanded, the old mapping entry will be the first entry,
> and now the dp flow can't be deleted.
>
>
> Signed-off-by: Faicker Mo 
> ---
> v2:
> - Add tc offload test case
> v3:
> - No change
> v4:
> - No change
> v5:
> - No change
> v6:
> - No change
> v7:
> - Minor fix for test case and rebased
> v8:
> - Shorten the time of the test case
> ---
>  lib/netdev-offload-tc.c  |  3 +-
>  tests/system-offloads-traffic.at | 54 
>  2 files changed, 56 insertions(+), 1 deletion(-)
>
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 4fb9d9f21..dd2020cad 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -276,8 +276,9 @@ del_filter_and_ufid_mapping(struct tcf_id *id, const 
> ovs_u128 *ufid,
>  }
>
>  err = tc_del_flower_filter(id);
> -if (!err) {
> +if (!err || err == ENODEV) {
>  del_ufid_tc_mapping(ufid);
> +return 0;
>  }
>  return err;
>  }
> diff --git a/tests/system-offloads-traffic.at 
> b/tests/system-offloads-traffic.at
> index 7558812eb..ba18711cb 100644
> --- a/tests/system-offloads-traffic.at
> +++ b/tests/system-offloads-traffic.at
> @@ -690,3 +690,57 @@ 
> OVS_CHECK_ACTIONS([check_pkt_len(size=200,gt(5),le(check_pkt_len(size=100,gt(5),
>
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([offloads - delete ufid mapping if device not exist - offloads 
> enabled])
> +OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
> other_config:hw-offload=true])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> +
> +ADD_NAMESPACES(at_ns0, at_ns1, at_ns2)
> +
> +dnl Disable IPv6 to skip unexpected flow
> +AT_CHECK([sysctl -w net.ipv6.conf.br0.disable_ipv6=1], [0], [ignore])
> +NS_CHECK_EXEC([at_ns0], [sysctl -w net.ipv6.conf.all.disable_ipv6=1], [0], 
> [ignore])
> +NS_CHECK_EXEC([at_ns1], [sysctl -w net.ipv6.conf.all.disable_ipv6=1], [0], 
> [ignore])
> +NS_CHECK_EXEC([at_ns2], [sysctl -w net.ipv6.conf.all.disable_ipv6=1], [0], 
> [ignore])
> +
> +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24", "aa:1a:54:e9:c5:56")
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
> +
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 2 -i 0.2 10.1.1.2 | FORMAT_PING], [0], 
> [dnl
> +2 packets transmitted, 2 received, 0% packet loss, time 0ms
> +])
> +sleep 1
> +
> +dnl Delete and add interface ovs-p0/p0
> +AT_CHECK([ip link del dev ovs-p0])
> +AT_CHECK([ip link add p0 type veth peer name ovs-p0 || return 77])
> +AT_CHECK([ip link set p0 netns at_ns0])
> +AT_CHECK([ip link set dev ovs-p0 up])
> +NS_CHECK_EXEC([at_ns0], [ip addr add dev p0 "10.1.1.1/24"])
> +NS_CHECK_EXEC([at_ns0], [ip link set dev p0 up])
> +NS_CHECK_EXEC([at_ns0], [ip link set dev p0 address "aa:1a:54:e9:c5:56"])
> +
> +sleep 1
> +AT_CHECK([ovs-appctl revalidator/purge])
> +
> +dnl Generate flows to trigger the hmap expand once
> +ADD_VETH(p2, at_ns2, br0, "10.1.1.3/24")
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 2 -i 0.2 10.1.1.2 | FORMAT_PING], [0], 
> [dnl
> +2 packets transmitted, 2 received, 0% packet loss, time 0ms
> +])
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 2 -i 0.2 10.1.1.3 | FORMAT_PING], [0], 
> [dnl
> +2 packets transmitted, 2 received, 0% packet loss, time 0ms
> +])
> +
> +sleep 1
> +AT_CHECK([ovs-appctl revalidator/purge])
> +
> +AT_CHECK([test $(ovs-appctl dpctl/dump-flows | grep -c "eth_type(0x0800)") 
> -eq 0], [0], [ignore])
> +
> +OVS_TRAFFIC_VSWITCHD_STOP(["/could not open network device ovs-p0/d
> +/on nonexistent port/d
> +/failed to flow_get/d
> +/Failed to acquire udpif_key/d
> +"])
> +AT_CLEANUP

The changes look fine, however, you still have a 3-second delay in your tests. 
Any specific reason for this?

I tried removing them and I found that if I did I needed to add another log 
messages exclusion. With this I was able to still make it fail without your 
change, and get it to pass for 400 runs.

Thought?

//Eelco

diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index ba18711cb..b2a402dec 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -710,7 +710,7 @@ ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
 NS_CHECK_EXEC([at_ns0], [ping -q -c 2 -i 0.2 10.1.1.2 | FORMAT_PING], [0], [dnl
 2 packets transmitted, 2 received, 0% packet loss, time 0ms
 ])
-sleep 1
+#sleep 1

 dnl Delete and add interface ovs-p0/p0
 AT_CHECK([ip link del dev ovs-p0])
@@ -721,7 +721,7 @@ NS_CHECK_EXEC([at_ns0], [ip addr add dev p0 "10.1.1.1/24"])
 NS_CHECK_EXEC([at_ns0], [ip link set dev p0 up])
 NS_CHECK_EXEC([at_ns0], [ip link set dev p0 address 

Re: [ovs-dev] [PATCH ovn v2 0/3] northd: Optimize parsing of LSP addresses.

2023-03-03 Thread Dumitru Ceara
On 3/2/23 15:50, Mark Michelson wrote:
> On 3/2/23 09:45, Simon Horman wrote:
>> On Thu, Mar 02, 2023 at 02:14:24PM +0100, Dumitru Ceara wrote:
>>> On 3/2/23 14:11, Ilya Maximets wrote:
 This patch set optimizes usage of string operations and avoids parsing
 same LSP addresses multiple times.

 Performance tests with ovn-heater show 5-10% improvement in high-scale
 density-light scenarios.

 Version 2:
    - Reduced code duplication in is_dynamic_lsp_address(). [Mark]

>>>
>>> Thanks for this revision, Ilya!
>>>
>>> I'm thinking of applying this to the main branch and branch-23.03 before
>>> the release tomorrow.  Mark, Simon, do you guys agree with that?  In
>>> theory this is not a bug fix but it's safe enough IMO and the benefits
>>> are significant.
>>
>> Sounds reasonable to me, though I'm not clear how
>> strict the backporting rules are.
>>

committer-responsibilities.rst says:

  Feature development should be done only on master. Occasionally it makes sense
  to add a feature to the most recent release branch, before the first actual
  release of that branch. These should be handled in the same way as bug fixes,
  that is, first implemented on master and then backported.

We're kind of in that situation now.  Although this series is not a
feature but at the same time not really a bugfix either.

> 
> I'm also fine with this since this is a pretty minor changeset and its
> probability of causing problems is extremely low.
> 

Thanks for your inputs on this and for the patch and reviews!

I pushed this to main and branch-23.03.

Regards,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] usdt-scripts: use ebpf to track upcall_cost

2023-03-03 Thread Eelco Chaudron


On 3 Mar 2023, at 11:45, Adrian Moreno wrote:

> On 3/3/23 11:33, Eelco Chaudron wrote:
>>
>>
>> On 27 Feb 2023, at 15:29, Ilya Maximets wrote:
>>
>>> On 2/24/23 11:41, Adrián Moreno wrote:
 From: Adrian Moreno 

 This is basically a rewrite of the upcall_cost.py script moving some of
 it's functionality to the ebpf programs. These are the main changes:

 * Correlation of kernel upcall <-> userspace recv upcall.
 The correlation was being done in python by comparing the packet
 contents.
 We still don't have a better way to do it but, instead of doing it
 at post-processing time, compute a hash in the ebpf program so
 correlation is little more than an indirection.
 For the hash library we use jhash.h from the linux-headers package.

 * GSO packets
 We where attaching to the upcall tracepoint which is hit before
 fragmentation. Use a kprobe on kprobe__queue_userspace_packet to
 correlate upcalls to upcall queue events.

 * Userspace operation batching.
 Also move the batc tracking implementation to the ebpf programs so each
 operation (e.g: PUT, EXEC) event  contains the id (hash) of the
 correspondent RECV.

 As a result of this, it's no longer mandatory to send the packet or key
 bytes to userspace, leading to smaller events and less event drops.
 Also, accuracy is improved since we're now considering fragmentation.
 Additionally, there is no need for a time-consuming post-processing
 phase so the script is now much faster.

 Signed-off-by: Adrian Moreno 
>>>
>>> Hi, Adrian.  I'll leave the actual review to Eelco, but FYI this change
>>> is failing builds due to PEP8 issues.
>>
>> In addition, it looks like you also run the entire script through the black 
>> auto formatter, which makes reviewing a pain, due to all the style changes.
>>
>> If you really want to run this through black can you split this in a 
>> separate patch?
>>
>
> I actually only applied black formatting to the pieces I modified but since 
> I've gone back and forth several times with this script there are some 
> gratuitous formatting changes that should be undone. I'll send another 
> version.

Thanks, I’ll wait for the next rev! I think we should try to keep the style of 
the (python) file you changing, and if we want to go black, we should do it in 
one go.

//Eelco


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] usdt-scripts: use ebpf to track upcall_cost

2023-03-03 Thread Adrian Moreno



On 3/3/23 11:33, Eelco Chaudron wrote:



On 27 Feb 2023, at 15:29, Ilya Maximets wrote:


On 2/24/23 11:41, Adrián Moreno wrote:

From: Adrian Moreno 

This is basically a rewrite of the upcall_cost.py script moving some of
it's functionality to the ebpf programs. These are the main changes:

* Correlation of kernel upcall <-> userspace recv upcall.
The correlation was being done in python by comparing the packet
contents.
We still don't have a better way to do it but, instead of doing it
at post-processing time, compute a hash in the ebpf program so
correlation is little more than an indirection.
For the hash library we use jhash.h from the linux-headers package.

* GSO packets
We where attaching to the upcall tracepoint which is hit before
fragmentation. Use a kprobe on kprobe__queue_userspace_packet to
correlate upcalls to upcall queue events.

* Userspace operation batching.
Also move the batc tracking implementation to the ebpf programs so each
operation (e.g: PUT, EXEC) event  contains the id (hash) of the
correspondent RECV.

As a result of this, it's no longer mandatory to send the packet or key
bytes to userspace, leading to smaller events and less event drops.
Also, accuracy is improved since we're now considering fragmentation.
Additionally, there is no need for a time-consuming post-processing
phase so the script is now much faster.

Signed-off-by: Adrian Moreno 


Hi, Adrian.  I'll leave the actual review to Eelco, but FYI this change
is failing builds due to PEP8 issues.


In addition, it looks like you also run the entire script through the black 
auto formatter, which makes reviewing a pain, due to all the style changes.

If you really want to run this through black can you split this in a separate 
patch?



I actually only applied black formatting to the pieces I modified but since I've 
gone back and forth several times with this script there are some gratuitous 
formatting changes that should be undone. I'll send another version.



//Eelco



--
Adrián Moreno

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] usdt-scripts: use ebpf to track upcall_cost

2023-03-03 Thread Adrian Moreno



On 2/27/23 15:29, Ilya Maximets wrote:

On 2/24/23 11:41, Adrián Moreno wrote:

From: Adrian Moreno 

This is basically a rewrite of the upcall_cost.py script moving some of
it's functionality to the ebpf programs. These are the main changes:

* Correlation of kernel upcall <-> userspace recv upcall.
The correlation was being done in python by comparing the packet
contents.
We still don't have a better way to do it but, instead of doing it
at post-processing time, compute a hash in the ebpf program so
correlation is little more than an indirection.
For the hash library we use jhash.h from the linux-headers package.

* GSO packets
We where attaching to the upcall tracepoint which is hit before
fragmentation. Use a kprobe on kprobe__queue_userspace_packet to
correlate upcalls to upcall queue events.

* Userspace operation batching.
Also move the batc tracking implementation to the ebpf programs so each
operation (e.g: PUT, EXEC) event  contains the id (hash) of the
correspondent RECV.

As a result of this, it's no longer mandatory to send the packet or key
bytes to userspace, leading to smaller events and less event drops.
Also, accuracy is improved since we're now considering fragmentation.
Additionally, there is no need for a time-consuming post-processing
phase so the script is now much faster.

Signed-off-by: Adrian Moreno 


Hi, Adrian.  I'll leave the actual review to Eelco, but FYI this change
is failing builds due to PEP8 issues.



I saw that! I fixed the code before sending it but I did not update the patch!
Thanks.


Best regards, Ilya Maximets.



--
Adrián Moreno

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/1] usdt-scripts: use ebpf to track upcall_cost

2023-03-03 Thread Eelco Chaudron


On 27 Feb 2023, at 15:29, Ilya Maximets wrote:

> On 2/24/23 11:41, Adrián Moreno wrote:
>> From: Adrian Moreno 
>>
>> This is basically a rewrite of the upcall_cost.py script moving some of
>> it's functionality to the ebpf programs. These are the main changes:
>>
>> * Correlation of kernel upcall <-> userspace recv upcall.
>> The correlation was being done in python by comparing the packet
>> contents.
>> We still don't have a better way to do it but, instead of doing it
>> at post-processing time, compute a hash in the ebpf program so
>> correlation is little more than an indirection.
>> For the hash library we use jhash.h from the linux-headers package.
>>
>> * GSO packets
>> We where attaching to the upcall tracepoint which is hit before
>> fragmentation. Use a kprobe on kprobe__queue_userspace_packet to
>> correlate upcalls to upcall queue events.
>>
>> * Userspace operation batching.
>> Also move the batc tracking implementation to the ebpf programs so each
>> operation (e.g: PUT, EXEC) event  contains the id (hash) of the
>> correspondent RECV.
>>
>> As a result of this, it's no longer mandatory to send the packet or key
>> bytes to userspace, leading to smaller events and less event drops.
>> Also, accuracy is improved since we're now considering fragmentation.
>> Additionally, there is no need for a time-consuming post-processing
>> phase so the script is now much faster.
>>
>> Signed-off-by: Adrian Moreno 
>
> Hi, Adrian.  I'll leave the actual review to Eelco, but FYI this change
> is failing builds due to PEP8 issues.

In addition, it looks like you also run the entire script through the black 
auto formatter, which makes reviewing a pain, due to all the style changes.

If you really want to run this through black can you split this in a separate 
patch?

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn: Handling of arp/nd learning on logical switches

2023-03-03 Thread Olaf Seibert via dev

Tangentially related to this: I was trying to detect when the problem with the "too 
many resubmits" occurs, for alerting purposes.
Some light examination of the source of ovs suggested that the coverage counter 
"drop_action_too_many_resubmit" corresponds to the error 
"XLATE_TOO_MANY_RESUBMITS".

So I saw this very recent log message in /var/log/openvswitch/ovs-vswitchd.log 
(I censored some addresses):

2023-03-03T10:06:51.453Z|00518|ofproto_dpif_xlate(handler34)|WARN|over 4096 
resubmit actions on bridge br-int while processing 
arp,in_port=1,vlan_tci=0x,dl_src=zz:zz:zz:zz:zz:zz,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=xxx.xxx.xxx.xxx,arp_tpa=yyy.yyy.yyy.yyy,arp_op=1,arp_sha=vv:vv:vv:vv:vv:vv,arp_tha=00:00:00:00:00:00

but the coverage counter seems to ignore this:

# ovs-appctl coverage/read-counter drop_action_too_many_resubmit
0

Is this a bug or am I looking at the wrong thing(s)?

Cheers,
-Olaf.

--
Olaf Seibert
Site Reliability Engineer

SysEleven GmbH
Boxhagener Straße 80
10245 Berlin

T +49 30 233 2012 0
F +49 30 616 7555 0


https://www.syseleven.de
https://www.linkedin.com/company/syseleven-gmbh/
https://www.twitter.com/SysEleven
https://www.syseleven.de/events/

Aktueller System-Status immer unter:
https://www.syseleven-status.net/

Firmensitz: Berlin
Registergericht: AG Berlin Charlottenburg, HRB 108571 Berlin
Geschäftsführer: Marc Korthaus, Jens Ihlenfeld, Andreas Hermann, Norbert Müller
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 5/5] route-table: Retrieving the preferred source address from Netlink.

2023-03-03 Thread Eelco Chaudron



On 2 Mar 2023, at 7:34, Nobuhiro MIKI wrote:

> We can use the "ip route add ... src ..." command to set the preferred
> source address for each entry in the kernel FIB. OVS has a mechanism to
> cache the FIB, but the preferred source address is ignored and
> calculated with its own logic. This patch resolves the difference
> between kernel FIB and OVS route table cache by retrieving the
> RTA_PREFSRC attribute of Netlink messages.
>
> Acked-by: Eelco Chaudron 
> Reviewed-by: Simon Horman 
> Signed-off-by: Nobuhiro MIKI 

No changes, so just adding my ack for completeness.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 4/5] ovs-router: Introduce src option in ovs/route/add command.

2023-03-03 Thread Eelco Chaudron


On 2 Mar 2023, at 7:34, Nobuhiro MIKI wrote:

> When adding a route with ovs/route/add command, the source address
> in "ovs_router_entry" structure is always the FIRST address that the
> interface has. See "ovs_router_get_netdev_source_address"
> function for more information.
>
> If an interface has multiple ipv4 and/or ipv6 addresses, there are use
> cases where the user wants to control the source address. This patch
> therefore addresses this issue by adding a src parameter.
>
> Note that same constraints also exist when caching routes from
> Kernel FIB with Netlink, but are not dealt with in this patch.
>
> Acked-by: Eelco Chaudron 
> Reviewed-by: Simon Horman 
> Signed-off-by: Nobuhiro MIKI 

Small nit below, rest looks good.

> ---
>  NEWS|  3 ++
>  lib/ovs-router.c| 86 +
>  ofproto/ofproto-tnl-unixctl.man |  5 +-
>  tests/ovs-router.at | 80 +-
>  4 files changed, 161 insertions(+), 13 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index ad84898ce80f..2bae48d60829 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -12,6 +12,9 @@ Post-v3.1.0
> in order to create OVSDB sockets with access mode of 0770.
> - QoS:
>   * Added new configuration option 'jitter' for a linux-netem QoS type.
> +   - ovs-appctl:
> + * Add support for selecting the source address with the
> +   “ovs-appctl ovs/route/add" command.
>
>
>  v3.1.0 - 16 Feb 2023
> diff --git a/lib/ovs-router.c b/lib/ovs-router.c
> index a47c2b9bd3fd..5255049c3a2f 100644
> --- a/lib/ovs-router.c
> +++ b/lib/ovs-router.c
> @@ -164,6 +164,46 @@ static void rt_init_match(struct match *match, uint32_t 
> mark,
>  match->flow.pkt_mark = mark;
>  }
>
> +static int
> +verify_prefsrc(const struct in6_addr *ip6_dst,
> +   const char output_bridge[],
> +   struct in6_addr *prefsrc)
> +{
> +struct in6_addr *mask, *addr6;
> +struct netdev *dev;
> +int err, n_in6, i;
> +
> +err = netdev_open(output_bridge, NULL, );
> +if (err) {
> +return err;
> +}
> +
> +err = netdev_get_addr_list(dev, , , _in6);
> +if (err) {
> +goto out;
> +}
> +
> +for (i = 0; i < n_in6; i++) {
> +struct in6_addr a1, a2;
> +a1 = ipv6_addr_bitand(ip6_dst, [i]);
> +a2 = ipv6_addr_bitand(prefsrc, [i]);
> +
> +/* Check that the intarface has "prefsrc" and
> + * it is same broadcast domain with "ip6_dst". */
> +if (IN6_ARE_ADDR_EQUAL(prefsrc, [i]) &&
> +IN6_ARE_ADDR_EQUAL(, )) {
> +goto out;
> +}
> +}
> +err = ENOENT;
> +
> +out:
> +free(addr6);
> +free(mask);
> +netdev_close(dev);
> +return err;
> +}
> +
>  int
>  ovs_router_get_netdev_source_address(const struct in6_addr *ip6_dst,
>   const char output_bridge[],
> @@ -217,8 +257,12 @@ static int
>  ovs_router_insert__(uint32_t mark, uint8_t priority, bool local,
>  const struct in6_addr *ip6_dst,
>  uint8_t plen, const char output_bridge[],
> -const struct in6_addr *gw)
> +const struct in6_addr *gw,
> +const struct in6_addr *ip6_src)
>  {
> +int (*get_src_addr)(const struct in6_addr *ip6_dst,
> +const char output_bridge[],
> +struct in6_addr *prefsrc);
>  const struct cls_rule *cr;
>  struct ovs_router_entry *p;
>  struct match match;
> @@ -236,11 +280,17 @@ ovs_router_insert__(uint32_t mark, uint8_t priority, 
> bool local,
>  p->plen = plen;
>  p->local = local;
>  p->priority = priority;
> -err = ovs_router_get_netdev_source_address(ip6_dst, output_bridge,
> -   >src_addr);
> +
> +if (ipv6_addr_is_set(ip6_src)) {
> +p->src_addr = *ip6_src;
> +get_src_addr = verify_prefsrc;
> +} else {
> +get_src_addr = ovs_router_get_netdev_source_address;
> +}
> +
> +err = get_src_addr(ip6_dst, output_bridge, >src_addr);
>  if (err && ipv6_addr_is_set(gw)) {
> -err = ovs_router_get_netdev_source_address(gw, output_bridge,
> -   >src_addr);
> +err = get_src_addr(gw, output_bridge, >src_addr);
>  }
>  if (err) {
>  struct ds ds = DS_EMPTY_INITIALIZER;
> @@ -274,7 +324,8 @@ ovs_router_insert(uint32_t mark, const struct in6_addr 
> *ip_dst, uint8_t plen,
>  {
>  if (use_system_routing_table) {
>  uint8_t priority = local ? plen + 64 : plen;
> -ovs_router_insert__(mark, priority, local, ip_dst, plen, 
> output_bridge, gw);
> +ovs_router_insert__(mark, priority, local, ip_dst, plen,
> +output_bridge, gw, _any);
>  }
>  }
>
> @@ -341,10 +392,13 @@ static void
>  ovs_router_add(struct unixctl_conn *conn, int argc,
>  

Re: [ovs-dev] [PATCH v6 3/5] ofproto: Fix mam page for tunnel related commands.

2023-03-03 Thread Eelco Chaudron



On 2 Mar 2023, at 7:34, Nobuhiro MIKI wrote:

> Fixed the manual page to indicate that both IPv4/IPv6
> are supported. Also added missing pkt_mark on one side
> and fixed the "gw" and "bridge" notation quirks.
>
> Reviewed-by: Simon Horman 
> Signed-off-by: Nobuhiro MIKI 
> ---
>  lib/ovs-router.c| 4 ++--
>  ofproto/ofproto-tnl-unixctl.man | 8 
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/lib/ovs-router.c b/lib/ovs-router.c
> index b5ac1edb6c65..a47c2b9bd3fd 100644
> --- a/lib/ovs-router.c
> +++ b/lib/ovs-router.c
> @@ -537,12 +537,12 @@ ovs_router_init(void)
>  fatal_signal_add_hook(ovs_router_flush_handler, NULL, NULL, true);
>  classifier_init(, NULL);
>  unixctl_command_register("ovs/route/add",
> - "ip_addr/prefix_len out_br_name [gw] "
> + "ip/plen output_bridge [gw] "
>   "[pkt_mark=mark]",
>   2, 4, ovs_router_add, NULL);
>  unixctl_command_register("ovs/route/show", "", 0, 0,
>   ovs_router_show, NULL);
> -unixctl_command_register("ovs/route/del", "ip_addr/prefix_len "
> +unixctl_command_register("ovs/route/del", "ip/plen "

You are changing the text here, but the error messages you updated in the 
previous patch still have the old names.

>   "[pkt_mark=mark]", 1, 2, ovs_router_del,
>   NULL);
>  unixctl_command_register("ovs/route/lookup", "ip_addr "
> diff --git a/ofproto/ofproto-tnl-unixctl.man b/ofproto/ofproto-tnl-unixctl.man
> index 13a465119a90..6ed7e7fcea9b 100644
> --- a/ofproto/ofproto-tnl-unixctl.man
> +++ b/ofproto/ofproto-tnl-unixctl.man
> @@ -1,8 +1,8 @@
>  .SS "OPENVSWITCH TUNNELING COMMANDS"
>  These commands query and modify OVS tunnel components.
>  .
> -.IP "\fBovs/route/add ipv4_address/plen output_bridge [GW]\fR"
> -Adds ipv4_address/plen route to vswitchd routing table. output_bridge
> +.IP "\fBovs/route/add ip/plen output_bridge [gw] [pkt_mark=mark]\fR"
> +Adds ip/plen route to vswitchd routing table. output_bridge
>  needs to be OVS bridge name.  This command is useful if OVS cached
>  routes does not look right.
>  .
> @@ -10,8 +10,8 @@ routes does not look right.
>  Print all routes in OVS routing table, This includes routes cached
>  from system routing table and user configured routes.
>  .
> -.IP "\fBovs/route/del ipv4_address/plen\fR"
> -Delete ipv4_address/plen route from OVS routing table.
> +.IP "\fBovs/route/del ip/plen [pkt_mark=mark]\fR"
> +Delete ip/plen route from OVS routing table.
>  .
>  .IP "\fBtnl/neigh/show\fR"
>  .IP "\fBtnl/arp/show\fR"
> -- 
> 2.31.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 2/5] ovs-router: Cleanup parser for ovs/route/add command.

2023-03-03 Thread Eelco Chaudron



On 2 Mar 2023, at 7:34, Nobuhiro MIKI wrote:

> This patch cleans up the parser to accept pkt_mark and gw in any order.
>
> pkt_mark and gw are normally expected to be specified exactly once.
> However, as with other tools, if specified multiple times, the last
> specification is used. Also, pkt_mark and gw have separate prefix
> strings so they can be parsed in any order.
>
> Acked-by: Eelco Chaudron 
> Reviewed-by: Simon Horman 
> Signed-off-by: Nobuhiro MIKI 

Changes look good.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 1/5] netdev-dummy: Support multiple IP addresses.

2023-03-03 Thread Eelco Chaudron



On 2 Mar 2023, at 7:34, Nobuhiro MIKI wrote:

> This is useful in test cases where multiple IPv4/IPv6 addresses
> are assigned together.
>
> Acked-by: Eelco Chaudron 
> Reviewed-by: Simon Horman 
> Signed-off-by: Nobuhiro MIKI 
> ---

This patch still looks good with the additional changes!

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] ovn: Handling of arp/nd learning on logical switches

2023-03-03 Thread Felix Hüttner via dev
Hello everyone,

We had a discussion on ovs-discuss that we wanted to bring here [1]:

Assume a physical network connected to a OVN Logical_Switch and then multiple 
Logical_Routers like so:

++
| Logical Router |
| lr001  +-+
++ |
   |
++ |
| Logical Router | | ++ +--+
| lr002  +-+-+ Logical Switch +-+ Phyiscal Network |
++ | | ls-ext | |  |
   | ++ +--+
  ...  |
   |
++ |
| Logical Router | |
| lr300  +-+
++

If now a multicast packet comes in to ls-ext from the physical network (e.g. 
for our case a arp request or a neighbor discovery) it is flooded to all 
attached lsp's.
The logical routers then try to lookup the source of the arp request/nd in 
their arp/neighbor table and insert it if it is not found.
For the picture above that means that a single arp packet can trigger 300 
lookup_arp and put_arp actions and each put_arp would result in a controller 
action.
The outcome would be the same for all 300 logical routers: Each of them would 
insert the mac/ip binding in the MAC_Binding table and for each of them flows 
would be added.
This leads to load on the ovn-controllers and the southbound database. Also 
some of the put_arp action can easily be lost due to the queue limit to the 
ovn-controller.

In the mailinglist thread mentioned above ([1]) there was the discussion to 
move this learning process to the Logical Switch.
This would mean that:
1. we only need to handle the learning in one location (and therefor only do it 
once)
2. the MAC_Binding table in the southbound database does not contain the 
(roughly) same entry 300 times

However we where unsure if there is anything speaking against such an approach.
>From my naive understanding I would propose the following changes:

1. move the whole lr_in_lookup_neighbor (table 1) and lr_in_learn_neighbor 
(table 2) from the logical router pipeline to the logical switch pipeline 
between ls_in_hairpin (table 18) and ls_in_arp_rsp (table 19)

2a. Leave the arp resolve stage in the logical routers: teach 
add_neighbor_flows to not only learn from the Mac_Binding table based on 
logical_ports but also based on the datapaths these ports are connected to

2b. Or move the arp resolve stage to the logical switches:
   1) move lr_in_arp_resolve (table 17) and lr_in_arp_request (table 21) from 
the logical router to the logical switch pipeline
   2) clarify what we would then use as a source address from arp requests that 
would then originate from the logical switch pipeline
   3) clarify how we signal to the logical switch that it should actually do 
the arp lookup (as static mac_bindings for individual routers probably still 
need to work).

I would for now tend to just move the learning stage to the logical switches 
while keeping the arp resolution stage on the logical router side.

I guess I have overlooked something important in here as well, so it would be 
great if I could get feedback on your views on this.

Thanks

[1]: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052268.html

--
Felix Huettner

Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev