Re: [ovs-dev] [PATCH ovn] ovn-northd.at: Fix occasional LSP I-P test failure due to initializtion phase.

2023-07-07 Thread Han Zhou
On Fri, Jul 7, 2023 at 8:31 PM Ales Musil  wrote:
>
>
>
> On Fri, Jul 7, 2023 at 8:30 AM Han Zhou  wrote:
>>
>> After the commit 0c1bde1c4a the recompute counters are more predictable,
>> so we changed the LSP incremental processing test to not tolerate any
>> failures (instead of 50% successful rate). But the test would then fail
>> occasionally at the first check, because sometimes the update of the
>> initial configurations from ovn-controller such as for tunnel interface
>> creation come too late, after we cleaned the stats counters and start
>> the LSP tests.
>>
>> This patch fixes it by creating a pilot port and wait for it to be up,
>> so that we know the initial ovn-controller configurations are done, and
>> will not interfere with our I-P tests.
>>
>> Fixes: 0c1bde1c4a47 ("ovn-northd: Avoid recompute caused by in-flight
transactions.")
>> Signed-off-by: Han Zhou 
>> ---
>>  tests/ovn-northd.at | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index e79d33b2aec5..3e06f14c9437 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -9532,6 +9532,13 @@ check_recompute_counter() {
>>
>>  check ovn-nbctl --wait=hv ls-add ls0
>>
>> +# Create a pilot port and wait it up to make sure we are ready for the
real
>> +# tests, so that the counters measured are accurate.
>> +check ovn-nbctl --wait=hv lsp-add ls0 lsp-pilot -- lsp-set-addresses
lsp-pilot "unknown"
>> +ovs-vsctl add-port br-int lsp-pilot -- set interface lsp-pilot
external_ids:iface-id=lsp-pilot
>> +wait_for_ports_up
>> +check ovn-nbctl --wait=hv sync
>> +
>>  check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
>>  check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses
lsp0-0 "unknown"
>>  ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0
external_ids:iface-id=lsp0-0
>> --
>> 2.31.1
>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
>
> Looks good to me, thanks.
>
> Acked-by: Ales Musil 
>

Thanks Ales. I applied to main.

Han

> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA
>
> amu...@redhat.comIM: amusil
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: add drop action

2023-07-07 Thread Jakub Kicinski
On Fri, 7 Jul 2023 18:04:36 +0200 Ilya Maximets wrote:
> >> That already exists, right? Johannes added it in the last release for 
> >> WiFi.  
> > 
> > I'm not sure.  The SKB_DROP_REASON_SUBSYS_MAC80211_UNUSABLE behaves 
> > similarly
> > to that on a surface.  However, looking closer, any value that can be passed
> > into ieee80211_rx_handlers_result() and ends up in the kfree_skb_reason() is
> > kind of defined in net/mac80211/drop.h, unless I'm missing something (very
> > possible, because I don't really know wifi code).
> > 
> > The difference, I guess, is that for openvswitch values will be provided by
> > the userpsace application via netlink interface.  It'll be just a number not
> > defined anywhere in the kernel.  Only the subsystem itself will be defined
> > in order to occupy the range.  Garbage in, same garbage out, from the 
> > kernel's
> > perspective.  
> 
> To be clear, I think, not defining them in this particular case is better.
> Definition of every reason that userspace can come up with will add extra
> uAPI maintenance cost/issues with no practical benefits.  Values are not
> going to be used for anything outside reporting a drop reason and subsystem
> offset is not part of uAPI anyway.

Ah, I see. No, please don't stuff user space defined values into 
the drop reason. The reasons are for debugging the kernel stack 
itself. IOW it'd be abuse not reuse.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/3] checkpatch: print subject field if misspelled or missing

2023-07-07 Thread 0-day Robot
Bleep bloop.  Greetings Chandan Somani, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line has trailing whitespace
#27 FILE: utilities/checkpatch.py:1030:
print("Subject missing! Your provisional subject is", 

Lines checked: 38, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 3/3] checkpatch: print subject field if misspelled or missing

2023-07-07 Thread Chandan Somani
This narrows down spelling errors that are in the commit
subject. In v2, it also provides a subject if the subject
line is missing. The provisional subject is the name of the
patch file, which should provide some context about the patch.

Signed-off-by: Chandan Somani 
---
 utilities/checkpatch.py | 8 
 1 file changed, 8 insertions(+)

diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
index e5d5029f2..f34e4bf2a 100755
--- a/utilities/checkpatch.py
+++ b/utilities/checkpatch.py
@@ -1024,6 +1024,14 @@ def ovs_checkpatch_file(filename):
 result = ovs_checkpatch_parse(part.get_payload(decode=False), filename,
   mail.get('Author', mail['From']),
   mail['Commit'])
+if spellcheck:
+if not mail['Subject'].strip():
+mail.replace_header('Subject', sys.argv[-1])
+print("Subject missing! Your provisional subject is", 
+  mail['Subject'])
+if check_spelling(mail['Subject'], False):
+print("Subject: %s" % mail['Subject'])
+
 ovs_checkpatch_print_result()
 return result
 
-- 
2.26.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 2/3] checkpatch: add suggestions to the spell checker

2023-07-07 Thread Chandan Somani
This will be useful for correcting possible spelling
mistakes with ease. Suggestions limited to 3 at first,
but configurable in the future.

Signed-off-by: Chandan Somani 
Acked-by: Aaron Conole 
---
 utilities/checkpatch.py | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
index acf9a0102..e5d5029f2 100755
--- a/utilities/checkpatch.py
+++ b/utilities/checkpatch.py
@@ -412,6 +412,7 @@ def check_spelling(line, comment):
 words = words.replace(':', ' ').split(' ')
 
 flagged_words = []
+num_suggestions = 3
 
 for word in words:
 skip = False
@@ -442,6 +443,8 @@ def check_spelling(line, comment):
 if len(flagged_words) > 0:
 for mistake in flagged_words:
 print_warning("Possible misspelled word: \"%s\"" % mistake)
+print("Did you mean: ",
+  spell_check_dict.suggest(mistake)[:num_suggestions])
 
 return True
 
-- 
2.26.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/3] checkpatch: reorganize flagged words using a list

2023-07-07 Thread Chandan Somani
Single out flagged words and allow for more useful
details, like spelling suggestions. Fixed syntax
error from v1

Signed-off-by: Chandan Somani 
---
 utilities/checkpatch.py | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/utilities/checkpatch.py b/utilities/checkpatch.py
index 64f0efeb4..acf9a0102 100755
--- a/utilities/checkpatch.py
+++ b/utilities/checkpatch.py
@@ -411,6 +411,8 @@ def check_spelling(line, comment):
 words = filter_comments(line, True) if comment else line
 words = words.replace(':', ' ').split(' ')
 
+flagged_words = []
+
 for word in words:
 skip = False
 strword = re.subn(r'\W+', '', word)[0].replace(',', '')
@@ -435,9 +437,13 @@ def check_spelling(line, comment):
 skip = True
 
 if not skip:
-print_warning("Check for spelling mistakes (e.g. \"%s\")"
-  % strword)
-return True
+flagged_words.append(strword)
+
+if len(flagged_words) > 0:
+for mistake in flagged_words:
+print_warning("Possible misspelled word: \"%s\"" % mistake)
+
+return True
 
 return False
 
-- 
2.26.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 0/8] northd: I-P for load balancer and lb groups

2023-07-07 Thread Mark Michelson

Hi Numan,

I gave the series a look. I've looked at the code but haven't yet run 
any tests with it. The main reason for this is that the series does not 
apply cleanly to OVN main.


Overall, I only have small notes. I've replied to patches 2 and 3 with 
the specifics.


On 7/7/23 01:51, num...@ovn.org wrote:

From: Numan Siddique 

This patch series adds the support to handle load balancer and
load balancer group changes incrementally in the "northd" engine
node.  "flow" engine node doesn't support I-P yet and falls back
to full recompute.

Note:  I still need to do some scale testing and measure the
improvements.  I'll first attempt adding I-P in the "flow"
engine node before attempting some scale tests.  Neverthless
this patch series is good enough for reviews.


v1 -> v2

   * Resolved the conflicts and rebased
   * Fixed a bug in p6.

Numan Siddique (8):
   northd I-P: Sync SB load balancers in a separate engine node.
   northd: Add a new engine node - northd_lb_data.
   northd: Add initial I-P for load balancer and load balancer groups
   northd: Refactor the 'northd' node code which handles logical switch
 changes.
   northd: Handle load balancer changes for a logical switch.
   northd: Handle load balancer group changes for a logical switch.
   northd: Sync SB Port bindings NAT column in a separate engine node.
   northd: Handle load balancer changes for a logical router.

  lib/lb.c   |  356 ++-
  lib/lb.h   |  113 ++-
  northd/automake.mk |2 +
  northd/en-lflow.c  |8 +-
  northd/en-northd-lb-data.c |  308 ++
  northd/en-northd-lb-data.h |   56 ++
  northd/en-northd.c |   57 +-
  northd/en-northd.h |2 +
  northd/en-sync-sb.c|   60 ++
  northd/en-sync-sb.h|9 +
  northd/inc-proc-northd.c   |   27 +-
  northd/northd.c| 1858 +---
  northd/northd.h|   44 +-
  tests/ovn-northd.at|  131 +++
  14 files changed, 2354 insertions(+), 677 deletions(-)
  create mode 100644 northd/en-northd-lb-data.c
  create mode 100644 northd/en-northd-lb-data.h



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 2/8] northd: Add a new engine node - northd_lb_data.

2023-07-07 Thread Mark Michelson

Hi Numan,

I have one small nit below.

On 7/7/23 01:53, num...@ovn.org wrote:

From: Numan Siddique 

This patch separates out the 'lbs' and 'lb_groups' from the 'northd' engine
node data into a new engine node  'northd_lb_data'. This new node becomes
an input to the 'northd' node.

This makes handling the NB load balancer and load balancer group changes
easier.

Signed-off-by: Numan Siddique 
---
  lib/lb.c   | 201 +--
  lib/lb.h   |  86 +++--
  northd/automake.mk |   2 +
  northd/en-lflow.c  |   3 +-
  northd/en-northd-lb-data.c | 126 +++
  northd/en-northd-lb-data.h |  19 ++
  northd/en-northd.c |  11 +-
  northd/en-sync-sb.c|   2 +-
  northd/inc-proc-northd.c   |   8 +-
  northd/northd.c| 673 +
  northd/northd.h|  15 +-
  11 files changed, 780 insertions(+), 366 deletions(-)
  create mode 100644 northd/en-northd-lb-data.c
  create mode 100644 northd/en-northd-lb-data.h

diff --git a/lib/lb.c b/lib/lb.c
index 7afdaed65b..429dbf15af 100644
--- a/lib/lb.c
+++ b/lib/lb.c
@@ -26,6 +26,7 @@
  #include "openvswitch/vlog.h"
  #include "lib/bitmap.h"
  #include "lib/smap.h"
+#include "socket-util.h"
  
  VLOG_DEFINE_THIS_MODULE(lb);
  
@@ -431,11 +432,62 @@ void ovn_northd_lb_vip_init(struct ovn_northd_lb_vip *lb_vip_nb,

  ovn_lb_get_health_check(nbrec_lb, vip_port_str, template);
  }
  
+static void

+ovn_lb_vip_backends_health_check_init(const struct ovn_northd_lb *lb,
+  const struct ovn_lb_vip *lb_vip,
+  struct ovn_northd_lb_vip *lb_vip_nb)
+{
+struct ds key = DS_EMPTY_INITIALIZER;
+
+for (size_t j = 0; j < lb_vip->n_backends; j++) {
+struct ovn_lb_backend *backend = _vip->backends[j];
+ds_clear();
+ds_put_format(, IN6_IS_ADDR_V4MAPPED(_vip->vip)
+  ? "%s" : "[%s]", backend->ip_str);
+
+const char *s = smap_get(>nlb->ip_port_mappings, ds_cstr());
+if (!s) {
+continue;
+}
+
+char *svc_mon_src_ip = NULL;
+char *port_name = xstrdup(s);
+char *p = strstr(port_name, ":");
+if (p) {
+*p = 0;
+p++;
+struct sockaddr_storage svc_mon_src_addr;
+if (!inet_parse_address(p, _mon_src_addr)) {
+static struct vlog_rate_limit rl =
+VLOG_RATE_LIMIT_INIT(5, 1);
+VLOG_WARN_RL(, "Invalid svc mon src IP %s", p);
+} else {
+struct ds src_ip_s = DS_EMPTY_INITIALIZER;
+ss_format_address_nobracks(_mon_src_addr,
+_ip_s);
+svc_mon_src_ip = ds_steal_cstr(_ip_s);
+}
+}
+
+if (svc_mon_src_ip) {
+struct ovn_northd_lb_backend *backend_nb =
+_vip_nb->backends_nb[j];
+backend_nb->health_check = true;
+backend_nb->logical_port = xstrdup(port_name);
+backend_nb->svc_mon_src_ip = svc_mon_src_ip;
+}
+free(port_name);
+}
+
+ds_destroy();
+}
+
  static
  void ovn_northd_lb_vip_destroy(struct ovn_northd_lb_vip *vip)
  {
  free(vip->backend_ips);
  for (size_t i = 0; i < vip->n_backends; i++) {
+free(vip->backends_nb[i].logical_port);
  free(vip->backends_nb[i].svc_mon_src_ip);
  }
  free(vip->backends_nb);
@@ -555,8 +607,7 @@ ovn_lb_get_health_check(const struct nbrec_load_balancer 
*nbrec_lb,
  }
  
  struct ovn_northd_lb *

-ovn_northd_lb_create(const struct nbrec_load_balancer *nbrec_lb,
- size_t n_ls_datapaths, size_t n_lr_datapaths)
+ovn_northd_lb_create(const struct nbrec_load_balancer *nbrec_lb)
  {
  bool template = smap_get_bool(_lb->options, "template", false);
  bool is_udp = nullable_string_is_equal(nbrec_lb->protocol, "udp");
@@ -595,9 +646,6 @@ ovn_northd_lb_create(const struct nbrec_load_balancer 
*nbrec_lb,
  }
  lb->affinity_timeout = affinity_timeout;
  
-lb->nb_ls_map = bitmap_allocate(n_ls_datapaths);

-lb->nb_lr_map = bitmap_allocate(n_lr_datapaths);
-
  sset_init(>ips_v4);
  sset_init(>ips_v6);
  struct smap_node *node;
@@ -631,7 +679,12 @@ ovn_northd_lb_create(const struct nbrec_load_balancer 
*nbrec_lb,
  ovn_lb_vip6_template_format_internal(lb_vip),
  xstrdup(node->value));
  }
+
  n_vips++;
+
+if (lb_vip_nb->lb_health_check) {
+ovn_lb_vip_backends_health_check_init(lb, lb_vip, lb_vip_nb);
+}
  }
  
  /* It's possible that parsing VIPs fails.  Update the lb->n_vips to the

@@ -639,6 +692,7 @@ ovn_northd_lb_create(const struct nbrec_load_balancer 
*nbrec_lb,
   */
  lb->n_vips = n_vips;
  
+

  if (nbrec_lb->n_selection_fields) {
  char *proto = NULL;
  if 

Re: [ovs-dev] [PATCH ovn v2 3/8] northd: Add initial I-P for load balancer and load balancer groups

2023-07-07 Thread Mark Michelson

Hi Numan,

I have one small nit below.

On 7/7/23 01:53, num...@ovn.org wrote:

From: Numan Siddique 

Any changes to load balancers and load balancer groups
are handled incrementally in the newly added 'northd_lb_data'
engine node.  'northd_lb_data' is input to 'northd' node
and the handler - northd_northd_lb_data_handler in 'northd'
node handles the changes.

If a load balancer or load balancer group is associated to
a logical switch or router then 'northd' node falls back
to full recompute.

Signed-off-by: Numan Siddique 
---
  lib/lb.c   |  85 
  lib/lb.h   |   9 ++
  northd/en-northd-lb-data.c | 202 +++--
  northd/en-northd-lb-data.h |  37 +++
  northd/en-northd.c |  24 +
  northd/en-northd.h |   1 +
  northd/inc-proc-northd.c   |  13 ++-
  northd/northd.c|  77 ++
  northd/northd.h|   7 ++
  tests/ovn-northd.at| 123 ++
  10 files changed, 545 insertions(+), 33 deletions(-)

diff --git a/lib/lb.c b/lib/lb.c
index 429dbf15af..a51fe225fa 100644
--- a/lib/lb.c
+++ b/lib/lb.c
@@ -606,13 +606,13 @@ ovn_lb_get_health_check(const struct nbrec_load_balancer 
*nbrec_lb,
  return NULL;
  }
  
-struct ovn_northd_lb *

-ovn_northd_lb_create(const struct nbrec_load_balancer *nbrec_lb)
+static void
+ovn_northd_lb_init(struct ovn_northd_lb *lb,
+   const struct nbrec_load_balancer *nbrec_lb)
  {
  bool template = smap_get_bool(_lb->options, "template", false);
  bool is_udp = nullable_string_is_equal(nbrec_lb->protocol, "udp");
  bool is_sctp = nullable_string_is_equal(nbrec_lb->protocol, "sctp");
-struct ovn_northd_lb *lb = xzalloc(sizeof *lb);
  int address_family = !strcmp(smap_get_def(_lb->options,
"address-family", "ipv4"),
   "ipv4")
@@ -668,6 +668,10 @@ ovn_northd_lb_create(const struct nbrec_load_balancer 
*nbrec_lb)
"reject", false);
  ovn_northd_lb_vip_init(lb_vip_nb, lb_vip, nbrec_lb,
 node->key, node->value, template);
+if (lb_vip_nb->lb_health_check) {
+lb->health_checks = true;
+}
+
  if (IN6_IS_ADDR_V4MAPPED(_vip->vip)) {
  sset_add(>ips_v4, lb_vip->vip_str);
  } else {
@@ -713,6 +717,13 @@ ovn_northd_lb_create(const struct nbrec_load_balancer 
*nbrec_lb)
  ds_chomp(_fields, ',');
  lb->selection_fields = ds_steal_cstr(_fields);
  }
+}
+
+struct ovn_northd_lb *
+ovn_northd_lb_create(const struct nbrec_load_balancer *nbrec_lb)
+{
+struct ovn_northd_lb *lb = xzalloc(sizeof *lb);
+ovn_northd_lb_init(lb, nbrec_lb);
  return lb;
  }
  
@@ -738,8 +749,8 @@ ovn_northd_lb_get_vips(const struct ovn_northd_lb *lb)

  return >nlb->vips;
  }
  
-void

-ovn_northd_lb_destroy(struct ovn_northd_lb *lb)
+static void
+ovn_northd_lb_cleanup(struct ovn_northd_lb *lb)
  {
  for (size_t i = 0; i < lb->n_vips; i++) {
  ovn_lb_vip_destroy(>vips[i]);
@@ -747,26 +758,36 @@ ovn_northd_lb_destroy(struct ovn_northd_lb *lb)
  }
  free(lb->vips);
  free(lb->vips_nb);
+lb->vips = NULL;
+lb->vips_nb = NULL;
  smap_destroy(>template_vips);
  sset_destroy(>ips_v4);
  sset_destroy(>ips_v6);
  free(lb->selection_fields);
+lb->selection_fields = NULL;
+lb->health_checks = false;
+}
+
+void
+ovn_northd_lb_destroy(struct ovn_northd_lb *lb)
+{
+ovn_northd_lb_cleanup(lb);
  free(lb);
  }
  
-/* Constructs a new 'struct ovn_lb_group' object from the Nb LB Group record

- * and a hash map of all existing 'struct ovn_northd_lb' objects.  Space will
- * be allocated for 'max_ls_datapaths' logical switches and 'max_lr_datapaths'
- * logical routers to which this LB Group is applied.  Can be filled later
- * with ovn_lb_group_add_ls() and ovn_lb_group_add_lr() respectively. */
-struct ovn_lb_group *
-ovn_lb_group_create(const struct nbrec_load_balancer_group *nbrec_lb_group,
-const struct hmap *lbs)
+void
+ovn_northd_lb_reinit(struct ovn_northd_lb *lb,
+ const struct nbrec_load_balancer *nbrec_lb)
  {
-struct ovn_lb_group *lb_group;
+ovn_northd_lb_cleanup(lb);
+ovn_northd_lb_init(lb, nbrec_lb);
+}
  
-lb_group = xzalloc(sizeof *lb_group);

-lb_group->uuid = nbrec_lb_group->header_.uuid;
+static void
+ovn_lb_group_init(struct ovn_lb_group *lb_group,
+  const struct nbrec_load_balancer_group *nbrec_lb_group,
+  const struct hmap *lbs)
+{
  lb_group->n_lbs = nbrec_lb_group->n_load_balancer;
  lb_group->lbs = xmalloc(lb_group->n_lbs * sizeof *lb_group->lbs);
  lb_group->lb_ips = ovn_lb_ip_set_create();
@@ -776,10 +797,30 @@ ovn_lb_group_create(const struct 
nbrec_load_balancer_group *nbrec_lb_group,
  

Re: [ovs-dev] [PATCH v9 4/4] userspace: Enable L4 checksum offloading by default.

2023-07-07 Thread Flavio Leitner

Please ignore this email.
fbl

On 7/7/23 16:21, Flavio Leitner wrote:

From: Ilya Maximets 

On 11/24/22 06:30, Mike Pattrick wrote:

From: Flavio Leitner 

The netdev receiving packets is supposed to provide the flags
indicating if the L4 checksum was verified and it is OK or BAD,
otherwise the stack will check when appropriate by software.

If the packet comes with good checksum, then postpone the
checksum calculation to the egress device if needed.

When encapsulate a packet with that flag, set the checksum
of the inner L4 header since that is not yet supported.

Calculate the L4 checksum when the packet is going to be sent
over a device that doesn't support the feature.

Linux tap devices allows enabling L3 and L4 offload, so this
patch enables the feature. However, Linux socket interface
remains disabled because the API doesn't allow enabling
those two features without enabling TSO too.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---

Didn't test this as well.  Only visual review.

Should we enable checksum offloading in CONFIGURE_VETH_OFFLOADS for
check-system-userspace testsuite since support is enabled by default?

More comments inline.

Best regards, Ilya Maximets.


  lib/conntrack.c |  15 +--
  lib/dp-packet.c |  25 
  lib/dp-packet.h |  78 -
  lib/flow.c  |  23 
  lib/netdev-dpdk.c   | 188 --
  lib/netdev-linux.c  | 252 ++--
  lib/netdev-native-tnl.c |  32 +
  lib/netdev.c|  46 ++--
  lib/packets.c   | 175 ++--
  lib/packets.h   |   3 +
  10 files changed, 580 insertions(+), 257 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 12194cce8..57e6a55e0 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2118,13 +2118,12 @@ conn_key_extract(struct conntrack *ct, struct dp_packet 
*pkt, ovs_be16 dl_type,
  }
  
  if (ok) {

-bool hwol_bad_l4_csum = dp_packet_l4_checksum_bad(pkt);
-if (!hwol_bad_l4_csum) {
-bool  hwol_good_l4_csum = dp_packet_l4_checksum_good(pkt)
-  || dp_packet_hwol_tx_l4_checksum(pkt);
+if (!dp_packet_l4_checksum_bad(pkt)) {
  /* Validate the checksum only when hwol is not supported. */
  if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
-   >icmp_related, l3, !hwol_good_l4_csum,
+   >icmp_related, l3,
+   !dp_packet_l4_checksum_good(pkt) &&
+   !dp_packet_hwol_tx_l4_checksum(pkt),
 NULL)) {
  ctx->hash = conn_key_hash(>key, ct->hash_basis);
  return true;
@@ -3453,8 +3452,10 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
conn_lookup_ctx *ctx,
  adj_seqnum(>tcp_seq, ec->seq_skew);
  }
  
-th->tcp_csum = 0;

-if (!dp_packet_hwol_tx_l4_checksum(pkt)) {
+if (dp_packet_hwol_tx_l4_checksum(pkt)) {
+dp_packet_ol_reset_l4_csum_good(pkt);
+} else {
+th->tcp_csum = 0;
  if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
  th->tcp_csum = packet_csum_upperlayer6(nh6, th, ctx->key.nw_proto,
 dp_packet_l4_size(pkt));
diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 90ef85de3..2cfaf5274 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -38,6 +38,9 @@ dp_packet_init__(struct dp_packet *b, size_t allocated, enum 
dp_packet_source so
  dp_packet_init_specific(b);
  /* By default assume the packet type to be Ethernet. */
  b->packet_type = htonl(PT_ETH);
+/* Reset csum start and offset. */
+b->csum_start = 0;
+b->csum_offset = 0;
  }
  
  static void

@@ -544,4 +547,26 @@ dp_packet_ol_send_prepare(struct dp_packet *p, const 
uint64_t flags)
  dp_packet_ol_set_ip_csum_good(p);
  dp_packet_hwol_reset_tx_ip_csum(p);
  }
+
+if (dp_packet_l4_checksum_good(p) || !dp_packet_hwol_tx_l4_checksum(p)) {
+dp_packet_hwol_reset_tx_l4_csum(p);
+return;
+}
+
+if (dp_packet_hwol_l4_is_tcp(p)
+&& !(flags & NETDEV_TX_OFFLOAD_TCP_CKSUM)) {
+packet_tcp_complete_csum(p);
+dp_packet_ol_set_l4_csum_good(p);
+dp_packet_hwol_reset_tx_l4_csum(p);
+} else if (dp_packet_hwol_l4_is_udp(p)
+&& !(flags & NETDEV_TX_OFFLOAD_UDP_CKSUM)) {

Indentation.


+packet_udp_complete_csum(p);
+dp_packet_ol_set_l4_csum_good(p);
+dp_packet_hwol_reset_tx_l4_csum(p);
+} else if (!(flags & NETDEV_TX_OFFLOAD_SCTP_CKSUM)
+&& dp_packet_hwol_l4_is_sctp(p)) {

Indentation.


+packet_sctp_complete_csum(p);
+dp_packet_ol_set_l4_csum_good(p);
+dp_packet_hwol_reset_tx_l4_csum(p);
+}
  }
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index f60618716..d550b099c 

Re: [ovs-dev] [PATCH v9 4/4] userspace: Enable L4 checksum offloading by default.

2023-07-07 Thread Flavio Leitner
From: Ilya Maximets 

On 11/24/22 06:30, Mike Pattrick wrote:
> From: Flavio Leitner 
> 
> The netdev receiving packets is supposed to provide the flags
> indicating if the L4 checksum was verified and it is OK or BAD,
> otherwise the stack will check when appropriate by software.
> 
> If the packet comes with good checksum, then postpone the
> checksum calculation to the egress device if needed.
> 
> When encapsulate a packet with that flag, set the checksum
> of the inner L4 header since that is not yet supported.
> 
> Calculate the L4 checksum when the packet is going to be sent
> over a device that doesn't support the feature.
> 
> Linux tap devices allows enabling L3 and L4 offload, so this
> patch enables the feature. However, Linux socket interface
> remains disabled because the API doesn't allow enabling
> those two features without enabling TSO too.
> 
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> ---

Didn't test this as well.  Only visual review.

Should we enable checksum offloading in CONFIGURE_VETH_OFFLOADS for
check-system-userspace testsuite since support is enabled by default?

More comments inline.

Best regards, Ilya Maximets.

>  lib/conntrack.c |  15 +--
>  lib/dp-packet.c |  25 
>  lib/dp-packet.h |  78 -
>  lib/flow.c  |  23 
>  lib/netdev-dpdk.c   | 188 --
>  lib/netdev-linux.c  | 252 ++--
>  lib/netdev-native-tnl.c |  32 +
>  lib/netdev.c|  46 ++--
>  lib/packets.c   | 175 ++--
>  lib/packets.h   |   3 +
>  10 files changed, 580 insertions(+), 257 deletions(-)
> 
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 12194cce8..57e6a55e0 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -2118,13 +2118,12 @@ conn_key_extract(struct conntrack *ct, struct 
> dp_packet *pkt, ovs_be16 dl_type,
>  }
>  
>  if (ok) {
> -bool hwol_bad_l4_csum = dp_packet_l4_checksum_bad(pkt);
> -if (!hwol_bad_l4_csum) {
> -bool  hwol_good_l4_csum = dp_packet_l4_checksum_good(pkt)
> -  || dp_packet_hwol_tx_l4_checksum(pkt);
> +if (!dp_packet_l4_checksum_bad(pkt)) {
>  /* Validate the checksum only when hwol is not supported. */
>  if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
> -   >icmp_related, l3, !hwol_good_l4_csum,
> +   >icmp_related, l3,
> +   !dp_packet_l4_checksum_good(pkt) &&
> +   !dp_packet_hwol_tx_l4_checksum(pkt),
> NULL)) {
>  ctx->hash = conn_key_hash(>key, ct->hash_basis);
>  return true;
> @@ -3453,8 +3452,10 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
> conn_lookup_ctx *ctx,
>  adj_seqnum(>tcp_seq, ec->seq_skew);
>  }
>  
> -th->tcp_csum = 0;
> -if (!dp_packet_hwol_tx_l4_checksum(pkt)) {
> +if (dp_packet_hwol_tx_l4_checksum(pkt)) {
> +dp_packet_ol_reset_l4_csum_good(pkt);
> +} else {
> +th->tcp_csum = 0;
>  if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>  th->tcp_csum = packet_csum_upperlayer6(nh6, th, 
> ctx->key.nw_proto,
> dp_packet_l4_size(pkt));
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index 90ef85de3..2cfaf5274 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -38,6 +38,9 @@ dp_packet_init__(struct dp_packet *b, size_t allocated, 
> enum dp_packet_source so
>  dp_packet_init_specific(b);
>  /* By default assume the packet type to be Ethernet. */
>  b->packet_type = htonl(PT_ETH);
> +/* Reset csum start and offset. */
> +b->csum_start = 0;
> +b->csum_offset = 0;
>  }
>  
>  static void
> @@ -544,4 +547,26 @@ dp_packet_ol_send_prepare(struct dp_packet *p, const 
> uint64_t flags)
>  dp_packet_ol_set_ip_csum_good(p);
>  dp_packet_hwol_reset_tx_ip_csum(p);
>  }
> +
> +if (dp_packet_l4_checksum_good(p) || !dp_packet_hwol_tx_l4_checksum(p)) {
> +dp_packet_hwol_reset_tx_l4_csum(p);
> +return;
> +}
> +
> +if (dp_packet_hwol_l4_is_tcp(p)
> +&& !(flags & NETDEV_TX_OFFLOAD_TCP_CKSUM)) {
> +packet_tcp_complete_csum(p);
> +dp_packet_ol_set_l4_csum_good(p);
> +dp_packet_hwol_reset_tx_l4_csum(p);
> +} else if (dp_packet_hwol_l4_is_udp(p)
> +&& !(flags & NETDEV_TX_OFFLOAD_UDP_CKSUM)) {

Indentation.

> +packet_udp_complete_csum(p);
> +dp_packet_ol_set_l4_csum_good(p);
> +dp_packet_hwol_reset_tx_l4_csum(p);
> +} else if (!(flags & NETDEV_TX_OFFLOAD_SCTP_CKSUM)
> +&& dp_packet_hwol_l4_is_sctp(p)) {

Indentation.

> +packet_sctp_complete_csum(p);
> +dp_packet_ol_set_l4_csum_good(p);
> +

[ovs-dev] [OVSCONF] OVS+OVS '23: Call for Participation

2023-07-07 Thread Michael Santana
Hello everyone!

We are happy to announce this year's Open vSwitch and OVN conference!
The conference will be on November 1-2, 2023 and will be a fully virtual event.

We are seeking long and short ("lightning") talks on topics related to
Open vSwitch and OVN.  We expect long talks to last 25 minutes with an
additional 5 minutes for questions, and short talks to last 10 minutes.

We will use an online format.  Talks will be pre-recorded and played
during the conference.  During the playback, text chat will give the
attendees and the presenters an opportunity to interact.  During the
question period, the presenters will answer questions live over audio
(and video, at the presenters' discretion).  Video will also be
available for offline viewing.


How to propose a talk
-

We are soliciting proposals for full talks and short ("lightning")
talks in a single round.  You may propose a talk as a full talk, a
lightning talk, or for either one at the committee's discretion.

We will also accept proposals for panel discussions.  Please submit
them as full talks and make it clear in the description that it is a
panel.

Please submit proposals to ovs...@openvswitch.org by September 15.
Proposals should include:

* Title and abstract.

* Speaker names, email addresses, and affiliations.

* Whether you are proposing a full talk or a lightning talk or
  either one at the committee's discretion.

Speakers will be notified of acceptance by September 29.


How to attend
-

General registration is not yet open.  We will announce when it opens.
The event will be fully virtual. Will be available after registering.

At the moment we are unsure about whether attendance will
be free of cost this year. We will have a follow up soon on this.



Topics that may be considered, among others, include:
-

 * The future of Open vSwitch (e.g., AF_XDP, P4, eBPF).

 * NAT, DPI, and stateful processing with Open vSwitch.

 * Deploying and using OVN.

 * Testing and scaling OVN.

 * NIC acceleration of Open vSwitch.

 * Using Open vSwitch to realize NFV and service chaining.

 * Porting Open vSwitch to new operating systems, hypervisors,
   or container systems.

 * Integrating Open vSwitch into larger systems.

 * Troubleshooting and debugging Open vSwitch installations.

 * Open vSwitch development and testing practices.

 * Performance measurements or approaches to improving
   performance.

 * End-user or service provider experiences with Open vSwitch.

 * Hardware ports of Open vSwitch (existing, in progress, or
   speculative).

 * The relationship between OpenFlow and Open vSwitch.

 * Using, developing, or administering OpenFlow controllers in
   conjunction with Open vSwitch.

 * Comparisons to other implementations of features found in
   Open vSwitch (e.g. other OpenFlow implementations, other
   software switches, etc.).

 * Increasing the size and diversity of the Open vSwitch user
   and developer base.

 * Tutorials and walkthroughs of use cases.

 * Demos.


More information

To reach the organizers, email ovs...@openvswitch.org.  For general
discussion of the conference, please use the ovs-discuss mailing list
at ovs-disc...@openvswitch.org


Thank you,
Conference Team

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] netdev-dpdk: Drop TSO in case of conflicting virtio features.

2023-07-07 Thread Ilya Maximets
On 7/7/23 17:18, Maxime Coquelin wrote:
> 
> 
> On 7/7/23 15:59, David Marchand wrote:
>> At some point in OVS history, some virtio features were announced as
>> supported (ECN and UFO virtio features).
>>
>> The userspace TSO code, which has been added later, does not support
>> those features and tries to disable them.
>>
>> This breaks OVS upgrades: if an existing VM already negotiated such
>> features, their lack on reconnection to an upgraded OVS triggers a
>> vhost socket disconnection by Qemu.
>> This results in an endless loop because Qemu then retries with the same
>> set of virtio features.
>>
>> This patch proposes to try and detect those vhost socket disconnection
>> and fallback restoring the old virtio features (and disabling TSO for this
>> vhost port).
>>
>> Acked-by: Mike Pattrick 
>> Acked-by: Simon Horman 
>> Signed-off-by: David Marchand 
>> ---
>> Changelog since v5:
>> - fixed coding style,
>>
>> Changelog since v4:
>> - I kept acks as the logic behind the state machine did not change much,
>> - fixed indent of enumeration in documentation,
>> - used status: in documentation instead of grep -o,
>> - renamed "disabled_tso" as "userspace-tso",
>> - switched to a state machine with flags,
>> - removed note on byte padding in netdev_dpdk struct,
>>
>> Changelog since v3:
>> - updated documentation now that the interface offloads status is reported
>>in ovsdb,
>> - fixed one coding style issue,
>>
>> Changelog since v2:
>> - reported workaround presence in the ovsdb port status field and
>>updated documentation accordingly,
>> - tried to use "better" names, to distinguish ECN virtio feature from
>>TSO OVS netdev feature,
>>
>> Changelog since v1:
>> - added a note in the documentation,
>> - fixed vhost unregister trigger (so that both disabling and re-enabling
>>TSO is handled),
>> - cleared netdev features when disabling TSO,
>> - changed level and ratelimited log message on vhost socket disconnect,
>>
>> ---
>>   Documentation/topics/userspace-tso.rst |  26 ++-
>>   lib/netdev-dpdk.c  | 100 -
>>   2 files changed, 120 insertions(+), 6 deletions(-)
>>
> 
> Acked-by: Maxime Coquelin 

Thanks, David, Mike, Simon and Maxime!

Applied.

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: add drop action

2023-07-07 Thread Ilya Maximets
On 7/7/23 17:29, Ilya Maximets wrote:
> On 7/7/23 17:00, Jakub Kicinski wrote:
>> On Fri, 7 Jul 2023 12:30:38 +0200 Ilya Maximets wrote:
>>> A wild idea:  How about we do not define actual reasons?  i.e. define a
>>> subsystem and just call kfree_skb_reason(skb, SUBSYSTEM | value), where
>>> 'value' is whatever userspace gives as long as it is within a subsystem
>>> range?
>>
>> That already exists, right? Johannes added it in the last release for WiFi.
> 
> I'm not sure.  The SKB_DROP_REASON_SUBSYS_MAC80211_UNUSABLE behaves similarly
> to that on a surface.  However, looking closer, any value that can be passed
> into ieee80211_rx_handlers_result() and ends up in the kfree_skb_reason() is
> kind of defined in net/mac80211/drop.h, unless I'm missing something (very
> possible, because I don't really know wifi code).
> 
> The difference, I guess, is that for openvswitch values will be provided by
> the userpsace application via netlink interface.  It'll be just a number not
> defined anywhere in the kernel.  Only the subsystem itself will be defined
> in order to occupy the range.  Garbage in, same garbage out, from the kernel's
> perspective.

To be clear, I think, not defining them in this particular case is better.
Definition of every reason that userspace can come up with will add extra
uAPI maintenance cost/issues with no practical benefits.  Values are not
going to be used for anything outside reporting a drop reason and subsystem
offset is not part of uAPI anyway.

> 
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 1/4] selftests: openvswitch: add an initial flow programming case

2023-07-07 Thread Adrian Moreno



On 6/28/23 18:27, Aaron Conole wrote:

The openvswitch self-tests can test much of the control side of
the module (ie: what a vswitchd implementation would process),
but the actual packet forwarding cases aren't supported, making
the testing of limited value.

Add some flow parsing and an initial ARP based test case using
arping utility.  This lets us display flows, add some basic
output flows with simple matches, and test against a known good
forwarding case.

Signed-off-by: Aaron Conole 
---
NOTE: 3 lines flag the line-length checkpatch warning, but there didn't
   seem to bea good way of breaking the lines smaller for 2 of them.
   The third would still flag, even if broken at what looks like a
   good point to break it.

  .../selftests/net/openvswitch/openvswitch.sh  |  51 +++
  .../selftests/net/openvswitch/ovs-dpctl.py| 408 ++
  2 files changed, 459 insertions(+)

diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh 
b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index 3117a4be0cd04..5cdacb3c8c925 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -11,6 +11,7 @@ VERBOSE=0
  TRACING=0
  
  tests="

+   arp_pingeth-arp: Basic arp ping between 
two NS
netlink_checks  ovsnl: validate netlink attrs 
and settings
upcall_interfaces   ovs: test the upcall interfaces"
  
@@ -127,6 +128,16 @@ ovs_add_netns_and_veths () {

return 0
  }
  
+ovs_add_flow () {

+   info "Adding flow to DP: sbx:$1 br:$2 flow:$3 act:$4"
+   ovs_sbx "$1" python3 $ovs_base/ovs-dpctl.py add-flow "$2" "$3" "$4"
+   if [ $? -ne 0 ]; then
+   echo "Flow [ $3 : $4 ] failed" >> ${ovs_dir}/debug.log
+   return 1
+   fi
+   return 0
+}
+
  usage() {
echo
echo "$0 [OPTIONS] [TEST]..."
@@ -141,6 +152,46 @@ usage() {
exit 1
  }
  
+# arp_ping test

+# - client has 1500 byte MTU
+# - server has 1500 byte MTU
+# - send ARP ping between two ns
+test_arp_ping () {
+
+   which arping >/dev/null 2>&1 || return $ksft_skip
+
+   sbx_add "test_arp_ping" || return $?
+
+   ovs_add_dp "test_arp_ping" arpping || return 1
+
+   info "create namespaces"
+   for ns in client server; do
+   ovs_add_netns_and_veths "test_arp_ping" "arpping" "$ns" \
+   "${ns:0:1}0" "${ns:0:1}1" || return 1
+   done
+
+   # Setup client namespace
+   ip netns exec client ip addr add 172.31.110.10/24 dev c1
+   ip netns exec client ip link set c1 up
+   HW_CLIENT=`ip netns exec client ip link show dev c1 | grep -E 
'link/ether [0-9a-f:]+' | awk '{print $2;}'`
+   info "Client hwaddr: $HW_CLIENT"
+
+   # Setup server namespace
+   ip netns exec server ip addr add 172.31.110.20/24 dev s1
+   ip netns exec server ip link set s1 up
+   HW_SERVER=`ip netns exec server ip link show dev s1 | grep -E 
'link/ether [0-9a-f:]+' | awk '{print $2;}'`
+   info "Server hwaddr: $HW_SERVER"
+
+   ovs_add_flow "test_arp_ping" arpping \
+   
"in_port(1),eth(),eth_type(0x0806),arp(sip=172.31.110.10,tip=172.31.110.20,sha=$HW_CLIENT,tha=ff:ff:ff:ff:ff:ff)"
 '2' || return 1
+   ovs_add_flow "test_arp_ping" arpping \
+   "in_port(2),eth(),eth_type(0x0806),arp()" '1' || return 1
+
+   ovs_sbx "test_arp_ping" ip netns exec client arping -I c1 172.31.110.20 
-c 1 || return 1
+
+   return 0
+}
+
  # netlink_validation
  # - Create a dp
  # - check no warning with "old version" simulation
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 1c8b36bc15d48..799bfb3064b90 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -9,9 +9,12 @@ import errno
  import ipaddress
  import logging
  import multiprocessing
+import re
  import struct
  import sys
  import time
+import types
+import uuid
  
  try:

  from pyroute2 import NDB
@@ -59,6 +62,104 @@ def macstr(mac):
  return outstr
  
  
+def strspn(str1, str2):

+tot = 0
+for char in str1:
+if str2.find(char) == -1:
+return tot
+tot += 1
+return tot
+
+
+def intparse(statestr, defmask="0x"):
+totalparse = strspn(statestr, "0123456789abcdefABCDEFx/")
+# scan until "/"
+count = strspn(statestr, "x0123456789abcdefABCDEF")
+
+firstnum = statestr[:count]
+if firstnum[-1] == "/":
+firstnum = firstnum[:-1]
+k = int(firstnum, 0)
+
+m = None
+if defmask is not None:
+secondnum = defmask
+if statestr[count] == "/":
+secondnum = statestr[count + 1 :]  # this is wrong...
+m = int(secondnum, 0)
+
+return statestr[totalparse + 1 :], k, m
+
+
+def 

Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Ilya Maximets
On 7/7/23 16:50, Eelco Chaudron wrote:
> 
> 
> On 7 Jul 2023, at 16:48, Ilya Maximets wrote:
> 
>> On 7/7/23 16:42, Eelco Chaudron wrote:
>>>
>>>
>>> On 7 Jul 2023, at 16:06, Ilya Maximets wrote:
>>>
 On 7/7/23 15:51, Eric Garver wrote:
> On Fri, Jul 07, 2023 at 02:09:49PM +0200, Eelco Chaudron wrote:
>>
>>
>> On 7 Jul 2023, at 13:08, Ilya Maximets wrote:
>>
>>> On 7/5/23 21:47, Eelco Chaudron wrote:


 On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:

 On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
 >
 >
 > On 30 Jun 2023, at 21:05, Eric Garver wrote:
 > Hi Eric,
 >
 > I started reviewing the series, and this test was failing every 
 other run for me on ‘check-system-userspace’. I ended up making the 
 following additional change:
 >
 > diff --git a/tests/ofproto-macros.at  
 b/tests/ofproto-macros.at 
 > index d2e6ac768..573ecdd0f 100644
 > --- a/tests/ofproto-macros.at 
 > +++ b/tests/ofproto-macros.at 
 > @@ -120,7 +120,7 @@ strip_xids () {
 >
 >  # Changes all 'used:...' to say 'used:0.0', to make output 
 easier to compare.
 >  strip_used () {
 > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
 > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
 >  }

 In this test case I expect the flow to be used. My local test runs 
 do
 not yield "never".

 Maybe we need a "ovs-appctl time/warp 5000" before dumping the 
 flow?
 This is used in tests/drop-stats.at  .

 Tried that but did not work, if you look at the actual number of 
 packets received by the flow entry, it varies quite a lot from 0 (only 
 learning) to 5 when I run the test multiple times.

 I tried changing the following to generate more traffic:

 -dnl generate some traffic
 -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
 [ignore])
 +dnl Generate some traffic.
 +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
 [ignore])

 And then run it 100 times, and the earlier problem did not happen:

   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
 TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'

 However, I have some times it fails with no traffic:


 --- - 2023-07-05 21:46:59.860127196 +0200
 +++ 
 /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
  2023-07-05 21:46:59.857641231 +0200
 @@ -1,2 +1 @@
 -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), 
 packets:0, bytes:0, used:0.0s, actions:drop
  
 Maybe this test needs some more love ;)
>>>
>>> The test likely needs a revalidator/wait call, since you're
>>> checking datapath flows.  Otherwise, it'll be time sensitive.
>>> Stats are gathered during revalidation.
>>
>> That was might thought initially, and I tried it (no luck), but it’s dp 
>> flows, not openflow.
>
> revalidator/wait still fails.
>
> The only thing I've found that makes this stable is the sleep below.
>
> --->8---
>
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 50d6c2d05e5d..ab893c5c696b 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -2068,6 +2068,9 @@ AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
>
>  dnl Generate some traffic.
>  NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
> [ignore])
> +dnl sleep to let the flow hit the datapath
> +sleep 5
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
> [ignore])
>
>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "eth_type(0x0800)" | dnl
>sed 
> 's/,packet_type(ns=[[0-9]]*,id=[[0-9]]*),/,/;s/,eth(),/,/;' | dnl
>

 Yeah, revalidator doesn't matter if we're talking about datapath flows, 
 true.
 The only thing that matters is how fast the kernel makes counter visible to
 userpsace.  It may not happen on very short time slots.

 However, waiting for 5 seconds is not a good solution.  If waiting is the 
 only
 option, we should use the OVS_WAIT_UNTIL_EQUAL instead.
>>>
>>> With the following ping command:
>>>
>>>   NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1],
>>>
>>> I never saw the 

Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: add drop action

2023-07-07 Thread Ilya Maximets
On 7/7/23 17:00, Jakub Kicinski wrote:
> On Fri, 7 Jul 2023 12:30:38 +0200 Ilya Maximets wrote:
>> A wild idea:  How about we do not define actual reasons?  i.e. define a
>> subsystem and just call kfree_skb_reason(skb, SUBSYSTEM | value), where
>> 'value' is whatever userspace gives as long as it is within a subsystem
>> range?
> 
> That already exists, right? Johannes added it in the last release for WiFi.

I'm not sure.  The SKB_DROP_REASON_SUBSYS_MAC80211_UNUSABLE behaves similarly
to that on a surface.  However, looking closer, any value that can be passed
into ieee80211_rx_handlers_result() and ends up in the kfree_skb_reason() is
kind of defined in net/mac80211/drop.h, unless I'm missing something (very
possible, because I don't really know wifi code).

The difference, I guess, is that for openvswitch values will be provided by
the userpsace application via netlink interface.  It'll be just a number not
defined anywhere in the kernel.  Only the subsystem itself will be defined
in order to occupy the range.  Garbage in, same garbage out, from the kernel's
perspective.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] netdev-dpdk: Drop TSO in case of conflicting virtio features.

2023-07-07 Thread Maxime Coquelin




On 7/7/23 15:59, David Marchand wrote:

At some point in OVS history, some virtio features were announced as
supported (ECN and UFO virtio features).

The userspace TSO code, which has been added later, does not support
those features and tries to disable them.

This breaks OVS upgrades: if an existing VM already negotiated such
features, their lack on reconnection to an upgraded OVS triggers a
vhost socket disconnection by Qemu.
This results in an endless loop because Qemu then retries with the same
set of virtio features.

This patch proposes to try and detect those vhost socket disconnection
and fallback restoring the old virtio features (and disabling TSO for this
vhost port).

Acked-by: Mike Pattrick 
Acked-by: Simon Horman 
Signed-off-by: David Marchand 
---
Changelog since v5:
- fixed coding style,

Changelog since v4:
- I kept acks as the logic behind the state machine did not change much,
- fixed indent of enumeration in documentation,
- used status: in documentation instead of grep -o,
- renamed "disabled_tso" as "userspace-tso",
- switched to a state machine with flags,
- removed note on byte padding in netdev_dpdk struct,

Changelog since v3:
- updated documentation now that the interface offloads status is reported
   in ovsdb,
- fixed one coding style issue,

Changelog since v2:
- reported workaround presence in the ovsdb port status field and
   updated documentation accordingly,
- tried to use "better" names, to distinguish ECN virtio feature from
   TSO OVS netdev feature,

Changelog since v1:
- added a note in the documentation,
- fixed vhost unregister trigger (so that both disabling and re-enabling
   TSO is handled),
- cleared netdev features when disabling TSO,
- changed level and ratelimited log message on vhost socket disconnect,

---
  Documentation/topics/userspace-tso.rst |  26 ++-
  lib/netdev-dpdk.c  | 100 -
  2 files changed, 120 insertions(+), 6 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1 7/7] netdev-linux: support 64-bit rates in tc policing

2023-07-07 Thread Eelco Chaudron


On 7 Jul 2023, at 17:04, Ilya Maximets wrote:

> On 7/6/23 18:22, Eelco Chaudron wrote:
>>
>>
>> On 2 Jun 2023, at 16:13, Adrian Moreno wrote:
>>
>>> Use TCA_POLICE_RATE64 if the rate cannot be expressed using 32bits.
>>>
>>> This breaks the 32Gbps barrier. The new barrier is ~4Tbps caused by
>>> netdev's API expressing kbps rates using 32-bit integers.
>>>
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137643
>>> Signed-off-by: Adrian Moreno 
>>
>> This patch looks good to me, however, I have one additional general issue 
>> with this series.
>>
>> For example, if the running kernel does not support TCA_POLICE_RATE64, 
>> TCA_HTB_RATE64, or TCA_HTB_CEIL64 but the kernel it was compiled on does. 
>> This will break the implementation. Maybe we need some probe in 
>> netdev_tc_init_flow_api()?
>
> netdev_tc_init_flow_api() will not be called if hw-offload is not enabled.
> But anyway, we only add 64bit arguments if we have to.  i.e. if the value
> fits into 32bit range, 64bit argument will not be used.  That should provide
> required backward compatibility.  If we need 64bit, but kernel doesn't
> support, then we need to fail anyway.  Current code will configure incorrect
> values, there is no need preserving that.
>
>>
>> However looking at the current code we already use TCA_POLICE_PKTRATE64 
>> which is later than TCA_POLICE_RATE64, so we should be good?! If this is 
>> true we also do not need all the ‘#ifdef HAVE_TCA_POLICE_PKTRATE64’ and 
>> related code in the this patch.
>
> kpkts policing in a relatively new feature in OVS, so we do not expect it
> to be used on old kernels.  So, use of TCA_POLICE_PKTRATE64 should be fine.

Ok, I’m fine with both explanations on top of my own research, so:

Acked-by: Eelco Chaudron 

>>
>> The HTB ones were added in 2013, so maybe we are good? If we are we can 
>> remove all the ‘#ifdef HAVE_TCA_HTB_RATE64’ and related code in patch 4.
>>
>> Simon/Ilya any opinions on this?
>>
>> Cheers,
>>
>> Eelco
>>
>>
>>> ---
>>>  acinclude.m4| 10 ++
>>>  lib/netdev-linux.c  | 19 ---
>>>  lib/netdev-linux.h  |  2 +-
>>>  lib/tc.c|  2 ++
>>>  tests/atlocal.in|  1 +
>>>  tests/system-traffic.at | 21 +
>>>  6 files changed, 47 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/acinclude.m4 b/acinclude.m4
>>> index 1e5a9872d..3ac7072b5 100644
>>> --- a/acinclude.m4
>>> +++ b/acinclude.m4
>>> @@ -221,6 +221,16 @@ AC_DEFUN([OVS_CHECK_LINUX_TC], [
>>> [Define to 1 if TCA_HTB_RATE64 is available.])],
>>>  [AC_SUBST(HAVE_TCA_HTB_RATE64,no)]
>>>  )
>>> +
>>> +  AC_COMPILE_IFELSE([
>>> +AC_LANG_PROGRAM([#include ], [
>>> +int x = TCA_POLICE_PKTRATE64;
>>> +])],
>>> +[AC_SUBST(HAVE_TCA_POLICE_PKTRATE64,yes)
>>> + AC_DEFINE([HAVE_TCA_POLICE_PKTRATE64], [1],
>>> +   [Define to 1 if TCA_POLICE_PKTRATE64 is available.])],
>>> +[AC_SUBST(HAVE_TCA_POLICE_PKTRATE64,no)]
>>> +)
>>>  ])
>>>
>>>  dnl OVS_CHECK_LINUX_SCTP_CT
>>> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
>>> index 3526fbfc3..6a9f36f1e 100644
>>> --- a/lib/netdev-linux.c
>>> +++ b/lib/netdev-linux.c
>>> @@ -494,7 +494,7 @@ static struct tcmsg *netdev_linux_tc_make_request(const 
>>> struct netdev *,
>>>unsigned int flags,
>>>struct ofpbuf *);
>>>
>>> -static int tc_add_policer(struct netdev *, uint32_t kbits_rate,
>>> +static int tc_add_policer(struct netdev *, uint64_t kbits_rate,
>>>uint32_t kbits_burst, uint32_t kpkts_rate,
>>>uint32_t kpkts_burst);
>>>
>>> @@ -2660,6 +2660,7 @@ nl_msg_put_act_police(struct ofpbuf *request, 
>>> uint32_t index,
>>>uint64_t pkts_rate, uint64_t pkts_burst,
>>>uint32_t notexceed_act, bool single_action)
>>>  {
>>> +uint64_t bytes_rate = kbits_rate / 8 * 1000;
>>>  size_t offset, act_offset;
>>>  struct tc_police police;
>>>  uint32_t prio = 0;
>>> @@ -2674,8 +2675,13 @@ nl_msg_put_act_police(struct ofpbuf *request, 
>>> uint32_t index,
>>>  nl_msg_act_police_start_nest(request, ++prio, , _offset,
>>>   single_action);
>>>  if (police.rate.rate) {
>>> -tc_put_rtab(request, TCA_POLICE_RATE, , 0);
>>> +tc_put_rtab(request, TCA_POLICE_RATE, , bytes_rate);
>>>  }
>>> +#ifdef HAVE_TCA_POLICE_PKTRATE64
>>> +if (bytes_rate > UINT32_MAX) {
>>> +nl_msg_put_u64(request, TCA_POLICE_RATE64, bytes_rate);
>>> +}
>>> +#endif
>>>  if (pkts_rate) {
>>>  uint64_t pkt_burst_ticks;
>>>  /* Here tc_bytes_to_ticks is used to convert packets rather than 
>>> bytes
>>> @@ -2689,7 +2695,7 @@ nl_msg_put_act_police(struct ofpbuf *request, 
>>> uint32_t index,
>>>  }
>>>
>>>  static int
>>> -tc_add_matchall_policer(struct netdev 

Re: [ovs-dev] [PATCH v1 7/7] netdev-linux: support 64-bit rates in tc policing

2023-07-07 Thread Ilya Maximets
On 7/6/23 18:22, Eelco Chaudron wrote:
> 
> 
> On 2 Jun 2023, at 16:13, Adrian Moreno wrote:
> 
>> Use TCA_POLICE_RATE64 if the rate cannot be expressed using 32bits.
>>
>> This breaks the 32Gbps barrier. The new barrier is ~4Tbps caused by
>> netdev's API expressing kbps rates using 32-bit integers.
>>
>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2137643
>> Signed-off-by: Adrian Moreno 
> 
> This patch looks good to me, however, I have one additional general issue 
> with this series.
> 
> For example, if the running kernel does not support TCA_POLICE_RATE64, 
> TCA_HTB_RATE64, or TCA_HTB_CEIL64 but the kernel it was compiled on does. 
> This will break the implementation. Maybe we need some probe in 
> netdev_tc_init_flow_api()?

netdev_tc_init_flow_api() will not be called if hw-offload is not enabled.
But anyway, we only add 64bit arguments if we have to.  i.e. if the value
fits into 32bit range, 64bit argument will not be used.  That should provide
required backward compatibility.  If we need 64bit, but kernel doesn't
support, then we need to fail anyway.  Current code will configure incorrect
values, there is no need preserving that.

> 
> However looking at the current code we already use TCA_POLICE_PKTRATE64 which 
> is later than TCA_POLICE_RATE64, so we should be good?! If this is true we 
> also do not need all the ‘#ifdef HAVE_TCA_POLICE_PKTRATE64’ and related code 
> in the this patch.

kpkts policing in a relatively new feature in OVS, so we do not expect it
to be used on old kernels.  So, use of TCA_POLICE_PKTRATE64 should be fine.

> 
> The HTB ones were added in 2013, so maybe we are good? If we are we can 
> remove all the ‘#ifdef HAVE_TCA_HTB_RATE64’ and related code in patch 4.
> 
> Simon/Ilya any opinions on this?
> 
> Cheers,
> 
> Eelco
> 
> 
>> ---
>>  acinclude.m4| 10 ++
>>  lib/netdev-linux.c  | 19 ---
>>  lib/netdev-linux.h  |  2 +-
>>  lib/tc.c|  2 ++
>>  tests/atlocal.in|  1 +
>>  tests/system-traffic.at | 21 +
>>  6 files changed, 47 insertions(+), 8 deletions(-)
>>
>> diff --git a/acinclude.m4 b/acinclude.m4
>> index 1e5a9872d..3ac7072b5 100644
>> --- a/acinclude.m4
>> +++ b/acinclude.m4
>> @@ -221,6 +221,16 @@ AC_DEFUN([OVS_CHECK_LINUX_TC], [
>> [Define to 1 if TCA_HTB_RATE64 is available.])],
>>  [AC_SUBST(HAVE_TCA_HTB_RATE64,no)]
>>  )
>> +
>> +  AC_COMPILE_IFELSE([
>> +AC_LANG_PROGRAM([#include ], [
>> +int x = TCA_POLICE_PKTRATE64;
>> +])],
>> +[AC_SUBST(HAVE_TCA_POLICE_PKTRATE64,yes)
>> + AC_DEFINE([HAVE_TCA_POLICE_PKTRATE64], [1],
>> +   [Define to 1 if TCA_POLICE_PKTRATE64 is available.])],
>> +[AC_SUBST(HAVE_TCA_POLICE_PKTRATE64,no)]
>> +)
>>  ])
>>
>>  dnl OVS_CHECK_LINUX_SCTP_CT
>> diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
>> index 3526fbfc3..6a9f36f1e 100644
>> --- a/lib/netdev-linux.c
>> +++ b/lib/netdev-linux.c
>> @@ -494,7 +494,7 @@ static struct tcmsg *netdev_linux_tc_make_request(const 
>> struct netdev *,
>>unsigned int flags,
>>struct ofpbuf *);
>>
>> -static int tc_add_policer(struct netdev *, uint32_t kbits_rate,
>> +static int tc_add_policer(struct netdev *, uint64_t kbits_rate,
>>uint32_t kbits_burst, uint32_t kpkts_rate,
>>uint32_t kpkts_burst);
>>
>> @@ -2660,6 +2660,7 @@ nl_msg_put_act_police(struct ofpbuf *request, uint32_t 
>> index,
>>uint64_t pkts_rate, uint64_t pkts_burst,
>>uint32_t notexceed_act, bool single_action)
>>  {
>> +uint64_t bytes_rate = kbits_rate / 8 * 1000;
>>  size_t offset, act_offset;
>>  struct tc_police police;
>>  uint32_t prio = 0;
>> @@ -2674,8 +2675,13 @@ nl_msg_put_act_police(struct ofpbuf *request, 
>> uint32_t index,
>>  nl_msg_act_police_start_nest(request, ++prio, , _offset,
>>   single_action);
>>  if (police.rate.rate) {
>> -tc_put_rtab(request, TCA_POLICE_RATE, , 0);
>> +tc_put_rtab(request, TCA_POLICE_RATE, , bytes_rate);
>>  }
>> +#ifdef HAVE_TCA_POLICE_PKTRATE64
>> +if (bytes_rate > UINT32_MAX) {
>> +nl_msg_put_u64(request, TCA_POLICE_RATE64, bytes_rate);
>> +}
>> +#endif
>>  if (pkts_rate) {
>>  uint64_t pkt_burst_ticks;
>>  /* Here tc_bytes_to_ticks is used to convert packets rather than 
>> bytes
>> @@ -2689,7 +2695,7 @@ nl_msg_put_act_police(struct ofpbuf *request, uint32_t 
>> index,
>>  }
>>
>>  static int
>> -tc_add_matchall_policer(struct netdev *netdev, uint32_t kbits_rate,
>> +tc_add_matchall_policer(struct netdev *netdev, uint64_t kbits_rate,
>>  uint32_t kbits_burst, uint32_t kpkts_rate,
>>  uint32_t kpkts_burst)
>>  {
>> @@ -5703,9 +5709,8 @@ 

Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: add drop action

2023-07-07 Thread Jakub Kicinski
On Fri, 7 Jul 2023 12:30:38 +0200 Ilya Maximets wrote:
> A wild idea:  How about we do not define actual reasons?  i.e. define a
> subsystem and just call kfree_skb_reason(skb, SUBSYSTEM | value), where
> 'value' is whatever userspace gives as long as it is within a subsystem
> range?

That already exists, right? Johannes added it in the last release for WiFi.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Eelco Chaudron


On 7 Jul 2023, at 16:48, Ilya Maximets wrote:

> On 7/7/23 16:42, Eelco Chaudron wrote:
>>
>>
>> On 7 Jul 2023, at 16:06, Ilya Maximets wrote:
>>
>>> On 7/7/23 15:51, Eric Garver wrote:
 On Fri, Jul 07, 2023 at 02:09:49PM +0200, Eelco Chaudron wrote:
>
>
> On 7 Jul 2023, at 13:08, Ilya Maximets wrote:
>
>> On 7/5/23 21:47, Eelco Chaudron wrote:
>>>
>>>
>>> On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:
>>>
>>> On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
>>> >
>>> >
>>> > On 30 Jun 2023, at 21:05, Eric Garver wrote:
>>> > Hi Eric,
>>> >
>>> > I started reviewing the series, and this test was failing every 
>>> other run for me on ‘check-system-userspace’. I ended up making the 
>>> following additional change:
>>> >
>>> > diff --git a/tests/ofproto-macros.at  
>>> b/tests/ofproto-macros.at 
>>> > index d2e6ac768..573ecdd0f 100644
>>> > --- a/tests/ofproto-macros.at 
>>> > +++ b/tests/ofproto-macros.at 
>>> > @@ -120,7 +120,7 @@ strip_xids () {
>>> >
>>> >  # Changes all 'used:...' to say 'used:0.0', to make output 
>>> easier to compare.
>>> >  strip_used () {
>>> > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
>>> > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
>>> >  }
>>>
>>> In this test case I expect the flow to be used. My local test runs 
>>> do
>>> not yield "never".
>>>
>>> Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
>>> This is used in tests/drop-stats.at  .
>>>
>>> Tried that but did not work, if you look at the actual number of 
>>> packets received by the flow entry, it varies quite a lot from 0 (only 
>>> learning) to 5 when I run the test multiple times.
>>>
>>> I tried changing the following to generate more traffic:
>>>
>>> -dnl generate some traffic
>>> -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
>>> [ignore])
>>> +dnl Generate some traffic.
>>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
>>> [ignore])
>>>
>>> And then run it 100 times, and the earlier problem did not happen:
>>>
>>>   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
>>> TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'
>>>
>>> However, I have some times it fails with no traffic:
>>>
>>>
>>> --- - 2023-07-05 21:46:59.860127196 +0200
>>> +++ 
>>> /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
>>>  2023-07-05 21:46:59.857641231 +0200
>>> @@ -1,2 +1 @@
>>> -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), 
>>> packets:0, bytes:0, used:0.0s, actions:drop
>>>  
>>> Maybe this test needs some more love ;)
>>
>> The test likely needs a revalidator/wait call, since you're
>> checking datapath flows.  Otherwise, it'll be time sensitive.
>> Stats are gathered during revalidation.
>
> That was might thought initially, and I tried it (no luck), but it’s dp 
> flows, not openflow.

 revalidator/wait still fails.

 The only thing I've found that makes this stable is the sleep below.

 --->8---

 diff --git a/tests/system-traffic.at b/tests/system-traffic.at
 index 50d6c2d05e5d..ab893c5c696b 100644
 --- a/tests/system-traffic.at
 +++ b/tests/system-traffic.at
 @@ -2068,6 +2068,9 @@ AT_CHECK([ovs-ofctl add-flows br0 flows.txt])

  dnl Generate some traffic.
  NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
 [ignore])
 +dnl sleep to let the flow hit the datapath
 +sleep 5
 +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
 [ignore])

  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "eth_type(0x0800)" | dnl
sed 's/,packet_type(ns=[[0-9]]*,id=[[0-9]]*),/,/;s/,eth(),/,/;' 
 | dnl

>>>
>>> Yeah, revalidator doesn't matter if we're talking about datapath flows, 
>>> true.
>>> The only thing that matters is how fast the kernel makes counter visible to
>>> userpsace.  It may not happen on very short time slots.
>>>
>>> However, waiting for 5 seconds is not a good solution.  If waiting is the 
>>> only
>>> option, we should use the OVS_WAIT_UNTIL_EQUAL instead.
>>
>> With the following ping command:
>>
>>   NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1],
>>
>> I never saw the unused issue, only the one that showed no flow entry at all. 
>> Maybe with this ping instead and the OVS_WAIT_UNTIL_EQUAL() it might work?
>
> There is no point sending so many 

Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Ilya Maximets
On 7/7/23 16:42, Eelco Chaudron wrote:
> 
> 
> On 7 Jul 2023, at 16:06, Ilya Maximets wrote:
> 
>> On 7/7/23 15:51, Eric Garver wrote:
>>> On Fri, Jul 07, 2023 at 02:09:49PM +0200, Eelco Chaudron wrote:


 On 7 Jul 2023, at 13:08, Ilya Maximets wrote:

> On 7/5/23 21:47, Eelco Chaudron wrote:
>>
>>
>> On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:
>>
>> On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
>> >
>> >
>> > On 30 Jun 2023, at 21:05, Eric Garver wrote:
>> > Hi Eric,
>> >
>> > I started reviewing the series, and this test was failing every 
>> other run for me on ‘check-system-userspace’. I ended up making the 
>> following additional change:
>> >
>> > diff --git a/tests/ofproto-macros.at  
>> b/tests/ofproto-macros.at 
>> > index d2e6ac768..573ecdd0f 100644
>> > --- a/tests/ofproto-macros.at 
>> > +++ b/tests/ofproto-macros.at 
>> > @@ -120,7 +120,7 @@ strip_xids () {
>> >
>> >  # Changes all 'used:...' to say 'used:0.0', to make output easier 
>> to compare.
>> >  strip_used () {
>> > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
>> > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
>> >  }
>>
>> In this test case I expect the flow to be used. My local test runs do
>> not yield "never".
>>
>> Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
>> This is used in tests/drop-stats.at  .
>>
>> Tried that but did not work, if you look at the actual number of packets 
>> received by the flow entry, it varies quite a lot from 0 (only learning) 
>> to 5 when I run the test multiple times.
>>
>> I tried changing the following to generate more traffic:
>>
>> -dnl generate some traffic
>> -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
>> [ignore])
>> +dnl Generate some traffic.
>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
>> [ignore])
>>
>> And then run it 100 times, and the earlier problem did not happen:
>>
>>   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
>> TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'
>>
>> However, I have some times it fails with no traffic:
>>
>>
>> --- - 2023-07-05 21:46:59.860127196 +0200
>> +++ 
>> /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
>>  2023-07-05 21:46:59.857641231 +0200
>> @@ -1,2 +1 @@
>> -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), 
>> packets:0, bytes:0, used:0.0s, actions:drop
>>  
>> Maybe this test needs some more love ;)
>
> The test likely needs a revalidator/wait call, since you're
> checking datapath flows.  Otherwise, it'll be time sensitive.
> Stats are gathered during revalidation.

 That was might thought initially, and I tried it (no luck), but it’s dp 
 flows, not openflow.
>>>
>>> revalidator/wait still fails.
>>>
>>> The only thing I've found that makes this stable is the sleep below.
>>>
>>> --->8---
>>>
>>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>>> index 50d6c2d05e5d..ab893c5c696b 100644
>>> --- a/tests/system-traffic.at
>>> +++ b/tests/system-traffic.at
>>> @@ -2068,6 +2068,9 @@ AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
>>>
>>>  dnl Generate some traffic.
>>>  NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
>>> [ignore])
>>> +dnl sleep to let the flow hit the datapath
>>> +sleep 5
>>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
>>> [ignore])
>>>
>>>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "eth_type(0x0800)" | dnl
>>>sed 's/,packet_type(ns=[[0-9]]*,id=[[0-9]]*),/,/;s/,eth(),/,/;' 
>>> | dnl
>>>
>>
>> Yeah, revalidator doesn't matter if we're talking about datapath flows, true.
>> The only thing that matters is how fast the kernel makes counter visible to
>> userpsace.  It may not happen on very short time slots.
>>
>> However, waiting for 5 seconds is not a good solution.  If waiting is the 
>> only
>> option, we should use the OVS_WAIT_UNTIL_EQUAL instead.
> 
> With the following ping command:
> 
>   NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1],
> 
> I never saw the unused issue, only the one that showed no flow entry at all. 
> Maybe with this ping instead and the OVS_WAIT_UNTIL_EQUAL() it might work?

There is no point sending so many packets.  Just send 3 with 0.3 interval
as other tests do and wait for result with OVS_WAIT_UNTIL_EQUAL.
That should always work.  OVS_WAIT_UNTIL_EQUAL will check multiple times
with short 

Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Eelco Chaudron


On 7 Jul 2023, at 16:06, Ilya Maximets wrote:

> On 7/7/23 15:51, Eric Garver wrote:
>> On Fri, Jul 07, 2023 at 02:09:49PM +0200, Eelco Chaudron wrote:
>>>
>>>
>>> On 7 Jul 2023, at 13:08, Ilya Maximets wrote:
>>>
 On 7/5/23 21:47, Eelco Chaudron wrote:
>
>
> On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:
>
> On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
> >
> >
> > On 30 Jun 2023, at 21:05, Eric Garver wrote:
> > Hi Eric,
> >
> > I started reviewing the series, and this test was failing every 
> other run for me on ‘check-system-userspace’. I ended up making the 
> following additional change:
> >
> > diff --git a/tests/ofproto-macros.at  
> b/tests/ofproto-macros.at 
> > index d2e6ac768..573ecdd0f 100644
> > --- a/tests/ofproto-macros.at 
> > +++ b/tests/ofproto-macros.at 
> > @@ -120,7 +120,7 @@ strip_xids () {
> >
> >  # Changes all 'used:...' to say 'used:0.0', to make output easier 
> to compare.
> >  strip_used () {
> > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
> > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
> >  }
>
> In this test case I expect the flow to be used. My local test runs do
> not yield "never".
>
> Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
> This is used in tests/drop-stats.at  .
>
> Tried that but did not work, if you look at the actual number of packets 
> received by the flow entry, it varies quite a lot from 0 (only learning) 
> to 5 when I run the test multiple times.
>
> I tried changing the following to generate more traffic:
>
> -dnl generate some traffic
> -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
> [ignore])
> +dnl Generate some traffic.
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
> [ignore])
>
> And then run it 100 times, and the earlier problem did not happen:
>
>   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
> TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'
>
> However, I have some times it fails with no traffic:
>
>
> --- - 2023-07-05 21:46:59.860127196 +0200
> +++ 
> /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
>  2023-07-05 21:46:59.857641231 +0200
> @@ -1,2 +1 @@
> -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), 
> packets:0, bytes:0, used:0.0s, actions:drop
>  
> Maybe this test needs some more love ;)

 The test likely needs a revalidator/wait call, since you're
 checking datapath flows.  Otherwise, it'll be time sensitive.
 Stats are gathered during revalidation.
>>>
>>> That was might thought initially, and I tried it (no luck), but it’s dp 
>>> flows, not openflow.
>>
>> revalidator/wait still fails.
>>
>> The only thing I've found that makes this stable is the sleep below.
>>
>> --->8---
>>
>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> index 50d6c2d05e5d..ab893c5c696b 100644
>> --- a/tests/system-traffic.at
>> +++ b/tests/system-traffic.at
>> @@ -2068,6 +2068,9 @@ AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
>>
>>  dnl Generate some traffic.
>>  NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
>> +dnl sleep to let the flow hit the datapath
>> +sleep 5
>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
>>
>>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "eth_type(0x0800)" | dnl
>>sed 's/,packet_type(ns=[[0-9]]*,id=[[0-9]]*),/,/;s/,eth(),/,/;' | 
>> dnl
>>
>
> Yeah, revalidator doesn't matter if we're talking about datapath flows, true.
> The only thing that matters is how fast the kernel makes counter visible to
> userpsace.  It may not happen on very short time slots.
>
> However, waiting for 5 seconds is not a good solution.  If waiting is the only
> option, we should use the OVS_WAIT_UNTIL_EQUAL instead.

With the following ping command:

  NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1],

I never saw the unused issue, only the one that showed no flow entry at all. 
Maybe with this ping instead and the OVS_WAIT_UNTIL_EQUAL() it might work?

>
> Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Ilya Maximets
On 7/7/23 15:51, Eric Garver wrote:
> On Fri, Jul 07, 2023 at 02:09:49PM +0200, Eelco Chaudron wrote:
>>
>>
>> On 7 Jul 2023, at 13:08, Ilya Maximets wrote:
>>
>>> On 7/5/23 21:47, Eelco Chaudron wrote:


 On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:

 On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
 >
 >
 > On 30 Jun 2023, at 21:05, Eric Garver wrote:
 > Hi Eric,
 >
 > I started reviewing the series, and this test was failing every 
 other run for me on ‘check-system-userspace’. I ended up making the 
 following additional change:
 >
 > diff --git a/tests/ofproto-macros.at  
 b/tests/ofproto-macros.at 
 > index d2e6ac768..573ecdd0f 100644
 > --- a/tests/ofproto-macros.at 
 > +++ b/tests/ofproto-macros.at 
 > @@ -120,7 +120,7 @@ strip_xids () {
 >
 >  # Changes all 'used:...' to say 'used:0.0', to make output easier 
 to compare.
 >  strip_used () {
 > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
 > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
 >  }

 In this test case I expect the flow to be used. My local test runs do
 not yield "never".

 Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
 This is used in tests/drop-stats.at  .

 Tried that but did not work, if you look at the actual number of packets 
 received by the flow entry, it varies quite a lot from 0 (only learning) 
 to 5 when I run the test multiple times.

 I tried changing the following to generate more traffic:

 -dnl generate some traffic
 -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
 [ignore])
 +dnl Generate some traffic.
 +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
 [ignore])

 And then run it 100 times, and the earlier problem did not happen:

   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
 TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'

 However, I have some times it fails with no traffic:


 --- - 2023-07-05 21:46:59.860127196 +0200
 +++ 
 /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
  2023-07-05 21:46:59.857641231 +0200
 @@ -1,2 +1 @@
 -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), packets:0, 
 bytes:0, used:0.0s, actions:drop
  
 Maybe this test needs some more love ;)
>>>
>>> The test likely needs a revalidator/wait call, since you're
>>> checking datapath flows.  Otherwise, it'll be time sensitive.
>>> Stats are gathered during revalidation.
>>
>> That was might thought initially, and I tried it (no luck), but it’s dp 
>> flows, not openflow.
> 
> revalidator/wait still fails.
> 
> The only thing I've found that makes this stable is the sleep below.
> 
> --->8---
> 
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 50d6c2d05e5d..ab893c5c696b 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -2068,6 +2068,9 @@ AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
>  
>  dnl Generate some traffic.
>  NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
> +dnl sleep to let the flow hit the datapath
> +sleep 5
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
>  
>  AT_CHECK([ovs-appctl dpctl/dump-flows | grep "eth_type(0x0800)" | dnl
>sed 's/,packet_type(ns=[[0-9]]*,id=[[0-9]]*),/,/;s/,eth(),/,/;' | 
> dnl
> 

Yeah, revalidator doesn't matter if we're talking about datapath flows, true.
The only thing that matters is how fast the kernel makes counter visible to
userpsace.  It may not happen on very short time slots.

However, waiting for 5 seconds is not a good solution.  If waiting is the only
option, we should use the OVS_WAIT_UNTIL_EQUAL instead.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] netdev-dpdk:add set_queue datapath dpdk support

2023-07-07 Thread Ilya Maximets
On 7/6/23 10:27, wangze wrote:
> When doing set_queue action, skb_priority is marked in
> userspace, but there is no further processing in datapath
> dpdk.

Hi.  There actually is further processing for dpdk ports.

The naming is confusing, I understand that, but set_queue OpenFlow
action actually has nothing in common with hardware Tx queues of
the device.  These are QoS queues configured in the database in
QoS table.  And netdev-dpdk makes use of them via Two Rate Three
Color Marker policing (trtcm-policer).  You can find documentation
for it here:
  https://docs.openvswitch.org/en/latest/topics/dpdk/qos/#multi-queue-policer
Or in the ovs-vswitchd.conf.db (5) man page.

The general QoS with set_queue examples can be found here:
  https://docs.openvswitch.org/en/latest/faq/qos/#quality-of-service-qos

In general, we shouldn't overload the meaning of the action.
And there are no high-level OpenFlow fileds or actions that
work with hardware Rx/Tx queues, AFAIK.  And there shouldn't be
as OpenFlow is a much higher abstraction.

> This patch add set_queue support in datapath dpdk,
> sending skbs from specific queues according to the
> skb_priority.
> 
> The enqueue action can be applied to the scenario of
> double-sending arp packets in bond, usage:
> ovs-ofctl add-flow br-ext "priority=1,in_port=LOCAL,arp actions=push_vlan:
> 0x8100,set_field:4397->vlan_vid,set_queue:1,output:bond1,pop_queue,set_queue:2,
> output:bond1,pop_queue" -Oopenflow13

Why do you need to double-send ARP packets?
And why they need to go to different hardware queues?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6] netdev-dpdk: Drop TSO in case of conflicting virtio features.

2023-07-07 Thread David Marchand
At some point in OVS history, some virtio features were announced as
supported (ECN and UFO virtio features).

The userspace TSO code, which has been added later, does not support
those features and tries to disable them.

This breaks OVS upgrades: if an existing VM already negotiated such
features, their lack on reconnection to an upgraded OVS triggers a
vhost socket disconnection by Qemu.
This results in an endless loop because Qemu then retries with the same
set of virtio features.

This patch proposes to try and detect those vhost socket disconnection
and fallback restoring the old virtio features (and disabling TSO for this
vhost port).

Acked-by: Mike Pattrick 
Acked-by: Simon Horman 
Signed-off-by: David Marchand 
---
Changelog since v5:
- fixed coding style,

Changelog since v4:
- I kept acks as the logic behind the state machine did not change much,
- fixed indent of enumeration in documentation,
- used status: in documentation instead of grep -o,
- renamed "disabled_tso" as "userspace-tso",
- switched to a state machine with flags,
- removed note on byte padding in netdev_dpdk struct,

Changelog since v3:
- updated documentation now that the interface offloads status is reported
  in ovsdb,
- fixed one coding style issue,

Changelog since v2:
- reported workaround presence in the ovsdb port status field and
  updated documentation accordingly,
- tried to use "better" names, to distinguish ECN virtio feature from
  TSO OVS netdev feature, 

Changelog since v1:
- added a note in the documentation,
- fixed vhost unregister trigger (so that both disabling and re-enabling
  TSO is handled),
- cleared netdev features when disabling TSO,
- changed level and ratelimited log message on vhost socket disconnect,

---
 Documentation/topics/userspace-tso.rst |  26 ++-
 lib/netdev-dpdk.c  | 100 -
 2 files changed, 120 insertions(+), 6 deletions(-)

diff --git a/Documentation/topics/userspace-tso.rst 
b/Documentation/topics/userspace-tso.rst
index 5a43c2e86b..c4b15f2604 100644
--- a/Documentation/topics/userspace-tso.rst
+++ b/Documentation/topics/userspace-tso.rst
@@ -68,7 +68,7 @@ as follows.
 connection is established, `TSO` is thus advertised to the guest as an
 available feature:
 
-QEMU Command Line Parameter::
+1. QEMU Command Line Parameter::
 
 $ sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
 ...
@@ -77,12 +77,34 @@ QEMU Command Line Parameter::
 ...
 
 2. Ethtool. Assuming that the guest's OS also supports `TSO`, ethtool can be
-used to enable same::
+   used to enable same::
 
 $ ethtool -K eth0 sg on # scatter-gather is a prerequisite for TSO
 $ ethtool -K eth0 tso on
 $ ethtool -k eth0
 
+**Note:** Enabling this feature impacts the virtio features exposed by the DPDK
+vHost User backend to a guest. If a guest was already connected to OvS before
+enabling TSO and restarting OvS, this guest ports won't have TSO available::
+
+$ ovs-vsctl get interface vhost0 status:tx_tcp_seg_offload
+"false"
+
+To help diagnose the issue, those ports have some additional information in
+their status field in ovsdb::
+
+$ ovs-vsctl get interface vhost0 status:userspace-tso
+disabled
+
+To restore TSO for this guest ports, this guest QEMU process must be stopped,
+then started again. OvS will then report::
+
+   $ ovs-vsctl get interface vhost0 status:tx_tcp_seg_offload
+   "true"
+
+   $ ovs-vsctl get interface vhost0 status:userspace-tso
+   ovs-vsctl: no key "userspace-tso" in Interface record "vhost0" column status
+
 ~~~
 Limitations
 ~~~
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 63dac689e3..4415443924 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -418,6 +418,18 @@ enum dpdk_hw_ol_features {
 NETDEV_TX_TSO_OFFLOAD = 1 << 7,
 };
 
+/* Flags for the netdev_dpdk virtio_features_state field.
+ * This is used for the virtio features recovery mechanism linked to TSO
+ * support. */
+#define OVS_VIRTIO_F_CLEAN (UINT8_C(1) << 0)
+#define OVS_VIRTIO_F_WORKAROUND (UINT8_C(1) << 1)
+#define OVS_VIRTIO_F_NEGOTIATED (UINT8_C(1) << 2)
+#define OVS_VIRTIO_F_RECONF_PENDING (UINT8_C(1) << 3)
+#define OVS_VIRTIO_F_CLEAN_NEGOTIATED \
+(OVS_VIRTIO_F_CLEAN | OVS_VIRTIO_F_NEGOTIATED)
+#define OVS_VIRTIO_F_WORKAROUND_NEGOTIATED \
+(OVS_VIRTIO_F_WORKAROUND | OVS_VIRTIO_F_NEGOTIATED)
+
 /*
  * In order to avoid confusion in variables names, following naming convention
  * should be used, if possible:
@@ -474,7 +486,11 @@ struct netdev_dpdk {
 bool vhost_reconfigured;
 
 atomic_uint8_t vhost_tx_retries_max;
-/* 2 pad bytes here. */
+
+/* Flags for virtio features recovery mechanism. */
+uint8_t virtio_features_state;
+
+/* 1 pad byte here. */
 );
 
 PADDED_MEMBERS(CACHE_LINE_SIZE,
@@ -1359,6 +1375,7 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 dev->requested_lsc_interrupt_mode = 0;
 

Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Eric Garver
On Fri, Jul 07, 2023 at 02:09:49PM +0200, Eelco Chaudron wrote:
> 
> 
> On 7 Jul 2023, at 13:08, Ilya Maximets wrote:
> 
> > On 7/5/23 21:47, Eelco Chaudron wrote:
> >>
> >>
> >> On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:
> >>
> >> On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
> >> >
> >> >
> >> > On 30 Jun 2023, at 21:05, Eric Garver wrote:
> >> > Hi Eric,
> >> >
> >> > I started reviewing the series, and this test was failing every 
> >> other run for me on ‘check-system-userspace’. I ended up making the 
> >> following additional change:
> >> >
> >> > diff --git a/tests/ofproto-macros.at  
> >> b/tests/ofproto-macros.at 
> >> > index d2e6ac768..573ecdd0f 100644
> >> > --- a/tests/ofproto-macros.at 
> >> > +++ b/tests/ofproto-macros.at 
> >> > @@ -120,7 +120,7 @@ strip_xids () {
> >> >
> >> >  # Changes all 'used:...' to say 'used:0.0', to make output easier 
> >> to compare.
> >> >  strip_used () {
> >> > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
> >> > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
> >> >  }
> >>
> >> In this test case I expect the flow to be used. My local test runs do
> >> not yield "never".
> >>
> >> Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
> >> This is used in tests/drop-stats.at  .
> >>
> >> Tried that but did not work, if you look at the actual number of packets 
> >> received by the flow entry, it varies quite a lot from 0 (only learning) 
> >> to 5 when I run the test multiple times.
> >>
> >> I tried changing the following to generate more traffic:
> >>
> >> -dnl generate some traffic
> >> -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], 
> >> [ignore])
> >> +dnl Generate some traffic.
> >> +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
> >> [ignore])
> >>
> >> And then run it 100 times, and the earlier problem did not happen:
> >>
> >>   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
> >> TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'
> >>
> >> However, I have some times it fails with no traffic:
> >>
> >>
> >> --- - 2023-07-05 21:46:59.860127196 +0200
> >> +++ 
> >> /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
> >>  2023-07-05 21:46:59.857641231 +0200
> >> @@ -1,2 +1 @@
> >> -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), packets:0, 
> >> bytes:0, used:0.0s, actions:drop
> >>  
> >> Maybe this test needs some more love ;)
> >
> > The test likely needs a revalidator/wait call, since you're
> > checking datapath flows.  Otherwise, it'll be time sensitive.
> > Stats are gathered during revalidation.
> 
> That was might thought initially, and I tried it (no luck), but it’s dp 
> flows, not openflow.

revalidator/wait still fails.

The only thing I've found that makes this stable is the sleep below.

--->8---

diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 50d6c2d05e5d..ab893c5c696b 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -2068,6 +2068,9 @@ AT_CHECK([ovs-ofctl add-flows br0 flows.txt])
 
 dnl Generate some traffic.
 NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
+dnl sleep to let the flow hit the datapath
+sleep 5
+NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
 
 AT_CHECK([ovs-appctl dpctl/dump-flows | grep "eth_type(0x0800)" | dnl
   sed 's/,packet_type(ns=[[0-9]]*,id=[[0-9]]*),/,/;s/,eth(),/,/;' | dnl

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/6] dpif-netdev: Remove pmd-sleep-max experimental tag.

2023-07-07 Thread David Marchand
On Wed, Jun 21, 2023 at 11:26 AM Kevin Traynor  wrote:
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/6] pmd.at: Add per pmd max sleep unit tests.

2023-07-07 Thread David Marchand
On Wed, Jun 21, 2023 at 11:26 AM Kevin Traynor  wrote:
>
> Add unit tests for new per pmd options of pmd-sleep-max.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 5/6] dpif-netdev: Add per pmd sleep config.

2023-07-07 Thread David Marchand
On Wed, Jun 21, 2023 at 11:26 AM Kevin Traynor  wrote:
>
> Extend 'pmd-sleep-max' so that individual PMD thread cores
> may have a specified max sleep request value.
>
> Any PMD thread core without a value will use the datapath default
> (no sleep request) or datapath global value set by the user.
>
> To set PMD thread cores 8 and 9 to never request a load based sleep
> and all other PMD thread cores to be able to request a max sleep of
> 50 usecs:
>
> $ ovs-vsctl set open_vswitch . other_config:pmd-sleep-max=50,8:0,9:0
>
> To set PMD thread cores 10 and 11 to request a max sleep of 100 usecs
> and all other PMD thread cores to never request a sleep:
>
> $ ovs-vsctl set open_vswitch . other_config:pmd-sleep-max=10:100,11:100
>
> 'pmd-sleep-show' can be used to dump the global and individual PMD thread
> core max sleep request values.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 

I have some nits below which can be fixed when applying.
But I am ok too if we go with the current patch.



> diff --git a/Documentation/topics/dpdk/pmd.rst 
> b/Documentation/topics/dpdk/pmd.rst
> index 40e6b7843..eafcbc504 100644
> --- a/Documentation/topics/dpdk/pmd.rst
> +++ b/Documentation/topics/dpdk/pmd.rst
> @@ -375,4 +375,27 @@ system configuration (e.g. enabling processor C-states) 
> and workloads.
>  rate.
>
> +Max sleep request values can be set for individual PMDs using key:value 
> pairs.
> +Any PMD that has been assigned a specified value will use that. Any PMD that
> +does not have a specified value will use the current global default.
> +
> +Specified values for individual PMDs can be added or removed at any time.
> +
> +For example, to set PMD thread cores 8 and 9 to never request a load based

Nit: in the dpif-netdev/pmd-sleep-show command output, "PMD thread
core X" is short, and understandable.

But for descriptions in the documentation, "PMD thread cores" is strange.
We didn't use such denomination so far in the doc.

I would go with "PMD on cores 8 and 9" (or PMD threads on cores 8 and 9).

> +sleep and all others PMD cores to be able to request a max sleep of 50 
> usecs::

Nit: Idem, PMD (or PMD threads)


> +
> +$ ovs-vsctl set open_vswitch . other_config:pmd-sleep-max=50,8:0,9:0
> +
> +The max sleep request for each PMD can be checked in the logs or with::
> +
> +$ ovs-appctl dpif-netdev/pmd-sleep-show
> +PMD max sleep request is 50 usecs by default.
> +PMD load based sleeps are enabled by default.
> +PMD thread core   8 NUMA  0: Max sleep request set to0 usecs.
> +PMD thread core   9 NUMA  1: Max sleep request set to0 usecs.
> +PMD thread core  10 NUMA  0: Max sleep request set to   50 usecs.
> +PMD thread core  11 NUMA  1: Max sleep request set to   50 usecs.
> +PMD thread core  12 NUMA  0: Max sleep request set to   50 usecs.
> +PMD thread core  13 NUMA  1: Max sleep request set to   50 usecs.
> +
>  .. _ovs-vswitchd(8):
>  http://openvswitch.org/support/dist-docs/ovs-vswitchd.8.html

[snip]

> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 88d25f1da..d9ee53ff5 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c

[snip]

> @@ -5065,4 +5077,182 @@ parse_affinity_list(const char *affinity_list, 
> unsigned *core_ids, int n_rxq)
>  }
>
> +static int
> +parse_pmd_sleep_list(const char *max_sleep_list,
> + struct pmd_sleep **pmd_sleeps)
> +{
> +char *list, *copy, *key, *value;
> +int num_vals = 0;
> +
> +if (!max_sleep_list) {
> +return num_vals;
> +}
> +
> +list = copy = xstrdup(max_sleep_list);
> +
> +while (ofputil_parse_key_value(, , )) {
> +char *error = NULL;
> +unsigned core;
> +uint64_t temp, pmd_max_sleep;
> +int i;
> +
> +error = str_to_u64(key, );
> +if (error) {
> +free(error);
> +continue;
> +}
> +
> +error = str_to_u64(value, _max_sleep);
> +if (error) {
> +/* No value specified. key is dp default. */
> +core = UINT_MAX;
> +pmd_max_sleep = temp;
> +free(error);
> +} else {
> +/* Value specified. key is  pmd core id.*/

Nit: extra space between "is pmd".


> +if (temp >= UINT_MAX) {
> +continue;
> +}
> +core = (unsigned) temp;
> +}
> +

[snip]


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/6] dpif-netdev: Add pmd-sleep-show command.

2023-07-07 Thread David Marchand
On Wed, Jun 21, 2023 at 11:26 AM Kevin Traynor  wrote:
>
> Max requested sleep time and status for a PMD thread
> is logged at start up or when changed, but it can be
> convenient to have a command to dump this information
> explicitly.
>
> It is envisaged that this will be expanded when future
> additions are added.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 

-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/6] pmd.at: Add macro for checking pmd sleep max time and state.

2023-07-07 Thread David Marchand
On Wed, Jun 21, 2023 at 11:26 AM Kevin Traynor  wrote:
>
> This is just cosmetic. There is no change to the tests.
>
> Signed-off-by: Kevin Traynor 

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/6] dpif-netdev: Rename pmd-maxsleep config option.

2023-07-07 Thread David Marchand
On Wed, Jun 21, 2023 at 11:26 AM Kevin Traynor  wrote:
>
> other_config:pmd-maxsleep is a config option to allow
> PMD thread cores to sleep under low or no load conditions.
>
> Rename it to 'pmd-sleep-max' to allow a more structured
> name and so that additional options or command can follow
> the 'pmd-sleep-xyz' pattern.
>
> Signed-off-by: Kevin Traynor 

I don't have a strong opinion against this name change.
Having a clear prefix for this feature params and using this same
prefix for appctl commands (later in this series) seems a good thing
to me.

Reviewed-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ovn-northd.at: Fix occasional LSP I-P test failure due to initializtion phase.

2023-07-07 Thread Ales Musil
On Fri, Jul 7, 2023 at 8:30 AM Han Zhou  wrote:

> After the commit 0c1bde1c4a the recompute counters are more predictable,
> so we changed the LSP incremental processing test to not tolerate any
> failures (instead of 50% successful rate). But the test would then fail
> occasionally at the first check, because sometimes the update of the
> initial configurations from ovn-controller such as for tunnel interface
> creation come too late, after we cleaned the stats counters and start
> the LSP tests.
>
> This patch fixes it by creating a pilot port and wait for it to be up,
> so that we know the initial ovn-controller configurations are done, and
> will not interfere with our I-P tests.
>
> Fixes: 0c1bde1c4a47 ("ovn-northd: Avoid recompute caused by in-flight
> transactions.")
> Signed-off-by: Han Zhou 
> ---
>  tests/ovn-northd.at | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index e79d33b2aec5..3e06f14c9437 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -9532,6 +9532,13 @@ check_recompute_counter() {
>
>  check ovn-nbctl --wait=hv ls-add ls0
>
> +# Create a pilot port and wait it up to make sure we are ready for the
> real
> +# tests, so that the counters measured are accurate.
> +check ovn-nbctl --wait=hv lsp-add ls0 lsp-pilot -- lsp-set-addresses
> lsp-pilot "unknown"
> +ovs-vsctl add-port br-int lsp-pilot -- set interface lsp-pilot
> external_ids:iface-id=lsp-pilot
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv sync
> +
>  check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
>  check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0
> "unknown"
>  ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0
> external_ids:iface-id=lsp0-0
> --
> 2.31.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Looks good to me, thanks.

Acked-by: Ales Musil 

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA 

amu...@redhat.comIM: amusil

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] vswitchd: Wait for a bridge exit before replying to exit unixctl.

2023-07-07 Thread Eelco Chaudron



On 7 Jul 2023, at 14:19, Ilya Maximets wrote:

> On 7/7/23 14:12, Eelco Chaudron wrote:
>>
>>
>> On 7 Jul 2023, at 12:58, Ilya Maximets wrote:
>>
>>> On 7/5/23 15:54, Eelco Chaudron wrote:


 On 4 Jul 2023, at 15:11, Ilya Maximets wrote:

> Before the cleanup option, the bridge_exit() call was fairly fast,
> because it didn't include any particularly long operations.  However,
> with the cleanup flag, this function destroys a lot of datapath
> resources freeing a lot of memory, waiting on RCU and talking to
> the kernel.  That may take a noticeable amount of time, especially
> on a busy system or under profilers/sanitizers.  However, the unixctl
> 'exit' command replies instantly without waiting for any work to
> actually be done.  This may cause system test failures or other
> issues where scripts expect ovs-vswitchd to exit or destroy all the
> datapath resources shortly after appctl call.
>
> Fix that by waiting for the bridge_exit() before replying to the user.
> At least, all the datapath resources will actually be destroyed by
> the time ovs-appctl exits.
>
> Also moving a structure from stack to global.  Seems cleaner this way.

 Thanks, yes it looks cleaner. One comment inline below.

 //Eelco


> Fixes: fe13ccdca6a2 ("vswitchd: Add --cleanup option to the 'appctl exit' 
> command")
> Signed-off-by: Ilya Maximets 
> ---
>  vswitchd/ovs-vswitchd.c | 38 +++---
>  1 file changed, 19 insertions(+), 19 deletions(-)
>
> diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
> index a244d2f70..55b437871 100644
> --- a/vswitchd/ovs-vswitchd.c
> +++ b/vswitchd/ovs-vswitchd.c
> @@ -68,19 +68,18 @@ static unixctl_cb_func ovs_vswitchd_exit;
>  static char *parse_options(int argc, char *argv[], char **unixctl_path);
>  OVS_NO_RETURN static void usage(void);
>
> -struct ovs_vswitchd_exit_args {
> -bool *exiting;
> -bool *cleanup;
> -};
> +static struct ovs_vswitchd_exit_args {
> +struct unixctl_conn *conn;
> +bool exiting;
> +bool cleanup;
> +} exit_args;
>
>  int
>  main(int argc, char *argv[])
>  {
> -char *unixctl_path = NULL;
>  struct unixctl_server *unixctl;
> +char *unixctl_path = NULL;
>  char *remote;
> -bool exiting, cleanup;
> -struct ovs_vswitchd_exit_args exit_args = {, };
>  int retval;
>
>  set_program_name(argv[0]);
> @@ -111,14 +110,12 @@ main(int argc, char *argv[])
>  exit(EXIT_FAILURE);
>  }
>  unixctl_command_register("exit", "[--cleanup]", 0, 1,
> - ovs_vswitchd_exit, _args);
> + ovs_vswitchd_exit, NULL);
>
>  bridge_init(remote);
>  free(remote);
>
> -exiting = false;
> -cleanup = false;
> -while (!exiting) {
> +while (!exit_args.exiting) {
>  OVS_USDT_PROBE(main, run_start);
>  memory_run();
>  if (memory_should_report()) {
> @@ -137,16 +134,20 @@ main(int argc, char *argv[])
>  bridge_wait();
>  unixctl_server_wait(unixctl);
>  netdev_wait();
> -if (exiting) {
> +if (exit_args.exiting) {
>  poll_immediate_wake();
>  }
>  OVS_USDT_PROBE(main, poll_block);
>  poll_block();
>  if (should_service_stop()) {
> -exiting = true;
> +exit_args.exiting = true;
>  }
>  }
> -bridge_exit(cleanup);
> +bridge_exit(exit_args.cleanup);
> +
> +if (exit_args.conn) {
> +unixctl_command_reply(exit_args.conn, NULL);
> +}
>  unixctl_server_destroy(unixctl);
>  service_stop();
>  vlog_disable_async();
> @@ -304,10 +305,9 @@ usage(void)
>
>  static void
>  ovs_vswitchd_exit(struct unixctl_conn *conn, int argc,
> -  const char *argv[], void *exit_args_)
> +  const char *argv[], void *args OVS_UNUSED)
>  {
> -struct ovs_vswitchd_exit_args *exit_args = exit_args_;
> -*exit_args->exiting = true;
> -*exit_args->cleanup = argc == 2 && !strcmp(argv[1], "--cleanup");
> -unixctl_command_reply(conn, NULL);
> +exit_args.conn = conn;

 Should we try to protect from two exit command in the same 
 unixctl_server_run()?
 Something like:

 if (exit_args.conn) {
  unixctl_command_reply(conn, NULL);
 } else {
 exit_args.conn = conn;
 }

>>>
>>> Good point.  It's unlikely, but can happen.
>>> We could either do what you suggested or store an array of connections
>>> and try to reply to all 

Re: [ovs-dev] [PATCH v2 2/4] dpif: probe support for OVS_ACTION_ATTR_DROP

2023-07-07 Thread Eelco Chaudron



On 7 Jul 2023, at 14:25, Ilya Maximets wrote:

> On 7/7/23 14:11, Eelco Chaudron wrote:
>>
>>
>> On 7 Jul 2023, at 13:03, Ilya Maximets wrote:
>>
>>> On 7/6/23 11:30, Eelco Chaudron wrote:


 On 30 Jun 2023, at 21:05, Eric Garver wrote:

 Hi Eric,

 I guess some description here would be good. Maybe add some info that this 
 is added due to kernel support being added.


 In general, the patch looks good, and I would ack it. However, there is a 
 potential issue with TC hardware offload. If the patch gets applied as is, 
 and the kernel fix gets in, the additional DROP actions get added, but 
 they will not be offloaded to hardware and are only visible as kernel dp 
 entries.

 I guess this might not be a real problem. However, people who upgrade OVS 
 with tc offload might get confused why they now get these drop rules in 
 the kernel dp. To solve this, we could simply skip adding them if hardware 
 offload is enabled. So the behavior stays the same. Something like this 
 (not tested, and we probably need to add dpif_is_netlink()):


 diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
 index 29f4daa63..bc39e45bd 100644
 --- a/ofproto/ofproto-dpif-xlate.c
 +++ b/ofproto/ofproto-dpif-xlate.c
 @@ -34,12 +34,14 @@
  #include "csum.h"
  #include "dp-packet.h"
  #include "dpif.h"
 +#include "dpif-netdev.h"
  #include "in-band.h"
  #include "lacp.h"
  #include "learn.h"
  #include "mac-learning.h"
  #include "mcast-snooping.h"
  #include "multipath.h"
 +#include "netdev-offload.h"
  #include "netdev-vport.h"
  #include "netlink.h"
  #include "nx-match.h"
 @@ -8199,7 +8201,8 @@ exit:

  /* Install drop action if datapath supports explicit drop action. */
  if (xin->odp_actions && !xin->odp_actions->size &&
 -ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) {
 +ovs_explicit_drop_action_supported(ctx.xbridge->ofproto) &&
 +/* Do not add drop actions if TC hardware offload is enabled, as
 + * as TC does not support offloading the drop action. */
 +(!netdev_is_flow_api_enabled() || dpif_is_netdev(xbridge->dpif))) 
 {
>>>
>>> I'd argue that the check should not be here.  We should not set the
>>> baker.rt_support flag in the first place instead.
>>
>> Yes, that was my initial thought also, but what if hw offload gets enabled 
>> later? We might need some re-evaluation at that stage. I know the docs say 
>> you have to restart ovs when you enable HW offload, but that is not always 
>> happening in practice.
>
> But if we keep it this way we'll have misleading information that it is
> supported in the log, while it will not be used in practice.

True, maybe we can re-evaluate if the hw offload flag changes?

>>
  put_drop_action(xin->odp_actions, ctx.error);
  }

 I do not see a way for TC to support this as is in the kernel. We probably 
 need some new kind of drop reason supported TC action.

 I know other people are exploring to see if this explicit drop action can 
 be used for other, more user-level things. If this is true this TC part 
 needs some proper fixing or we end up with rules not being offloaded.

 Added some more people, who might have some opinions on TC offload / ideas 
 on how to add SKB_DROP_REASON support to TC.

 Cheers,


 Eelco

> Signed-off-by: Eric Garver 
> ---
>  include/linux/openvswitch.h |  2 +-
>  lib/dpif.c  |  6 --
>  lib/dpif.h  |  1 -
>  ofproto/ofproto-dpif.c  | 34 --
>  4 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
> index a265e05ad253..fce6456f47c8 100644
> --- a/include/linux/openvswitch.h
> +++ b/include/linux/openvswitch.h
> @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
>   OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>   OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>   OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
> + OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>
>  #ifndef __KERNEL__
>   OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>   OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
> - OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
>   OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
>  #endif
>   __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
> diff --git a/lib/dpif.c b/lib/dpif.c
> index d328bf288de0..dc02ec912693 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -1928,12 +1928,6 @@ dpif_supports_tnl_push_pop(const 

Re: [ovs-dev] [PATCH v5] netdev-dpdk: Drop TSO in case of conflicting virtio features.

2023-07-07 Thread Ilya Maximets
On 7/7/23 14:22, David Marchand wrote:
> On Fri, Jul 7, 2023 at 1:50 PM Ilya Maximets  wrote:
>>> @@ -4268,7 +4295,9 @@ new_device(int vid)
>>>  dev->hw_ol_features |= NETDEV_TX_SCTP_CKSUM_OFFLOAD;
>>>  }
>>>
>>> -if (userspace_tso_enabled()) {
>>> +if (userspace_tso_enabled()
>>> +&& dev->virtio_features_state & OVS_VIRTIO_F_CLEAN) {
>>
>> If this needed?  There is no harm in using these features on Tx if 
>> negotiated.
> 
> I did not test without it.
> 
> Without this limitation, this means that the host/OVS can make tso
> offloading requests to the guest, but the guest can't make them to the
> host.
> Iow, on the host side, TSO would be advertised to OVS layers, and
> would be displayed in the port status field.
> On the guest side, TSO would be forbidden.
> I find this asymmetry confusing.
> 
> If you don't have access to the guest (which happens in deployments
> with VNF as black boxes), the only chance to catch that the workaround
> is in effect is to look at the "userspace-tso" field that gets added
> by this patch.
> But to know about this field, you would have to get a hint something
> is wrong on the TSO side.
> 
> With announcing that TSO is off for both sides of this vhost port, I
> think we have more chances a user would search in OVS documentation
> and find out about the "userspace-tso" status field.

OK.  Makes sense.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] ci: Remove '--recheck' in CI.

2023-07-07 Thread Ales Musil
On Wed, Jul 5, 2023 at 7:08 PM Dumitru Ceara  wrote:

> If we want to catch new failures faster we have a better chance if CI
> doesn't auto-retry (once).
>
> There are some tests that are still "unstable" and fail every now and
> then.  In order to reduce the number of false negatives keep the
> --recheck for them.  To achieve that we use a new macro, TAG_UNSTABLE,
> to tag all these tests.  The list of "unstable" tests is compiled based
> on the following discussion:
> https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/405465.html
>
> In order to avoid new GitHub actions jobs, we re-purpose the last job of
> each target type to also run the unstable tests.  These jobs were
> already running less tests than others so the additional run time should
> not be an issue.
>
> Signed-off-by: Dumitru Ceara 
>

Hi,

I have one question below.


> ---
> Changes in v1 (since RFC):
> - kept recheck for unstable tests
> - introduced TAG_UNSTABLE
> - changed test.yml to run unstable tests in the last batch of every
>   test target type.
> ---
>  .ci/ci.sh  |  2 +-
>  .ci/linux-build.sh | 54 +++---
>  .github/workflows/test.yml | 15 ++-
>  tests/ovn-ic.at|  1 +
>  tests/ovn-ipsec.at |  1 +
>  tests/ovn-macros.at|  5 
>  tests/ovn-northd.at|  1 +
>  tests/ovn-performance.at   |  1 +
>  tests/ovn.at   | 13 +
>  9 files changed, 70 insertions(+), 23 deletions(-)
>
> diff --git a/.ci/ci.sh b/.ci/ci.sh
> index 10f11939c5..a500aba764 100755
> --- a/.ci/ci.sh
> +++ b/.ci/ci.sh
> @@ -101,7 +101,7 @@ function run_tests() {
>  && \
>  ARCH=$ARCH CC=$CC LIBS=$LIBS OPTS=$OPTS TESTSUITE=$TESTSUITE \
>  TEST_RANGE=$TEST_RANGE SANITIZERS=$SANITIZERS DPDK=$DPDK \
> -./.ci/linux-build.sh
> +UNSTABLE=$UNSTABLE ./.ci/linux-build.sh
>  "
>  }
>
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index 5a79a52daf..58e41a7b68 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -9,6 +9,7 @@ COMMON_CFLAGS=""
>  OVN_CFLAGS=""
>  OPTS="$OPTS --enable-Werror"
>  JOBS=${JOBS:-"-j4"}
> +RECHECK=${RECHECK:-"no"}
>
>  function install_dpdk()
>  {
> @@ -99,6 +100,17 @@ function configure_clang()
>  COMMON_CFLAGS="${COMMON_CFLAGS}
> -Wno-error=unused-command-line-argument"
>  }
>
> +function run_tests()
> +{
> +if ! make distcheck CFLAGS="${COMMON_CFLAGS} ${OVN_CFLAGS}" $JOBS \
> +TESTSUITEFLAGS="$JOBS $TEST_RANGE" RECHECK=$RECHECK
> +then
> +# testsuite.log is necessary for debugging.
> +cat */_build/sub/tests/testsuite.log
> +exit 1
>

This essentially mean that if we fail in the stable part we won't run the
unstable,
which might be ok. But IMO it would be better to run both and fail
depending on results
from both WDYT? The same applies for system tests.


> +fi
> +}
> +
>  function execute_tests()
>  {
>  # 'distcheck' will reconfigure with required options.
> @@ -106,27 +118,39 @@ function execute_tests()
>  configure_ovn
>
>  export DISTCHECK_CONFIGURE_FLAGS="$OPTS"
> -if ! make distcheck CFLAGS="${COMMON_CFLAGS} ${OVN_CFLAGS}" $JOBS \
> -TESTSUITEFLAGS="$JOBS $TEST_RANGE" RECHECK=yes
> -then
> -# testsuite.log is necessary for debugging.
> -cat */_build/sub/tests/testsuite.log
> +
> +SKIP_UNSTABLE=yes run_tests
> +
> +if [ "$UNSTABLE" ]; then
> +SKIP_UNSTABLE=no TEST_RANGE="-k unstable" RECHECK=yes \
> +run_tests
> +fi
> +}
> +
> +function run_system_tests()
> +{
> +type=$1
> +log_file=$2
> +
> +if ! sudo make $JOBS $type TESTSUITEFLAGS="$TEST_RANGE" \
> +RECHECK=$RECHECK; then
> +# $log_file is necessary for debugging.
> +cat tests/$log_file
>  exit 1
>  fi
>  }
>
>  function execute_system_tests()
>  {
> -  type=$1
> -  log_file=$2
> -
> -  configure_ovn $OPTS
> -  make $JOBS || { cat config.log; exit 1; }
> -  if ! sudo make $JOBS $type TESTSUITEFLAGS="$TEST_RANGE"
> RECHECK=yes; then
> -  # $log_file is necessary for debugging.
> -  cat tests/$log_file
> -  exit 1
> -  fi
> +configure_ovn $OPTS
> +make $JOBS || { cat config.log; exit 1; }
> +
> +SKIP_UNSTABLE=yes run_system_tests $@
> +
> +if [ "$UNSTABLE" ]; then
> +SKIP_UNSTABLE=no TEST_RANGE="-k unstable" RECHECK=yes \
> +run_system_tests $@
> +fi
>  }
>
>  configure_$CC
> diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
> index fe2a14c401..7d40251003 100644
> --- a/.github/workflows/test.yml
> +++ b/.github/workflows/test.yml
> @@ -92,6 +92,7 @@ jobs:
>TESTSUITE:   ${{ matrix.cfg.testsuite }}
>TEST_RANGE:  ${{ matrix.cfg.test_range }}
>SANITIZERS:  ${{ matrix.cfg.sanitizers }}
> +  UNSTABLE:${{ matrix.cfg.unstable }}
>
>  name: linux ${{ join(matrix.cfg.*, ' ') }}
>  runs-on: 

Re: [ovs-dev] [PATCH v2 2/4] dpif: probe support for OVS_ACTION_ATTR_DROP

2023-07-07 Thread Ilya Maximets
On 7/7/23 14:11, Eelco Chaudron wrote:
> 
> 
> On 7 Jul 2023, at 13:03, Ilya Maximets wrote:
> 
>> On 7/6/23 11:30, Eelco Chaudron wrote:
>>>
>>>
>>> On 30 Jun 2023, at 21:05, Eric Garver wrote:
>>>
>>> Hi Eric,
>>>
>>> I guess some description here would be good. Maybe add some info that this 
>>> is added due to kernel support being added.
>>>
>>>
>>> In general, the patch looks good, and I would ack it. However, there is a 
>>> potential issue with TC hardware offload. If the patch gets applied as is, 
>>> and the kernel fix gets in, the additional DROP actions get added, but they 
>>> will not be offloaded to hardware and are only visible as kernel dp entries.
>>>
>>> I guess this might not be a real problem. However, people who upgrade OVS 
>>> with tc offload might get confused why they now get these drop rules in the 
>>> kernel dp. To solve this, we could simply skip adding them if hardware 
>>> offload is enabled. So the behavior stays the same. Something like this 
>>> (not tested, and we probably need to add dpif_is_netlink()):
>>>
>>>
>>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>>> index 29f4daa63..bc39e45bd 100644
>>> --- a/ofproto/ofproto-dpif-xlate.c
>>> +++ b/ofproto/ofproto-dpif-xlate.c
>>> @@ -34,12 +34,14 @@
>>>  #include "csum.h"
>>>  #include "dp-packet.h"
>>>  #include "dpif.h"
>>> +#include "dpif-netdev.h"
>>>  #include "in-band.h"
>>>  #include "lacp.h"
>>>  #include "learn.h"
>>>  #include "mac-learning.h"
>>>  #include "mcast-snooping.h"
>>>  #include "multipath.h"
>>> +#include "netdev-offload.h"
>>>  #include "netdev-vport.h"
>>>  #include "netlink.h"
>>>  #include "nx-match.h"
>>> @@ -8199,7 +8201,8 @@ exit:
>>>
>>>  /* Install drop action if datapath supports explicit drop action. */
>>>  if (xin->odp_actions && !xin->odp_actions->size &&
>>> -ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) {
>>> +ovs_explicit_drop_action_supported(ctx.xbridge->ofproto) &&
>>> +/* Do not add drop actions if TC hardware offload is enabled, as
>>> + * as TC does not support offloading the drop action. */
>>> +(!netdev_is_flow_api_enabled() || dpif_is_netdev(xbridge->dpif))) {
>>
>> I'd argue that the check should not be here.  We should not set the
>> baker.rt_support flag in the first place instead.
> 
> Yes, that was my initial thought also, but what if hw offload gets enabled 
> later? We might need some re-evaluation at that stage. I know the docs say 
> you have to restart ovs when you enable HW offload, but that is not always 
> happening in practice.

But if we keep it this way we'll have misleading information that it is
supported in the log, while it will not be used in practice.

> 
>>>  put_drop_action(xin->odp_actions, ctx.error);
>>>  }
>>>
>>> I do not see a way for TC to support this as is in the kernel. We probably 
>>> need some new kind of drop reason supported TC action.
>>>
>>> I know other people are exploring to see if this explicit drop action can 
>>> be used for other, more user-level things. If this is true this TC part 
>>> needs some proper fixing or we end up with rules not being offloaded.
>>>
>>> Added some more people, who might have some opinions on TC offload / ideas 
>>> on how to add SKB_DROP_REASON support to TC.
>>>
>>> Cheers,
>>>
>>>
>>> Eelco
>>>
 Signed-off-by: Eric Garver 
 ---
  include/linux/openvswitch.h |  2 +-
  lib/dpif.c  |  6 --
  lib/dpif.h  |  1 -
  ofproto/ofproto-dpif.c  | 34 --
  4 files changed, 33 insertions(+), 10 deletions(-)

 diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
 index a265e05ad253..fce6456f47c8 100644
 --- a/include/linux/openvswitch.h
 +++ b/include/linux/openvswitch.h
 @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
 +  OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */

  #ifndef __KERNEL__
OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
 -  OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
  #endif
__OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
 diff --git a/lib/dpif.c b/lib/dpif.c
 index d328bf288de0..dc02ec912693 100644
 --- a/lib/dpif.c
 +++ b/lib/dpif.c
 @@ -1928,12 +1928,6 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
  return dpif_is_netdev(dpif);
  }

 -bool
 -dpif_supports_explicit_drop_action(const struct dpif *dpif)
 -{
 -return dpif_is_netdev(dpif);
 -}
 -
  bool
  

Re: [ovs-dev] [PATCH v5] netdev-dpdk: Drop TSO in case of conflicting virtio features.

2023-07-07 Thread David Marchand
On Fri, Jul 7, 2023 at 1:50 PM Ilya Maximets  wrote:
> > @@ -4268,7 +4295,9 @@ new_device(int vid)
> >  dev->hw_ol_features |= NETDEV_TX_SCTP_CKSUM_OFFLOAD;
> >  }
> >
> > -if (userspace_tso_enabled()) {
> > +if (userspace_tso_enabled()
> > +&& dev->virtio_features_state & OVS_VIRTIO_F_CLEAN) {
>
> If this needed?  There is no harm in using these features on Tx if negotiated.

I did not test without it.

Without this limitation, this means that the host/OVS can make tso
offloading requests to the guest, but the guest can't make them to the
host.
Iow, on the host side, TSO would be advertised to OVS layers, and
would be displayed in the port status field.
On the guest side, TSO would be forbidden.
I find this asymmetry confusing.

If you don't have access to the guest (which happens in deployments
with VNF as black boxes), the only chance to catch that the workaround
is in effect is to look at the "userspace-tso" field that gets added
by this patch.
But to know about this field, you would have to get a hint something
is wrong on the TSO side.

With announcing that TSO is off for both sides of this vhost port, I
think we have more chances a user would search in OVS documentation
and find out about the "userspace-tso" status field.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] vswitchd: Wait for a bridge exit before replying to exit unixctl.

2023-07-07 Thread Ilya Maximets
On 7/7/23 14:12, Eelco Chaudron wrote:
> 
> 
> On 7 Jul 2023, at 12:58, Ilya Maximets wrote:
> 
>> On 7/5/23 15:54, Eelco Chaudron wrote:
>>>
>>>
>>> On 4 Jul 2023, at 15:11, Ilya Maximets wrote:
>>>
 Before the cleanup option, the bridge_exit() call was fairly fast,
 because it didn't include any particularly long operations.  However,
 with the cleanup flag, this function destroys a lot of datapath
 resources freeing a lot of memory, waiting on RCU and talking to
 the kernel.  That may take a noticeable amount of time, especially
 on a busy system or under profilers/sanitizers.  However, the unixctl
 'exit' command replies instantly without waiting for any work to
 actually be done.  This may cause system test failures or other
 issues where scripts expect ovs-vswitchd to exit or destroy all the
 datapath resources shortly after appctl call.

 Fix that by waiting for the bridge_exit() before replying to the user.
 At least, all the datapath resources will actually be destroyed by
 the time ovs-appctl exits.

 Also moving a structure from stack to global.  Seems cleaner this way.
>>>
>>> Thanks, yes it looks cleaner. One comment inline below.
>>>
>>> //Eelco
>>>
>>>
 Fixes: fe13ccdca6a2 ("vswitchd: Add --cleanup option to the 'appctl exit' 
 command")
 Signed-off-by: Ilya Maximets 
 ---
  vswitchd/ovs-vswitchd.c | 38 +++---
  1 file changed, 19 insertions(+), 19 deletions(-)

 diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
 index a244d2f70..55b437871 100644
 --- a/vswitchd/ovs-vswitchd.c
 +++ b/vswitchd/ovs-vswitchd.c
 @@ -68,19 +68,18 @@ static unixctl_cb_func ovs_vswitchd_exit;
  static char *parse_options(int argc, char *argv[], char **unixctl_path);
  OVS_NO_RETURN static void usage(void);

 -struct ovs_vswitchd_exit_args {
 -bool *exiting;
 -bool *cleanup;
 -};
 +static struct ovs_vswitchd_exit_args {
 +struct unixctl_conn *conn;
 +bool exiting;
 +bool cleanup;
 +} exit_args;

  int
  main(int argc, char *argv[])
  {
 -char *unixctl_path = NULL;
  struct unixctl_server *unixctl;
 +char *unixctl_path = NULL;
  char *remote;
 -bool exiting, cleanup;
 -struct ovs_vswitchd_exit_args exit_args = {, };
  int retval;

  set_program_name(argv[0]);
 @@ -111,14 +110,12 @@ main(int argc, char *argv[])
  exit(EXIT_FAILURE);
  }
  unixctl_command_register("exit", "[--cleanup]", 0, 1,
 - ovs_vswitchd_exit, _args);
 + ovs_vswitchd_exit, NULL);

  bridge_init(remote);
  free(remote);

 -exiting = false;
 -cleanup = false;
 -while (!exiting) {
 +while (!exit_args.exiting) {
  OVS_USDT_PROBE(main, run_start);
  memory_run();
  if (memory_should_report()) {
 @@ -137,16 +134,20 @@ main(int argc, char *argv[])
  bridge_wait();
  unixctl_server_wait(unixctl);
  netdev_wait();
 -if (exiting) {
 +if (exit_args.exiting) {
  poll_immediate_wake();
  }
  OVS_USDT_PROBE(main, poll_block);
  poll_block();
  if (should_service_stop()) {
 -exiting = true;
 +exit_args.exiting = true;
  }
  }
 -bridge_exit(cleanup);
 +bridge_exit(exit_args.cleanup);
 +
 +if (exit_args.conn) {
 +unixctl_command_reply(exit_args.conn, NULL);
 +}
  unixctl_server_destroy(unixctl);
  service_stop();
  vlog_disable_async();
 @@ -304,10 +305,9 @@ usage(void)

  static void
  ovs_vswitchd_exit(struct unixctl_conn *conn, int argc,
 -  const char *argv[], void *exit_args_)
 +  const char *argv[], void *args OVS_UNUSED)
  {
 -struct ovs_vswitchd_exit_args *exit_args = exit_args_;
 -*exit_args->exiting = true;
 -*exit_args->cleanup = argc == 2 && !strcmp(argv[1], "--cleanup");
 -unixctl_command_reply(conn, NULL);
 +exit_args.conn = conn;
>>>
>>> Should we try to protect from two exit command in the same 
>>> unixctl_server_run()?
>>> Something like:
>>>
>>> if (exit_args.conn) {
>>>  unixctl_command_reply(conn, NULL);
>>> } else {
>>> exit_args.conn = conn;
>>> }
>>>
>>
>> Good point.  It's unlikely, but can happen.
>> We could either do what you suggested or store an array of connections
>> and try to reply to all of them.  It becomes a bit inconsistent though
>> if different calls have different options.
>>
>> What do you think?
> 
> I was thinking of an array also but what happens with the one 

Re: [ovs-dev] [PATCH] vswitchd: Wait for a bridge exit before replying to exit unixctl.

2023-07-07 Thread Eelco Chaudron



On 7 Jul 2023, at 12:58, Ilya Maximets wrote:

> On 7/5/23 15:54, Eelco Chaudron wrote:
>>
>>
>> On 4 Jul 2023, at 15:11, Ilya Maximets wrote:
>>
>>> Before the cleanup option, the bridge_exit() call was fairly fast,
>>> because it didn't include any particularly long operations.  However,
>>> with the cleanup flag, this function destroys a lot of datapath
>>> resources freeing a lot of memory, waiting on RCU and talking to
>>> the kernel.  That may take a noticeable amount of time, especially
>>> on a busy system or under profilers/sanitizers.  However, the unixctl
>>> 'exit' command replies instantly without waiting for any work to
>>> actually be done.  This may cause system test failures or other
>>> issues where scripts expect ovs-vswitchd to exit or destroy all the
>>> datapath resources shortly after appctl call.
>>>
>>> Fix that by waiting for the bridge_exit() before replying to the user.
>>> At least, all the datapath resources will actually be destroyed by
>>> the time ovs-appctl exits.
>>>
>>> Also moving a structure from stack to global.  Seems cleaner this way.
>>
>> Thanks, yes it looks cleaner. One comment inline below.
>>
>> //Eelco
>>
>>
>>> Fixes: fe13ccdca6a2 ("vswitchd: Add --cleanup option to the 'appctl exit' 
>>> command")
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>>  vswitchd/ovs-vswitchd.c | 38 +++---
>>>  1 file changed, 19 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
>>> index a244d2f70..55b437871 100644
>>> --- a/vswitchd/ovs-vswitchd.c
>>> +++ b/vswitchd/ovs-vswitchd.c
>>> @@ -68,19 +68,18 @@ static unixctl_cb_func ovs_vswitchd_exit;
>>>  static char *parse_options(int argc, char *argv[], char **unixctl_path);
>>>  OVS_NO_RETURN static void usage(void);
>>>
>>> -struct ovs_vswitchd_exit_args {
>>> -bool *exiting;
>>> -bool *cleanup;
>>> -};
>>> +static struct ovs_vswitchd_exit_args {
>>> +struct unixctl_conn *conn;
>>> +bool exiting;
>>> +bool cleanup;
>>> +} exit_args;
>>>
>>>  int
>>>  main(int argc, char *argv[])
>>>  {
>>> -char *unixctl_path = NULL;
>>>  struct unixctl_server *unixctl;
>>> +char *unixctl_path = NULL;
>>>  char *remote;
>>> -bool exiting, cleanup;
>>> -struct ovs_vswitchd_exit_args exit_args = {, };
>>>  int retval;
>>>
>>>  set_program_name(argv[0]);
>>> @@ -111,14 +110,12 @@ main(int argc, char *argv[])
>>>  exit(EXIT_FAILURE);
>>>  }
>>>  unixctl_command_register("exit", "[--cleanup]", 0, 1,
>>> - ovs_vswitchd_exit, _args);
>>> + ovs_vswitchd_exit, NULL);
>>>
>>>  bridge_init(remote);
>>>  free(remote);
>>>
>>> -exiting = false;
>>> -cleanup = false;
>>> -while (!exiting) {
>>> +while (!exit_args.exiting) {
>>>  OVS_USDT_PROBE(main, run_start);
>>>  memory_run();
>>>  if (memory_should_report()) {
>>> @@ -137,16 +134,20 @@ main(int argc, char *argv[])
>>>  bridge_wait();
>>>  unixctl_server_wait(unixctl);
>>>  netdev_wait();
>>> -if (exiting) {
>>> +if (exit_args.exiting) {
>>>  poll_immediate_wake();
>>>  }
>>>  OVS_USDT_PROBE(main, poll_block);
>>>  poll_block();
>>>  if (should_service_stop()) {
>>> -exiting = true;
>>> +exit_args.exiting = true;
>>>  }
>>>  }
>>> -bridge_exit(cleanup);
>>> +bridge_exit(exit_args.cleanup);
>>> +
>>> +if (exit_args.conn) {
>>> +unixctl_command_reply(exit_args.conn, NULL);
>>> +}
>>>  unixctl_server_destroy(unixctl);
>>>  service_stop();
>>>  vlog_disable_async();
>>> @@ -304,10 +305,9 @@ usage(void)
>>>
>>>  static void
>>>  ovs_vswitchd_exit(struct unixctl_conn *conn, int argc,
>>> -  const char *argv[], void *exit_args_)
>>> +  const char *argv[], void *args OVS_UNUSED)
>>>  {
>>> -struct ovs_vswitchd_exit_args *exit_args = exit_args_;
>>> -*exit_args->exiting = true;
>>> -*exit_args->cleanup = argc == 2 && !strcmp(argv[1], "--cleanup");
>>> -unixctl_command_reply(conn, NULL);
>>> +exit_args.conn = conn;
>>
>> Should we try to protect from two exit command in the same 
>> unixctl_server_run()?
>> Something like:
>>
>> if (exit_args.conn) {
>>  unixctl_command_reply(conn, NULL);
>> } else {
>> exit_args.conn = conn;
>> }
>>
>
> Good point.  It's unlikely, but can happen.
> We could either do what you suggested or store an array of connections
> and try to reply to all of them.  It becomes a bit inconsistent though
> if different calls have different options.
>
> What do you think?

I was thinking of an array also but what happens with the one not fitting in 
the array!?

However giving it another thought, it might just break with a pipe error and 
quit. So maybe we should just reply to the first one and let the other dingle 

Re: [ovs-dev] [PATCH ovn] binding: fixed port claims as additional_chassis

2023-07-07 Thread Ales Musil
On Tue, Jun 27, 2023 at 11:37 AM Xavier Simonart 
wrote:

> When sb is read-only, the port claim is delayed until sb is rw.
> However, before this patch, this resulted in the chassis always
> claiming the port as main (while it was maybe an additional chassis).
>
> Fixes: a7c7d4519e50 ("controller: avoid recomputes triggered by SBDB
> Port_Binding updates.")
>
> Signed-off-by: Xavier Simonart 
> ---
>  controller/binding.c   | 14 +++---
>  controller/binding.h   |  3 ++-
>  controller/if-status.c | 10 ++
>  3 files changed, 19 insertions(+), 8 deletions(-)
>
> diff --git a/controller/binding.c b/controller/binding.c
> index 359ad6d66..9fb90cf9f 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -1164,7 +1164,9 @@ local_bindings_pb_chassis_is_set(struct shash
> *local_bindings,
>  local_binding_find(local_bindings, pb_name);
>  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
>
> -if (b_lport && b_lport->pb && b_lport->pb->chassis == chassis_rec) {
> +if (b_lport && b_lport->pb &&
> +   ((b_lport->pb->chassis == chassis_rec) ||
> + is_additional_chassis(b_lport->pb, chassis_rec))) {
>  return true;
>  }
>  return false;
> @@ -1173,14 +1175,20 @@ local_bindings_pb_chassis_is_set(struct shash
> *local_bindings,
>  void
>  local_binding_set_pb(struct shash *local_bindings, const char *pb_name,
>   const struct sbrec_chassis *chassis_rec,
> - struct hmap *tracked_datapaths, bool is_set)
> + struct hmap *tracked_datapaths, bool is_set,
> + enum can_bind bind_type)
>  {
>  struct local_binding *lbinding =
>  local_binding_find(local_bindings, pb_name);
>  struct binding_lport *b_lport =
> local_binding_get_primary_lport(lbinding);
>
>  if (b_lport) {
> -set_pb_chassis_in_sbrec(b_lport->pb, chassis_rec, is_set);
> +if (bind_type == CAN_BIND_AS_MAIN) {
> +set_pb_chassis_in_sbrec(b_lport->pb, chassis_rec, is_set);
> +} else  if (bind_type == CAN_BIND_AS_ADDITIONAL) {
> +set_pb_additional_chassis_in_sbrec(b_lport->pb, chassis_rec,
> +   is_set);
> +}
>  if (tracked_datapaths) {
>  update_lport_tracking(b_lport->pb, tracked_datapaths, true);
>  }
> diff --git a/controller/binding.h b/controller/binding.h
> index 319cbd263..abc3d6117 100644
> --- a/controller/binding.h
> +++ b/controller/binding.h
> @@ -23,6 +23,7 @@
>  #include "openvswitch/uuid.h"
>  #include "openvswitch/list.h"
>  #include "sset.h"
> +#include "lport.h"
>
>  struct hmap;
>  struct ovsdb_idl;
> @@ -177,7 +178,7 @@ void local_binding_set_down(struct shash
> *local_bindings, const char *pb_name,
>  void local_binding_set_pb(struct shash *local_bindings, const char
> *pb_name,
>const struct sbrec_chassis *chassis_rec,
>struct hmap *tracked_datapaths,
> -  bool is_set);
> +  bool is_set, enum can_bind);
>  bool local_bindings_pb_chassis_is_set(struct shash *local_bindings,
>const char *pb_name,
>const struct sbrec_chassis
> *chassis_rec);
> diff --git a/controller/if-status.c b/controller/if-status.c
> index 2b2eb1679..b45208746 100644
> --- a/controller/if-status.c
> +++ b/controller/if-status.c
> @@ -184,6 +184,7 @@ struct ovs_iface {
>   * OIF_INSTALL_FLOWS.
>   */
>  uint16_t mtu;   /* Extracted from OVS interface.mtu field. */
> +enum can_bind bind_type;/* CAN_BIND_AS_MAIN or CAN_BIND_AS_ADDITIONAL
> */
>  };
>
>  static uint64_t ifaces_usage;
> @@ -285,6 +286,7 @@ if_status_mgr_claim_iface(struct if_status_mgr *mgr,
>  if (!iface) {
>  iface = ovs_iface_create(mgr, iface_id, iface_rec, OIF_CLAIMED);
>  }
> +iface->bind_type = bind_type;
>
>  memcpy(>pb_uuid, >header_.uuid, sizeof(iface->pb_uuid));
>  if (!sb_readonly) {
> @@ -406,7 +408,7 @@ if_status_handle_claims(struct if_status_mgr *mgr,
>  struct ovs_iface *iface = node->data;
>  VLOG_INFO("if_status_handle_claims for %s", iface->id);
>  local_binding_set_pb(bindings, iface->id, chassis_rec,
> - tracked_datapath, true);
> + tracked_datapath, true, iface->bind_type);
>  rc = true;
>  }
>  return rc;
> @@ -473,7 +475,7 @@ if_status_mgr_update(struct if_status_mgr *mgr,
>  chassis_rec)) {
>  if (!sb_readonly) {
>  local_binding_set_pb(bindings, iface->id, chassis_rec,
> - NULL, true);
> + NULL, true, iface->bind_type);
>  } else {
>  continue;
>

Re: [ovs-dev] [PATCH v2 2/4] dpif: probe support for OVS_ACTION_ATTR_DROP

2023-07-07 Thread Eelco Chaudron



On 7 Jul 2023, at 13:03, Ilya Maximets wrote:

> On 7/6/23 11:30, Eelco Chaudron wrote:
>>
>>
>> On 30 Jun 2023, at 21:05, Eric Garver wrote:
>>
>> Hi Eric,
>>
>> I guess some description here would be good. Maybe add some info that this 
>> is added due to kernel support being added.
>>
>>
>> In general, the patch looks good, and I would ack it. However, there is a 
>> potential issue with TC hardware offload. If the patch gets applied as is, 
>> and the kernel fix gets in, the additional DROP actions get added, but they 
>> will not be offloaded to hardware and are only visible as kernel dp entries.
>>
>> I guess this might not be a real problem. However, people who upgrade OVS 
>> with tc offload might get confused why they now get these drop rules in the 
>> kernel dp. To solve this, we could simply skip adding them if hardware 
>> offload is enabled. So the behavior stays the same. Something like this (not 
>> tested, and we probably need to add dpif_is_netlink()):
>>
>>
>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
>> index 29f4daa63..bc39e45bd 100644
>> --- a/ofproto/ofproto-dpif-xlate.c
>> +++ b/ofproto/ofproto-dpif-xlate.c
>> @@ -34,12 +34,14 @@
>>  #include "csum.h"
>>  #include "dp-packet.h"
>>  #include "dpif.h"
>> +#include "dpif-netdev.h"
>>  #include "in-band.h"
>>  #include "lacp.h"
>>  #include "learn.h"
>>  #include "mac-learning.h"
>>  #include "mcast-snooping.h"
>>  #include "multipath.h"
>> +#include "netdev-offload.h"
>>  #include "netdev-vport.h"
>>  #include "netlink.h"
>>  #include "nx-match.h"
>> @@ -8199,7 +8201,8 @@ exit:
>>
>>  /* Install drop action if datapath supports explicit drop action. */
>>  if (xin->odp_actions && !xin->odp_actions->size &&
>> -ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) {
>> +ovs_explicit_drop_action_supported(ctx.xbridge->ofproto) &&
>> +/* Do not add drop actions if TC hardware offload is enabled, as
>> + * as TC does not support offloading the drop action. */
>> +(!netdev_is_flow_api_enabled() || dpif_is_netdev(xbridge->dpif))) {
>
> I'd argue that the check should not be here.  We should not set the
> baker.rt_support flag in the first place instead.

Yes, that was my initial thought also, but what if hw offload gets enabled 
later? We might need some re-evaluation at that stage. I know the docs say you 
have to restart ovs when you enable HW offload, but that is not always 
happening in practice.

>>  put_drop_action(xin->odp_actions, ctx.error);
>>  }
>>
>> I do not see a way for TC to support this as is in the kernel. We probably 
>> need some new kind of drop reason supported TC action.
>>
>> I know other people are exploring to see if this explicit drop action can be 
>> used for other, more user-level things. If this is true this TC part needs 
>> some proper fixing or we end up with rules not being offloaded.
>>
>> Added some more people, who might have some opinions on TC offload / ideas 
>> on how to add SKB_DROP_REASON support to TC.
>>
>> Cheers,
>>
>>
>> Eelco
>>
>>> Signed-off-by: Eric Garver 
>>> ---
>>>  include/linux/openvswitch.h |  2 +-
>>>  lib/dpif.c  |  6 --
>>>  lib/dpif.h  |  1 -
>>>  ofproto/ofproto-dpif.c  | 34 --
>>>  4 files changed, 33 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>>> index a265e05ad253..fce6456f47c8 100644
>>> --- a/include/linux/openvswitch.h
>>> +++ b/include/linux/openvswitch.h
>>> @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
>>> OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>>> OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>>> OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
>>> +   OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>>>
>>>  #ifndef __KERNEL__
>>> OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>>> OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
>>> -   OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
>>> OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
>>>  #endif
>>> __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>>> diff --git a/lib/dpif.c b/lib/dpif.c
>>> index d328bf288de0..dc02ec912693 100644
>>> --- a/lib/dpif.c
>>> +++ b/lib/dpif.c
>>> @@ -1928,12 +1928,6 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
>>>  return dpif_is_netdev(dpif);
>>>  }
>>>
>>> -bool
>>> -dpif_supports_explicit_drop_action(const struct dpif *dpif)
>>> -{
>>> -return dpif_is_netdev(dpif);
>>> -}
>>> -
>>>  bool
>>>  dpif_supports_lb_output_action(const struct dpif *dpif)
>>>  {
>>> diff --git a/lib/dpif.h b/lib/dpif.h
>>> index 129cbf6a1d5f..fc1719f88b4f 100644
>>> --- a/lib/dpif.h
>>> +++ b/lib/dpif.h
>>> @@ -938,7 +938,6 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
>>> odp_port_t port_no,
>>>
>>>  

Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Eelco Chaudron


On 7 Jul 2023, at 13:08, Ilya Maximets wrote:

> On 7/5/23 21:47, Eelco Chaudron wrote:
>>
>>
>> On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:
>>
>> On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
>> >
>> >
>> > On 30 Jun 2023, at 21:05, Eric Garver wrote:
>> > Hi Eric,
>> >
>> > I started reviewing the series, and this test was failing every other 
>> run for me on ‘check-system-userspace’. I ended up making the following 
>> additional change:
>> >
>> > diff --git a/tests/ofproto-macros.at  
>> b/tests/ofproto-macros.at 
>> > index d2e6ac768..573ecdd0f 100644
>> > --- a/tests/ofproto-macros.at 
>> > +++ b/tests/ofproto-macros.at 
>> > @@ -120,7 +120,7 @@ strip_xids () {
>> >
>> >  # Changes all 'used:...' to say 'used:0.0', to make output easier to 
>> compare.
>> >  strip_used () {
>> > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
>> > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
>> >  }
>>
>> In this test case I expect the flow to be used. My local test runs do
>> not yield "never".
>>
>> Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
>> This is used in tests/drop-stats.at  .
>>
>> Tried that but did not work, if you look at the actual number of packets 
>> received by the flow entry, it varies quite a lot from 0 (only learning) to 
>> 5 when I run the test multiple times.
>>
>> I tried changing the following to generate more traffic:
>>
>> -dnl generate some traffic
>> -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
>> +dnl Generate some traffic.
>> +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
>> [ignore])
>>
>> And then run it 100 times, and the earlier problem did not happen:
>>
>>   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
>> TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'
>>
>> However, I have some times it fails with no traffic:
>>
>>
>> --- - 2023-07-05 21:46:59.860127196 +0200
>> +++ 
>> /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
>>  2023-07-05 21:46:59.857641231 +0200
>> @@ -1,2 +1 @@
>> -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), packets:0, 
>> bytes:0, used:0.0s, actions:drop
>>  
>> Maybe this test needs some more love ;)
>
> The test likely needs a revalidator/wait call, since you're
> checking datapath flows.  Otherwise, it'll be time sensitive.
> Stats are gathered during revalidation.

That was might thought initially, and I tried it (no luck), but it’s dp flows, 
not openflow.

>>
>> >
>> > This is also the reason why the intel tests are failing in patchwork; 
>> https://mail.openvswitch.org/pipermail/ovs-build/2023-June/032011.html 
>> 
>> >
>> > I still need to review the other patches in the series, but some small 
>> comments while testing the patch set:
>> >
>> > We are trying to align all commit messages to start with a Capital and 
>> end with a dot. So in your case:
>> >
>> > ‘tests: system-traffic: Add coverage for drop action.’
>>
>> ACK. I'll address this in v3.
>>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] netdev-dpdk: Drop TSO in case of conflicting virtio features.

2023-07-07 Thread Ilya Maximets
On 7/6/23 21:42, David Marchand wrote:
> At some point in OVS history, some virtio features were announced as
> supported (ECN and UFO virtio features).
> 
> The userspace TSO code, which has been added later, does not support
> those features and tries to disable them.
> 
> This breaks OVS upgrades: if an existing VM already negotiated such
> features, their lack on reconnection to an upgraded OVS triggers a
> vhost socket disconnection by Qemu.
> This results in an endless loop because Qemu then retries with the same
> set of virtio features.
> 
> This patch proposes to try and detect those vhost socket disconnection
> and fallback restoring the old virtio features (and disabling TSO for this
> vhost port).
> 
> Acked-by: Mike Pattrick 
> Acked-by: Simon Horman 
> Signed-off-by: David Marchand 
> ---
> Changelog since v4:
> - I kept acks as the logic behind the state machine did not change much,
> - fixed indent of enumeration in documentation,
> - used status: in documentation instead of grep -o,
> - renamed "disabled_tso" as "userspace-tso",
> - switched to a state machine with flags,
> - removed note on byte padding in netdev_dpdk struct,

Thanks, David!  A few nits inline.

Best regards, Ilya Maximets.

> 
> Changelog since v3:
> - updated documentation now that the interface offloads status is reported
>   in ovsdb,
> - fixed one coding style issue,
> 
> Changelog since v2:
> - reported workaround presence in the ovsdb port status field and
>   updated documentation accordingly,
> - tried to use "better" names, to distinguish ECN virtio feature from
>   TSO OVS netdev feature, 
> 
> Changelog since v1:
> - added a note in the documentation,
> - fixed vhost unregister trigger (so that both disabling and re-enabling
>   TSO is handled),
> - cleared netdev features when disabling TSO,
> - changed level and ratelimited log message on vhost socket disconnect,
> 
> ---
>  Documentation/topics/userspace-tso.rst | 26 ++-
>  lib/netdev-dpdk.c  | 98 --
>  2 files changed, 118 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/topics/userspace-tso.rst 
> b/Documentation/topics/userspace-tso.rst
> index 5a43c2e86b..c4b15f2604 100644
> --- a/Documentation/topics/userspace-tso.rst
> +++ b/Documentation/topics/userspace-tso.rst
> @@ -68,7 +68,7 @@ as follows.
>  connection is established, `TSO` is thus advertised to the guest as an
>  available feature:
>  
> -QEMU Command Line Parameter::
> +1. QEMU Command Line Parameter::
>  
>  $ sudo $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
>  ...
> @@ -77,12 +77,34 @@ QEMU Command Line Parameter::
>  ...
>  
>  2. Ethtool. Assuming that the guest's OS also supports `TSO`, ethtool can be
> -used to enable same::
> +   used to enable same::
>  
>  $ ethtool -K eth0 sg on # scatter-gather is a prerequisite for TSO
>  $ ethtool -K eth0 tso on
>  $ ethtool -k eth0
>  
> +**Note:** Enabling this feature impacts the virtio features exposed by the 
> DPDK
> +vHost User backend to a guest. If a guest was already connected to OvS before
> +enabling TSO and restarting OvS, this guest ports won't have TSO available::
> +
> +$ ovs-vsctl get interface vhost0 status:tx_tcp_seg_offload
> +"false"
> +
> +To help diagnose the issue, those ports have some additional information in
> +their status field in ovsdb::
> +
> +$ ovs-vsctl get interface vhost0 status:userspace-tso
> +disabled
> +
> +To restore TSO for this guest ports, this guest QEMU process must be stopped,
> +then started again. OvS will then report::
> +
> +   $ ovs-vsctl get interface vhost0 status:tx_tcp_seg_offload
> +   "true"
> +
> +   $ ovs-vsctl get interface vhost0 status:userspace-tso
> +   ovs-vsctl: no key "userspace-tso" in Interface record "vhost0" column 
> status
> +
>  ~~~
>  Limitations
>  ~~~
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 63dac689e3..ae28eaa6b6 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -418,6 +418,18 @@ enum dpdk_hw_ol_features {
>  NETDEV_TX_TSO_OFFLOAD = 1 << 7,
>  };
>  
> +/* Flags for the netdev_dpdk virtio_features_state field.
> + * This is used for the virtio features recovery mechanism linked to TSO
> + * support. */
> +#define OVS_VIRTIO_F_CLEAN (UINT8_C(1) << 0)
> +#define OVS_VIRTIO_F_WORKAROUND (UINT8_C(1) << 1)
> +#define OVS_VIRTIO_F_NEGOTIATED (UINT8_C(1) << 2)
> +#define OVS_VIRTIO_F_RECONF_PENDING (UINT8_C(1) << 3)
> +#define OVS_VIRTIO_F_CLEAN_NEGOTIATED (OVS_VIRTIO_F_CLEAN \
> +   | OVS_VIRTIO_F_NEGOTIATED)
> +#define OVS_VIRTIO_F_WORKAROUND_NEGOTIATED (OVS_VIRTIO_F_WORKAROUND \
> +| OVS_VIRTIO_F_NEGOTIATED)

Maybe break the lines before the '(' ?  Might be easier to read:

#define OVS_VIRTIO_F_CLEAN_NEGOTIATED \
(OVS_VIRTIO_F_CLEAN  | OVS_VIRTIO_F_NEGOTIATED)
#define OVS_VIRTIO_F_WORKAROUND_NEGOTIATED \

Re: [ovs-dev] [ovs-dev v4] dpif-netdev: fix dpif_netdev_flow_put

2023-07-07 Thread Ilya Maximets
On 7/5/23 16:22, Eelco Chaudron wrote:
>>> +dnl Hold the prefix 10.1.2.2/24 by another 10s.
>>> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 
>>> 'ipv4(src=10.0.0.1,dst=10.1.2.2),tcp(src=1,dst=2)'])
>>> +dnl Send more 10.1.0.2 to make 10.1.0.0/16 tuple preprend 10.1.2.0/24 
>>> tuple in pvector of subtable.
>>> +for i in `seq 0 256`;do
>>> +AT_CHECK([ovs-appctl netdev-dummy/receive p1 
>>> 'ipv4(src=10.0.0.1,dst=10.1.0.2),tcp(src=1,dst=2)'])
> 
> Missing indent on AT_CHECK(). However different styles exist. No indent, 2 
> spaces, 4 spaces. What is the preference, Ilya?

Most of the code in automake/autotest is using 2 spaces.
So, I'd go with 2 space indentation here.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/4] tests: system-traffic: add coverage for drop action

2023-07-07 Thread Ilya Maximets
On 7/5/23 21:47, Eelco Chaudron wrote:
> 
> 
> On Wed, Jul 5, 2023 at 6:46 PM Eric Garver  wrote:
> 
> On Wed, Jul 05, 2023 at 04:58:47PM +0200, Eelco Chaudron wrote:
> >
> >
> > On 30 Jun 2023, at 21:05, Eric Garver wrote:
> > Hi Eric,
> >
> > I started reviewing the series, and this test was failing every other 
> run for me on ‘check-system-userspace’. I ended up making the following 
> additional change:
> >
> > diff --git a/tests/ofproto-macros.at  
> b/tests/ofproto-macros.at 
> > index d2e6ac768..573ecdd0f 100644
> > --- a/tests/ofproto-macros.at 
> > +++ b/tests/ofproto-macros.at 
> > @@ -120,7 +120,7 @@ strip_xids () {
> >
> >  # Changes all 'used:...' to say 'used:0.0', to make output easier to 
> compare.
> >  strip_used () {
> > -    sed 's/used:[[0-9]]\.[[0-9]]*/used:0.0/'
> > +    sed 's/used:\([[0-9]]\.[[0-9]]*s\|never\)/used:0.0s/'
> >  }
> 
> In this test case I expect the flow to be used. My local test runs do
> not yield "never".
> 
> Maybe we need a "ovs-appctl time/warp 5000" before dumping the flow?
> This is used in tests/drop-stats.at .
> 
> Tried that but did not work, if you look at the actual number of packets 
> received by the flow entry, it varies quite a lot from 0 (only learning) to 5 
> when I run the test multiple times.
> 
> I tried changing the following to generate more traffic:
> 
> -dnl generate some traffic
> -NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -w 2 10.1.1.2], [1], [ignore])
> +dnl Generate some traffic.
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 100 -i 0.01 -w 1 10.1.1.2], [1], 
> [ignore])
> 
> And then run it 100 times, and the earlier problem did not happen:
> 
>   sudo bash -c 'for i in {1..100}; do make check-system-userspace 
> TESTSUITEFLAGS="-k drop -v" || break; echo "EC_DONE: $i"; done'
> 
> However, I have some times it fails with no traffic:
> 
> 
> --- - 2023-07-05 21:46:59.860127196 +0200
> +++ 
> /home/echaudron/Documents/review/ovs_eric_DROP/tests/system-userspace-testsuite.dir/at-groups/45/stdout
>  2023-07-05 21:46:59.857641231 +0200
> @@ -1,2 +1 @@
> -recirc_id(),in_port(2),eth_type(0x0800),ipv4(frag=no), packets:0, 
> bytes:0, used:0.0s, actions:drop
>  
> Maybe this test needs some more love ;)

The test likely needs a revalidator/wait call, since you're
checking datapath flows.  Otherwise, it'll be time sensitive.
Stats are gathered during revalidation.

> 
> >
> > This is also the reason why the intel tests are failing in patchwork; 
> https://mail.openvswitch.org/pipermail/ovs-build/2023-June/032011.html 
> 
> >
> > I still need to review the other patches in the series, but some small 
> comments while testing the patch set:
> >
> > We are trying to align all commit messages to start with a Capital and 
> end with a dot. So in your case:
> >
> > ‘tests: system-traffic: Add coverage for drop action.’
> 
> ACK. I'll address this in v3.
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/4] dpif: probe support for OVS_ACTION_ATTR_DROP

2023-07-07 Thread Ilya Maximets
On 7/6/23 11:30, Eelco Chaudron wrote:
> 
> 
> On 30 Jun 2023, at 21:05, Eric Garver wrote:
> 
> Hi Eric,
> 
> I guess some description here would be good. Maybe add some info that this is 
> added due to kernel support being added.
> 
> 
> In general, the patch looks good, and I would ack it. However, there is a 
> potential issue with TC hardware offload. If the patch gets applied as is, 
> and the kernel fix gets in, the additional DROP actions get added, but they 
> will not be offloaded to hardware and are only visible as kernel dp entries.
> 
> I guess this might not be a real problem. However, people who upgrade OVS 
> with tc offload might get confused why they now get these drop rules in the 
> kernel dp. To solve this, we could simply skip adding them if hardware 
> offload is enabled. So the behavior stays the same. Something like this (not 
> tested, and we probably need to add dpif_is_netlink()):
> 
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 29f4daa63..bc39e45bd 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -34,12 +34,14 @@
>  #include "csum.h"
>  #include "dp-packet.h"
>  #include "dpif.h"
> +#include "dpif-netdev.h"
>  #include "in-band.h"
>  #include "lacp.h"
>  #include "learn.h"
>  #include "mac-learning.h"
>  #include "mcast-snooping.h"
>  #include "multipath.h"
> +#include "netdev-offload.h"
>  #include "netdev-vport.h"
>  #include "netlink.h"
>  #include "nx-match.h"
> @@ -8199,7 +8201,8 @@ exit:
> 
>  /* Install drop action if datapath supports explicit drop action. */
>  if (xin->odp_actions && !xin->odp_actions->size &&
> -ovs_explicit_drop_action_supported(ctx.xbridge->ofproto)) {
> +ovs_explicit_drop_action_supported(ctx.xbridge->ofproto) &&
> +/* Do not add drop actions if TC hardware offload is enabled, as
> + * as TC does not support offloading the drop action. */
> +(!netdev_is_flow_api_enabled() || dpif_is_netdev(xbridge->dpif))) {

I'd argue that the check should not be here.  We should not set the
baker.rt_support flag in the first place instead.

>  put_drop_action(xin->odp_actions, ctx.error);
>  }
> 
> I do not see a way for TC to support this as is in the kernel. We probably 
> need some new kind of drop reason supported TC action.
> 
> I know other people are exploring to see if this explicit drop action can be 
> used for other, more user-level things. If this is true this TC part needs 
> some proper fixing or we end up with rules not being offloaded.
> 
> Added some more people, who might have some opinions on TC offload / ideas on 
> how to add SKB_DROP_REASON support to TC.
> 
> Cheers,
> 
> 
> Eelco
> 
>> Signed-off-by: Eric Garver 
>> ---
>>  include/linux/openvswitch.h |  2 +-
>>  lib/dpif.c  |  6 --
>>  lib/dpif.h  |  1 -
>>  ofproto/ofproto-dpif.c  | 34 --
>>  4 files changed, 33 insertions(+), 10 deletions(-)
>>
>> diff --git a/include/linux/openvswitch.h b/include/linux/openvswitch.h
>> index a265e05ad253..fce6456f47c8 100644
>> --- a/include/linux/openvswitch.h
>> +++ b/include/linux/openvswitch.h
>> @@ -1086,11 +1086,11 @@ enum ovs_action_attr {
>>  OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>>  OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>>  OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
>> +OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>>
>>  #ifndef __KERNEL__
>>  OVS_ACTION_ATTR_TUNNEL_PUSH,   /* struct ovs_action_push_tnl*/
>>  OVS_ACTION_ATTR_TUNNEL_POP,/* u32 port number. */
>> -OVS_ACTION_ATTR_DROP,  /* u32 xlate_error. */
>>  OVS_ACTION_ATTR_LB_OUTPUT, /* u32 bond-id. */
>>  #endif
>>  __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>> diff --git a/lib/dpif.c b/lib/dpif.c
>> index d328bf288de0..dc02ec912693 100644
>> --- a/lib/dpif.c
>> +++ b/lib/dpif.c
>> @@ -1928,12 +1928,6 @@ dpif_supports_tnl_push_pop(const struct dpif *dpif)
>>  return dpif_is_netdev(dpif);
>>  }
>>
>> -bool
>> -dpif_supports_explicit_drop_action(const struct dpif *dpif)
>> -{
>> -return dpif_is_netdev(dpif);
>> -}
>> -
>>  bool
>>  dpif_supports_lb_output_action(const struct dpif *dpif)
>>  {
>> diff --git a/lib/dpif.h b/lib/dpif.h
>> index 129cbf6a1d5f..fc1719f88b4f 100644
>> --- a/lib/dpif.h
>> +++ b/lib/dpif.h
>> @@ -938,7 +938,6 @@ int dpif_get_pmds_for_port(const struct dpif * dpif, 
>> odp_port_t port_no,
>>
>>  char *dpif_get_dp_version(const struct dpif *);
>>  bool dpif_supports_tnl_push_pop(const struct dpif *);
>> -bool dpif_supports_explicit_drop_action(const struct dpif *);
>>  bool dpif_synced_dp_layers(struct dpif *);
>>
>>  /* Log functions. */
>> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
>> index fad7342b0b02..c490a3c1da48 100644
>> --- a/ofproto/ofproto-dpif.c
>> +++ 

Re: [ovs-dev] [PATCH] vswitchd: Wait for a bridge exit before replying to exit unixctl.

2023-07-07 Thread Ilya Maximets
On 7/5/23 15:54, Eelco Chaudron wrote:
> 
> 
> On 4 Jul 2023, at 15:11, Ilya Maximets wrote:
> 
>> Before the cleanup option, the bridge_exit() call was fairly fast,
>> because it didn't include any particularly long operations.  However,
>> with the cleanup flag, this function destroys a lot of datapath
>> resources freeing a lot of memory, waiting on RCU and talking to
>> the kernel.  That may take a noticeable amount of time, especially
>> on a busy system or under profilers/sanitizers.  However, the unixctl
>> 'exit' command replies instantly without waiting for any work to
>> actually be done.  This may cause system test failures or other
>> issues where scripts expect ovs-vswitchd to exit or destroy all the
>> datapath resources shortly after appctl call.
>>
>> Fix that by waiting for the bridge_exit() before replying to the user.
>> At least, all the datapath resources will actually be destroyed by
>> the time ovs-appctl exits.
>>
>> Also moving a structure from stack to global.  Seems cleaner this way.
> 
> Thanks, yes it looks cleaner. One comment inline below.
> 
> //Eelco
> 
> 
>> Fixes: fe13ccdca6a2 ("vswitchd: Add --cleanup option to the 'appctl exit' 
>> command")
>> Signed-off-by: Ilya Maximets 
>> ---
>>  vswitchd/ovs-vswitchd.c | 38 +++---
>>  1 file changed, 19 insertions(+), 19 deletions(-)
>>
>> diff --git a/vswitchd/ovs-vswitchd.c b/vswitchd/ovs-vswitchd.c
>> index a244d2f70..55b437871 100644
>> --- a/vswitchd/ovs-vswitchd.c
>> +++ b/vswitchd/ovs-vswitchd.c
>> @@ -68,19 +68,18 @@ static unixctl_cb_func ovs_vswitchd_exit;
>>  static char *parse_options(int argc, char *argv[], char **unixctl_path);
>>  OVS_NO_RETURN static void usage(void);
>>
>> -struct ovs_vswitchd_exit_args {
>> -bool *exiting;
>> -bool *cleanup;
>> -};
>> +static struct ovs_vswitchd_exit_args {
>> +struct unixctl_conn *conn;
>> +bool exiting;
>> +bool cleanup;
>> +} exit_args;
>>
>>  int
>>  main(int argc, char *argv[])
>>  {
>> -char *unixctl_path = NULL;
>>  struct unixctl_server *unixctl;
>> +char *unixctl_path = NULL;
>>  char *remote;
>> -bool exiting, cleanup;
>> -struct ovs_vswitchd_exit_args exit_args = {, };
>>  int retval;
>>
>>  set_program_name(argv[0]);
>> @@ -111,14 +110,12 @@ main(int argc, char *argv[])
>>  exit(EXIT_FAILURE);
>>  }
>>  unixctl_command_register("exit", "[--cleanup]", 0, 1,
>> - ovs_vswitchd_exit, _args);
>> + ovs_vswitchd_exit, NULL);
>>
>>  bridge_init(remote);
>>  free(remote);
>>
>> -exiting = false;
>> -cleanup = false;
>> -while (!exiting) {
>> +while (!exit_args.exiting) {
>>  OVS_USDT_PROBE(main, run_start);
>>  memory_run();
>>  if (memory_should_report()) {
>> @@ -137,16 +134,20 @@ main(int argc, char *argv[])
>>  bridge_wait();
>>  unixctl_server_wait(unixctl);
>>  netdev_wait();
>> -if (exiting) {
>> +if (exit_args.exiting) {
>>  poll_immediate_wake();
>>  }
>>  OVS_USDT_PROBE(main, poll_block);
>>  poll_block();
>>  if (should_service_stop()) {
>> -exiting = true;
>> +exit_args.exiting = true;
>>  }
>>  }
>> -bridge_exit(cleanup);
>> +bridge_exit(exit_args.cleanup);
>> +
>> +if (exit_args.conn) {
>> +unixctl_command_reply(exit_args.conn, NULL);
>> +}
>>  unixctl_server_destroy(unixctl);
>>  service_stop();
>>  vlog_disable_async();
>> @@ -304,10 +305,9 @@ usage(void)
>>
>>  static void
>>  ovs_vswitchd_exit(struct unixctl_conn *conn, int argc,
>> -  const char *argv[], void *exit_args_)
>> +  const char *argv[], void *args OVS_UNUSED)
>>  {
>> -struct ovs_vswitchd_exit_args *exit_args = exit_args_;
>> -*exit_args->exiting = true;
>> -*exit_args->cleanup = argc == 2 && !strcmp(argv[1], "--cleanup");
>> -unixctl_command_reply(conn, NULL);
>> +exit_args.conn = conn;
> 
> Should we try to protect from two exit command in the same 
> unixctl_server_run()?
> Something like:
> 
> if (exit_args.conn) {
>  unixctl_command_reply(conn, NULL);
> } else {
> exit_args.conn = conn;
> }
> 

Good point.  It's unlikely, but can happen.
We could either do what you suggested or store an array of connections
and try to reply to all of them.  It becomes a bit inconsistent though
if different calls have different options.

What do you think?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next 2/2] net: openvswitch: add drop action

2023-07-07 Thread Ilya Maximets
On 7/6/23 15:57, Eric Garver wrote:
> On Thu, Jul 06, 2023 at 08:54:16AM -0400, Aaron Conole wrote:
>> Eric Garver  writes:
>>
>>> This adds an explicit drop action. This is used by OVS to drop packets
>>> for which it cannot determine what to do. An explicit action in the
>>> kernel allows passing the reason _why_ the packet is being dropped. We
>>> can then use perf tracing to match on the drop reason.
>>>
>>> e.g. trace all OVS dropped skbs
>>>
>>>  # perf trace -e skb:kfree_skb --filter="reason >= 0x3"
>>>  [..]
>>>  106.023 ping/2465 skb:kfree_skb(skbaddr: 0xa0e8765f2000, \
>>>   location:0xc0d9b462, protocol: 2048, reason: 196610)
>>>
>>> reason: 196610 --> 0x30002 (OVS_XLATE_RECURSION_TOO_DEEP)
>>>
>>> Signed-off-by: Eric Garver 
>>> ---
>>>  include/uapi/linux/openvswitch.h|  2 ++
>>>  net/openvswitch/actions.c   | 13 +
>>>  net/openvswitch/flow_netlink.c  | 12 +++-
>>>  .../testing/selftests/net/openvswitch/ovs-dpctl.py  |  3 +++
>>>  4 files changed, 29 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/uapi/linux/openvswitch.h 
>>> b/include/uapi/linux/openvswitch.h
>>> index e94870e77ee9..a967dbca3574 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -965,6 +965,7 @@ struct check_pkt_len_arg {
>>>   * start of the packet or at the start of the l3 header depending on the 
>>> value
>>>   * of l3 tunnel flag in the tun_flags field of OVS_ACTION_ATTR_ADD_MPLS
>>>   * argument.
>>> + * @OVS_ACTION_ATTR_DROP: Explicit drop action.
>>>   *
>>>   * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  
>>> Not all
>>>   * fields within a header are modifiable, e.g. the IPv4 protocol and 
>>> fragment
>>> @@ -1002,6 +1003,7 @@ enum ovs_action_attr {
>>> OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
>>> OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */
>>> OVS_ACTION_ATTR_DEC_TTL,  /* Nested OVS_DEC_TTL_ATTR_*. */
>>> +   OVS_ACTION_ATTR_DROP, /* u32 xlate_error. */
>>>  
>>> __OVS_ACTION_ATTR_MAX,/* Nothing past this will be accepted
>>>* from userspace. */
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index cab1e02b63e0..4ad9a45dc042 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -32,6 +32,7 @@
>>>  #include "vport.h"
>>>  #include "flow_netlink.h"
>>>  #include "openvswitch_trace.h"
>>> +#include "drop.h"
>>>  
>>>  struct deferred_action {
>>> struct sk_buff *skb;
>>> @@ -1477,6 +1478,18 @@ static int do_execute_actions(struct datapath *dp, 
>>> struct sk_buff *skb,
>>> return dec_ttl_exception_handler(dp, skb,
>>>  key, a);
>>> break;
>>> +
>>> +   case OVS_ACTION_ATTR_DROP:
>>> +   u32 reason = nla_get_u32(a);
>>> +
>>> +   reason |= SKB_DROP_REASON_SUBSYS_OPENVSWITCH <<
>>> +   SKB_DROP_REASON_SUBSYS_SHIFT;
>>> +
>>> +   if (reason == OVS_XLATE_OK)
>>> +   break;
>>> +
>>> +   kfree_skb_reason(skb, reason);
>>> +   return 0;
>>> }
>>>  
>>> if (unlikely(err)) {
>>> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
>>> index 41116361433d..23d39eae9a0d 100644
>>> --- a/net/openvswitch/flow_netlink.c
>>> +++ b/net/openvswitch/flow_netlink.c
>>> @@ -39,6 +39,7 @@
>>>  #include 
>>>  
>>>  #include "flow_netlink.h"
>>> +#include "drop.h"
>>>  
>>>  struct ovs_len_tbl {
>>> int len;
>>> @@ -61,6 +62,7 @@ static bool actions_may_change_flow(const struct nlattr 
>>> *actions)
>>> case OVS_ACTION_ATTR_RECIRC:
>>> case OVS_ACTION_ATTR_TRUNC:
>>> case OVS_ACTION_ATTR_USERSPACE:
>>> +   case OVS_ACTION_ATTR_DROP:
>>> break;
>>>  
>>> case OVS_ACTION_ATTR_CT:
>>> @@ -2394,7 +2396,7 @@ static void ovs_nla_free_nested_actions(const struct 
>>> nlattr *actions, int len)
>>> /* Whenever new actions are added, the need to update this
>>>  * function should be considered.
>>>  */
>>> -   BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 23);
>>> +   BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
>>>  
>>> if (!actions)
>>> return;
>>> @@ -3182,6 +3184,7 @@ static int __ovs_nla_copy_actions(struct net *net, 
>>> const struct nlattr *attr,
>>> [OVS_ACTION_ATTR_CHECK_PKT_LEN] = (u32)-1,
>>> [OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct 
>>> ovs_action_add_mpls),
>>> [OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
>>> +   [OVS_ACTION_ATTR_DROP] = sizeof(u32),
>>> };
>>> const struct ovs_action_push_vlan 

Re: [ovs-dev] [PATCH net-next 4/4] selftests: openvswitch: add ct-nat test case with ipv4

2023-07-07 Thread Adrian Moreno



On 6/28/23 18:27, Aaron Conole wrote:

Building on the previous work, add a very simplistic NAT case
using ipv4.  This just tests dnat transformation

Signed-off-by: Aaron Conole 


Hi Aaron,

I know that the goal is not to support the full syntax, and that nat is a 
specially convoluted action, so I'm just commenting on the low-hanging fruits 
(see below).



---
  .../selftests/net/openvswitch/openvswitch.sh  | 64 +++
  .../selftests/net/openvswitch/ovs-dpctl.py| 60 +
  2 files changed, 124 insertions(+)

diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh 
b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index 40a66c72af0f0..dced4f612a78c 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -14,6 +14,7 @@ tests="
arp_pingeth-arp: Basic arp ping between 
two NS
ct_connect_v4   ip4-ct-xon: Basic ipv4 tcp 
connection using ct
connect_v4  ip4-xon: Basic ipv4 ping 
between two NS
+   nat_connect_v4  ip4-nat-xon: Basic ipv4 tcp 
connection via NAT
netlink_checks  ovsnl: validate netlink attrs 
and settings
upcall_interfaces   ovs: test the upcall interfaces"
  
@@ -300,6 +301,69 @@ test_connect_v4 () {

return 0
  }
  
+# nat_connect_v4 test

+#  - client has 1500 byte MTU
+#  - server has 1500 byte MTU
+#  - use ICMP to ping in each direction
+#  - only allow CT state stuff to pass through new in c -> s
+test_nat_connect_v4 () {
+   which nc >/dev/null 2>/dev/null || return $ksft_skip
+
+   sbx_add "test_nat_connect_v4" || return $?
+
+   ovs_add_dp "test_nat_connect_v4" nat4 || return 1
+   info "create namespaces"
+   for ns in client server; do
+   ovs_add_netns_and_veths "test_nat_connect_v4" "nat4" "$ns" \
+   "${ns:0:1}0" "${ns:0:1}1" || return 1
+   done
+
+   ip netns exec client ip addr add 172.31.110.10/24 dev c1
+   ip netns exec client ip link set c1 up
+   ip netns exec server ip addr add 172.31.110.20/24 dev s1
+   ip netns exec server ip link set s1 up
+
+   ip netns exec client ip route add default via 172.31.110.20
+
+   ovs_add_flow "test_nat_connect_v4" nat4 \
+   'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
+   ovs_add_flow "test_nat_connect_v4" nat4 \
+   'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
+   ovs_add_flow "test_nat_connect_v4" nat4 \
+   
"ct_state(-trk),in_port(1),eth(),eth_type(0x0800),ipv4(dst=192.168.0.20)" \
+   "ct(commit,nat(dst=172.31.110.20)),recirc(0x1)"
+   ovs_add_flow "test_nat_connect_v4" nat4 \
+   "ct_state(-trk),in_port(2),eth(),eth_type(0x0800),ipv4()" \
+   "ct(commit,nat),recirc(0x2)"
+
+   ovs_add_flow "test_nat_connect_v4" nat4 \
+   
"recirc_id(0x1),ct_state(+trk-inv),in_port(1),eth(),eth_type(0x0800),ipv4()" "2"
+   ovs_add_flow "test_nat_connect_v4" nat4 \
+   
"recirc_id(0x2),ct_state(+trk-inv),in_port(2),eth(),eth_type(0x0800),ipv4()" "1"
+
+   # do a ping
+   ovs_sbx "test_nat_connect_v4" ip netns exec client ping 192.168.0.20 -c 
3 || return 1
+
+   # create an echo server in 'server'
+   echo "server" | \
+   ovs_netns_spawn_daemon "test_nat_connect_v4" "server" \
+   nc -lvnp 4443
+   ovs_sbx "test_nat_connect_v4" ip netns exec client nc -i 1 -zv 
192.168.0.20 4443 || return 1
+
+   # Now test in the other direction (should fail)
+   echo "client" | \
+   ovs_netns_spawn_daemon "test_nat_connect_v4" "client" \
+   nc -lvnp 4443
+   ovs_sbx "test_nat_connect_v4" ip netns exec client nc -i 1 -zv 
172.31.110.10 4443
+   if [ $? == 0 ]; then
+  info "connect to client was successful"
+  return 1
+   fi
+
+   info "done..."
+   return 0
+}
+
  # netlink_validation
  # - Create a dp
  # - check no warning with "old version" simulation
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 704cb4adf79a9..12ba5265b88fb 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -511,6 +511,66 @@ class ovsactions(nla):
  else:
  ctact["attrs"].append([scan[1], None])
  actstr = actstr[strspn(actstr, ", ") :]
+# it seems strange to put this here, but nat() is a complex
+# sub-action and this lets it sit anywhere in the ct() 
action
+if actstr.startswith("nat"):
+

Re: [ovs-dev] [PATCH net-next 3/4] selftests: openvswitch: add basic ct test case parsing

2023-07-07 Thread Adrian Moreno



On 6/28/23 18:27, Aaron Conole wrote:

Forwarding via ct() action is an important use case for openvswitch, but
generally would require using a full ovs-vswitchd to get working. Add a
ct action parser for basic ct test case.

Signed-off-by: Aaron Conole 
---
NOTE: 3 lines flag the line-length checkpatch warning, but there didnt
   seem to be a really good way of breaking the lines smaller.

  .../selftests/net/openvswitch/openvswitch.sh  | 68 +++
  .../selftests/net/openvswitch/ovs-dpctl.py| 39 +++
  2 files changed, 107 insertions(+)

diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh 
b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index 5d60a9466dab3..40a66c72af0f0 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -12,6 +12,7 @@ TRACING=0
  
  tests="

arp_pingeth-arp: Basic arp ping between 
two NS
+   ct_connect_v4   ip4-ct-xon: Basic ipv4 tcp 
connection using ct
connect_v4  ip4-xon: Basic ipv4 ping 
between two NS
netlink_checks  ovsnl: validate netlink attrs 
and settings
upcall_interfaces   ovs: test the upcall interfaces"
@@ -193,6 +194,73 @@ test_arp_ping () {
return 0
  }
  
+# ct_connect_v4 test

+#  - client has 1500 byte MTU
+#  - server has 1500 byte MTU
+#  - use ICMP to ping in each direction
+#  - only allow CT state stuff to pass through new in c -> s
+test_ct_connect_v4 () {
+
+   which nc >/dev/null 2>/dev/null || return $ksft_skip
+
+   sbx_add "test_ct_connect_v4" || return $?
+
+   ovs_add_dp "test_ct_connect_v4" ct4 || return 1
+   info "create namespaces"
+   for ns in client server; do
+   ovs_add_netns_and_veths "test_ct_connect_v4" "ct4" "$ns" \
+   "${ns:0:1}0" "${ns:0:1}1" || return 1
+   done
+
+   ip netns exec client ip addr add 172.31.110.10/24 dev c1
+   ip netns exec client ip link set c1 up
+   ip netns exec server ip addr add 172.31.110.20/24 dev s1
+   ip netns exec server ip link set s1 up
+
+   # Add forwarding for ARP and ip packets - completely wildcarded
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+   'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+   'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+'ct_state(-trk),eth(),eth_type(0x0800),ipv4()' \
+'ct(commit),recirc(0x1)' || return 1
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+
'recirc_id(0x1),ct_state(+trk+new),in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10)'
 \
+'2' || return 1
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+
'recirc_id(0x1),ct_state(+trk+est),in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10)'
 \
+'2' || return 1
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+
'recirc_id(0x1),ct_state(+trk+est),in_port(2),eth(),eth_type(0x0800),ipv4(dst=172.31.110.10)'
 \
+'1' || return 1
+   ovs_add_flow "test_ct_connect_v4" ct4 \
+
'recirc_id(0x1),ct_state(+trk+inv),eth(),eth_type(0x0800),ipv4()' 'drop' || \
+return 1
+
+   # do a ping
+   ovs_sbx "test_ct_connect_v4" ip netns exec client ping 172.31.110.20 -c 
3 || return 1
+
+   # create an echo server in 'server'
+   echo "server" | \
+   ovs_netns_spawn_daemon "test_ct_connect_v4" "server" \
+   nc -lvnp 4443
+   ovs_sbx "test_ct_connect_v4" ip netns exec client nc -i 1 -zv 
172.31.110.20 4443 || return 1
+
+   # Now test in the other direction (should fail)
+   echo "client" | \
+   ovs_netns_spawn_daemon "test_ct_connect_v4" "client" \
+   nc -lvnp 4443
+   ovs_sbx "test_ct_connect_v4" ip netns exec client nc -i 1 -zv 
172.31.110.10 4443
+   if [ $? == 0 ]; then
+  info "ct connect to client was successful"
+  return 1
+   fi
+
+   info "done..."
+   return 0
+}
+
  # connect_v4 test
  #  - client has 1500 byte MTU
  #  - server has 1500 byte MTU
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py 
b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index 799bfb3064b90..704cb4adf79a9 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -62,6 +62,15 @@ def macstr(mac):
  return outstr
  
  
+def strcspn(str1, str2):

+tot = 0
+for char in str1:
+if str2.find(char) != -1:
+return tot
+tot += 1
+return tot
+
+
  def strspn(str1, 

Re: [ovs-dev] [PATCH ovn v2 1/2] qos: fix potential double deletion of ovs idl row

2023-07-07 Thread Ales Musil
On Tue, Jun 20, 2023 at 4:26 PM Xavier Simonart  wrote:

> If an interface with an qos option is deleted at the same
> time as an ofport notification from ovs (causing runtime_data recompute)
> is received, the binding module was trying to delete twice the same qos
> queue, causing ovs to raise an exception.
>
> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2213219
> Fixes: 7d1d111ff213 ("controller: configure qos through ovs qos table and
> do not run tc directly")
> Signed-off-by: Xavier Simonart 
>

Hi,
I have one question below.


> ---
> v2: rebased on origin/main
> ---
>  controller/binding.c| 22 
>  controller/binding.h|  1 +
>  controller/ovn-controller.c | 12 +++
>  tests/ovn-macros.at | 34 ++
>  tests/ovn.at| 70 +
>  tests/system-ovn.at | 18 --
>  6 files changed, 139 insertions(+), 18 deletions(-)
>
> diff --git a/controller/binding.c b/controller/binding.c
> index 486095ca7..8069a2e0d 100644
> --- a/controller/binding.c
> +++ b/controller/binding.c
> @@ -396,9 +396,23 @@ configure_qos(const struct sbrec_port_binding *pb,
>  q->burst = burst;
>  }
>
> +static const struct ovsrec_queue *
> +find_qos_queue_by_external_ids(const struct smap *external_ids,
> +struct ovsdb_idl_index *ovsrec_queue_by_external_ids)
> +{
> +const struct ovsrec_queue *queue =
> +ovsrec_queue_index_init_row(ovsrec_queue_by_external_ids);
> +ovsrec_queue_index_set_external_ids(queue, external_ids);
> +const struct ovsrec_queue *retval =
> +ovsrec_queue_index_find(ovsrec_queue_by_external_ids, queue);
> +ovsrec_queue_index_destroy_row(queue);
> +return retval;
> +}
> +
>  static void
>  ovs_qos_entries_gc(struct ovsdb_idl_txn *ovs_idl_txn,
> struct ovsdb_idl_index *ovsrec_port_by_qos,
> +   struct ovsdb_idl_index *ovsrec_queue_by_external_ids,
> const struct ovsrec_qos_table *qos_table,
> struct hmap *queue_map)
>  {
> @@ -414,6 +428,13 @@ ovs_qos_entries_gc(struct ovsdb_idl_txn *ovs_idl_txn,
>  if (!queue) {
>  continue;
>  }
> +const struct ovsrec_queue *ovsrec_queue =
> +find_qos_queue_by_external_ids(>external_ids,
> +
>  ovsrec_queue_by_external_ids);
>

Since we do not completely control the external ids, isn't there a chance
that we will have
outdated external ids which will result in leaked qos records in the end?
Maybe my understanding of the index search over smap is wrong.


> +if (!ovsrec_queue) {
> +VLOG_DBG("queue already deleted !");
> +continue;
> +}
>
>  const char *port = smap_get(>external_ids, "ovn_port");
>  if (!port) {
> @@ -2165,6 +2186,7 @@ binding_run(struct binding_ctx_in *b_ctx_in, struct
> binding_ctx_out *b_ctx_out)
>  shash_destroy(_mappings);
>  /* Remove stale QoS entries. */
>  ovs_qos_entries_gc(b_ctx_in->ovs_idl_txn,
> b_ctx_in->ovsrec_port_by_qos,
> +   b_ctx_in->ovsrec_queue_by_external_ids,
> b_ctx_in->qos_table, b_ctx_out->qos_map);
>
>  cleanup_claimed_port_timestamps();
> diff --git a/controller/binding.h b/controller/binding.h
> index 0e57f02ee..e3ab1d7ca 100644
> --- a/controller/binding.h
> +++ b/controller/binding.h
> @@ -47,6 +47,7 @@ struct binding_ctx_in {
>  struct ovsdb_idl_index *sbrec_port_binding_by_datapath;
>  struct ovsdb_idl_index *sbrec_port_binding_by_name;
>  struct ovsdb_idl_index *ovsrec_port_by_qos;
> +struct ovsdb_idl_index *ovsrec_queue_by_external_ids;
>  const struct ovsrec_qos_table *qos_table;
>  const struct sbrec_port_binding_table *port_binding_table;
>  const struct ovsrec_bridge *br_int;
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index a47406979..bb84554fc 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -1116,6 +1116,7 @@ enum sb_engine_node {
>  OVS_NODE(port, "port") \
>  OVS_NODE(interface, "interface") \
>  OVS_NODE(qos, "qos") \
> +OVS_NODE(queue, "queue") \
>  OVS_NODE(flow_sample_collector_set, "flow_sample_collector_set")
>
>  enum ovs_engine_node {
> @@ -1576,6 +1577,10 @@ init_binding_ctx(struct engine_node *node,
>  engine_ovsdb_node_get_index(
>  engine_get_input("OVS_port", node), "qos");
>
> +struct ovsdb_idl_index *ovsrec_queue_by_external_ids =
> +engine_ovsdb_node_get_index(
> +engine_get_input("OVS_queue", node), "external_ids");
> +
>  struct controller_engine_ctx *ctrl_ctx =
> engine_get_context()->client_ctx;
>
>  b_ctx_in->ovnsb_idl_txn = engine_get_context()->ovnsb_idl_txn;
> @@ -1584,6 +1589,7 @@ init_binding_ctx(struct engine_node *node,
>  b_ctx_in->sbrec_port_binding_by_datapath =
> 

Re: [ovs-dev] [PATCH ovn v2] tests: fixed userspace-system tests not properly cleaned up

2023-07-07 Thread Ales Musil
On Wed, Jul 5, 2023 at 5:57 PM Xavier Simonart  wrote:

> Usually, when cleaning up a system test we execute (for the ovn-vswitchd
> process):
> - OVS_TRAFFIC_VSWITCHD_STOP (i.e. ovs-appctl -t $1 exit --cleanup, wait up
> to 30 seconds
>   and report failure if not properly cleaned up).
> - kill_ovs_vswitchd (i.e. if ovs-vsitwhd still running,
>   ovs-appctl -t ovs-vswitchd exit --cleanup, wait 1 to 10 seconds and kill
> ovs-vswitvchd
>   if not properly cleaned up).
>
> If we do not execute OVS_TRAFFIC_VSWITCHD_STOP, wait time is much smaller
> which might result in
> - clean up not done properly, ovs-vswitchd killed, ovs-netdev not deleted,
> and no error reported
> - next tests will fail
>
> Using this patch, we properly execute OVS_TRAFFIC_VSWITCHD_STOP, so the
> delay waiting for
> ovs-vswitchd to exit is longer, and, if by any chance this delay is still
> too short,
> the proper test will fail (i.e. the first test failing will be the one
> unable to clean).
>
> Note that, after this patch, some extra tests might fail such as
> "load-balancer template IPv4".
> This is due to the fact that, instead of killing ovn-controller at exit,
> we now properly exit and check the unexpected presence of warnings. In
> other words, we might now higlight some existing issues which were
> ignoring before.
>
> Signed-off-by: Xavier Simonart 
>
> ---
> v2: - updated based on Ales' feedback:
>   - some cleanup wrongly done twice (e.g. for DNAT LR hairpin IPv4)
> causing test to always fail
>   - replaced the 'kill ovn-controller' in some tests by proper cleanup
> - added a few additional missing OVS_WAIT_AND_EXIT (ovsdb-server)
> ---
>  tests/system-ovn.at | 103 ++--
>  1 file changed, 99 insertions(+), 4 deletions(-)
>
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index 44a8072d6..8f6a647a0 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -5819,6 +5819,21 @@
> tcp,orig=(src=42.42.42.3,dst=42.42.42.2,sport=,dport=4242),reply=(s
>
>  
> tcp,orig=(src=42.42.42.3,dst=66.66.66.66,sport=,dport=666),reply=(src=42.42.42.2,dst=42.42.42.3,sport=4242,dport=),zone=,mark=2,protoinfo=(state=)
>  ])
>
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> +/connection dropped.*/d"])
> +
>  AT_CLEANUP
>  ])
>
> @@ -5915,6 +5930,21 @@
> tcp,orig=(src=4242::3,dst=4242::2,sport=,dport=4242),reply=(src=424
>
>  
> tcp,orig=(src=4242::3,dst=::1,sport=,dport=666),reply=(src=4242::2,dst=4242::3,sport=4242,dport=),zone=,mark=2,protoinfo=(state=)
>  ])
>
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> +/connection dropped.*/d"])
> +
>  AT_CLEANUP
>  ])
>
> @@ -6716,7 +6746,7 @@ AT_CHECK([ovn-nbctl set Logical_Switch_Port sw02
> options:qos_burst=100])
>  OVS_WAIT_UNTIL([tc class show dev ovs-public | \
>  grep -q 'class htb .* prio 0 rate 5Gbit ceil 6Gbit burst
> 125000b cburst 124500b'])
>
> -kill $(pidof ovn-controller)
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> @@ -6867,7 +6897,7 @@ Allowing connections from 1000::a
>  wait_column "up" nb:bfd status logical_port=rp-public
>  ovn-nbctl destroy bfd $uuid_v6
>
> -kill $(pidof ovn-controller)
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> @@ -7121,7 +7151,7 @@ NS_CHECK_EXEC([vm2], [ping -q -c 3 -i 0.3 -w 2
> 172.18.2.10 | FORMAT_PING], \
>  3 packets transmitted, 3 received, 0% packet loss, time 0ms
>  ])
>
> -kill $(pidof ovn-controller)
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> @@ -7238,6 +7268,21 @@ NS_CHECK_EXEC([vm1], [ping -q -c 3 -i 0.3 -w 2
> 172.18.2.12 | FORMAT_PING], \
>  3 packets transmitted, 3 received, 0% packet loss, time 0ms
>  ])
>
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
> +
> +as ovn-sb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as ovn-nb
> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> +
> +as northd
> +OVS_APP_EXIT_AND_WAIT([NORTHD_TYPE])
> +
> +as
> +OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-.*/d
> +/connection dropped.*/d"])
> +
>  AT_CLEANUP
>  ])
>
> @@ -7421,7 +7466,7 @@ OVS_WAIT_UNTIL([
>  ])
>  kill $(pidof tcpdump)
>
> -kill $(pidof ovn-controller)
> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
>  OVS_APP_EXIT_AND_WAIT([ovsdb-server])
> @@ -8494,6 +8539,16 @@
> tcp,orig=(src=192.168.2.2,dst=172.16.1.2,sport=,dport=),reply=
>  AT_CHECK([ovs-appctl dpctl/flush-conntrack])
>
>  

Re: [ovs-dev] [PATCH ovn 3/3] ovn-northd: Avoid recompute caused by in-flight transactions.

2023-07-07 Thread Han Zhou
On Fri, Jul 7, 2023 at 2:02 PM Numan Siddique  wrote:
>
> On Fri, Jul 7, 2023 at 6:40 AM Han Zhou  wrote:
> >
> > On Thu, Jul 6, 2023 at 9:55 PM Numan Siddique  wrote:
> > >
> > > On Thu, Jul 6, 2023 at 3:32 PM Han Zhou  wrote:
> > > >
> > > > Although each individual VIF port related changes are handled
> > > > incrementally, it still triggers recompute if there are in-flight
> > > > transactions (either to NB or SB) when a change comes, which is very
> > > > common in real production environment if change happens frequently.
> > > > It is also easy to hit such situation in test cases where nb_cfg
> > > > mechanism is heavily used, which makes it difficult to write
reliable
> > > > and stable tests, such as what the commit 8c30ba1386 was trying to
work
> > > > around.
> > > >
> > > > This patch skips the I-P engine execution until the NB & SB
transaction
> > > > handles are available (no in-flight transactions), and when
skippiing
> > > > the runs it keeps the tracked changes in IDL across main loop
> > > > iterations. This way we avoid recompute without worrying about
missing
> > > > any changes.
> > > >
> > > > Signed-off-by: Han Zhou 
> > >
> > > Acked-by: Numan Siddique 
> > >
> > > Numan
> >
> > Thanks Numan! I applied the series to main.
>
> Hi Han,
>
> Looks like the test case - "LSP incremental processing" is a little bit
flaky.
>
> If you see the latest CI run,  this test is failing for ARM -
> https://github.com/ovn-org/ovn/runs/14845008643
> It also failed for the load balancer I-P patch series
> (http://patchwork.ozlabs.org/project/ovn/list/?series=362800)
> in my local repo and it passed in a re-run.
>
>
> --
> as northd ovn-appctl -t ovn-northd inc-engine/clear-stats
> ../../../tests/ovn-macros.at:419: "$@"
> ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0 unknown
> ../../../tests/ovn-macros.at:419: "$@"
> Waiting until 0 rows in nb Logical_Switch_Port with up!=true type=""...
> ovn-macros.at:470: waiting until test $count = $(count_rows $db:$table
> $a $b $c $d $e)...
> ovn-macros.at:470: wait succeeded immediately
> Waiting until 0 rows in nb Logical_Switch_Port with up!=true
type=router...
> ovn-macros.at:470: waiting until test $count = $(count_rows $db:$table
> $a $b $c $d $e)...
> ovn-macros.at:470: wait succeeded immediately
> ovn-nbctl --wait=hv sync
> ../../../tests/ovn-macros.at:419: "$@"
> ../../../tests/ovn-northd.at:9515: test x$northd_recomp = x$1
> ../../../tests/ovn-northd.at:9515: exit code was 1, expected 0
> 
> Looks like its failing here -
> https://github.com/ovn-org/ovn/blob/main/tests/ovn-northd.at#L9540
>
> Maybe we shouldn't do exact matches on the counters ?
>
Sorry for this. I think the below patch should fix it:
https://patchwork.ozlabs.org/project/ovn/patch/20230707062938.761299-1-hz...@ovn.org/

Thanks,
Han

> Thanks
> Numan
>
>
> >
> > Han
> > >
> > > > ---
> > > >  northd/inc-proc-northd.c |   6 +--
> > > >  northd/ovn-northd.c  |  22 +
> > > >  tests/ovn-northd.at  | 104
+++
> > > >  3 files changed, 66 insertions(+), 66 deletions(-)
> > > >
> > > > diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> > > > index 19fc67795643..d328deb222e6 100644
> > > > --- a/northd/inc-proc-northd.c
> > > > +++ b/northd/inc-proc-northd.c
> > > > @@ -296,6 +296,7 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
*nb,
> > > >  bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
> > > >   struct ovsdb_idl_txn *ovnsb_txn,
> > > >   bool recompute) {
> > > > +ovs_assert(ovnnb_txn && ovnsb_txn);
> > > >  engine_init_run();
> > > >
> > > >  /* Force a full recompute if instructed to, for example, after
a
> > NB/SB
> > > > @@ -312,10 +313,7 @@ bool inc_proc_northd_run(struct ovsdb_idl_txn
> > *ovnnb_txn,
> > > >  };
> > > >
> > > >  engine_set_context(_ctx);
> > > > -
> > > > -if (ovnnb_txn && ovnsb_txn) {
> > > > -engine_run(true);
> > > > -}
> > > > +engine_run(true);
> > > >
> > > >  if (!engine_has_run()) {
> > > >  if (engine_need_run()) {
> > > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > > index 3e0848e41b8e..4fa1b039ea32 100644
> > > > --- a/northd/ovn-northd.c
> > > > +++ b/northd/ovn-northd.c
> > > > @@ -881,6 +881,7 @@ main(int argc, char *argv[])
> > > >  simap_destroy();
> > > >  }
> > > >
> > > > +bool clear_idl_track = true;
> > > >  if (!state.paused) {
> > > >  if (!ovsdb_idl_has_lock(ovnsb_idl_loop.idl) &&
> > > >  !ovsdb_idl_is_lock_contended(ovnsb_idl_loop.idl))
> > > > @@ -930,25 +931,26 @@ main(int argc, char *argv[])
> > > >  }
> > > >
> > > >  if (ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
> > > > -int64_t loop_start_time = time_wall_msec();
> > > > -bool activity = inc_proc_northd_run(ovnnb_txn,
> > ovnsb_txn,
> > > > -   

[ovs-dev] [PATCH ovn] ovn-northd.at: Fix occasional LSP I-P test failure due to initializtion phase.

2023-07-07 Thread Han Zhou
After the commit 0c1bde1c4a the recompute counters are more predictable,
so we changed the LSP incremental processing test to not tolerate any
failures (instead of 50% successful rate). But the test would then fail
occasionally at the first check, because sometimes the update of the
initial configurations from ovn-controller such as for tunnel interface
creation come too late, after we cleaned the stats counters and start
the LSP tests.

This patch fixes it by creating a pilot port and wait for it to be up,
so that we know the initial ovn-controller configurations are done, and
will not interfere with our I-P tests.

Fixes: 0c1bde1c4a47 ("ovn-northd: Avoid recompute caused by in-flight 
transactions.")
Signed-off-by: Han Zhou 
---
 tests/ovn-northd.at | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index e79d33b2aec5..3e06f14c9437 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9532,6 +9532,13 @@ check_recompute_counter() {
 
 check ovn-nbctl --wait=hv ls-add ls0
 
+# Create a pilot port and wait it up to make sure we are ready for the real
+# tests, so that the counters measured are accurate.
+check ovn-nbctl --wait=hv lsp-add ls0 lsp-pilot -- lsp-set-addresses lsp-pilot 
"unknown"
+ovs-vsctl add-port br-int lsp-pilot -- set interface lsp-pilot 
external_ids:iface-id=lsp-pilot
+wait_for_ports_up
+check ovn-nbctl --wait=hv sync
+
 check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
 check ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0 
"unknown"
 ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0 
external_ids:iface-id=lsp0-0
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 3/3] ovn-northd: Avoid recompute caused by in-flight transactions.

2023-07-07 Thread Numan Siddique
On Fri, Jul 7, 2023 at 6:40 AM Han Zhou  wrote:
>
> On Thu, Jul 6, 2023 at 9:55 PM Numan Siddique  wrote:
> >
> > On Thu, Jul 6, 2023 at 3:32 PM Han Zhou  wrote:
> > >
> > > Although each individual VIF port related changes are handled
> > > incrementally, it still triggers recompute if there are in-flight
> > > transactions (either to NB or SB) when a change comes, which is very
> > > common in real production environment if change happens frequently.
> > > It is also easy to hit such situation in test cases where nb_cfg
> > > mechanism is heavily used, which makes it difficult to write reliable
> > > and stable tests, such as what the commit 8c30ba1386 was trying to work
> > > around.
> > >
> > > This patch skips the I-P engine execution until the NB & SB transaction
> > > handles are available (no in-flight transactions), and when skippiing
> > > the runs it keeps the tracked changes in IDL across main loop
> > > iterations. This way we avoid recompute without worrying about missing
> > > any changes.
> > >
> > > Signed-off-by: Han Zhou 
> >
> > Acked-by: Numan Siddique 
> >
> > Numan
>
> Thanks Numan! I applied the series to main.

Hi Han,

Looks like the test case - "LSP incremental processing" is a little bit flaky.

If you see the latest CI run,  this test is failing for ARM -
https://github.com/ovn-org/ovn/runs/14845008643
It also failed for the load balancer I-P patch series
(http://patchwork.ozlabs.org/project/ovn/list/?series=362800)
in my local repo and it passed in a re-run.


--
as northd ovn-appctl -t ovn-northd inc-engine/clear-stats
../../../tests/ovn-macros.at:419: "$@"
ovn-nbctl --wait=hv lsp-add ls0 lsp0-0 -- lsp-set-addresses lsp0-0 unknown
../../../tests/ovn-macros.at:419: "$@"
Waiting until 0 rows in nb Logical_Switch_Port with up!=true type=""...
ovn-macros.at:470: waiting until test $count = $(count_rows $db:$table
$a $b $c $d $e)...
ovn-macros.at:470: wait succeeded immediately
Waiting until 0 rows in nb Logical_Switch_Port with up!=true type=router...
ovn-macros.at:470: waiting until test $count = $(count_rows $db:$table
$a $b $c $d $e)...
ovn-macros.at:470: wait succeeded immediately
ovn-nbctl --wait=hv sync
../../../tests/ovn-macros.at:419: "$@"
../../../tests/ovn-northd.at:9515: test x$northd_recomp = x$1
../../../tests/ovn-northd.at:9515: exit code was 1, expected 0

Looks like its failing here -
https://github.com/ovn-org/ovn/blob/main/tests/ovn-northd.at#L9540

Maybe we shouldn't do exact matches on the counters ?

Thanks
Numan


>
> Han
> >
> > > ---
> > >  northd/inc-proc-northd.c |   6 +--
> > >  northd/ovn-northd.c  |  22 +
> > >  tests/ovn-northd.at  | 104 +++
> > >  3 files changed, 66 insertions(+), 66 deletions(-)
> > >
> > > diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> > > index 19fc67795643..d328deb222e6 100644
> > > --- a/northd/inc-proc-northd.c
> > > +++ b/northd/inc-proc-northd.c
> > > @@ -296,6 +296,7 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
> > >  bool inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
> > >   struct ovsdb_idl_txn *ovnsb_txn,
> > >   bool recompute) {
> > > +ovs_assert(ovnnb_txn && ovnsb_txn);
> > >  engine_init_run();
> > >
> > >  /* Force a full recompute if instructed to, for example, after a
> NB/SB
> > > @@ -312,10 +313,7 @@ bool inc_proc_northd_run(struct ovsdb_idl_txn
> *ovnnb_txn,
> > >  };
> > >
> > >  engine_set_context(_ctx);
> > > -
> > > -if (ovnnb_txn && ovnsb_txn) {
> > > -engine_run(true);
> > > -}
> > > +engine_run(true);
> > >
> > >  if (!engine_has_run()) {
> > >  if (engine_need_run()) {
> > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > > index 3e0848e41b8e..4fa1b039ea32 100644
> > > --- a/northd/ovn-northd.c
> > > +++ b/northd/ovn-northd.c
> > > @@ -881,6 +881,7 @@ main(int argc, char *argv[])
> > >  simap_destroy();
> > >  }
> > >
> > > +bool clear_idl_track = true;
> > >  if (!state.paused) {
> > >  if (!ovsdb_idl_has_lock(ovnsb_idl_loop.idl) &&
> > >  !ovsdb_idl_is_lock_contended(ovnsb_idl_loop.idl))
> > > @@ -930,25 +931,26 @@ main(int argc, char *argv[])
> > >  }
> > >
> > >  if (ovsdb_idl_has_lock(ovnsb_idl_loop.idl)) {
> > > -int64_t loop_start_time = time_wall_msec();
> > > -bool activity = inc_proc_northd_run(ovnnb_txn,
> ovnsb_txn,
> > > -recompute);
> > > -recompute = false;
> > > -if (ovnsb_txn) {
> > > +bool activity = false;
> > > +if (ovnnb_txn && ovnsb_txn) {
> > > +int64_t loop_start_time = time_wall_msec();
> > > +activity = inc_proc_northd_run(ovnnb_txn,
> ovnsb_txn,
> > > +