date:20220405

Re: [ovs-dev] [PATCH net] net: openvswitch: don't send internal clone attribute to the userspace.

2022-04-05 Thread patchwork-bot+netdevbpf

Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski :

On Mon,  4 Apr 2022 12:41:50 +0200 you wrote:
> 'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for
> performance optimization inside the kernel.  It's added by the kernel
> while parsing user-provided actions and should not be sent during the
> flow dump as it's not part of the uAPI.
> 
> The issue doesn't cause any significant problems to the ovs-vswitchd
> process, because reported actions are not really used in the
> application lifecycle and only supposed to be shown to a human via
> ovs-dpctl flow dump.  However, the action list is still incorrect
> and causes the following error if the user wants to look at the
> datapath flows:
> 
> [...]

Here is the summary with links:
  - [net] net: openvswitch: don't send internal clone attribute to the 
userspace.
https://git.kernel.org/netdev/net/c/3f2a3050b4a3

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] handlers: Fix handlers mapping

2022-04-05 Thread Ilya Maximets

On 4/4/22 14:36, Michael Santana wrote:
> The handler and CPU mapping in upcalls are incorrect, and this is
> specially noticeable systems with cpu isolation enabled.
> 
> Say we have a 12 core system where only every even number CPU is enabled
> C0, C2, C4, C6, C8, C10
> 
> This means we will create an array of size 6 that will be sent to
> kernel that is populated with sockets [S0, S1, S2, S3, S4, S5]
> 
> The problem is when the kernel does an upcall it checks the socket array
> via the index of the CPU, effectively adding additional load on some
> CPUs while leaving no work on other CPUs.
> 
> e.g.
> 
> C0  indexes to S0
> C2  indexes to S2 (should be S1)
> C4  indexes to S4 (should be S2)
> 
> Modulo of 6 (size of socket array) is applied, so we wrap back to S0
> C6  indexes to S0 (should be S3)
> C8  indexes to S2 (should be S4)
> C10 indexes to S4 (should be S5)
> 
> Effectively sockets S0, S2, S4 get overloaded while sockets S1, S3, S5
> get no work assigned to them
> 
> This leads to the kernel to throw the following message:
> "openvswitch: cpu_id mismatch with handler threads"
> 
> To fix this we send the kernel a corrected array of sockets the size
> of all CPUs in the system. In the above example we would create a
> corrected array as follows:
> [S0, S1, S1, S2, S2, S3, S3, S4, S4, S5, S5, S0]
> 
> This guarantees that regardless of which CPU a packet comes in the kernel
> will correctly map it to the correct socket

Hey, Michael.  Thanks for working on this issue!

I agree that the array is too short and ovs-vswitchd has to supply
a full array with values for each physical core.
This can be even a separate patch with just filling in the array
for each physical core in a round-robin manner.

However, I'm not sure it is possible to predict what the good
distribution of sockets among cores should look like.

Taking your current implementation as an example.  I'm assuming
that suggested distribution (socket duplication for neighboring
CPU cores) can be good for a system with 2 NUMA nodes and cores
enumerated as odd - NUMA0, even - NUMA1.  But the enumeration
scheme of CPU cores is not set in stone, most multi-NUMA systems
has a BIOS configuration knob to change the enumeration method.
One of the most popular ones is: lower cores - NUMA0, higher cores
- NUMA1.  In this case, since interrupts from a NIC are likely
arriving on the cores from the same NUMA, sockets S1-S3 will
carry all the traffic, while sockets S4-S5 will remain unused.

Another point is that we don't really know how many irq vectors
are registered for NICs and what is the irq affinity for them.
IOW, we don't know on which cores actual interrupts will be
handled and on which sockets they will end up.  IRQ affinity
and number of queues are also dynamic and can be changed over
time.

So, AFAICT, regardless of the positioning of the sockets in the
array, there will be very good and very bad cases for it with
more or less same probability.

It might be that the only option to guarantee more or less even
load distribution is to always create N handlers, where the N
is the number of actual physical cores.  And let the scheduler
and in-kernel IRQ affinity to deal with load balancing.
Yes, we're adding an extra overhead in case of limited CPU
affinity for the ovs-vswitchd, but OTOH if we have high volume
of traffic received on big number of CPU cores, while OVS is
limited to a single core, that sounds like a misconfiguration.
Some performance testing is required, for sure.

Thoughts?

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 1/4] dpif-netdev/mfex: Add ipv4 profile based hashing.

2022-04-05 Thread Ilya Maximets

On 4/5/22 15:51, Stokes, Ian wrote:
> 
> 
>> -Original Message-
>> From: Amber, Kumar 
>> Sent: Friday, April 1, 2022 12:24 PM
>> To: ovs-dev@openvswitch.org
>> Cc: Stokes, Ian ; echau...@redhat.com; Ferriter, Cian
>> ; f...@sysclose.org; Van Haaren, Harry
>> ; Amber, Kumar 
>> Subject: [PATCH v8 1/4] dpif-netdev/mfex: Add ipv4 profile based hashing.
>>
>> For packets which don't already have a hash calculated,
>> miniflow_hash_5tuple() calculates the hash of a packet
>> using the previously built miniflow.
>>
>> This commit adds IPv4 profile specific hashing which
>> uses fixed offsets into the packet to improve hashing
>> performance.
>>
>> Signed-off-by: Kumar Amber 
>> Signed-off-by: Harry van Haaren 
>> Co-authored-by: Harry van Haaren 
> 
> Thanks for working on this Amber/Harry and the reviews /testing Cian.
> 
> Looking at this it seems to be in a good sate. Will apply to master unless 
> there are any objection?

Hmm.  On a quick glance over the first patch I see a lot of
unaligned memory accesses that are clearly an undefined behavior
from the compiler point of view.  Some of them are actually
done for odd (!) memory addresses, e.g. &((uint8_t)pkt)[27] is
converted to a uint32_t pointer and dereferenced.  And unlike
some other memory accesses in this file, alignment for these
ones can actually easily be pre-calculated by a compiler
potentially causing all sorts of fun stuff.

It's not a coincidence that most of memory accesses in the
generic flow.c are performed with memcpy() or get_16aligned_be32()
or similar functions.  This helps to avoid issues with incorrect
compiler assumptions and potentially broken logic.  Not only
to support various different architectures.

We just cleared a big pile of undefined behavior issues that
were causing real mis-compilations in the code and some more
fixes are coming, including enabling of UBsan in CI.  So, I
don't think the series should be accepted in the form it is now.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn 1/2] vtep: correctly bring vtep lport up in SBDB

2022-04-05 Thread Mark Michelson


Hi,

The subject claims this is patch 1/2, but I don't see patch 2/2 on 
patchwork or on the mailing list.


Provisionally, I'm acking this patch

Acked-by: Mark Michelson 

but if there's a part 2, I'd prefer that it gets reviewed before this 
gets merged.


Also, I have one minor thing below.

On 4/1/22 04:16, Vladislav Odintsov wrote:

Signed-off-by: Vladislav Odintsov 
---
  controller-vtep/binding.c | 9 -
  1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/controller-vtep/binding.c b/controller-vtep/binding.c
index 01d5a16d2..7c7bea90a 100644
--- a/controller-vtep/binding.c
+++ b/controller-vtep/binding.c
@@ -109,12 +109,11 @@ update_pb_chassis(const struct sbrec_port_binding 
*port_binding_rec,
   port_binding_rec->chassis->name,
   chassis_rec->name);
  }
-
  sbrec_port_binding_set_chassis(port_binding_rec, chassis_rec);
-if (port_binding_rec->n_up) {
-bool up = true;
-sbrec_port_binding_set_up(port_binding_rec, , 1);
-}
+}
+else if (port_binding_rec->n_up) {


This is a coding guidelines violation. The else should be on the same 
line as the closing curly brace:


} else if (port_binding_rec->n_up) {

This is minor enough not to prevent me from acking the patch. But this 
should be fixed before merging.

+bool up = true;
+sbrec_port_binding_set_up(port_binding_rec, , 1);
  }
  }
  



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] Fixed multiple other flaky tests

2022-04-05 Thread Mark Michelson


Thanks a bunch Xavier!

Acked-by: Mark Michelson 

On 3/30/22 07:26, Xavier Simonart wrote:

- options:requested-chassis for logical port
- NAT and Load Balancer flows
- 2LSs IGMP and MLD
- Dynamic neighbor between LRs
- (userspace test) conntrack TCP reset
- Dont flood fill local datapaths beyond distributed gateway port
- 1 LR with distributed router gateway port

Signed-off-by: Xavier Simonart 
---
  tests/ovn-northd.at |  2 +-
  tests/ovn.at| 17 -
  tests/system-ovn.at |  9 +++--
  3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index d9de4aacb..c252f61a5 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4566,7 +4566,7 @@ check ovn-nbctl lr-nat-del ro1
  check ovn-nbctl lr-nat-del ro2
  
  check ovn-nbctl --add-route lr-nat-add ro1 dnat_and_snat 10.0.0.100 192.168.1.2 vm1 00:00:00:00:00:01

-check ovn-nbctl --add-route lr-nat-add ro2 dnat_and_snat 20.0.0.100 
192.168.2.2 vm2 00:00:00:00:00:02
+check ovn-nbctl --wait=sb --add-route lr-nat-add ro2 dnat_and_snat 20.0.0.100 
192.168.2.2 vm2 00:00:00:00:00:02
  
  check_lflows 1
  
diff --git a/tests/ovn.at b/tests/ovn.at

index 0c2fe9f97..2d13a1cb2 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -76,6 +76,12 @@ m4_define([OVN_CHECK_PACKETS_REMOVE_BROADCAST],
  m4_define([OVN_CHECK_PACKETS_CONTAIN],
[ovn_wait_packets__ "$1" "$2" "__file__:__line__"])
  
+# OVN_CHECK_PACKETS_UNIQ succeeds if some expected packets are duplicated.

+# It fails if unexpected packets are received.
+m4_define([OVN_CHECK_PACKETS_UNIQ],
+  [ovn_check_packets__ "$1" "$2" "__file__:__line__"
+   AT_CHECK([sort $rcv_text | uniq], [0], [expout])])
+
  AT_BANNER([OVN components])
  
  AT_SETUP([lexer])

@@ -11418,7 +11424,7 @@ echo $outside_ip
  test_arp 3 1 f0010204 $outside_ip $rtr_ip
  
  # Now check the packets actually received against the ones expected.

-OVN_CHECK_PACKETS([hv3/vif1-tx.pcap], [hv3-vif1.expected])
+OVN_CHECK_PACKETS_UNIQ([hv3/vif1-tx.pcap], [hv3-vif1.expected])
  
  # Send ip packet between foo1 and outside1

  src_mac="f0010203"
@@ -11428,7 +11434,7 @@ dst_ip=`ip_to_hex 172 16 1 3`
  
packet=${dst_mac}${src_mac}0800451c4011${src_ip}${dst_ip}00350008
  
  # Now check the packets actually received against the ones expected.

-OVN_CHECK_PACKETS([hv3/vif1-tx.pcap], [hv3-vif1.expected])
+OVN_CHECK_PACKETS_UNIQ([hv3/vif1-tx.pcap], [hv3-vif1.expected])
  
  # Now add bridge-mappings on hv2, which should make everything work

  as hv2 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
@@ -11449,7 +11455,7 @@ 
exp_gw_ip_garp=020102030806000108000604000102010203ac1001010
  echo $exp_gw_ip_garp >> hv3-vif1.expected
  
  # Now check the packets actually received against the ones expected.

-OVN_CHECK_PACKETS([hv3/vif1-tx.pcap], [hv3-vif1.expected])
+OVN_CHECK_PACKETS_UNIQ([hv3/vif1-tx.pcap], [hv3-vif1.expected])
  
  # Send ip packet between foo1 and outside1

  src_mac="f0010203"
@@ -13838,7 +13844,7 @@ as hv1
  ovs-vsctl set interface hv1-vif0 external-ids:iface-id=lsp0
  
  OVS_WAIT_UNTIL([test 1 = $(grep -c "Claiming lport lsp0" hv1/ovn-controller.log)])

-check_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0
+wait_column "$hv1_uuid" Port_Binding chassis logical_port=lsp0
  
  # (4) Chassis hv1 should add flows in OFTABLE_PHY_TO_LOG and OFTABLE_LOG_TO_PHY tables.

  as hv1 ovs-ofctl dump-flows br-int
@@ -25636,7 +25642,6 @@ src_ip=`ip_to_hex 10 0 1 2`
  dst_ip=`ip_to_hex 10 0 2 2`
  
packet=${dst_mac}${src_mac}0800451c4011${src_ip}${dst_ip}00350008
  as hv1 ovs-appctl netdev-dummy/receive hv1-vif1 $packet
-as hv1 ovs-appctl ofproto/trace br-int in_port=1 $packet
  
  # Packet to Expect at p2

  src_mac="0201"
@@ -25646,6 +25651,7 @@ dst_ip=`ip_to_hex 10 0 2 2`
  echo 
"${dst_mac}${src_mac}0800451c3e110200${src_ip}${dst_ip}00350008"
 > expected
  
  OVN_CHECK_PACKETS([hv2/vif1-tx.pcap], [expected])

+as hv1 ovs-appctl ofproto/trace br-int in_port=1 $packet
  
  # MAC binding entry should have generated

  AT_CHECK([ovn-sbctl find mac ip=10.0.0.2 mac='"00:00:00:00:03:02"' 
logical_port=lrp-r1-join | grep 10\.0\.0\.2], [0], [ignore], [])
@@ -29553,6 +29559,7 @@ AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=26 | 
grep 10.0.1.2], [0], [ig
  # doesn't remove.)
  check ovn-nbctl --wait=hv lrp-set-gateway-chassis lrp_lr_ls1 hv1 1
  as hv2 check ovn-appctl -t ovn-controller recompute
+ovn-nbctl --wait=hv sync
  AT_CHECK([as hv2 ovs-ofctl dump-flows br-int table=26 | grep 10.0.1.2], [1])
  
  # Enable dnat_and_snat on lr, and now hv2 should see flows for lsp1.

diff --git a/tests/system-ovn.at b/tests/system-ovn.at
index 5d90be1c9..2de21e9f5 100644
--- a/tests/system-ovn.at
+++ b/tests/system-ovn.at
@@ -4174,13 +4174,17 @@ start_daemon ovn-controller
  #   - subnet 20.0.0.0/8
  #   - 2 port (sw2-p1 - sw2-p2)
  #   - IGMP Querier from

Re: [ovs-dev] OVN weekly meeting time change

2022-04-05 Thread Ben Pfaff

On Thu, Mar 31, 2022 at 02:03:41PM -0400, Mark Michelson wrote:
> @Ben
> Can you please change the channel topic message of #openvswitch so that the
> new meeting time is reflected? Thanks!

I've done this now.

Does anyone know how I would give someone else permission to do this
kind of thing?  I don't think I should be the only one who can do this.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] controller/binding: prevent claiming container lport to ovs bridge

2022-04-05 Thread Mark Michelson


Hi Mohammad,

The change itself seems like a good idea. I have a question. This code 
is only written for the incremental case, not the recompute case. Is 
this because this situation can only arise in the incremental case?


On 3/31/22 07:01, Mohammad Heib wrote:

currently ovn-controller allow users to claim lport of type container
to ovs bridge which is invalid use-case.

This patch will prevent such invalid use-cases by ignoring the claiming
requests for container lports and will throw a warning message to the
controller logs.

Signed-off-by: Mohammad Heib 
---
  controller/binding.c | 11 +++
  tests/ovn.at |  5 +
  2 files changed, 16 insertions(+)

diff --git a/controller/binding.c b/controller/binding.c
index 4d62b0858..1051651f4 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -2012,6 +2012,17 @@ binding_handle_ovs_interface_changes(struct 
binding_ctx_in *b_ctx_in,
  int64_t ofport = iface_rec->n_ofport ? *iface_rec->ofport : 0;
  if (iface_id && ofport > 0 &&
  is_iface_in_int_bridge(iface_rec, b_ctx_in->br_int)) {
+const struct sbrec_port_binding *pb = NULL;
+pb = lport_lookup_by_name(b_ctx_in->sbrec_port_binding_by_name,
+  iface_id);
+if (pb && (get_lport_type(pb) == LP_CONTAINER)) {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
+VLOG_WARN_RL(, "Can't claim lport %s of type container to "
+ "OVS bridge,\nplease remove the lport prent_name"
+ " before claiming it.", pb->logical_port);
+continue;
+}
+
  handled = consider_iface_claim(iface_rec, iface_id, b_ctx_in,
 b_ctx_out, qos_map_ptr);
  if (!handled) {
diff --git a/tests/ovn.at b/tests/ovn.at
index 0c2fe9f97..d32240cf4 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -28232,6 +28232,11 @@ check ovn-nbctl --wait=sb set logical_switch_port vm1 
parent_name=vm-cont1
  
  wait_for_ports_up
  
+# Try to claim container port to ovs

+check ovn-nbctl set logical_switch_port vm-cont2 parent_name=vm2
+check as hv1 ovs-vsctl set Interface vm1 external_ids:iface-id=vm-cont2
+AT_CHECK([test 1 = `cat hv1/ovn-controller.log |grep -c "claim lport vm-cont2 of 
type container"`])
+
  # Delete vm1, vm-cont1 and vm-cont2 and recreate again.
  check ovn-nbctl lsp-del vm1
  check ovn-nbctl lsp-del vm-cont1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] northd: handle container lport type update

2022-04-05 Thread Mark Michelson


Hi Mohammad,

I have a few findings down below, pretty minor.

On 3/30/22 14:09, Mohammad Heib wrote:

currently, when a lport with a nonempty parent_name field
is created or updated in the NBDB the ovn-controller will
handle this port as a container lport and will do all the
required operations to maintain this lport.

This behavior will be changed if the user has explicitly
removed the parent_name field from the NBDB and the ovn-controller
will start handler this port as a VIF port without cleaning its
previous allocations and without removing its previous relations
with other lports and that can lead to undefined behavior when
handling such types of lports.

This patch trying to fix the above issue by removing the port_binding row
for a container port that its parent_port got removed and that makes
the ovn-controller remove all its allocations and relations with other
ports and then create a new port_binding row with the same logical_port
as the deleted row and let northd handle it as a newly created port.

This approach can be applied to other lport types change not only for
container port but for now it's only handling container lport type change.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2037433
Signed-off-by: Mohammad Heib 
---
  northd/northd.c | 91 +
  tests/ovn.at| 12 +++
  2 files changed, 89 insertions(+), 14 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 2fb0a93c2..1ac80dfad 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -1801,6 +1801,36 @@ lsp_is_remote(const struct nbrec_logical_switch_port 
*nbsp)
  return !strcmp(nbsp->type, "remote");
  }
  
+static bool

+lsp_is_type_changed(const struct sbrec_port_binding *sb,
+const struct nbrec_logical_switch_port *nbsp,
+bool *is_old_container_lport)
+{
+if (!sb || !nbsp) {
+return false;
+}
+/* Both lports are not "VIF's" it is safe to use strcmp. */
+if ((sb->type && sb->type[0]) &&
+   (nbsp->type && nbsp->type[0])) {
+return strcmp(sb->type, nbsp->type);
+} else if ((!sb->type || !sb->type[0]) &&
+(!nbsp->type || !nbsp->type[0])) {
+/* Two "VIF's" interface make sure both have parent_port
+ * set or both have parent_port unset, otherwisre they are
+ * different ports type.
+ */
+if ((sb->parent_port && nbsp->parent_name) ||
+(!sb->parent_port && !nbsp->parent_name)) {
+return false;
+} else {
+*is_old_container_lport = true;
+return true;
+}
+}
+
+return true;
+}


The above function can be simplified a bit:
1) sb->type and nbsp->type are guaranteed to be non-NULL, so there is no 
reason to check their NULL-ness.
2) The default true return and the strcmp-based returns are essentially 
the same thing and could be grouped together.


In addition, it's good practice to ensure that the output parameter 
(is_old_container_lport in this case) is always explicitly set by the 
function. The current code relies on the caller to set 
is_old_container_lport to false before calling the function. If it is 
uninitialized or is set true before calling the function, the results 
will not be correct.


Here's a proposed (untested) re-write:

*is_old_container_lport = false;
if (!sb->type[0] && !nbsp->type[0]) {
/* Both are apparently VIF ports.
 * Check if one was previously a container.
 */
if ((sb->parent_port && !nbsp->parent_name) ||
!sb->parent_port && nbsp->parent_name)) {
*is_old_container_lport = true;
return true;
} else {
return false;
}
}

return strcmp(sb->type, nbsp->type);


+
  static bool
  lrport_is_enabled(const struct nbrec_logical_router_port *lrport)
  {
@@ -2481,22 +2511,55 @@ join_logical_ports(struct northd_input *input_data,
  VLOG_WARN_RL(, "duplicate logical port %s", 
nbsp->name);
  continue;
  } else if (op && (!op->sb || op->sb->datapath == od->sb)) {
-ovn_port_set_nb(op, nbsp, NULL);
-ovs_list_remove(>list);
-
-uint32_t queue_id = smap_get_int(>sb->options,
- "qdisc_queue_id", 0);
-if (queue_id && op->sb->chassis) {
-add_chassis_queue(
- chassis_qdisc_queues, 
>sb->chassis->header_.uuid,
- queue_id);
-}
+/*
+ * Handle cases where lport type was explicitly changed
+ * in the NBDB, in such cases:
+ * 1. remove the current sberc of the affected lport from


s/sberc/sbrec/
The same spelling mistake is made below in list item (2).


+ *the port_binding table.
+

Re: [ovs-dev] [PATCH v5 2/6] treewide: Avoid offsetting NULL pointers.

2022-04-05 Thread Aaron Conole

Dumitru Ceara  writes:

> On 4/5/22 16:41, Aaron Conole wrote:
>> Dumitru Ceara  writes:
>> 
>>> This is undefined behavior and was reported by UB Sanitizer:
>>>   lib/meta-flow.c:3445:16: runtime error: member access within null pointer 
>>> of type 'struct vl_mf_field'
>>>   #0 0x6aad0f in mf_get_vl_mff lib/meta-flow.c:3445
>>>   #1 0x6d96d7 in mf_from_oxm_header lib/nx-match.c:260
>>>   #2 0x6d9e2e in nx_pull_header__ lib/nx-match.c:341
>>>   #3 0x6daafa in nx_pull_header lib/nx-match.c:488
>>>   #4 0x6abcb6 in mf_vl_mff_nx_pull_header lib/meta-flow.c:3605
>>>   #5 0x73b9be in decode_NXAST_RAW_REG_MOVE lib/ofp-actions.c:2652
>>>   #6 0x764ccd in ofpact_decode lib/ofp-actions.inc2:4681
>>>   [...]
>>>   lib/sset.c:315:12: runtime error: applying zero offset to null pointer
>>>   #0 0xcc2e6a in sset_at_position /root/ovs/lib/sset.c:315:12
>>>   #1 0x5734b3 in port_dump_next /root/ovs/ofproto/ofproto-dpif.c:4083:20
>>>   [...]
>>>   lib/ovsdb-data.c:2194:56: runtime error: applying zero offset to null 
>>> pointer
>>>   #0 0x5e9530 in ovsdb_datum_added_removed 
>>> /root/ovs/lib/ovsdb-data.c:2194:56
>>>   #1 0x4d6258 in update_row_ref_count 
>>> /root/ovs/ovsdb/transaction.c:335:17
>>>   #2 0x4c360b in for_each_txn_row /root/ovs/ovsdb/transaction.c:1572:33
>>>   [...]
>>>   lib/ofpbuf.c:440:30: runtime error: applying zero offset to null pointer
>>>   #0 0x75066d in ofpbuf_push_uninit lib/ofpbuf.c:440
>>>   #1 0x46ac8a in ovnacts_parse lib/actions.c:4190
>>>   #2 0x46ad91 in ovnacts_parse_string lib/actions.c:4208
>>>   #3 0x4106d1 in test_parse_actions tests/test-ovn.c:1324
>>>   [...]
>>>   lib/ofp-actions.c:3205:22: runtime error: applying non-zero offset 2 to 
>>> null pointer
>>>   #0 0x6e1641 in set_field_split_str /root/ovs/lib/ofp-actions.c:3205:22
>>>   [...]
>>>   lib/tnl-ports.c:74:12: runtime error: applying zero offset to null pointer
>>>   #0 0xceffe7 in tnl_port_cast /root/ovs/lib/tnl-ports.c:74:12
>>>   #1 0xcf14c3 in map_insert /root/ovs/lib/tnl-ports.c:116:13
>>>   [...]
>>>   ofproto/ofproto.c:8905:16: runtime error: applying zero offset to null 
>>> pointer
>>>   #0 0x556795 in eviction_group_hash_rule 
>>> /root/ovs/ofproto/ofproto.c:8905:16
>>>   #1 0x503f8d in eviction_group_add_rule 
>>> /root/ovs/ofproto/ofproto.c:9022:42
>>>   [...]
>>>
>>> Also, it's valid to have an empty ofpact list and we should be able to
>>> try to iterate through it.
>>>
>>> UB Sanitizer report:
>>>   include/openvswitch/ofp-actions.h:222:12: runtime error: applying zero 
>>> offset to null pointer
>>>   #0 0x665d69 in ofpact_end 
>>> /root/ovs/./include/openvswitch/ofp-actions.h:222:12
>>>   #1 0x66b2cf in ofpacts_put_openflow_actions 
>>> /root/ovs/lib/ofp-actions.c:8861:5
>>>   #2 0x6ffdd1 in ofputil_encode_flow_mod /root/ovs/lib/ofp-flow.c:447:9
>>>   [...]
>>>
>>> Signed-off-by: Dumitru Ceara 
>>> ---
>>> v5:
>>> - Rebase.
>>> v4:
>>> - Addressed Ilya's comments.
>>> ---
>> 
>> Glad to see that the undefined behavior got removed, BUT this
>> can introduce some different undefined behavior - places where we
>> have a calls to ofpbuf_at_...() always assume a valid pointer is
>> returned.
>> 
>
> Thanks for the review!
>
>> I think it makes sense to abort if b->data is NULL in these cases.
>> Maybe something like:
>> 
>>   ovs_abort(0, "invalid buffer data pointer");
>> 
>> WDYT?
>> 
>
> Calling ovs_abort() directly from openvswitch/util.h will be a challenge
> because it's an internal function and the openvswitch/util.h header is
> public.  Worst case we just call ovs_assert() like we already do in
> ofpbuf_at_assert().

Maybe we can expose ovs_abort as well?

> But, just to make sure I understood properly, you'd like to assert that
> b->data is not NULL only in ofpbuf_at() and ofpbuf_at_assert(), right?

right - only for those places where we have the assumption that the
return must be !NULL

> Because the other ofpact_...() functions are also called in valid
> scenarios on ofpbufs that have b->data = NULL.
>
>>>  include/openvswitch/ofp-actions.h |4 +++-
>>>  include/openvswitch/ofpbuf.h  |   17 -
>>>  lib/dynamic-string.c  |8 ++--
>>>  lib/meta-flow.c   |4 +++-
>>>  lib/ofp-actions.c |8 
>>>  lib/ofpbuf.c  |4 
>>>  lib/ovsdb-data.c  |   37 
>>> +
>>>  lib/ovsdb-data.h  |4 
>>>  lib/sset.c|4 +++-
>>>  lib/tnl-ports.c   |2 +-
>>>  ofproto/ofproto.c |2 +-
>>>  11 files changed, 62 insertions(+), 32 deletions(-)
>>>
>>> diff --git a/include/openvswitch/ofp-actions.h 
>>> b/include/openvswitch/ofp-actions.h
>>> index 41bcb55d2056..b7231c7bb334 100644
>>> --- a/include/openvswitch/ofp-actions.h
>>> +++

Re: [ovs-dev] [PATCH v2] netdev:clear out vlan flow fields while processing native tunnel

2022-04-05 Thread Ilya Maximets

Hi.  Thanks for the patch!

One general comment for the subject line:
'netdev' doesn't seem to be a correct 'area' for the change.
It should be 'ofproto-dpif-xlate' instead.  And, please,
add the space after the ':'.  'summary' should preferably
start with a capital letter and end with a period.

More details on how to format the patch here:
  Documentation/internals/contributing/submitting-patches.rst

See some more comments inline.

On 4/5/22 01:01, Thilak Raj Surendra Babu wrote:
> when a packet is received over an access port that needs to be sent over a 
> vxlan tunnel,
> the access port VLAN id is used in the lookup leading to a wrong packet being 
> crafted and
> sent over the tunnel.Clear out the flow 's VLAN field as it should not be 
> used while performing
> mac lookup for the outer tunnel and also at this point the VLAN action 
> related to inner flow is
> already committed.

Please, limit the line length for a commit message.
72 characters is the most common limit for it.

> 

Please, add a 'Fixes' tag.

> Signed-off-by: Thilak Raj Surendra Babu 
> ---

While sending new versions of a patch, please, add here
a brief explanation of what changed between versions.
E.g.:

Version 2:
  - Fixed line length warning from checkpatch.
  - Dropped unrelated changes.

>  ofproto/ofproto-dpif-xlate.c |  3 ++
>  tests/system-traffic.at  | 54 

I see you added a system test.  That is OK, good to have one.
However, the issue doesn't require any actual network to be
reproduced.  So, we should have a unit test for it.

I know that Rosemarie worked on the same problem over the
past few weeks, and she already has a working unit test.  So,
I think it would be great if you can cooperate and include
the unit test into the patch.

Rosemarie, could you share what you have?

>  2 files changed, 57 insertions(+)
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index bfd4960dd..34f74aade 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3541,6 +3541,9 @@ propagate_tunnel_data_to_flow__(struct flow *dst_flow,
>  {
>  dst_flow->dl_dst = dmac;
>  dst_flow->dl_src = smac;

An empty line here would be nice.

> +/* Clear VLAN entries which do not apply for tunnel flows */

Comments should end with a period.

> +memset (dst_flow->vlans, 0,

No need for extra space between the function name and the '('.

> +sizeof(union flow_vlan_hdr) * FLOW_MAX_VLAN_HEADERS);

Please, avoid use of sizeof with the type as an argument.
Use the expression (actual object) as an argument instead.

For more details see the coding-style guide:
  Documentation/internals/contributing/coding-style.rst

>  
>  dst_flow->packet_type = htonl(PT_ETH);
>  dst_flow->nw_dst = src_flow->tunnel.ip_dst;
> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> index 4a7fa49fc..b0003c2e5 100644
> --- a/tests/system-traffic.at
> +++ b/tests/system-traffic.at
> @@ -259,6 +259,60 @@ NS_CHECK_EXEC([at_ns0], [ping -s 3200 -q -c 3 -i 0.3 -w 
> 2 10.1.1.100 | FORMAT_PI
>  OVS_TRAFFIC_VSWITCHD_STOP
>  AT_CLEANUP
>  
> +

This extra space is unnecessary.

> +AT_SETUP([datapath - ping vlan over vxlan tunnel])
> +OVS_CHECK_TUNNEL_TSO()
> +OVS_CHECK_VXLAN()
> +
> +OVS_TRAFFIC_VSWITCHD_START()
> +ADD_BR([br-underlay])
> +
> +AT_CHECK([ovs-vsctl -- add-port br0 patch0 -- set interface patch0 
> type=patch options:peer=patch1 -- add-port br-underlay patch1 -- set 
> interface patch1 type=patch options:peer=patch0])
> +
> +AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
> +AT_CHECK([ovs-ofctl add-flow br-underlay "actions=normal"])
> +
> +ADD_NAMESPACES(at_ns0)
> +
> +dnl Set up underlay link from host into the namespace using veth pair.
> +ADD_VETH(p0, at_ns0, br-underlay, "172.31.1.1/24")
> +AT_CHECK([ip addr add dev br-underlay "172.31.1.100/24"])
> +AT_CHECK([ip link set dev br-underlay up])
> +
> +
> +dnl Set up tunnel endpoints on OVS outside the namespace and with a native
> +dnl linux device inside the namespace.
> +
> +ADD_NATIVE_TUNNEL([vxlan], [at_vxlan1], [at_ns0], [172.31.1.100], 
> [10.1.1.1/24],
> +  [id 0 dstport 4789])
> +
> +ADD_OVS_TUNNEL([vxlan], [br-underlay], [at_vxlan0], [172.31.1.1], 
> [10.1.1.100/24])
> +
> +NS_EXEC([at_ns0], [ip link add link at_vxlan1 name at_vxlan1.100 type vlan 
> id 100])
> +
> +NS_EXEC([at_ns0], [ip addr flush dev at_vxlan1])
> +NS_EXEC([at_ns0], [ip addr add dev at_vxlan1.100 "10.1.1.30/24"])
> +NS_EXEC([at_ns0], [ip link set dev at_vxlan1.100 up])
> +
> +ADD_NAMESPACES(at_ns1)
> +ADD_VETH(p1, at_ns1, br0, "10.1.1.10/24")
> +
> +AT_CHECK([ovs-vsctl set port ovs-p1 tag=100])
> +
> +dnl First, check the underlay
> +NS_CHECK_EXEC([at_ns0], [ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | 
> FORMAT_PING], [0], [dnl
> +3 packets transmitted, 3 received, 0% packet loss, time 0ms
> +])
> +
> +NS_CHECK_EXEC([at_ns1], [ping -q -c 3 -i 0.3 -w 2 10.1.1.30 | FORMAT_PING], 
> [0],

Re: [ovs-dev] [PATCH ovn v2] northd: avoid snat on reply packets

2022-04-05 Thread Mark Michelson


Thanks Xavier. It's amazing this issue has been around so long.

Acked-by: Mark Michelson 

On 4/1/22 12:34, Xavier Simonart wrote:

On gateway routers egress packets might be both:
- unDNATted
- SNATted
Reply packets should not be SNATted (they must of course be UnDNATted
if DNAT was applied).

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2061593

Signed-off-by: Xavier Simonart 

---
v2: replaced "<<<" in tests as failing on github
---
  northd/northd.c |   1 +
  tests/ovn-northd.at |  33 ++--
  tests/system-ovn.at | 119 
  3 files changed, 137 insertions(+), 16 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 2fb0a93c2..511bd3331 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -12919,6 +12919,7 @@ build_lrouter_out_snat_flow(struct hmap *lflows, struct 
ovn_datapath *od,
  ds_put_format(actions, "ip%s.src=%s; next;",
is_v6 ? "6" : "4", nat->external_ip);
  } else {
+ds_put_format(match, " && (!ct.trk || !ct.rpl)");
  ds_put_format(actions, "ct_snat(%s", nat->external_ip);
  
  if (nat->external_port_range[0]) {

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index daa664982..f462fca86 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -1030,7 +1030,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows | sed 
's/table=../table=??/' | sort], [0
  AT_CHECK([grep -e "lr_out_snat" crflows | sed 's/table=../table=??/' | sort], 
[0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $allowed_range), action=(ct_snat(172.16.1.1);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 && 
ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.1);)
  ])
  
  
@@ -1062,7 +1062,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows2 | sed 's/table=../table=??/' | sort], [

  AT_CHECK([grep -e "lr_out_snat" crflows2 | sed 's/table=../table=??/' | 
sort], [0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11), action=(ct_snat(172.16.1.1);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.1);)
table=??(lr_out_snat), priority=35   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $disallowed_range), action=(next;)
  ])
  
@@ -1091,7 +1091,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows3 | sed 's/table=../table=??/' | sort], [

  AT_CHECK([grep -e "lr_out_snat" crflows3 | sed 's/table=../table=??/' | 
sort], [0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $allowed_range), action=(ct_snat(172.16.1.2);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 && 
ip4.dst == $allowed_range && (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.2);)
  ])
  
  # Stateful FIP with DISALLOWED_IPs

@@ -1120,7 +1120,7 @@ AT_CHECK([grep -e "lr_out_snat" drflows4 | sed 
's/table=../table=??/' | sort], [
  AT_CHECK([grep -e "lr_out_snat" crflows4 | sed 's/table=../table=??/' | 
sort], [0], [dnl
table=??(lr_out_snat), priority=0, match=(1), action=(next;)
table=??(lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 
50.0.0.11), action=(ct_snat(172.16.1.2);)
+  table=??(lr_out_snat), priority=33   , match=(ip && ip4.src == 50.0.0.11 
&& (!ct.trk || !ct.rpl)), action=(ct_snat(172.16.1.2);)
table=??(lr_out_snat), priority=35   , match=(ip && ip4.src == 50.0.0.11 
&& ip4.dst == $disallowed_range), action=(next;)
  ])
  
@@ -5140,11 +5140,12 @@ AT_CHECK([grep "lr_out_post_undnat" lr0flows | sed 's/table=./table=?/' | sort],

  AT_CHECK([grep "lr_out_snat" lr0flows | sed 's/table=./table=?/' | sort], 
[0], [dnl
table=? (lr_out_snat), priority=0, match=(1), action=(next;)
table=? (lr_out_snat), priority=120  , match=(nd_ns), action=(next;)
-  table=? (lr_out_snat), priority=25   , match=(ip && ip4.src == 
10.0.0.0/24), action=(ct_snat(172.168.0.10);)
-  table=? (lr_out_snat), priority=33   , match=(ip && ip4.src == 
10.0.0.10), action=(ct_snat(172.168.0.30);)
-  table=? (lr_out_snat), priority=33   , match=(ip && ip4.src == 
10.0.0.3), action=(ct_snat(172.168.0.20);)
+  table=? (lr_out_snat),

Re: [ovs-dev] [PATCH ovn] expr.c: Fix the comment of expr_to_matches.

2022-04-05 Thread Mark Michelson


Acked-by: Mark Michelson 

On 3/30/22 01:50, Han Zhou wrote:

The comment is misleading. The conjunctive match IDs always starts from
1 instead of 0.

Signed-off-by: Han Zhou 
---
  lib/expr.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/expr.c b/lib/expr.c
index 47ef6108e..caae0d4b2 100644
--- a/lib/expr.c
+++ b/lib/expr.c
@@ -3257,7 +3257,7 @@ add_cmp_flow(const struct expr *cmp,
   * a collection of Open vSwitch flows in 'matches', which this function
   * initializes to an hmap of "struct expr_match" structures.  Returns the
   * number of conjunctive match IDs consumed by 'matches', which uses
- * conjunctive match IDs beginning with 0; the caller must offset or remap them
+ * conjunctive match IDs beginning with 1; the caller must offset or remap them
   * into the desired range as necessary.
   *
   * The matches inserted into 'matches' will be of three distinct kinds:



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] expr.c: Remove the force_crossproduct related code.

2022-04-05 Thread Mark Michelson


This is my favorite type of change :)

Acked-by: Mark Michelson 

On 3/30/22 01:49, Han Zhou wrote:

This is not needed any more and force_crossproduct is always false.

Signed-off-by: Han Zhou 
---
  lib/expr.c | 15 +--
  1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/lib/expr.c b/lib/expr.c
index 47ef6108e..77c67cde4 100644
--- a/lib/expr.c
+++ b/lib/expr.c
@@ -33,19 +33,6 @@
  
  VLOG_DEFINE_THIS_MODULE(expr);
  
-/* Right now conjunction flows generated by ovn-controller

- * has issues. If there are multiple flows with the same
- * match for different conjunctions, ovn-controller doesn't
- * handle it properly.
- * Eg.
- * match 1 - ip4.src == {IP1, IP2} && tcp.dst >=500 && tcp.src <=600
- * action - drop
- *
- * match 2 - ip4.src == {IP1, IP2} && tcp.dst >=700 && tcp.src <=800
- * action - allow.
- */
-static bool force_crossproduct = false;
-
  static struct expr *parse_and_annotate(const char *s,
 const struct shash *symtab,
 struct ovs_list *nesting,
@@ -2916,7 +2903,7 @@ expr_normalize_and(struct expr *expr)
  
  ovs_assert(sub->type == EXPR_T_OR);

  const struct expr_symbol *symbol = expr_get_unique_symbol(sub);
-if (!symbol || force_crossproduct || symbol->must_crossproduct ) {
+if (!symbol || symbol->must_crossproduct) {
  struct expr *or = expr_create_andor(EXPR_T_OR);
  struct expr *k;
  



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 4/4] tests/mfex: Improve pcap script for mfex tests.

2022-04-05 Thread Amber, Kumar

HI Michael Ian,

> -Original Message-
> From: Phelan, Michael 
> Sent: Tuesday, April 5, 2022 7:47 PM
> To: Stokes, Ian ; Amber, Kumar
> ; ovs-dev@openvswitch.org
> Cc: echau...@redhat.com; Ferriter, Cian ;
> f...@sysclose.org; Van Haaren, Harry 
> Subject: RE: [PATCH v8 4/4] tests/mfex: Improve pcap script for mfex tests.
> 
> 
> > -Original Message-
> > From: Stokes, Ian 
> > Sent: Tuesday 5 April 2022 14:52
> > To: Amber, Kumar ; ovs-dev@openvswitch.org;
> > Phelan, Michael 
> > Cc: echau...@redhat.com; Ferriter, Cian ;
> > f...@sysclose.org; Van Haaren, Harry 
> > Subject: RE: [PATCH v8 4/4] tests/mfex: Improve pcap script for mfex tests.
> >
> > > The mfex pcap generation script is improved for varied length
> > > traffic and also removes the hard coded mfex_pcap and instead uses
> > > the script itself to generate complex traffic patterns for testing.
> > >
> > > Signed-off-by: Kumar Amber 
> > >
> >
> > So in general I think this approach is a bit more improved.
> >
> > @Phelan, Michael do you know if we currently run this testing in our
> > CI? Also @Amber, Kumar et al, should the CI be expanded now to do both
> > fuzzy and non-fuzzy testing as both seem to be an option now?
> 
> Currently our CI runs only the MFEX Autovalidator and the MFEX
> Configuration unit tests, the fuzzy test is disabled as it was giving 
> inconsistent
> results. I believe Amber had a separate patch which fixed this issue but I'm
> not certain whether it has been merged into OVS master yet, if it has then
> the MFEX fuzzy test should be okay to be re-enabled on the CI and run for
> any new patches coming in.
> 

The patch to ignore the warning for context swithing is here :
http://patchwork.ozlabs.org/project/openvswitch/patch/20220224151517.317034-1-kumar.am...@intel.com/

Regards
Amber

> >
> > Regards
> > Ian
> >
> > > ---
> > > v8:
> > > - Reduce IO writes.
> > > --
> > > ---
> > >  tests/automake.mk |   1 -
> > >  tests/mfex_fuzzy.py   |  66 +++---
> > >  tests/pcap/mfex_test.pcap | Bin 416 -> 0 bytes
> > >  tests/system-dpdk.at  |  23 +
> > >  4 files changed, 63 insertions(+), 27 deletions(-)  delete mode
> > > 100644 tests/pcap/mfex_test.pcap
> > >
> > > diff --git a/tests/automake.mk b/tests/automake.mk index
> > > 8a9151f81..507da2ee8 100644
> > > --- a/tests/automake.mk
> > > +++ b/tests/automake.mk
> > > @@ -145,7 +145,6 @@ $(srcdir)/tests/fuzz-regression-list.at:
> > > tests/automake.mk
> > >
> > >  EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
> > MFEX_AUTOVALIDATOR_TESTS =
> > > \ -tests/pcap/mfex_test.pcap \  tests/mfex_fuzzy.py
> > >
> > >  OVSDB_CLUSTER_TESTSUITE_AT = \
> > > diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py index
> > > 3efe1152d..dbde5fe1b 100755
> > > --- a/tests/mfex_fuzzy.py
> > > +++ b/tests/mfex_fuzzy.py
> > > @@ -3,30 +3,58 @@
> > >  import sys
> > >
> > >  from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6,
> > > RandShort, fuzz -from scapy.all import IPv6, Dot1Q, IP, Ether, UDP,
> > > TCP
> > > +from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random
> > >
> > > +# Relative path for the pcap file location.
> > >  path = str(sys.argv[1]) + "/pcap/fuzzy.pcap"
> > > +# The number of packets generated will be size * 8.
> > > +size = int(sys.argv[2])
> > > +# Traffic option is used to choose between fuzzy or simple packet type.
> > > +traffic_opt = str(sys.argv[3])
> > > +
> > >  pktdump = PcapWriter(path, append=False, sync=True)
> > >
> > > -for i in range(0, 2000):
> > > +pkt = []
> > > +
> > > +for i in range(0, size):
> > >
> > > -# Generate random protocol bases, use a fuzz() over the combined
> > packet
> > > -# for full fuzzing.
> > >  eth = Ether(src=RandMAC(), dst=RandMAC())
> > >  vlan = Dot1Q()
> > > -ipv4 = IP(src=RandIP(), dst=RandIP())
> > > -ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
> > >  udp = UDP(dport=RandShort(), sport=RandShort())
> > > -tcp = TCP(dport=RandShort(), sport=RandShort())
> > > -
> > > -# IPv4 packets with fuzzing
> > > -pktdump.write(fuzz(eth / ipv4 / udp))
> > > -pktdump.write(fuzz(eth / ipv4 / tcp))
> > > -pktdump.write(fuzz(eth / vlan / ipv4 / udp))
> > > -pktdump.write(fuzz(eth / vlan / ipv4 / tcp))
> > > -
> > > -# IPv6 packets with fuzzing
> > > -pktdump.write(fuzz(eth / ipv6 / udp))
> > > -pktdump.write(fuzz(eth / ipv6 / tcp))
> > > -pktdump.write(fuzz(eth / vlan / ipv6 / udp))
> > > -pktdump.write(fuzz(eth / vlan / ipv6 / tcp))
> > > +
> > > +if traffic_opt == "fuzzy":
> > > +
> > > +ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100))
> > > +ipv6 = IPv6(src=RandIP6(), dst=RandIP6(),
> > > + plen=random.randint(0,
> > 100))
> > > +tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S',
> > > +  dataofs=random.randint(0, 20))
> > > +# IPv4 packets with fuzzing
> > > +pkt.append(fuzz(eth / ipv4 / udp))
> >

Re: [ovs-dev] [PATCH ovn] Fixed multiple flaky tests

2022-04-05 Thread Dumitru Ceara

On 4/5/22 17:22, Mark Michelson wrote:
> On 3/30/22 06:40, Xavier Simonart wrote:
>> Hi Dumitru
>>
>> On Mon, Mar 21, 2022 at 12:02 PM Dumitru Ceara  wrote:
>>
>>> On 3/21/22 10:56, Xavier Simonart wrote:
 - send gratuitous arp on localnet
 - send gratuitous arp for nat ips in localnet
 - dns lookup : 1 HV, 2 LS, 2 LSPs/LS -- ovn-northd -- dp-groups=yes
 - send gratuitous arp for NAT rules on distributed router
 - 2 HVs, 1 lport/HV, localport ports

 Signed-off-by: Xavier Simonart 
 ---
>>>
>>> Hi Xavier,
>>>
   tests/ovn.at | 20 +---
   1 file changed, 13 insertions(+), 7 deletions(-)

 diff --git a/tests/ovn.at b/tests/ovn.at
 index 166b5f72e..cf027703f 100644
 --- a/tests/ovn.at
 +++ b/tests/ovn.at
>>>
>>> [...]
>>>
 @@ -10391,7 +10395,7 @@ test_dns6() {
   }

   AT_CAPTURE_FILE([ofctl_monitor0.log])
 -as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
 +as hv1 ovs-ofctl -t 300 monitor br-int resume --detach --no-chdir \
   --pidfile=ovs-ofctl0.pid 2> ofctl_monitor0.log

>>>
>>> Is this really needed?  We set OVS_CTL_TIMEOUT=30 in atlocal.in.
>>>
>>> Thanks for looking into this.
>>
>> Yes :-)
>> The issue here is that we start an ovs-ofctl monitor, then run a few
>> times
>> (send packet / check monitor.log).
>> The test is quite long, and on slow systems (e.g. VM running s390
>> architecture on a x86_64 host), the 30 seconds are not enough for all
>> tests, and hence monitoring stops. Then we check (for 30 seconds) whether
>> monitor.log contains what we expect ...
>> 300 seconds might be a little much... just depends on how slow the system
>> is... Requires at least one minute on my system...
> 
> Thanks for the clarification on this.
> 
> Acked-by: Mark Michelson
> 
> I think this should be committed as-is since it's fixing the flakiness
> of the tests.
> 

Sounds good to me too, the CI has been quite unhappy lately.

> With regards to the DNS test in particular, my assumption is that the
> long timeout is needed because the DNS queries are falling back to
> system default behavior when OVN is not able to answer the queries
> successfully. This can take a very long time since the system will wait
> a certain amount of time for a response and may retry the query a
> multiple times as well. If there were some way to temporarily override
> default system behavior it may end up speeding up the test and
> eliminating the need to set the timeout so large.
> 
> It's also possible I'm completely wrong in the above paragraph :)
> 
>>
>> Thanks
>> Xavier
>>
>> Thanks,
>>> Dumitru
>>>
>>>
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] controller: properly remove qos policy meters

2022-04-05 Thread Mark Michelson


Thanks Lorenzo,

Acked-by: Mark Michelson 

On 4/1/22 17:11, Lorenzo Bianconi wrote:

Remove meters created through set_meter() action if the related entry in
the NB QoS table has been deleted.

Fixes: 885655e16 ("controller: reconfigure ovs meters for ovn meters")
Signed-off-by: Lorenzo Bianconi 
---
  controller/ofctrl.c | 7 +++
  tests/ovn.at| 4 
  2 files changed, 11 insertions(+)

diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index a7c2d2011..2d8a8ec9c 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -2665,6 +2665,13 @@ ofctrl_put(struct ovn_desired_flow_table *lflow_table,
  EXTEND_TABLE_FOR_EACH_INSTALLED (m_installed, next_meter, meters) {
  /* Delete the meter. */
  ofctrl_meter_bands_erase(m_installed, );
+if (!strncmp(m_installed->name, "__string: ", 10)) {
+struct ofputil_meter_mod mm = {
+.command = OFPMC13_DELETE,
+.meter = { .meter_id = m_installed->table_id },
+};
+add_meter_mod(, );
+}
  ovn_extend_table_remove_existing(meters, m_installed);
  }
  
diff --git a/tests/ovn.at b/tests/ovn.at

index 0c2fe9f97..078a6276b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9512,6 +9512,10 @@ check ovn-nbctl --wait=hv qos-add lsw0 to-lport 1002 
'inport=="lp2" && is_chassi
  AT_CHECK([as hv ovs-ofctl dump-meters br-int -O OpenFlow13 | grep meter | wc 
-l], [0], [4
  ])
  
+check ovn-nbctl qos-del lsw0

+AT_CHECK([as hv ovs-ofctl dump-meters br-int -O OpenFlow13 | grep meter | wc 
-l], [0], [0
+])
+
  OVN_CLEANUP([hv])
  AT_CLEANUP
  ])



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5 2/6] treewide: Avoid offsetting NULL pointers.

2022-04-05 Thread Dumitru Ceara

On 4/5/22 16:41, Aaron Conole wrote:
> Dumitru Ceara  writes:
> 
>> This is undefined behavior and was reported by UB Sanitizer:
>>   lib/meta-flow.c:3445:16: runtime error: member access within null pointer 
>> of type 'struct vl_mf_field'
>>   #0 0x6aad0f in mf_get_vl_mff lib/meta-flow.c:3445
>>   #1 0x6d96d7 in mf_from_oxm_header lib/nx-match.c:260
>>   #2 0x6d9e2e in nx_pull_header__ lib/nx-match.c:341
>>   #3 0x6daafa in nx_pull_header lib/nx-match.c:488
>>   #4 0x6abcb6 in mf_vl_mff_nx_pull_header lib/meta-flow.c:3605
>>   #5 0x73b9be in decode_NXAST_RAW_REG_MOVE lib/ofp-actions.c:2652
>>   #6 0x764ccd in ofpact_decode lib/ofp-actions.inc2:4681
>>   [...]
>>   lib/sset.c:315:12: runtime error: applying zero offset to null pointer
>>   #0 0xcc2e6a in sset_at_position /root/ovs/lib/sset.c:315:12
>>   #1 0x5734b3 in port_dump_next /root/ovs/ofproto/ofproto-dpif.c:4083:20
>>   [...]
>>   lib/ovsdb-data.c:2194:56: runtime error: applying zero offset to null 
>> pointer
>>   #0 0x5e9530 in ovsdb_datum_added_removed 
>> /root/ovs/lib/ovsdb-data.c:2194:56
>>   #1 0x4d6258 in update_row_ref_count 
>> /root/ovs/ovsdb/transaction.c:335:17
>>   #2 0x4c360b in for_each_txn_row /root/ovs/ovsdb/transaction.c:1572:33
>>   [...]
>>   lib/ofpbuf.c:440:30: runtime error: applying zero offset to null pointer
>>   #0 0x75066d in ofpbuf_push_uninit lib/ofpbuf.c:440
>>   #1 0x46ac8a in ovnacts_parse lib/actions.c:4190
>>   #2 0x46ad91 in ovnacts_parse_string lib/actions.c:4208
>>   #3 0x4106d1 in test_parse_actions tests/test-ovn.c:1324
>>   [...]
>>   lib/ofp-actions.c:3205:22: runtime error: applying non-zero offset 2 to 
>> null pointer
>>   #0 0x6e1641 in set_field_split_str /root/ovs/lib/ofp-actions.c:3205:22
>>   [...]
>>   lib/tnl-ports.c:74:12: runtime error: applying zero offset to null pointer
>>   #0 0xceffe7 in tnl_port_cast /root/ovs/lib/tnl-ports.c:74:12
>>   #1 0xcf14c3 in map_insert /root/ovs/lib/tnl-ports.c:116:13
>>   [...]
>>   ofproto/ofproto.c:8905:16: runtime error: applying zero offset to null 
>> pointer
>>   #0 0x556795 in eviction_group_hash_rule 
>> /root/ovs/ofproto/ofproto.c:8905:16
>>   #1 0x503f8d in eviction_group_add_rule 
>> /root/ovs/ofproto/ofproto.c:9022:42
>>   [...]
>>
>> Also, it's valid to have an empty ofpact list and we should be able to
>> try to iterate through it.
>>
>> UB Sanitizer report:
>>   include/openvswitch/ofp-actions.h:222:12: runtime error: applying zero 
>> offset to null pointer
>>   #0 0x665d69 in ofpact_end 
>> /root/ovs/./include/openvswitch/ofp-actions.h:222:12
>>   #1 0x66b2cf in ofpacts_put_openflow_actions 
>> /root/ovs/lib/ofp-actions.c:8861:5
>>   #2 0x6ffdd1 in ofputil_encode_flow_mod /root/ovs/lib/ofp-flow.c:447:9
>>   [...]
>>
>> Signed-off-by: Dumitru Ceara 
>> ---
>> v5:
>> - Rebase.
>> v4:
>> - Addressed Ilya's comments.
>> ---
> 
> Glad to see that the undefined behavior got removed, BUT this
> can introduce some different undefined behavior - places where we
> have a calls to ofpbuf_at_...() always assume a valid pointer is
> returned.
> 

Thanks for the review!

> I think it makes sense to abort if b->data is NULL in these cases.
> Maybe something like:
> 
>   ovs_abort(0, "invalid buffer data pointer");
> 
> WDYT?
> 

Calling ovs_abort() directly from openvswitch/util.h will be a challenge
because it's an internal function and the openvswitch/util.h header is
public.  Worst case we just call ovs_assert() like we already do in
ofpbuf_at_assert().

But, just to make sure I understood properly, you'd like to assert that
b->data is not NULL only in ofpbuf_at() and ofpbuf_at_assert(), right?

Because the other ofpact_...() functions are also called in valid
scenarios on ofpbufs that have b->data = NULL.

>>  include/openvswitch/ofp-actions.h |4 +++-
>>  include/openvswitch/ofpbuf.h  |   17 -
>>  lib/dynamic-string.c  |8 ++--
>>  lib/meta-flow.c   |4 +++-
>>  lib/ofp-actions.c |8 
>>  lib/ofpbuf.c  |4 
>>  lib/ovsdb-data.c  |   37 
>> +
>>  lib/ovsdb-data.h  |4 
>>  lib/sset.c|4 +++-
>>  lib/tnl-ports.c   |2 +-
>>  ofproto/ofproto.c |2 +-
>>  11 files changed, 62 insertions(+), 32 deletions(-)
>>
>> diff --git a/include/openvswitch/ofp-actions.h 
>> b/include/openvswitch/ofp-actions.h
>> index 41bcb55d2056..b7231c7bb334 100644
>> --- a/include/openvswitch/ofp-actions.h
>> +++ b/include/openvswitch/ofp-actions.h
>> @@ -218,7 +218,9 @@ struct ofpact *ofpact_next_flattened(const struct ofpact 
>> *);
>>  static inline struct ofpact *
>>  ofpact_end(const struct ofpact *ofpacts, size_t ofpacts_len)
>>  {
>> -return ALIGNED_CAST(struct ofpact *, (uint8_t *) ofpacts +

Re: [ovs-dev] [PATCH ovn] Fixed multiple flaky tests

2022-04-05 Thread Mark Michelson


On 3/30/22 06:40, Xavier Simonart wrote:

Hi Dumitru

On Mon, Mar 21, 2022 at 12:02 PM Dumitru Ceara  wrote:


On 3/21/22 10:56, Xavier Simonart wrote:

- send gratuitous arp on localnet
- send gratuitous arp for nat ips in localnet
- dns lookup : 1 HV, 2 LS, 2 LSPs/LS -- ovn-northd -- dp-groups=yes
- send gratuitous arp for NAT rules on distributed router
- 2 HVs, 1 lport/HV, localport ports

Signed-off-by: Xavier Simonart 
---


Hi Xavier,


  tests/ovn.at | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 166b5f72e..cf027703f 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at


[...]


@@ -10391,7 +10395,7 @@ test_dns6() {
  }

  AT_CAPTURE_FILE([ofctl_monitor0.log])
-as hv1 ovs-ofctl monitor br-int resume --detach --no-chdir \
+as hv1 ovs-ofctl -t 300 monitor br-int resume --detach --no-chdir \
  --pidfile=ovs-ofctl0.pid 2> ofctl_monitor0.log



Is this really needed?  We set OVS_CTL_TIMEOUT=30 in atlocal.in.

Thanks for looking into this.


Yes :-)
The issue here is that we start an ovs-ofctl monitor, then run a few times
(send packet / check monitor.log).
The test is quite long, and on slow systems (e.g. VM running s390
architecture on a x86_64 host), the 30 seconds are not enough for all
tests, and hence monitoring stops. Then we check (for 30 seconds) whether
monitor.log contains what we expect ...
300 seconds might be a little much... just depends on how slow the system
is... Requires at least one minute on my system...


Thanks for the clarification on this.

Acked-by: Mark Michelson

I think this should be committed as-is since it's fixing the flakiness 
of the tests.


With regards to the DNS test in particular, my assumption is that the 
long timeout is needed because the DNS queries are falling back to 
system default behavior when OVN is not able to answer the queries 
successfully. This can take a very long time since the system will wait 
a certain amount of time for a response and may retry the query a 
multiple times as well. If there were some way to temporarily override 
default system behavior it may end up speeding up the test and 
eliminating the need to set the timeout so large.


It's also possible I'm completely wrong in the above paragraph :)



Thanks
Xavier

Thanks,

Dumitru



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] flow: Allow matches on nw_proto also for IPv6 later frags.

2022-04-05 Thread Paolo Valerio

Aaron Conole  writes:

> Eelco Chaudron  writes:
>
>> On 4 Apr 2022, at 23:54, Paolo Valerio wrote:
>>
>>> Aaron Conole  writes:
>>>
 Paolo Valerio  writes:

> The next header contained in the last extension header of the IPv6
> later frags still contain the information of the upper layer protocol
> number.
>
> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
> frags as well by processing later frags and storing the nw_proto
> information.
>
> Signed-off-by: Paolo Valerio 
> ---

 Digging in, I could never find a reason why ipv6 was treated differently
 in this manner, but I guess there could be cases where the header layout
 could cause problems (see the top of the while() block).  I don't know
 that it's a practical concern (other values are probably less common),
 but it was the only thing I could come up with for why the match might
 behave this way.  I haven't done enough mailing list archeology to have
 a good idea.
>>>
>>> Guess this may probably be due to (rfc8200 section-4.5):
>>>
>>> "The subsequent fragment packets are composed of:
>>>
>>>[...]
>>>
>>>(2)  A Fragment header containing:
>>>
>>>The Next Header value that identifies the first header
>>>after the Per-Fragment headers of the original packet.
>>>
>>>[...]"
>>>
>>> where the "first header" in the original packet could be another
>>> extension header or an upper-layer header, while the Fragment header is
>>> the last EH for later fragments (according to section 4.5).
>
> Thanks for digging in and finding this.
>
>>> If I'm not missing something, it makes sense to drop the patch.
>>> It's probably a good thing if we document the different behavior between
>>> IPv4 and IPv6 for later frags, though.
>>
>> Alternatively, we could check if the next header is a non-extension header 
>> and if so, mark it as such.
>>
>> The only problem might be the case where packets do have extension
>> headers after the fragmentation header, and they might be dropped
>> (handled differently).
>
> I think it might be best to keep the system as-is.  If we add this
> "feature," users will still need to do matches like:
>
>   table=X,ip6,ct_state=-trk actions=ct(...)
>
> They can try to optimize the flow table looks by doing individual
> selections, but it will be confusing to flow writers (and take an extra
> bit of time to debug) when they get missed packets.  Even if we try to
> make things convenient, there can be edge cases we miss - and today
> systems like OVN already accommodate the nuance of v4 vs v6.
>

I agree. Although the proposal is reasonable and covers the majority of
the cases, it's probably better to stick with the current behavior.

> I'm in favor of a documentation patch instead that spells it out for
> flow writers to clearly understand.  That way we don't have a custom
> state machine to try and work around this limitation.
>

If we all agree, I'd proceed with that.

>> Anyhow, whatever we decide, we should document it in both the documentation 
>> and code.
>>
>> //Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/5] netdev-offload-tc: Move flower_to_match action handling to isolated function.

2022-04-05 Thread Mike Pattrick

On Mon, Mar 28, 2022 at 7:10 AM Eelco Chaudron  wrote:
>
> Move handling of the individual actions in the parse_tc_flower_to_match()
> function to a separate function that will make recursive action handling
> easier.
>
> Signed-off-by: Eelco Chaudron 

Acked-by: Mike Pattrick 

> ---
>  lib/netdev-offload-tc.c |  422 
> +--
>  1 file changed, 221 insertions(+), 201 deletions(-)
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index ae6f1134d..a7838e503 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -617,6 +617,226 @@ parse_tc_flower_terse_to_match(struct tc_flower *flower,
>  return 0;
>  }
>
> +static int
> +_parse_tc_flower_to_actions(struct tc_flower *flower, struct ofpbuf *buf,
> +int start_index, int max_index)
> +{
> +struct tc_action *action;
> +int i;
> +
> +if (max_index <= 0 || max_index > flower->action_count) {
> +max_index = flower->action_count;
> +}
> +
> +for (i = start_index; i < max_index; i++) {
> +action = >actions[i];
> +
> +switch (action->type) {
> +case TC_ACT_VLAN_POP: {
> +nl_msg_put_flag(buf, OVS_ACTION_ATTR_POP_VLAN);
> +}
> +break;
> +case TC_ACT_VLAN_PUSH: {
> +struct ovs_action_push_vlan *push;
> +
> +push = nl_msg_put_unspec_zero(buf, OVS_ACTION_ATTR_PUSH_VLAN,
> +  sizeof *push);
> +push->vlan_tpid = action->vlan.vlan_push_tpid;
> +push->vlan_tci = htons(action->vlan.vlan_push_id
> +   | (action->vlan.vlan_push_prio << 13)
> +   | VLAN_CFI);
> +}
> +break;
> +case TC_ACT_MPLS_POP: {
> +nl_msg_put_be16(buf, OVS_ACTION_ATTR_POP_MPLS,
> +action->mpls.proto);
> +}
> +break;
> +case TC_ACT_MPLS_PUSH: {
> +struct ovs_action_push_mpls *push;
> +ovs_be32 mpls_lse = 0;
> +
> +flow_set_mpls_lse_label(_lse, action->mpls.label);
> +flow_set_mpls_lse_tc(_lse, action->mpls.tc);
> +flow_set_mpls_lse_ttl(_lse, action->mpls.ttl);
> +flow_set_mpls_lse_bos(_lse, action->mpls.bos);
> +
> +push = nl_msg_put_unspec_zero(buf, OVS_ACTION_ATTR_PUSH_MPLS,
> +  sizeof *push);
> +push->mpls_ethertype = action->mpls.proto;
> +push->mpls_lse = mpls_lse;
> +}
> +break;
> +case TC_ACT_MPLS_SET: {
> +size_t set_offset = nl_msg_start_nested(buf,
> +OVS_ACTION_ATTR_SET);
> +struct ovs_key_mpls *set_mpls;
> +ovs_be32 mpls_lse = 0;
> +
> +flow_set_mpls_lse_label(_lse, action->mpls.label);
> +flow_set_mpls_lse_tc(_lse, action->mpls.tc);
> +flow_set_mpls_lse_ttl(_lse, action->mpls.ttl);
> +flow_set_mpls_lse_bos(_lse, action->mpls.bos);
> +
> +set_mpls = nl_msg_put_unspec_zero(buf, OVS_KEY_ATTR_MPLS,
> +  sizeof *set_mpls);
> +set_mpls->mpls_lse = mpls_lse;
> +nl_msg_end_nested(buf, set_offset);
> +}
> +break;
> +case TC_ACT_PEDIT: {
> +parse_flower_rewrite_to_netlink_action(buf, action);
> +}
> +break;
> +case TC_ACT_ENCAP: {
> +size_t set_offset = nl_msg_start_nested(buf, 
> OVS_ACTION_ATTR_SET);
> +size_t tunnel_offset =
> +nl_msg_start_nested(buf, OVS_KEY_ATTR_TUNNEL);
> +
> +if (action->encap.id_present) {
> +nl_msg_put_be64(buf, OVS_TUNNEL_KEY_ATTR_ID, 
> action->encap.id);
> +}
> +if (action->encap.ipv4.ipv4_src) {
> +nl_msg_put_be32(buf, OVS_TUNNEL_KEY_ATTR_IPV4_SRC,
> +action->encap.ipv4.ipv4_src);
> +}
> +if (action->encap.ipv4.ipv4_dst) {
> +nl_msg_put_be32(buf, OVS_TUNNEL_KEY_ATTR_IPV4_DST,
> +action->encap.ipv4.ipv4_dst);
> +}
> +if (ipv6_addr_is_set(>encap.ipv6.ipv6_src)) {
> +nl_msg_put_in6_addr(buf, OVS_TUNNEL_KEY_ATTR_IPV6_SRC,
> +>encap.ipv6.ipv6_src);
> +}
> +if (ipv6_addr_is_set(>encap.ipv6.ipv6_dst)) {
> +nl_msg_put_in6_addr(buf, OVS_TUNNEL_KEY_ATTR_IPV6_DST,
> +>encap.ipv6.ipv6_dst);
> +}
> +if (action->encap.tos) {
> +nl_msg_put_u8(buf, OVS_TUNNEL_KEY_ATTR_TOS,
> +  action->encap.tos);
> +}
> +if (action->encap.ttl) {
> +nl_msg_put_u8(buf,

Re: [ovs-dev] [PATCH 1/5] netdev-offload-tc: Move flow_put action handling to isolated function.

2022-04-05 Thread Mike Pattrick

On Mon, Mar 28, 2022 at 7:10 AM Eelco Chaudron  wrote:
>
> Move handling of the individual actions in the netdev_tc_flow_put()
> function to a separate function that will make recursive action handling
> easier.
>
> Signed-off-by: Eelco Chaudron 

Acked-by: Mike Pattrick 

> ---
>  lib/netdev-offload-tc.c |  255 
> +--
>  1 file changed, 138 insertions(+), 117 deletions(-)
>
> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
> index 12d0a9af3..ae6f1134d 100644
> --- a/lib/netdev-offload-tc.c
> +++ b/lib/netdev-offload-tc.c
> @@ -1558,6 +1558,139 @@ parse_match_ct_state_to_flower(struct tc_flower 
> *flower, struct match *match)
>  }
>  }
>
> +static int
> +netdev_tc_parse_nl_actions(struct netdev *netdev, struct tc_flower *flower,
> +   struct offload_info *info,
> +   const struct nlattr *actions, size_t actions_len,
> +   bool *recirc_act)
> +{
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
> +const struct nlattr *nla;
> +size_t left;
> +
> +NL_ATTR_FOR_EACH (nla, left, actions, actions_len) {
> +struct tc_action *action;
> +int err;
> +
> +if (flower->action_count >= TCA_ACT_MAX_NUM) {
> +VLOG_DBG_RL(, "Can only support %d actions", TCA_ACT_MAX_NUM);
> +return EOPNOTSUPP;
> +}
> +
> +action = >actions[flower->action_count];
> +
> +if (nl_attr_type(nla) == OVS_ACTION_ATTR_OUTPUT) {
> +odp_port_t port = nl_attr_get_odp_port(nla);
> +struct netdev *outdev = netdev_ports_get(
> +port, netdev_get_dpif_type(netdev));
> +
> +if (!outdev) {
> +VLOG_DBG_RL(, "Can't find netdev for output port %d", 
> port);
> +return ENODEV;
> +}
> +
> +if (!netdev_flow_api_equals(netdev, outdev)) {
> +VLOG_DBG_RL(,
> +"Flow API provider mismatch between ingress (%s) 
> "
> +"and egress (%s) ports",
> +netdev_get_name(netdev), 
> netdev_get_name(outdev));
> +netdev_close(outdev);
> +return EOPNOTSUPP;
> +}
> +
> +action->out.ifindex_out = netdev_get_ifindex(outdev);
> +if (action->out.ifindex_out < 0) {
> +VLOG_DBG_RL(,
> +"Can't find ifindex for output port %s, error 
> %d",
> +netdev_get_name(outdev), 
> action->out.ifindex_out);
> +netdev_close(outdev);
> +return -action->out.ifindex_out;
> +}
> +
> +action->out.ingress = is_internal_port(netdev_get_type(outdev));
> +action->type = TC_ACT_OUTPUT;
> +flower->action_count++;
> +netdev_close(outdev);
> +} else if (nl_attr_type(nla) == OVS_ACTION_ATTR_PUSH_VLAN) {
> +const struct ovs_action_push_vlan *vlan_push = nl_attr_get(nla);
> +
> +action->vlan.vlan_push_tpid = vlan_push->vlan_tpid;
> +action->vlan.vlan_push_id = vlan_tci_to_vid(vlan_push->vlan_tci);
> +action->vlan.vlan_push_prio = 
> vlan_tci_to_pcp(vlan_push->vlan_tci);
> +action->type = TC_ACT_VLAN_PUSH;
> +flower->action_count++;
> +} else if (nl_attr_type(nla) == OVS_ACTION_ATTR_POP_VLAN) {
> +action->type = TC_ACT_VLAN_POP;
> +flower->action_count++;
> +} else if (nl_attr_type(nla) == OVS_ACTION_ATTR_PUSH_MPLS) {
> +const struct ovs_action_push_mpls *mpls_push = nl_attr_get(nla);
> +
> +action->mpls.proto = mpls_push->mpls_ethertype;
> +action->mpls.label = mpls_lse_to_label(mpls_push->mpls_lse);
> +action->mpls.tc = mpls_lse_to_tc(mpls_push->mpls_lse);
> +action->mpls.ttl = mpls_lse_to_ttl(mpls_push->mpls_lse);
> +action->mpls.bos = mpls_lse_to_bos(mpls_push->mpls_lse);
> +action->type = TC_ACT_MPLS_PUSH;
> +flower->action_count++;
> +} else if (nl_attr_type(nla) == OVS_ACTION_ATTR_POP_MPLS) {
> +action->mpls.proto = nl_attr_get_be16(nla);
> +action->type = TC_ACT_MPLS_POP;
> +flower->action_count++;
> +} else if (nl_attr_type(nla) == OVS_ACTION_ATTR_SET) {
> +const struct nlattr *set = nl_attr_get(nla);
> +const size_t set_len = nl_attr_get_size(nla);
> +
> +err = parse_put_flow_set_action(flower, action, set, set_len);
> +if (err) {
> +return err;
> +}
> +if (action->type == TC_ACT_ENCAP) {
> +action->encap.tp_dst = info->tp_dst_port;
> +action->encap.no_csum = !info->tunnel_csum_on;
> +}
> +} else if (nl_attr_type(nla)

Re: [ovs-dev] [PATCH v5 4/6] ofp-errors: Ensure parsed OFPT_ERROR messages are properly aligned.

2022-04-05 Thread Aaron Conole

Dumitru Ceara  writes:

> Trim the ofpbuf to ensure proper alignment.
>
> UB Sanitizer report:
>   lib/ofp-print.c:1218:24: runtime error: member access within misaligned 
> address 0x019229d2 for type 'const struct ofp_header', which requires 4 
> byte alignment
>   0x019229d2: note: pointer points here
>00 00  5a 5a 05 22 00 3e 00 00  00 09 00 00 00 00 00 00  00 03 05 0d 00 2e 
> 00 00  00 09 ff ff ff ff
> ^
>   #0 0x7d45cc in ofp_to_string lib/ofp-print.c:1218
>   #1 0x774fa8 in ofperr_msg_format lib/ofp-errors.c:253
>   #2 0x7d2617 in ofp_print_error_msg lib/ofp-print.c:435
>   #3 0x7d3eb7 in ofp_to_string__ lib/ofp-print.c:998
>   #4 0x7d47fb in ofp_to_string lib/ofp-print.c:1244
>   #5 0x8dcc4b in do_send lib/vconn.c:688
>   #6 0x8dca64 in vconn_send lib/vconn.c:671
>
> Signed-off-by: Dumitru Ceara 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5 3/6] ofp-actions: Ensure aligned accesses to masked fields.

2022-04-05 Thread Aaron Conole

Dumitru Ceara  writes:

> For example is parsing the OVN "eth.dst[40] = 1;" action, which
> triggered the following warning from UndefinedBehaviorSanitizer:
>
>   lib/meta-flow.c:3210:9: runtime error: member access within misaligned 
> address 0x00de4e36 for type 'const union mf_value', which requires 8 byte 
> alignment
>   0x00de4e36: note: pointer points here
>00 00 00 00 01 00  00 00 00 00 00 00 00 00  70 4e de 00 00 00 00 00  10 51 
> de 00 00 00 00 00  c0 4f
>^
>   #0 0x5818bc in mf_format lib/meta-flow.c:3210
>   #1 0x5b6047 in format_SET_FIELD lib/ofp-actions.c:3342
>   #2 0x5d68ab in ofpact_format lib/ofp-actions.c:9213
>   #3 0x5d6ee0 in ofpacts_format lib/ofp-actions.c:9237
>   #4 0x410922 in test_parse_actions tests/test-ovn.c:1360
>
> To avoid this we now change the internal representation of the set_field
> actions, struct ofpact_set_field, such that the mask is always stored
> at a correctly aligned address, multiple of OFPACT_ALIGNTO.
>
> We also need to adapt the "ovs-ofctl show-flows - Oversized flow" test
> because now the ofpact representation of the set_field action uses more
> bytes in memory (for the extra alignment).  Change the test to use
> dec_ttl instead.
>
> Signed-off-by: Dumitru Ceara 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] flow: Allow matches on nw_proto also for IPv6 later frags.

2022-04-05 Thread Aaron Conole

Eelco Chaudron  writes:

> On 4 Apr 2022, at 23:54, Paolo Valerio wrote:
>
>> Aaron Conole  writes:
>>
>>> Paolo Valerio  writes:
>>>
 The next header contained in the last extension header of the IPv6
 later frags still contain the information of the upper layer protocol
 number.

 Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
 frags as well by processing later frags and storing the nw_proto
 information.

 Signed-off-by: Paolo Valerio 
 ---
>>>
>>> Digging in, I could never find a reason why ipv6 was treated differently
>>> in this manner, but I guess there could be cases where the header layout
>>> could cause problems (see the top of the while() block).  I don't know
>>> that it's a practical concern (other values are probably less common),
>>> but it was the only thing I could come up with for why the match might
>>> behave this way.  I haven't done enough mailing list archeology to have
>>> a good idea.
>>
>> Guess this may probably be due to (rfc8200 section-4.5):
>>
>> "The subsequent fragment packets are composed of:
>>
>>[...]
>>
>>(2)  A Fragment header containing:
>>
>>The Next Header value that identifies the first header
>>after the Per-Fragment headers of the original packet.
>>
>>[...]"
>>
>> where the "first header" in the original packet could be another
>> extension header or an upper-layer header, while the Fragment header is
>> the last EH for later fragments (according to section 4.5).

Thanks for digging in and finding this.

>> If I'm not missing something, it makes sense to drop the patch.
>> It's probably a good thing if we document the different behavior between
>> IPv4 and IPv6 for later frags, though.
>
> Alternatively, we could check if the next header is a non-extension header 
> and if so, mark it as such.
>
> The only problem might be the case where packets do have extension
> headers after the fragmentation header, and they might be dropped
> (handled differently).

I think it might be best to keep the system as-is.  If we add this
"feature," users will still need to do matches like:

  table=X,ip6,ct_state=-trk actions=ct(...)

They can try to optimize the flow table looks by doing individual
selections, but it will be confusing to flow writers (and take an extra
bit of time to debug) when they get missed packets.  Even if we try to
make things convenient, there can be edge cases we miss - and today
systems like OVN already accommodate the nuance of v4 vs v6.

I'm in favor of a documentation patch instead that spells it out for
flow writers to clearly understand.  That way we don't have a custom
state machine to try and work around this limitation.

> Anyhow, whatever we decide, we should document it in both the documentation 
> and code.
>
> //Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v5 2/6] treewide: Avoid offsetting NULL pointers.

2022-04-05 Thread Aaron Conole

Dumitru Ceara  writes:

> This is undefined behavior and was reported by UB Sanitizer:
>   lib/meta-flow.c:3445:16: runtime error: member access within null pointer 
> of type 'struct vl_mf_field'
>   #0 0x6aad0f in mf_get_vl_mff lib/meta-flow.c:3445
>   #1 0x6d96d7 in mf_from_oxm_header lib/nx-match.c:260
>   #2 0x6d9e2e in nx_pull_header__ lib/nx-match.c:341
>   #3 0x6daafa in nx_pull_header lib/nx-match.c:488
>   #4 0x6abcb6 in mf_vl_mff_nx_pull_header lib/meta-flow.c:3605
>   #5 0x73b9be in decode_NXAST_RAW_REG_MOVE lib/ofp-actions.c:2652
>   #6 0x764ccd in ofpact_decode lib/ofp-actions.inc2:4681
>   [...]
>   lib/sset.c:315:12: runtime error: applying zero offset to null pointer
>   #0 0xcc2e6a in sset_at_position /root/ovs/lib/sset.c:315:12
>   #1 0x5734b3 in port_dump_next /root/ovs/ofproto/ofproto-dpif.c:4083:20
>   [...]
>   lib/ovsdb-data.c:2194:56: runtime error: applying zero offset to null 
> pointer
>   #0 0x5e9530 in ovsdb_datum_added_removed 
> /root/ovs/lib/ovsdb-data.c:2194:56
>   #1 0x4d6258 in update_row_ref_count /root/ovs/ovsdb/transaction.c:335:17
>   #2 0x4c360b in for_each_txn_row /root/ovs/ovsdb/transaction.c:1572:33
>   [...]
>   lib/ofpbuf.c:440:30: runtime error: applying zero offset to null pointer
>   #0 0x75066d in ofpbuf_push_uninit lib/ofpbuf.c:440
>   #1 0x46ac8a in ovnacts_parse lib/actions.c:4190
>   #2 0x46ad91 in ovnacts_parse_string lib/actions.c:4208
>   #3 0x4106d1 in test_parse_actions tests/test-ovn.c:1324
>   [...]
>   lib/ofp-actions.c:3205:22: runtime error: applying non-zero offset 2 to 
> null pointer
>   #0 0x6e1641 in set_field_split_str /root/ovs/lib/ofp-actions.c:3205:22
>   [...]
>   lib/tnl-ports.c:74:12: runtime error: applying zero offset to null pointer
>   #0 0xceffe7 in tnl_port_cast /root/ovs/lib/tnl-ports.c:74:12
>   #1 0xcf14c3 in map_insert /root/ovs/lib/tnl-ports.c:116:13
>   [...]
>   ofproto/ofproto.c:8905:16: runtime error: applying zero offset to null 
> pointer
>   #0 0x556795 in eviction_group_hash_rule 
> /root/ovs/ofproto/ofproto.c:8905:16
>   #1 0x503f8d in eviction_group_add_rule 
> /root/ovs/ofproto/ofproto.c:9022:42
>   [...]
>
> Also, it's valid to have an empty ofpact list and we should be able to
> try to iterate through it.
>
> UB Sanitizer report:
>   include/openvswitch/ofp-actions.h:222:12: runtime error: applying zero 
> offset to null pointer
>   #0 0x665d69 in ofpact_end 
> /root/ovs/./include/openvswitch/ofp-actions.h:222:12
>   #1 0x66b2cf in ofpacts_put_openflow_actions 
> /root/ovs/lib/ofp-actions.c:8861:5
>   #2 0x6ffdd1 in ofputil_encode_flow_mod /root/ovs/lib/ofp-flow.c:447:9
>   [...]
>
> Signed-off-by: Dumitru Ceara 
> ---
> v5:
> - Rebase.
> v4:
> - Addressed Ilya's comments.
> ---

Glad to see that the undefined behavior got removed, BUT this
can introduce some different undefined behavior - places where we
have a calls to ofpbuf_at_...() always assume a valid pointer is
returned.

I think it makes sense to abort if b->data is NULL in these cases.
Maybe something like:

  ovs_abort(0, "invalid buffer data pointer");

WDYT?

>  include/openvswitch/ofp-actions.h |4 +++-
>  include/openvswitch/ofpbuf.h  |   17 -
>  lib/dynamic-string.c  |8 ++--
>  lib/meta-flow.c   |4 +++-
>  lib/ofp-actions.c |8 
>  lib/ofpbuf.c  |4 
>  lib/ovsdb-data.c  |   37 
> +
>  lib/ovsdb-data.h  |4 
>  lib/sset.c|4 +++-
>  lib/tnl-ports.c   |2 +-
>  ofproto/ofproto.c |2 +-
>  11 files changed, 62 insertions(+), 32 deletions(-)
>
> diff --git a/include/openvswitch/ofp-actions.h 
> b/include/openvswitch/ofp-actions.h
> index 41bcb55d2056..b7231c7bb334 100644
> --- a/include/openvswitch/ofp-actions.h
> +++ b/include/openvswitch/ofp-actions.h
> @@ -218,7 +218,9 @@ struct ofpact *ofpact_next_flattened(const struct ofpact 
> *);
>  static inline struct ofpact *
>  ofpact_end(const struct ofpact *ofpacts, size_t ofpacts_len)
>  {
> -return ALIGNED_CAST(struct ofpact *, (uint8_t *) ofpacts + ofpacts_len);
> +return ofpacts
> +   ? ALIGNED_CAST(struct ofpact *, (uint8_t *) ofpacts + ofpacts_len)
> +   : NULL;
>  }
>  
>  static inline bool
> diff --git a/include/openvswitch/ofpbuf.h b/include/openvswitch/ofpbuf.h
> index 1136ba04c84e..7b6aba9dc29c 100644
> --- a/include/openvswitch/ofpbuf.h
> +++ b/include/openvswitch/ofpbuf.h
> @@ -179,7 +179,9 @@ static inline void ofpbuf_delete(struct ofpbuf *b)
>  static inline void *ofpbuf_at(const struct ofpbuf *b, size_t offset,
>size_t size)
>  {
> -return offset + size <= b->size ? (char *) b->data + offset : NULL;
> +return (offset + size <= b->size) &&

Re: [ovs-dev] [PATCH ovn v2] controller/pinctrl: avoid accessing invalid memory

2022-04-05 Thread Mark Michelson


Thanks for the update Mohammad. I like this version much better.

Acked-by: Mark Michelson 

On 4/3/22 07:26, Mohammad Heib wrote:

currently pinctrl uses some hash tables that were supplied
by ovn-controller to prepare and send IPv6 RAs, those
hash tables are not updated properly when a port_bindings record
deleted and in this case, some port_bindings records remain in the
hash table where they should be removed.

This patch handles such changes and update those lists
to avoid invalid memory access.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2052945
Signed-off-by: Mohammad Heib 
---
  controller/binding.c | 17 +
  1 file changed, 17 insertions(+)

diff --git a/controller/binding.c b/controller/binding.c
index 4d62b0858..36884b1ea 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -481,6 +481,20 @@ remove_related_lport(const struct sbrec_port_binding *pb,
  }
  }
  
+static void

+delete_active_pb_ras_pd(const struct sbrec_port_binding *pb,
+struct shash *map)
+{
+struct shash_node *iter = shash_find(map, pb->logical_port);
+
+if (iter) {
+struct pb_ld_binding *ras_pd = iter->data;
+shash_delete(map, iter);
+free(ras_pd);
+return;
+}
+}
+
  static void
  update_active_pb_ras_pd(const struct sbrec_port_binding *pb,
  struct hmap *local_datapaths,
@@ -2251,6 +2265,9 @@ binding_handle_port_binding_changes(struct binding_ctx_in 
*b_ctx_in,
  continue;
  }
  
+delete_active_pb_ras_pd(pb, b_ctx_out->local_active_ports_ipv6_pd);

+delete_active_pb_ras_pd(pb, b_ctx_out->local_active_ports_ras);
+
  enum en_lport_type lport_type = get_lport_type(pb);
  
  struct binding_lport *b_lport =




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 4/4] tests/mfex: Improve pcap script for mfex tests.

2022-04-05 Thread Phelan, Michael



> -Original Message-
> From: Stokes, Ian 
> Sent: Tuesday 5 April 2022 14:52
> To: Amber, Kumar ; ovs-dev@openvswitch.org;
> Phelan, Michael 
> Cc: echau...@redhat.com; Ferriter, Cian ;
> f...@sysclose.org; Van Haaren, Harry 
> Subject: RE: [PATCH v8 4/4] tests/mfex: Improve pcap script for mfex tests.
> 
> > The mfex pcap generation script is improved for varied length traffic
> > and also removes the hard coded mfex_pcap and instead uses the script
> > itself to generate complex traffic patterns for testing.
> >
> > Signed-off-by: Kumar Amber 
> >
> 
> So in general I think this approach is a bit more improved.
> 
> @Phelan, Michael do you know if we currently run this testing in our CI? Also
> @Amber, Kumar et al, should the CI be expanded now to do both fuzzy and
> non-fuzzy testing as both seem to be an option now?

Currently our CI runs only the MFEX Autovalidator and the MFEX Configuration 
unit tests, the fuzzy test is disabled as it was giving inconsistent results. I 
believe Amber had a separate patch which fixed this issue but I'm not certain 
whether it has been merged into OVS master yet, if it has then the MFEX fuzzy 
test should be okay to be re-enabled on the CI and run for any new patches 
coming in.

> 
> Regards
> Ian
> 
> > ---
> > v8:
> > - Reduce IO writes.
> > --
> > ---
> >  tests/automake.mk |   1 -
> >  tests/mfex_fuzzy.py   |  66 +++---
> >  tests/pcap/mfex_test.pcap | Bin 416 -> 0 bytes
> >  tests/system-dpdk.at  |  23 +
> >  4 files changed, 63 insertions(+), 27 deletions(-)  delete mode
> > 100644 tests/pcap/mfex_test.pcap
> >
> > diff --git a/tests/automake.mk b/tests/automake.mk index
> > 8a9151f81..507da2ee8 100644
> > --- a/tests/automake.mk
> > +++ b/tests/automake.mk
> > @@ -145,7 +145,6 @@ $(srcdir)/tests/fuzz-regression-list.at:
> > tests/automake.mk
> >
> >  EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
> MFEX_AUTOVALIDATOR_TESTS =
> > \ -tests/pcap/mfex_test.pcap \  tests/mfex_fuzzy.py
> >
> >  OVSDB_CLUSTER_TESTSUITE_AT = \
> > diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py index
> > 3efe1152d..dbde5fe1b 100755
> > --- a/tests/mfex_fuzzy.py
> > +++ b/tests/mfex_fuzzy.py
> > @@ -3,30 +3,58 @@
> >  import sys
> >
> >  from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6,
> > RandShort, fuzz -from scapy.all import IPv6, Dot1Q, IP, Ether, UDP,
> > TCP
> > +from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random
> >
> > +# Relative path for the pcap file location.
> >  path = str(sys.argv[1]) + "/pcap/fuzzy.pcap"
> > +# The number of packets generated will be size * 8.
> > +size = int(sys.argv[2])
> > +# Traffic option is used to choose between fuzzy or simple packet type.
> > +traffic_opt = str(sys.argv[3])
> > +
> >  pktdump = PcapWriter(path, append=False, sync=True)
> >
> > -for i in range(0, 2000):
> > +pkt = []
> > +
> > +for i in range(0, size):
> >
> > -# Generate random protocol bases, use a fuzz() over the combined
> packet
> > -# for full fuzzing.
> >  eth = Ether(src=RandMAC(), dst=RandMAC())
> >  vlan = Dot1Q()
> > -ipv4 = IP(src=RandIP(), dst=RandIP())
> > -ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
> >  udp = UDP(dport=RandShort(), sport=RandShort())
> > -tcp = TCP(dport=RandShort(), sport=RandShort())
> > -
> > -# IPv4 packets with fuzzing
> > -pktdump.write(fuzz(eth / ipv4 / udp))
> > -pktdump.write(fuzz(eth / ipv4 / tcp))
> > -pktdump.write(fuzz(eth / vlan / ipv4 / udp))
> > -pktdump.write(fuzz(eth / vlan / ipv4 / tcp))
> > -
> > -# IPv6 packets with fuzzing
> > -pktdump.write(fuzz(eth / ipv6 / udp))
> > -pktdump.write(fuzz(eth / ipv6 / tcp))
> > -pktdump.write(fuzz(eth / vlan / ipv6 / udp))
> > -pktdump.write(fuzz(eth / vlan / ipv6 / tcp))
> > +
> > +if traffic_opt == "fuzzy":
> > +
> > +ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100))
> > +ipv6 = IPv6(src=RandIP6(), dst=RandIP6(), plen=random.randint(0,
> 100))
> > +tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S',
> > +  dataofs=random.randint(0, 20))
> > +# IPv4 packets with fuzzing
> > +pkt.append(fuzz(eth / ipv4 / udp))
> > +pkt.append(fuzz(eth / ipv4 / tcp))
> > +pkt.append(fuzz(eth / vlan / ipv4 / udp))
> > +pkt.append(fuzz(eth / vlan / ipv4 / tcp))
> > +
> > +# IPv6 packets with fuzzing
> > +pkt.append(fuzz(eth / ipv6 / udp))
> > +pkt.append(fuzz(eth / ipv6 / tcp))
> > +pkt.append(fuzz(eth / vlan / ipv6 / udp))
> > +pkt.append(fuzz(eth / vlan / ipv6 / tcp))
> > +
> > +else:
> > +
> > +ipv4 = IP(src=RandIP(), dst=RandIP())
> > +ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
> > +tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S')
> > +# IPv4 packets
> > +pkt.append(eth / ipv4 / udp)
> > +pkt.append(eth / ipv4 / tcp)
> > +

[ovs-dev] [PATCH v2 2/2] dpif-netdev: Restructure rxq schedule logging.

2022-04-05 Thread Kevin Traynor

Previously logging about rxq scheduling was done in a code branch with
the selection of the PMD thread core after checking that a numa was
selected.

By splitting out the logging from the PMD thread core selection, it can
simplify the code complexity and make it more extendable for future
additions.

Also, minor updates to a couple of variables to improve readability and
fix a log indent while working on this code block.

There is no user visible change in behaviour or logs.

Signed-off-by: Kevin Traynor 
Acked-by: David Marchand 
---
 lib/dpif-netdev.c | 61 ++-
 1 file changed, 34 insertions(+), 27 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 0888d3a8b..14f1fdc0c 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -5996,5 +5996,5 @@ sched_numa_list_schedule(struct sched_numa_list 
*numa_list,
 struct sched_pmd *sched_pmd = NULL;
 struct sched_numa *numa;
-int numa_id;
+int port_numa_id;
 uint64_t proc_cycles;
 char rxq_cyc_log[MAX_RXQ_CYC_STRLEN];
@@ -6008,7 +6008,9 @@ sched_numa_list_schedule(struct sched_numa_list 
*numa_list,
 /* Store the cycles for this rxq as we will log these later. */
 proc_cycles = dp_netdev_rxq_get_cycles(rxq, RXQ_CYCLES_PROC_HIST);
-/* Select the numa that should be used for this rxq. */
-numa_id = netdev_get_numa_id(rxq->port->netdev);
-numa = sched_numa_list_lookup(numa_list, numa_id);
+
+port_numa_id = netdev_get_numa_id(rxq->port->netdev);
+
+/* Select numa. */
+numa = sched_numa_list_lookup(numa_list, port_numa_id);
 
 /* Check if numa has no PMDs or no non-isolated PMDs. */
@@ -6028,33 +6030,38 @@ sched_numa_list_schedule(struct sched_numa_list 
*numa_list,
 
 if (numa) {
-if (numa->numa_id != numa_id) {
+/* Select the PMD that should be used for this rxq. */
+sched_pmd = sched_pmd_next(numa, algo,
+   proc_cycles ? true : false);
+}
+
+/* Check that a pmd has been selected. */
+if (sched_pmd) {
+int pmd_numa_id;
+
+pmd_numa_id = sched_pmd->numa->numa_id;
+/* Check if selected pmd numa matches port numa. */
+if (pmd_numa_id != port_numa_id) {
 VLOG(level, "There's no available (non-isolated) pmd thread "
 "on numa node %d. Port \'%s\' rx queue %d will "
 "be assigned to a pmd on numa node %d. "
 "This may lead to reduced performance.",
-numa_id, netdev_rxq_get_name(rxq->rx),
-netdev_rxq_get_queue_id(rxq->rx), numa->numa_id);
+port_numa_id, netdev_rxq_get_name(rxq->rx),
+netdev_rxq_get_queue_id(rxq->rx), pmd_numa_id);
 }
-
-/* Select the PMD that should be used for this rxq. */
-sched_pmd = sched_pmd_next(numa, algo, proc_cycles ? true : false);
-if (sched_pmd) {
-VLOG(level, "Core %2u on numa node %d assigned port \'%s\' "
-"rx queue %d%s.",
-sched_pmd->pmd->core_id, sched_pmd->pmd->numa_id,
-netdev_rxq_get_name(rxq->rx),
-netdev_rxq_get_queue_id(rxq->rx),
-get_rxq_cyc_log(rxq_cyc_log, algo, proc_cycles));
-sched_pmd_add_rxq(sched_pmd, rxq, proc_cycles);
-}
-}
-if (!sched_pmd) {
+VLOG(level, "Core %2u on numa node %d assigned port \'%s\' "
+"rx queue %d%s.",
+sched_pmd->pmd->core_id, sched_pmd->pmd->numa_id,
+netdev_rxq_get_name(rxq->rx),
+netdev_rxq_get_queue_id(rxq->rx),
+get_rxq_cyc_log(rxq_cyc_log, algo, proc_cycles));
+sched_pmd_add_rxq(sched_pmd, rxq, proc_cycles);
+} else  {
 VLOG(level == VLL_DBG ? level : VLL_WARN,
-"No non-isolated pmd on any numa available for "
-"port \'%s\' rx queue %d%s. "
-"This rx queue will not be polled.",
-netdev_rxq_get_name(rxq->rx),
-netdev_rxq_get_queue_id(rxq->rx),
-get_rxq_cyc_log(rxq_cyc_log, algo, proc_cycles));
+ "No non-isolated pmd on any numa available for "
+ "port \'%s\' rx queue %d%s. "
+ "This rx queue will not be polled.",
+ netdev_rxq_get_name(rxq->rx),
+ netdev_rxq_get_queue_id(rxq->rx),
+ get_rxq_cyc_log(rxq_cyc_log, algo, proc_cycles));
 }
 }
-- 
2.34.1

___
dev mailing list
d...@openvswitch.org

[ovs-dev] [PATCH v2 1/2] dpif-netdev: Split function to find lowest loaded PMD thread core.

2022-04-05 Thread Kevin Traynor

This splits up the looping through each PMD thread core on a numa node
with the check to compare cycles or rxqs.

This is done so in future the compare could be reused with any group
of PMD thread cores.

There is no user visible change in behaviour.

Signed-off-by: Kevin Traynor 
Acked-by: David Marchand 
---
 lib/dpif-netdev.c | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 676434308..0888d3a8b 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -5756,13 +5756,34 @@ compare_rxq_cycles(const void *a, const void *b)
 }
 
+static bool
+sched_pmd_new_lowest(struct sched_pmd *current_lowest, struct sched_pmd *pmd,
+ bool has_proc) {
+uint64_t current_num, pmd_num;
+
+if (current_lowest == NULL) {
+return true;
+}
+
+if (has_proc) {
+current_num = current_lowest->pmd_proc_cycles;
+pmd_num = pmd->pmd_proc_cycles;
+} else {
+current_num = current_lowest->n_rxq;
+pmd_num = pmd->n_rxq;
+}
+
+if (pmd_num < current_num) {
+return true;
+}
+return false;
+}
+
 static struct sched_pmd *
 sched_pmd_get_lowest(struct sched_numa *numa, bool has_cyc)
 {
 struct sched_pmd *lowest_sched_pmd = NULL;
-uint64_t lowest_num = UINT64_MAX;
 
 for (unsigned i = 0; i < numa->n_pmds; i++) {
 struct sched_pmd *sched_pmd;
-uint64_t pmd_num;
 
 sched_pmd = >pmds[i];
@@ -5770,12 +5791,5 @@ sched_pmd_get_lowest(struct sched_numa *numa, bool 
has_cyc)
 continue;
 }
-if (has_cyc) {
-pmd_num = sched_pmd->pmd_proc_cycles;
-} else {
-pmd_num = sched_pmd->n_rxq;
-}
-
-if (pmd_num < lowest_num) {
-lowest_num = pmd_num;
+if (sched_pmd_new_lowest(lowest_sched_pmd, sched_pmd, has_cyc)) {
 lowest_sched_pmd = sched_pmd;
 }
-- 
2.34.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v2 0/2] Rxq scheduling code rework.

2022-04-05 Thread Kevin Traynor

These are intended to reduce code complexity a bit and make the
code more extensible for additional features like cross-numa
polling [0].

There is no change in behaviour.

v2:
- Address minor comments from David
- Minor rebase as [1] has now merged

GHA: https://github.com/kevintraynor/ovs/actions/runs/2096405657

[0] https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392433.html
[1] https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392689.html

Kevin Traynor (2):
  dpif-netdev: Split function to find lowest loaded PMD thread core.
  dpif-netdev: Restructure rxq schedule logging.

 lib/dpif-netdev.c | 95 +--
 1 file changed, 58 insertions(+), 37 deletions(-)

-- 
2.34.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 5.15 675/913] net: prefer nf_ct_put instead of nf_conntrack_put

2022-04-05 Thread Greg Kroah-Hartman

From: Florian Westphal 

[ Upstream commit 408bdcfce8dfd6902f75fbcd3b99d8b24b506597 ]

Its the same as nf_conntrack_put(), but without the
need for an indirect call.  The downside is a module dependency on
nf_conntrack, but all of these already depend on conntrack anyway.

Cc: Paul Blakey 
Cc: d...@openvswitch.org
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin 
---
 net/netfilter/nf_conntrack_core.c |  4 ++--
 net/openvswitch/conntrack.c   | 14 ++
 net/sched/act_ct.c|  6 +++---
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index 7f7997460764..917e708a4561 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -989,7 +989,7 @@ static int __nf_ct_resolve_clash(struct sk_buff *skb,
 
nf_ct_acct_merge(ct, ctinfo, loser_ct);
nf_ct_add_to_dying_list(loser_ct);
-   nf_conntrack_put(_ct->ct_general);
+   nf_ct_put(loser_ct);
nf_ct_set(skb, ct, ctinfo);
 
NF_CT_STAT_INC(net, clash_resolve);
@@ -1920,7 +1920,7 @@ nf_conntrack_in(struct sk_buff *skb, const struct 
nf_hook_state *state)
/* Invalid: inverse of the return code tells
 * the netfilter core what to do */
pr_debug("nf_conntrack_in: Can't track with proto module\n");
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
skb->_nfct = 0;
/* Special case: TCP tracker reports an attempt to reopen a
 * closed/aborted connection. We have to go back and create a
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 8f47f4e78d32..f2b64cab9af7 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -574,7 +574,7 @@ ovs_ct_expect_find(struct net *net, const struct 
nf_conntrack_zone *zone,
struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
 
nf_ct_delete(ct, 0, 0);
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
}
}
 
@@ -723,7 +723,7 @@ static bool skb_nfct_cached(struct net *net,
if (nf_ct_is_confirmed(ct))
nf_ct_delete(ct, 0, 0);
 
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, 0);
return false;
}
@@ -967,7 +967,8 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
 
/* Associate skb with specified zone. */
if (tmpl) {
-   nf_conntrack_put(skb_nfct(skb));
+   ct = nf_ct_get(skb, );
+   nf_ct_put(ct);
nf_conntrack_get(>ct_general);
nf_ct_set(skb, tmpl, IP_CT_NEW);
}
@@ -1328,7 +1329,12 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
 
 int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
 {
-   nf_conntrack_put(skb_nfct(skb));
+   enum ip_conntrack_info ctinfo;
+   struct nf_conn *ct;
+
+   ct = nf_ct_get(skb, );
+
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
ovs_ct_fill_key(skb, key, false);
 
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 4ffea1290ce1..240b3c5d2eb1 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -592,7 +592,7 @@ static bool tcf_ct_skb_nfct_cached(struct net *net, struct 
sk_buff *skb,
if (nf_ct_is_confirmed(ct))
nf_ct_kill(ct);
 
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 
return false;
@@ -757,7 +757,7 @@ static void tcf_ct_params_free(struct rcu_head *head)
tcf_ct_flow_table_put(params);
 
if (params->tmpl)
-   nf_conntrack_put(>tmpl->ct_general);
+   nf_ct_put(params->tmpl);
kfree(params);
 }
 
@@ -967,7 +967,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct 
tc_action *a,
tc_skb_cb(skb)->post_ct = false;
ct = nf_ct_get(skb, );
if (ct) {
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
}
 
-- 
2.34.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 5.16 0744/1017] net: prefer nf_ct_put instead of nf_conntrack_put

2022-04-05 Thread Greg Kroah-Hartman

From: Florian Westphal 

[ Upstream commit 408bdcfce8dfd6902f75fbcd3b99d8b24b506597 ]

Its the same as nf_conntrack_put(), but without the
need for an indirect call.  The downside is a module dependency on
nf_conntrack, but all of these already depend on conntrack anyway.

Cc: Paul Blakey 
Cc: d...@openvswitch.org
Signed-off-by: Florian Westphal 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin 
---
 net/netfilter/nf_conntrack_core.c |  4 ++--
 net/openvswitch/conntrack.c   | 14 ++
 net/sched/act_ct.c|  6 +++---
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index 7f7997460764..917e708a4561 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -989,7 +989,7 @@ static int __nf_ct_resolve_clash(struct sk_buff *skb,
 
nf_ct_acct_merge(ct, ctinfo, loser_ct);
nf_ct_add_to_dying_list(loser_ct);
-   nf_conntrack_put(_ct->ct_general);
+   nf_ct_put(loser_ct);
nf_ct_set(skb, ct, ctinfo);
 
NF_CT_STAT_INC(net, clash_resolve);
@@ -1920,7 +1920,7 @@ nf_conntrack_in(struct sk_buff *skb, const struct 
nf_hook_state *state)
/* Invalid: inverse of the return code tells
 * the netfilter core what to do */
pr_debug("nf_conntrack_in: Can't track with proto module\n");
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
skb->_nfct = 0;
/* Special case: TCP tracker reports an attempt to reopen a
 * closed/aborted connection. We have to go back and create a
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 8f47f4e78d32..f2b64cab9af7 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -574,7 +574,7 @@ ovs_ct_expect_find(struct net *net, const struct 
nf_conntrack_zone *zone,
struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
 
nf_ct_delete(ct, 0, 0);
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
}
}
 
@@ -723,7 +723,7 @@ static bool skb_nfct_cached(struct net *net,
if (nf_ct_is_confirmed(ct))
nf_ct_delete(ct, 0, 0);
 
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, 0);
return false;
}
@@ -967,7 +967,8 @@ static int __ovs_ct_lookup(struct net *net, struct 
sw_flow_key *key,
 
/* Associate skb with specified zone. */
if (tmpl) {
-   nf_conntrack_put(skb_nfct(skb));
+   ct = nf_ct_get(skb, );
+   nf_ct_put(ct);
nf_conntrack_get(>ct_general);
nf_ct_set(skb, tmpl, IP_CT_NEW);
}
@@ -1328,7 +1329,12 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb,
 
 int ovs_ct_clear(struct sk_buff *skb, struct sw_flow_key *key)
 {
-   nf_conntrack_put(skb_nfct(skb));
+   enum ip_conntrack_info ctinfo;
+   struct nf_conn *ct;
+
+   ct = nf_ct_get(skb, );
+
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
ovs_ct_fill_key(skb, key, false);
 
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 4ffea1290ce1..240b3c5d2eb1 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -592,7 +592,7 @@ static bool tcf_ct_skb_nfct_cached(struct net *net, struct 
sk_buff *skb,
if (nf_ct_is_confirmed(ct))
nf_ct_kill(ct);
 
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
 
return false;
@@ -757,7 +757,7 @@ static void tcf_ct_params_free(struct rcu_head *head)
tcf_ct_flow_table_put(params);
 
if (params->tmpl)
-   nf_conntrack_put(>tmpl->ct_general);
+   nf_ct_put(params->tmpl);
kfree(params);
 }
 
@@ -967,7 +967,7 @@ static int tcf_ct_act(struct sk_buff *skb, const struct 
tc_action *a,
tc_skb_cb(skb)->post_ct = false;
ct = nf_ct_get(skb, );
if (ct) {
-   nf_conntrack_put(>ct_general);
+   nf_ct_put(ct);
nf_ct_set(skb, NULL, IP_CT_UNTRACKED);
}
 
-- 
2.34.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 4/4] tests/mfex: Improve pcap script for mfex tests.

2022-04-05 Thread Stokes, Ian

> The mfex pcap generation script is improved for varied length
> traffic and also removes the hard coded mfex_pcap and instead uses
> the script itself to generate complex traffic patterns for testing.
> 
> Signed-off-by: Kumar Amber 
> 

So in general I think this approach is a bit more improved.

@Phelan, Michael do you know if we currently run this testing in our CI? Also 
@Amber, Kumar et al, should the CI be expanded now to do both fuzzy and 
non-fuzzy testing as both seem to be an option now?

Regards
Ian

> ---
> v8:
> - Reduce IO writes.
> --
> ---
>  tests/automake.mk |   1 -
>  tests/mfex_fuzzy.py   |  66 +++---
>  tests/pcap/mfex_test.pcap | Bin 416 -> 0 bytes
>  tests/system-dpdk.at  |  23 +
>  4 files changed, 63 insertions(+), 27 deletions(-)
>  delete mode 100644 tests/pcap/mfex_test.pcap
> 
> diff --git a/tests/automake.mk b/tests/automake.mk
> index 8a9151f81..507da2ee8 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -145,7 +145,6 @@ $(srcdir)/tests/fuzz-regression-list.at:
> tests/automake.mk
> 
>  EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
>  MFEX_AUTOVALIDATOR_TESTS = \
> - tests/pcap/mfex_test.pcap \
>   tests/mfex_fuzzy.py
> 
>  OVSDB_CLUSTER_TESTSUITE_AT = \
> diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py
> index 3efe1152d..dbde5fe1b 100755
> --- a/tests/mfex_fuzzy.py
> +++ b/tests/mfex_fuzzy.py
> @@ -3,30 +3,58 @@
>  import sys
> 
>  from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6, RandShort, fuzz
> -from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP
> +from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP, random
> 
> +# Relative path for the pcap file location.
>  path = str(sys.argv[1]) + "/pcap/fuzzy.pcap"
> +# The number of packets generated will be size * 8.
> +size = int(sys.argv[2])
> +# Traffic option is used to choose between fuzzy or simple packet type.
> +traffic_opt = str(sys.argv[3])
> +
>  pktdump = PcapWriter(path, append=False, sync=True)
> 
> -for i in range(0, 2000):
> +pkt = []
> +
> +for i in range(0, size):
> 
> -# Generate random protocol bases, use a fuzz() over the combined packet
> -# for full fuzzing.
>  eth = Ether(src=RandMAC(), dst=RandMAC())
>  vlan = Dot1Q()
> -ipv4 = IP(src=RandIP(), dst=RandIP())
> -ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
>  udp = UDP(dport=RandShort(), sport=RandShort())
> -tcp = TCP(dport=RandShort(), sport=RandShort())
> -
> -# IPv4 packets with fuzzing
> -pktdump.write(fuzz(eth / ipv4 / udp))
> -pktdump.write(fuzz(eth / ipv4 / tcp))
> -pktdump.write(fuzz(eth / vlan / ipv4 / udp))
> -pktdump.write(fuzz(eth / vlan / ipv4 / tcp))
> -
> -# IPv6 packets with fuzzing
> -pktdump.write(fuzz(eth / ipv6 / udp))
> -pktdump.write(fuzz(eth / ipv6 / tcp))
> -pktdump.write(fuzz(eth / vlan / ipv6 / udp))
> -pktdump.write(fuzz(eth / vlan / ipv6 / tcp))
> +
> +if traffic_opt == "fuzzy":
> +
> +ipv4 = IP(src=RandIP(), dst=RandIP(), len=random.randint(0, 100))
> +ipv6 = IPv6(src=RandIP6(), dst=RandIP6(), plen=random.randint(0, 
> 100))
> +tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S',
> +  dataofs=random.randint(0, 20))
> +# IPv4 packets with fuzzing
> +pkt.append(fuzz(eth / ipv4 / udp))
> +pkt.append(fuzz(eth / ipv4 / tcp))
> +pkt.append(fuzz(eth / vlan / ipv4 / udp))
> +pkt.append(fuzz(eth / vlan / ipv4 / tcp))
> +
> +# IPv6 packets with fuzzing
> +pkt.append(fuzz(eth / ipv6 / udp))
> +pkt.append(fuzz(eth / ipv6 / tcp))
> +pkt.append(fuzz(eth / vlan / ipv6 / udp))
> +pkt.append(fuzz(eth / vlan / ipv6 / tcp))
> +
> +else:
> +
> +ipv4 = IP(src=RandIP(), dst=RandIP())
> +ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
> +tcp = TCP(dport=RandShort(), sport=RandShort(), flags='S')
> +# IPv4 packets
> +pkt.append(eth / ipv4 / udp)
> +pkt.append(eth / ipv4 / tcp)
> +pkt.append(eth / vlan / ipv4 / udp)
> +pkt.append(eth / vlan / ipv4 / tcp)
> +
> +# IPv6 packets
> +pkt.append(eth / ipv6 / udp)
> +pkt.append(eth / ipv6 / tcp)
> +pkt.append(eth / vlan / ipv6 / udp)
> +pkt.append(eth / vlan / ipv6 / tcp)
> +
> +pktdump.write(pkt)
> diff --git a/tests/pcap/mfex_test.pcap b/tests/pcap/mfex_test.pcap
> deleted file mode 100644
> index
> 1aac67b8d643ecb016c758cba4cc32212a80f52a..0
> 000
> GIT binary patch
> literal 0
> HcmV?d1
> 
> literal 416
> zcmca|c+)~A1{MYw`2U}Qff2}QK`M68ITRa|G@yFii5$Gfk6YL%z>@uY&}o
> |
> z2s4N<1VH2&7y^V87$)XGOtD~MV$cFgfG~zBGGJ2#YtF$ z
> xK>KST_NTIwYriok6N4Vm)gX-
> Q@c^{cp<7_5LgK^UuU{2>VS0RZ!RQ+EIW
> 
> diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
> index 7d2715c4a..1476e470c 100644
> --- a/tests/system-dpdk.at
> +++ b/tests/system-dpdk.at
> @@ -226,12 +226,13 @@ dnl 
>

Re: [ovs-dev] [PATCH v8 3/4] dpif-netdev/mfex: Avoid hashing when opt mfex called.

2022-04-05 Thread Stokes, Ian

> This patch avoids calculating the software hash of the packet again
> if the optimized miniflow-extract hit. In cases of scalar miniflow
> extract, the normal hashing calculation is performed.
> 
> Signed-off-by: Kumar Amber 

Thanks for looking at this, seems straight forward enough. Think this can be 
merged once preceding patches are in place.

Thanks
Ian
> ---
>  lib/dpif-netdev-avx512.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/dpif-netdev-avx512.c b/lib/dpif-netdev-avx512.c
> index b7131ba3f..c68b79f6b 100644
> --- a/lib/dpif-netdev-avx512.c
> +++ b/lib/dpif-netdev-avx512.c
> @@ -212,15 +212,15 @@ dp_netdev_input_outer_avx512(struct
> dp_netdev_pmd_thread *pmd,
>  if (!mfex_hit) {
>  /* Do a scalar miniflow extract into keys. */
>  miniflow_extract(packet, >mf);
> +key->len = netdev_flow_key_size(miniflow_n_values(>mf));
> +key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
> + >mf);
>  }
> 
>  /* Cache TCP and byte values for all packets. */
>  pkt_meta[i].bytes = dp_packet_size(packet);
>  pkt_meta[i].tcp_flags = miniflow_get_tcp_flags(>mf);
> 
> -key->len = netdev_flow_key_size(miniflow_n_values(>mf));
> -key->hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet, 
> >mf);
> -
>  if (emc_enabled) {
>  f = emc_lookup(>emc_cache, key);
> 
> --
> 2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 1/4] dpif-netdev/mfex: Add ipv4 profile based hashing.

2022-04-05 Thread Stokes, Ian




> -Original Message-
> From: Amber, Kumar 
> Sent: Friday, April 1, 2022 12:24 PM
> To: ovs-dev@openvswitch.org
> Cc: Stokes, Ian ; echau...@redhat.com; Ferriter, Cian
> ; f...@sysclose.org; Van Haaren, Harry
> ; Amber, Kumar 
> Subject: [PATCH v8 1/4] dpif-netdev/mfex: Add ipv4 profile based hashing.
> 
> For packets which don't already have a hash calculated,
> miniflow_hash_5tuple() calculates the hash of a packet
> using the previously built miniflow.
> 
> This commit adds IPv4 profile specific hashing which
> uses fixed offsets into the packet to improve hashing
> performance.
> 
> Signed-off-by: Kumar Amber 
> Signed-off-by: Harry van Haaren 
> Co-authored-by: Harry van Haaren 

Thanks for working on this Amber/Harry and the reviews /testing Cian.

Looking at this it seems to be in a good sate. Will apply to master unless 
there are any objection?

Thanks
Ian
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v8 2/4] dpif-netdev/mfex: Add packet hash check to autovalidator.

2022-04-05 Thread Stokes, Ian

> This patch adds the scalar hash calls to the autovalidator.
> It also adds checks for comparing the scalar hash against
> the profile based hash calculated as part of AVX512 MFEX implementations.
> 
> The per profile AVX512 optimized hash was added to the autovalidator
> in the last commit. The autovalidator was already calling that code,
> we just add the checks and scalar hashing in this commit.
> 
> Signed-off-by: Kumar Amber 
> 

Minor nit below.

> ---
> v8:
> - Fix hash validation.
> ---
> ---
>  lib/dpif-netdev-private-extract.c | 24 +---
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/dpif-netdev-private-extract.c 
> b/lib/dpif-netdev-private-extract.c
> index 4b2f12015..e8550c1fb 100644
> --- a/lib/dpif-netdev-private-extract.c
> +++ b/lib/dpif-netdev-private-extract.c
> @@ -252,6 +252,9 @@ dpif_miniflow_extract_autovalidator(struct
> dp_packet_batch *packets,
>  DP_PACKET_BATCH_FOR_EACH (i, packet, packets) {
>  pkt_metadata_init(>md, in_port);
>  miniflow_extract(packet, [i].mf);
> +keys[i].len = netdev_flow_key_size(miniflow_n_values([i].mf));
> +keys[i].hash = dpif_netdev_packet_get_rss_hash_orig_pkt(packet,
> +[i].mf);
> 
>  /* Store known good metadata to compare with optimized metadata. */
>  good_l2_5_ofs[i] = packet->l2_5_ofs;
> @@ -266,10 +269,11 @@ dpif_miniflow_extract_autovalidator(struct
> dp_packet_batch *packets,
>  if (!mfex_impls[j].available) {
>  continue;
>  }
> -/* Reset keys and offsets before each implementation. */
> +/* Reset keys, offsets and hash before each implementation. */
>  memset(test_keys, 0, keys_size * sizeof(struct netdev_flow_key));
>  DP_PACKET_BATCH_FOR_EACH (i, packet, packets) {
>  dp_packet_reset_offsets(packet);
> +*dp_packet_ol_flags_ptr(packet) &= ~DP_PACKET_OL_RSS_HASH;
>  }
>  /* Call optimized miniflow for each batch of packet. */
>  uint32_t hit_mask = mfex_impls[j].extract_func(packets, test_keys,
> @@ -335,6 +339,15 @@ dpif_miniflow_extract_autovalidator(struct
> dp_packet_batch *packets,
>  failed = 1;
>  }
> 
> +/* Check hashes are equal. */
> +if ((keys[i].hash != test_keys[i].hash) ||
> +(keys[i].len != test_keys[i].len)) {
> +ds_put_format(_msg, "Good hash: %d len: %d\tTest hash:%d"
> +  " len:%d\n", keys[i].hash, keys[i].len,
> +  test_keys[i].hash, test_keys[i].len);

Should the Log message be more verbose?  Should it be made to clearer to end 
user that the test has failed due to the hashes not matching rather than just 
providing a Good Hash value and Test Hash Value? Just a thought.

Regards
Ian


> +failed = 1;
> +}
> +
>  if (failed) {
>  VLOG_ERR("Autovalidation for %s failed in pkt %d,"
>   " disabling.", mfex_impls[j].name, i);
> @@ -351,13 +364,10 @@ dpif_miniflow_extract_autovalidator(struct
> dp_packet_batch *packets,
>  atomic_store_relaxed(>miniflow_extract_opt, NULL);
>  }
> 
> -/* Preserve packet correctness by storing back the good offsets in
> - * packets back. */
> +/* Reset all packet values. */
>  DP_PACKET_BATCH_FOR_EACH (i, packet, packets) {
> -packet->l2_5_ofs = good_l2_5_ofs[i];
> -packet->l3_ofs = good_l3_ofs[i];
> -packet->l4_ofs = good_l4_ofs[i];
> -packet->l2_pad_size = good_l2_pad_size[i];
> +dp_packet_reset_offsets(packet);
> +*dp_packet_ol_flags_ptr(packet) &= ~DP_PACKET_OL_RSS_HASH;
>  }
> 
>  /* Returning zero implies no packets were hit by autovalidation. This
> --
> 2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net] net: openvswitch: don't send internal clone attribute to the userspace.

2022-04-05 Thread Aaron Conole

Ilya Maximets  writes:

> 'OVS_CLONE_ATTR_EXEC' is an internal attribute that is used for
> performance optimization inside the kernel.  It's added by the kernel
> while parsing user-provided actions and should not be sent during the
> flow dump as it's not part of the uAPI.
>
> The issue doesn't cause any significant problems to the ovs-vswitchd
> process, because reported actions are not really used in the
> application lifecycle and only supposed to be shown to a human via
> ovs-dpctl flow dump.  However, the action list is still incorrect
> and causes the following error if the user wants to look at the
> datapath flows:
>
>   # ovs-dpctl add-dp system@ovs-system
>   # ovs-dpctl add-flow "" "clone(ct(commit),0)"
>   # ovs-dpctl dump-flows
>   , packets:0, bytes:0, used:never,
> actions:clone(bad length 4, expected -1 for: action0(01 00 00 00),
>   ct(commit),0)
>
> With the fix:
>
>   # ovs-dpctl dump-flows
>   , packets:0, bytes:0, used:never,
> actions:clone(ct(commit),0)
>
> Additionally fixed an incorrect attribute name in the comment.
>
> Fixes: b233504033db ("openvswitch: kernel datapath clone action")
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net] net: openvswitch: fix leak of nested actions

2022-04-05 Thread Aaron Conole

Ilya Maximets  writes:

> While parsing user-provided actions, openvswitch module may dynamically
> allocate memory and store pointers in the internal copy of the actions.
> So this memory has to be freed while destroying the actions.
>
> Currently there are only two such actions: ct() and set().  However,
> there are many actions that can hold nested lists of actions and
> ovs_nla_free_flow_actions() just jumps over them leaking the memory.
>
> For example, removal of the flow with the following actions will lead
> to a leak of the memory allocated by nf_ct_tmpl_alloc():
>
>   actions:clone(ct(commit),0)
>
> Non-freed set() action may also leak the 'dst' structure for the
> tunnel info including device references.
>
> Under certain conditions with a high rate of flow rotation that may
> cause significant memory leak problem (2MB per second in reporter's
> case).  The problem is also hard to mitigate, because the user doesn't
> have direct control over the datapath flows generated by OVS.
>
> Fix that by iterating over all the nested actions and freeing
> everything that needs to be freed recursively.
>
> New build time assertion should protect us from this problem if new
> actions will be added in the future.
>
> Unfortunately, openvswitch module doesn't use NLA_F_NESTED, so all
> attributes has to be explicitly checked.  sample() and clone() actions
> are mixing extra attributes into the user-provided action list.  That
> prevents some code generalization too.
>
> Fixes: 34ae932a4036 ("openvswitch: Make tunnel set action attach a metadata 
> dst")
> Link: https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392922.html
> Reported-by: Stéphane Graber 
> Signed-off-by: Ilya Maximets 
> ---

Acked-by: Aaron Conole 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v5 6/6] ci: Add UB Sanitizer.

2022-04-05 Thread Dumitru Ceara

Note: There still is an UB instance when using SSE4.2 as reported here:
https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390904.html

Acked-by: Aaron Conole 
Signed-off-by: Dumitru Ceara 
---
v5:
- Rebased.
v4:
- Rebased.
v3:
- Fix typo in variable name.
- Added SSE4.2 UB note to commit log.
---
 .ci/linux-build.sh   |6 ++
 .github/workflows/build-and-test.yml |5 +
 configure.ac |1 +
 tests/atlocal.in |   16 
 tests/automake.mk|1 +
 tests/daemon.at  |8 
 tests/ovs-macros.at  |5 +
 tests/ovsdb-server.at|   16 
 8 files changed, 58 insertions(+)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index 6cd38ff3efb5..afd4379e6923 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -250,6 +250,12 @@ if [ "$ASAN" ]; then
 CFLAGS_FOR_OVS="${CFLAGS_FOR_OVS} ${CFLAGS_ASAN}"
 fi
 
+if [ "$UBSAN" ]; then
+# Use the default options configured in tests/atlocal.in, in UBSAN_OPTIONS.
+CFLAGS_UBSAN="-O1 -fno-omit-frame-pointer -fno-common -fsanitize=undefined"
+CFLAGS_FOR_OVS="${CFLAGS_FOR_OVS} ${CFLAGS_UBSAN}"
+fi
+
 save_OPTS="${OPTS} $*"
 OPTS="${EXTRA_OPTS} ${save_OPTS}"
 
diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index eac3504e48f8..9e3583781baa 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -13,6 +13,7 @@ jobs:
 linux-headers-$(uname -r) build-essential fakeroot devscripts equivs
   AFXDP:   ${{ matrix.afxdp }}
   ASAN:${{ matrix.asan }}
+  UBSAN:   ${{ matrix.ubsan }}
   CC:  ${{ matrix.compiler }}
   DEB_PACKAGE: ${{ matrix.deb_package }}
   DPDK:${{ matrix.dpdk }}
@@ -44,6 +45,10 @@ jobs:
 testsuite:test
 kernel:   3.16
 asan: asan
+  - compiler: clang
+testsuite:test
+kernel:   3.16
+ubsan:ubsan
 
   - compiler: gcc
 testsuite:test
diff --git a/configure.ac b/configure.ac
index 298ea85aba22..198653a60528 100644
--- a/configure.ac
+++ b/configure.ac
@@ -197,6 +197,7 @@ OVS_CHECK_LINUX_SCTP_CT
 OVS_CHECK_LINUX_VIRTIO_TYPES
 OVS_CHECK_DPDK
 OVS_CHECK_PRAGMA_MESSAGE
+AC_SUBST([CFLAGS])
 AC_SUBST([OVS_CFLAGS])
 AC_SUBST([OVS_LDFLAGS])
 
diff --git a/tests/atlocal.in b/tests/atlocal.in
index a0ad239ecef2..142ea2090bc8 100644
--- a/tests/atlocal.in
+++ b/tests/atlocal.in
@@ -4,6 +4,7 @@ OPENSSL_SUPPORTS_SNI='@OPENSSL_SUPPORTS_SNI@'
 HAVE_UNBOUND='@HAVE_UNBOUND@'
 EGREP='@EGREP@'
 PYTHON3='@PYTHON3@'
+CFLAGS='@CFLAGS@'
 
 # PYTHONCOERCECLOCALE=0 disables the Unicode compatibility warning on
 # stderr that breaks almost any Python3 test (PEP 0538)
@@ -197,6 +198,16 @@ else
 DIFF_SUPPORTS_NORMAL_FORMAT=no
 fi
 
+# Check whether UB Sanitizer is being used.
+case "$CFLAGS" in
+*fsanitize=undefined*)
+TESTS_WITH_UBSAN=yes
+;;
+*)
+TESTS_WITH_UBSAN=no
+;;
+esac
+
 # Turn off proxies.
 unset http_proxy
 unset https_proxy
@@ -222,3 +233,8 @@ export OVS_CTL_TIMEOUT
 # matter break everything.
 ASAN_OPTIONS=detect_leaks=0:abort_on_error=true:log_path=asan:$ASAN_OPTIONS
 export ASAN_OPTIONS
+
+# Add some default flags for UndefinedBehaviorSanitizer, if it was used
+# for the build.
+UBSAN_OPTIONS=print_stacktrace=1:halt_on_error=true:log_path=ubsan:$UBSAN_OPTIONS
+export UBSAN_OPTIONS
diff --git a/tests/automake.mk b/tests/automake.mk
index 8a9151f81b86..a526be84b493 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -213,6 +213,7 @@ check-local:
set $(SHELL) '$(TESTSUITE)' -C tests AUTOTEST_PATH=$(AUTOTEST_PATH); \
"$$@" $(TESTSUITEFLAGS) || \
(test -z "$$(find $(TESTSUITE_DIR) -name 'asan.*')" && \
+test -z "$$(find $(TESTSUITE_DIR) -name 'ubsan.*')" && \
 test X'$(RECHECK)' = Xyes && "$$@" --recheck)
 
 # Python Coverage support.
diff --git a/tests/daemon.at b/tests/daemon.at
index 39d9aa391e9b..d7981f9d23a6 100644
--- a/tests/daemon.at
+++ b/tests/daemon.at
@@ -81,6 +81,10 @@ AT_SKIP_IF([test "$IS_WIN32" = "yes"])
 # This test intentionally causes SIGSEGV, so make Address Sanitizer ignore it.
 ASAN_OPTIONS=$ASAN_OPTIONS:handle_segv=0; export ASAN_OPTIONS
 
+# Skip it if UB Sanitizer is being used.  There's no way to disable the
+# SEGV check at runtime.
+AT_SKIP_IF([test $TESTS_WITH_UBSAN = yes])
+
 # Start the daemon and wait for the pidfile to get created.
 on_exit 'kill $(cat *.pid)'
 AT_CHECK([ovsdb-server --monitor --pidfile --no-db 2>/dev/null & echo $!],
@@ -149,6 +153,10 @@ AT_SKIP_IF([test "$IS_WIN32" = "yes"])
 # This test intentionally causes SIGSEGV, so make Address Sanitizer ignore it.
 ASAN_OPTIONS=$ASAN_OPTIONS:handle_segv=0; export ASAN_OPTIONS
 
+# Skip it if UB Sanitizer is being used.  There's no way

[ovs-dev] [PATCH v5 5/6] ofp-actions: Use aligned structures when decoding ofp actions.

2022-04-05 Thread Dumitru Ceara

Some openflow actions can be misaligned, e.g., actions whithin OF 1.0
replies to statistics reply messages which have a header of 12 bytes
and no additional padding.

Also, buggy controllers might incorrectly encode actions.

When decoding multiple actions in ofpacts_decode(), make sure that
when advancing to the next action it will be properly aligned
(multiple of OFPACT_ALIGNTO).

Detected by UB Sanitizer when running one of the fuzz tests:
  lib/ofp-actions.c:5347:12: runtime error: member access within misaligned 
address 0x016ba274 for type 'const struct nx_action_learn', which requires 
8 byte alignment
  0x016ba274: note: pointer points here
20 20 20 20 ff ff 00 38  00 00 23 20 00 10 20 20  20 20 20 20 20 20 20 20  
20 20 20 20 00 03 20 00
^
  #0 0x52cece in decode_LEARN_common lib/ofp-actions.c:5347
  #1 0x52dcf6 in decode_NXAST_RAW_LEARN lib/ofp-actions.c:5463
  #2 0x548604 in ofpact_decode lib/ofp-actions.inc2:4723
  #3 0x53ee43 in ofpacts_decode lib/ofp-actions.c:7781
  #4 0x53efc1 in ofpacts_pull_openflow_actions__ lib/ofp-actions.c:7820
  #5 0x5409e1 in ofpacts_pull_openflow_instructions lib/ofp-actions.c:8396
  #6 0x5608a8 in ofputil_decode_flow_stats_reply lib/ofp-flow.c:1100

Signed-off-by: Dumitru Ceara 
---
v5: Addressed Adrian's comments:
- Correctly compute decoded_len.
- Fix ofpbuf_align() comment.
v4: Rebased.
v3: Split out from old patch 07/11.
---
 include/openvswitch/ofp-actions.h |1 
 include/openvswitch/ofpbuf.h  |2 +
 lib/ofp-actions.c |   81 +
 lib/ofpbuf.c  |   39 ++
 4 files changed, 105 insertions(+), 18 deletions(-)

diff --git a/include/openvswitch/ofp-actions.h 
b/include/openvswitch/ofp-actions.h
index 711e7c773d6c..7b57e49ad657 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -203,6 +203,7 @@ BUILD_ASSERT_DECL(sizeof(struct ofpact) == 4);
 /* Alignment. */
 #define OFPACT_ALIGNTO 8
 #define OFPACT_ALIGN(SIZE) ROUND_UP(SIZE, OFPACT_ALIGNTO)
+#define OFPACT_IS_ALIGNED(ADDR) ((uintptr_t) (ADDR) % OFPACT_ALIGNTO == 0)
 #define OFPACT_PADDED_MEMBERS(MEMBERS) PADDED_MEMBERS(OFPACT_ALIGNTO, MEMBERS)
 
 /* Returns the ofpact following 'ofpact'. */
diff --git a/include/openvswitch/ofpbuf.h b/include/openvswitch/ofpbuf.h
index 7b6aba9dc29c..50630c9f9baf 100644
--- a/include/openvswitch/ofpbuf.h
+++ b/include/openvswitch/ofpbuf.h
@@ -111,6 +111,7 @@ void ofpbuf_use_ds(struct ofpbuf *, const struct ds *);
 void ofpbuf_use_stack(struct ofpbuf *, void *, size_t);
 void ofpbuf_use_stub(struct ofpbuf *, void *, size_t);
 void ofpbuf_use_const(struct ofpbuf *, const void *, size_t);
+void ofpbuf_use_data(struct ofpbuf *, const void *, size_t);
 
 void ofpbuf_init(struct ofpbuf *, size_t);
 void ofpbuf_uninit(struct ofpbuf *);
@@ -149,6 +150,7 @@ static inline size_t ofpbuf_msgsize(const struct ofpbuf *);
 void ofpbuf_prealloc_headroom(struct ofpbuf *, size_t);
 void ofpbuf_prealloc_tailroom(struct ofpbuf *, size_t);
 void ofpbuf_trim(struct ofpbuf *);
+void ofpbuf_align(struct ofpbuf *);
 void ofpbuf_padto(struct ofpbuf *, size_t);
 void ofpbuf_shift(struct ofpbuf *, int);
 
diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 783af84439e2..d31b26583759 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -424,7 +424,8 @@ static void put_reg_load(struct ofpbuf *openflow,
  const struct mf_subfield *, uint64_t value);
 
 static enum ofperr ofpact_pull_raw(struct ofpbuf *, enum ofp_version,
-   enum ofp_raw_action_type *, uint64_t *arg);
+   enum ofp_raw_action_type *, uint64_t *arg,
+   size_t *raw_len);
 static void *ofpact_put_raw(struct ofpbuf *, enum ofp_version,
 enum ofp_raw_action_type, uint64_t arg);
 
@@ -7768,45 +7769,87 @@ check_GOTO_TABLE(const struct ofpact_goto_table *a,
 
 static void
 log_bad_action(const struct ofp_action_header *actions, size_t actions_len,
-   const struct ofp_action_header *bad_action, enum ofperr error)
+   size_t action_offset, enum ofperr error)
 {
 if (!VLOG_DROP_WARN()) {
 struct ds s;
 
 ds_init();
 ds_put_hex_dump(, actions, actions_len, 0, false);
-VLOG_WARN("bad action at offset %#"PRIxPTR" (%s):\n%s",
-  (char *)bad_action - (char *)actions,
+VLOG_WARN("bad action at offset %"PRIuSIZE" (%s):\n%s", action_offset,
   ofperr_get_name(error), ds_cstr());
 ds_destroy();
 }
 }
 
 static enum ofperr
-ofpacts_decode(const void *actions, size_t actions_len,
-   enum ofp_version ofp_version,
-   const struct vl_mff_map *vl_mff_map,
-   uint64_t *ofpacts_tlv_bitmap, struct ofpbuf *ofpacts)
+ofpacts_decode_aligned(struct ofpbuf *openflow, enum ofp_version

[ovs-dev] [PATCH v5 4/6] ofp-errors: Ensure parsed OFPT_ERROR messages are properly aligned.

2022-04-05 Thread Dumitru Ceara

Trim the ofpbuf to ensure proper alignment.

UB Sanitizer report:
  lib/ofp-print.c:1218:24: runtime error: member access within misaligned 
address 0x019229d2 for type 'const struct ofp_header', which requires 4 
byte alignment
  0x019229d2: note: pointer points here
   00 00  5a 5a 05 22 00 3e 00 00  00 09 00 00 00 00 00 00  00 03 05 0d 00 2e 
00 00  00 09 ff ff ff ff
^
  #0 0x7d45cc in ofp_to_string lib/ofp-print.c:1218
  #1 0x774fa8 in ofperr_msg_format lib/ofp-errors.c:253
  #2 0x7d2617 in ofp_print_error_msg lib/ofp-print.c:435
  #3 0x7d3eb7 in ofp_to_string__ lib/ofp-print.c:998
  #4 0x7d47fb in ofp_to_string lib/ofp-print.c:1244
  #5 0x8dcc4b in do_send lib/vconn.c:688
  #6 0x8dca64 in vconn_send lib/vconn.c:671

Signed-off-by: Dumitru Ceara 
---
v5: Rebased.
v4: Rebased.
v3: Split out from old patch 07/11.
---
 lib/ofp-errors.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/ofp-errors.c b/lib/ofp-errors.c
index 35191abf16dd..eb099d355270 100644
--- a/lib/ofp-errors.c
+++ b/lib/ofp-errors.c
@@ -20,6 +20,7 @@
 #include "openflow/openflow.h"
 #include "openflow/nicira-ext.h"
 #include "openvswitch/dynamic-string.h"
+#include "openvswitch/ofp-actions.h"
 #include "openvswitch/ofp-errors.h"
 #include "openvswitch/ofp-msgs.h"
 #include "openvswitch/ofp-print.h"
@@ -346,6 +347,7 @@ ofperr_decode_msg(const struct ofp_header *oh, struct 
ofpbuf *payload)
 if (error && payload) {
 ofpbuf_init(payload, b.size);
 ofpbuf_push(payload, b.data, b.size);
+ofpbuf_trim(payload);
 }
 return error;
 }

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v5 3/6] ofp-actions: Ensure aligned accesses to masked fields.

2022-04-05 Thread Dumitru Ceara

For example is parsing the OVN "eth.dst[40] = 1;" action, which
triggered the following warning from UndefinedBehaviorSanitizer:

  lib/meta-flow.c:3210:9: runtime error: member access within misaligned 
address 0x00de4e36 for type 'const union mf_value', which requires 8 byte 
alignment
  0x00de4e36: note: pointer points here
   00 00 00 00 01 00  00 00 00 00 00 00 00 00  70 4e de 00 00 00 00 00  10 51 
de 00 00 00 00 00  c0 4f
   ^
  #0 0x5818bc in mf_format lib/meta-flow.c:3210
  #1 0x5b6047 in format_SET_FIELD lib/ofp-actions.c:3342
  #2 0x5d68ab in ofpact_format lib/ofp-actions.c:9213
  #3 0x5d6ee0 in ofpacts_format lib/ofp-actions.c:9237
  #4 0x410922 in test_parse_actions tests/test-ovn.c:1360

To avoid this we now change the internal representation of the set_field
actions, struct ofpact_set_field, such that the mask is always stored
at a correctly aligned address, multiple of OFPACT_ALIGNTO.

We also need to adapt the "ovs-ofctl show-flows - Oversized flow" test
because now the ofpact representation of the set_field action uses more
bytes in memory (for the extra alignment).  Change the test to use
dec_ttl instead.

Signed-off-by: Dumitru Ceara 
---
v5: Implemented Adrian's suggestion to always align the 'mask' within
'struct ofpact_set_field' to 8 bytes.
v4: Rebase.
v3: Split out from old patch 07/11.
---
 include/openvswitch/ofp-actions.h |   10 ++
 lib/ofp-actions.c |   19 +--
 tests/ovs-ofctl.at|2 +-
 3 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/include/openvswitch/ofp-actions.h 
b/include/openvswitch/ofp-actions.h
index b7231c7bb334..711e7c773d6c 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -549,7 +549,8 @@ struct ofpact_set_field {
 const struct mf_field *field;
 );
 union mf_value value[];  /* Significant value bytes followed by
-  * significant mask bytes. */
+  * significant mask bytes aligned at
+  * OFPACT_ALIGNTO bytes. */
 };
 BUILD_ASSERT_DECL(offsetof(struct ofpact_set_field, value)
   % OFPACT_ALIGNTO == 0);
@@ -557,9 +558,10 @@ BUILD_ASSERT_DECL(offsetof(struct ofpact_set_field, value)
   == sizeof(struct ofpact_set_field));
 
 /* Use macro to not have to deal with constness. */
-#define ofpact_set_field_mask(SF)   \
-ALIGNED_CAST(union mf_value *,  \
- (uint8_t *)(SF)->value + (SF)->field->n_bytes)
+#define ofpact_set_field_mask(SF)   \
+ALIGNED_CAST(union mf_value *,  \
+ (uint8_t *)(SF)->value +   \
+ROUND_UP((SF)->field->n_bytes, OFPACT_ALIGNTO))
 
 /* OFPACT_PUSH_VLAN/MPLS/PBB
  *
diff --git a/lib/ofp-actions.c b/lib/ofp-actions.c
index 4ada0f4a3c49..783af84439e2 100644
--- a/lib/ofp-actions.c
+++ b/lib/ofp-actions.c
@@ -3376,21 +3376,28 @@ check_SET_FIELD(struct ofpact_set_field *a,
 return 0;
 }
 
-/* Appends an OFPACT_SET_FIELD ofpact with enough space for the value and mask
- * for the 'field' to 'ofpacts' and returns it.  Fills in the value from
- * 'value', if non-NULL, and mask from 'mask' if non-NULL.  If 'value' is
- * non-NULL and 'mask' is NULL, an all-ones mask will be filled in. */
+/* Appends an OFPACT_SET_FIELD ofpact with enough space for the value and a
+ * properly aligned mask for the 'field' to 'ofpacts' and returns it.  Fills
+ * in the value from 'value', if non-NULL, and mask from 'mask' if non-NULL.
+ * If 'value' is non-NULL and 'mask' is NULL, an all-ones mask will be
+ * filled in. */
 struct ofpact_set_field *
 ofpact_put_set_field(struct ofpbuf *ofpacts, const struct mf_field *field,
  const void *value, const void *mask)
 {
+/* Ensure there's enough space for:
+ * - value (8-byte aligned)
+ * - mask  (8-byte aligned)
+ * - padding (to make the whole ofpact_set_field 8-byte aligned)
+ */
+size_t total_size = 2 * ROUND_UP(field->n_bytes, OFPACT_ALIGNTO);
 struct ofpact_set_field *sf = ofpact_put_SET_FIELD(ofpacts);
 sf->field = field;
 
 /* Fill in the value and mask if given, otherwise put zeroes so that the
  * caller may fill in the value and mask itself. */
 if (value) {
-ofpbuf_put_uninit(ofpacts, 2 * field->n_bytes);
+ofpbuf_put_uninit(ofpacts, total_size);
 sf = ofpacts->header;
 memcpy(sf->value, value, field->n_bytes);
 if (mask) {
@@ -3399,7 +3406,7 @@ ofpact_put_set_field(struct ofpbuf *ofpacts, const struct 
mf_field *field,
 memset(ofpact_set_field_mask(sf), 0xff, field->n_bytes);
 }
 } else {
-ofpbuf_put_zeros(ofpacts, 2 * field->n_bytes);
+

[ovs-dev] [PATCH v5 2/6] treewide: Avoid offsetting NULL pointers.

2022-04-05 Thread Dumitru Ceara

This is undefined behavior and was reported by UB Sanitizer:
  lib/meta-flow.c:3445:16: runtime error: member access within null pointer of 
type 'struct vl_mf_field'
  #0 0x6aad0f in mf_get_vl_mff lib/meta-flow.c:3445
  #1 0x6d96d7 in mf_from_oxm_header lib/nx-match.c:260
  #2 0x6d9e2e in nx_pull_header__ lib/nx-match.c:341
  #3 0x6daafa in nx_pull_header lib/nx-match.c:488
  #4 0x6abcb6 in mf_vl_mff_nx_pull_header lib/meta-flow.c:3605
  #5 0x73b9be in decode_NXAST_RAW_REG_MOVE lib/ofp-actions.c:2652
  #6 0x764ccd in ofpact_decode lib/ofp-actions.inc2:4681
  [...]
  lib/sset.c:315:12: runtime error: applying zero offset to null pointer
  #0 0xcc2e6a in sset_at_position /root/ovs/lib/sset.c:315:12
  #1 0x5734b3 in port_dump_next /root/ovs/ofproto/ofproto-dpif.c:4083:20
  [...]
  lib/ovsdb-data.c:2194:56: runtime error: applying zero offset to null pointer
  #0 0x5e9530 in ovsdb_datum_added_removed 
/root/ovs/lib/ovsdb-data.c:2194:56
  #1 0x4d6258 in update_row_ref_count /root/ovs/ovsdb/transaction.c:335:17
  #2 0x4c360b in for_each_txn_row /root/ovs/ovsdb/transaction.c:1572:33
  [...]
  lib/ofpbuf.c:440:30: runtime error: applying zero offset to null pointer
  #0 0x75066d in ofpbuf_push_uninit lib/ofpbuf.c:440
  #1 0x46ac8a in ovnacts_parse lib/actions.c:4190
  #2 0x46ad91 in ovnacts_parse_string lib/actions.c:4208
  #3 0x4106d1 in test_parse_actions tests/test-ovn.c:1324
  [...]
  lib/ofp-actions.c:3205:22: runtime error: applying non-zero offset 2 to null 
pointer
  #0 0x6e1641 in set_field_split_str /root/ovs/lib/ofp-actions.c:3205:22
  [...]
  lib/tnl-ports.c:74:12: runtime error: applying zero offset to null pointer
  #0 0xceffe7 in tnl_port_cast /root/ovs/lib/tnl-ports.c:74:12
  #1 0xcf14c3 in map_insert /root/ovs/lib/tnl-ports.c:116:13
  [...]
  ofproto/ofproto.c:8905:16: runtime error: applying zero offset to null pointer
  #0 0x556795 in eviction_group_hash_rule 
/root/ovs/ofproto/ofproto.c:8905:16
  #1 0x503f8d in eviction_group_add_rule /root/ovs/ofproto/ofproto.c:9022:42
  [...]

Also, it's valid to have an empty ofpact list and we should be able to
try to iterate through it.

UB Sanitizer report:
  include/openvswitch/ofp-actions.h:222:12: runtime error: applying zero offset 
to null pointer
  #0 0x665d69 in ofpact_end 
/root/ovs/./include/openvswitch/ofp-actions.h:222:12
  #1 0x66b2cf in ofpacts_put_openflow_actions 
/root/ovs/lib/ofp-actions.c:8861:5
  #2 0x6ffdd1 in ofputil_encode_flow_mod /root/ovs/lib/ofp-flow.c:447:9
  [...]

Signed-off-by: Dumitru Ceara 
---
v5:
- Rebase.
v4:
- Addressed Ilya's comments.
---
 include/openvswitch/ofp-actions.h |4 +++-
 include/openvswitch/ofpbuf.h  |   17 -
 lib/dynamic-string.c  |8 ++--
 lib/meta-flow.c   |4 +++-
 lib/ofp-actions.c |8 
 lib/ofpbuf.c  |4 
 lib/ovsdb-data.c  |   37 +
 lib/ovsdb-data.h  |4 
 lib/sset.c|4 +++-
 lib/tnl-ports.c   |2 +-
 ofproto/ofproto.c |2 +-
 11 files changed, 62 insertions(+), 32 deletions(-)

diff --git a/include/openvswitch/ofp-actions.h 
b/include/openvswitch/ofp-actions.h
index 41bcb55d2056..b7231c7bb334 100644
--- a/include/openvswitch/ofp-actions.h
+++ b/include/openvswitch/ofp-actions.h
@@ -218,7 +218,9 @@ struct ofpact *ofpact_next_flattened(const struct ofpact *);
 static inline struct ofpact *
 ofpact_end(const struct ofpact *ofpacts, size_t ofpacts_len)
 {
-return ALIGNED_CAST(struct ofpact *, (uint8_t *) ofpacts + ofpacts_len);
+return ofpacts
+   ? ALIGNED_CAST(struct ofpact *, (uint8_t *) ofpacts + ofpacts_len)
+   : NULL;
 }
 
 static inline bool
diff --git a/include/openvswitch/ofpbuf.h b/include/openvswitch/ofpbuf.h
index 1136ba04c84e..7b6aba9dc29c 100644
--- a/include/openvswitch/ofpbuf.h
+++ b/include/openvswitch/ofpbuf.h
@@ -179,7 +179,9 @@ static inline void ofpbuf_delete(struct ofpbuf *b)
 static inline void *ofpbuf_at(const struct ofpbuf *b, size_t offset,
   size_t size)
 {
-return offset + size <= b->size ? (char *) b->data + offset : NULL;
+return (offset + size <= b->size) && b->data
+   ? (char *) b->data + offset
+   : NULL;
 }
 
 /* Returns a pointer to byte 'offset' in 'b', which must contain at least
@@ -188,20 +190,20 @@ static inline void *ofpbuf_at_assert(const struct ofpbuf 
*b, size_t offset,
  size_t size)
 {
 ovs_assert(offset + size <= b->size);
-return ((char *) b->data) + offset;
+return b->data ? (char *) b->data + offset : NULL;
 }
 
 /* Returns a pointer to byte following the last byte of data in use in 'b'. */
 static inline void *ofpbuf_tail(const struct ofpbuf *b)
 {

[ovs-dev] [PATCH v5 1/6] treewide: Fix invalid bit shift operations.

2022-04-05 Thread Dumitru Ceara

UB Sanitizer reports:
  tests/test-hash.c:59:40: runtime error: shift exponent 64 is too large for 
64-bit type 'long unsigned int'
  #0 0x44c3c9 in get_range128 tests/test-hash.c:59
  #1 0x44cb2e in check_hash_bytes128 tests/test-hash.c:178
  #2 0x44d14d in test_hash_main tests/test-hash.c:282
  [...]
  ofproto/ofproto-dpif-xlate.c:5607:45: runtime error: left shift of 65535 by 
16 places cannot be represented in type 'int'
  #0 0x53fe9f in xlate_sample_action ofproto/ofproto-dpif-xlate.c:5607
  #1 0x54d625 in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7160
  #2 0x553b76 in xlate_actions ofproto/ofproto-dpif-xlate.c:7806
  #3 0x4fcb49 in upcall_xlate ofproto/ofproto-dpif-upcall.c:1237
  #4 0x4fe02f in process_upcall ofproto/ofproto-dpif-upcall.c:1456
  #5 0x4fda99 in upcall_cb ofproto/ofproto-dpif-upcall.c:1358
  [...]
  tests/test-util.c:89:23: runtime error: left shift of 1 by 31 places cannot 
be represented in type 'int'
  #0 0x476415 in test_ctz tests/test-util.c:89
  [...]
  lib/dpif-netlink.c:396:33: runtime error: left shift of 1 by 31 places cannot 
be represented in type 'int'
  #0 0x571b9f in dpif_netlink_open lib/dpif-netlink.c:396

Acked-by: Aaron Conole 
Acked-by: Paolo Valerio 
Signed-off-by: Dumitru Ceara 
---
v5:
- Rebase.
v4:
- Added Paolo's ack.
- Addressed Ilya's comments.
v3:
- Added Aaron's ack.
---
 lib/dpif-netlink.c   |2 +-
 ofproto/ofproto-dpif-xlate.c |3 ++-
 tests/test-hash.c|3 +++
 tests/test-util.c|   13 ++---
 4 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 71e35ccddaae..06e1e8ca0283 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -85,7 +85,7 @@ enum { MAX_PORTS = USHRT_MAX };
 #define EPOLLEXCLUSIVE (1u << 28)
 #endif
 
-#define OVS_DP_F_UNSUPPORTED (1 << 31);
+#define OVS_DP_F_UNSUPPORTED (1u << 31);
 
 /* This PID is not used by the kernel datapath when using dispatch per CPU,
  * but it is required to be set (not zero). */
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index bfd4960ddb8f..792d9a3cd0eb 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -5641,7 +5641,8 @@ xlate_sample_action(struct xlate_ctx *ctx,
 
 /* Scale the probability from 16-bit to 32-bit while representing
  * the same percentage. */
-uint32_t probability = (os->probability << 16) | os->probability;
+uint32_t probability =
+((uint32_t) os->probability << 16) | os->probability;
 
 /* If ofp_port in flow sample action is equel to ofp_port,
  * this sample action is a input port action. */
diff --git a/tests/test-hash.c b/tests/test-hash.c
index 5d3f8ea43f65..aec5f580bb75 100644
--- a/tests/test-hash.c
+++ b/tests/test-hash.c
@@ -55,6 +55,9 @@ set_bit128(ovs_u128 *values, int bit, int n_bits)
 static uint64_t
 get_range128(ovs_u128 *value, int ofs, uint64_t mask)
 {
+if (ofs == 0) {
+return value->u64.lo & mask;
+}
 return ((ofs < 64 ? (value->u64.lo >> ofs) : 0) & mask)
 | ((ofs <= 64 ? (value->u64.hi << (64 - ofs)) : (value->u64.hi >> (ofs 
- 64)) & mask));
 }
diff --git a/tests/test-util.c b/tests/test-util.c
index f0fd0421086b..7d899fbbfd92 100644
--- a/tests/test-util.c
+++ b/tests/test-util.c
@@ -43,17 +43,16 @@ check_log_2_floor(uint32_t x, int n)
 static void
 test_log_2_floor(struct ovs_cmdl_context *ctx OVS_UNUSED)
 {
-int n;
-
-for (n = 0; n < 32; n++) {
+for (uint32_t n = 0; n < 32; n++) {
 /* Check minimum x such that f(x) == n. */
-check_log_2_floor(1 << n, n);
+check_log_2_floor(UINT32_C(1) << n, n);
 
 /* Check maximum x such that f(x) == n. */
-check_log_2_floor((1 << n) | ((1 << n) - 1), n);
+check_log_2_floor((UINT32_C(1) << n) | ((UINT32_C(1) << n) - 1), n);
 
 /* Check a random value in the middle. */
-check_log_2_floor((random_uint32() & ((1 << n) - 1)) | (1 << n), n);
+check_log_2_floor((random_uint32() & ((UINT32_C(1) << n) - 1))
+  | (UINT32_C(1) << n), n);
 }
 
 /* log_2_floor(0) is undefined, so don't check it. */
@@ -86,7 +85,7 @@ test_ctz(struct ovs_cmdl_context *ctx OVS_UNUSED)
 
 for (n = 0; n < 32; n++) {
 /* Check minimum x such that f(x) == n. */
-check_ctz32(1 << n, n);
+check_ctz32(UINT32_C(1) << n, n);
 
 /* Check maximum x such that f(x) == n. */
 check_ctz32(UINT32_MAX << n, n);

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH v5 0/6] Fix UndefinedBehaviorSanitizer reported issues and enable it in CI.

2022-04-05 Thread Dumitru Ceara

As privately reported by Aaron Conole, and by Jeffrey Walton [0]
there's currently a number of undefined behavior instances in
the OVS code base.  Running the OVS (and OVN) tests with UBSan [1]
enabled uncovers these.

This series fixes the issues reported by UBSan and, through the last
patch, enables UBSan tests in GitHub Actions CI.

[0] https://mail.openvswitch.org/pipermail/ovs-dev/2022-January/390894.html
[1] https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html
[2] https://patchwork.ozlabs.org/project/openvswitch/list/?series=286494=*

Changes in v5:
- Rebased, dropped patch 'util: Avoid UB when iterating collections.',
  the proper fix from Adrian was accepted.
- Addressed Adrian's comments.

Changes in v4:
- Rebased, dropping patches that were already applied.
- Addressed Ilya's comments.
- Added acks from Paolo from v3 as nothing major changed in this revision.
- Rephrased the commit message for "util: Avoid UB when iterating collections."

Changes in v3:
- Added acks to the patches acked by Aaron.
- Addressed Aaron's comments.
- Split previous patch 07/11 "ofp-actions: ofp-errors: Use aligned
  structures when decoding ofp actions." into three separate patches
  addressing independent issues.
- Reordered patches such that the ones that might need follow up are
  towards the end of the series.
- Added a new patch, patch 13/14, fixing a typo in the CFLAGS_ASAN
  variables in the script used for building OVS in CI.  This typo was
  originally copy/pasted in the CFLAGS_UBSAN flags in v1 and v2 of
  this series.

Changes in v2:
- Patch 3/11:
  - Remove cache line size aligment markers instead, as suggested by
Ilya.

Dumitru Ceara (6):
  treewide: Fix invalid bit shift operations.
  treewide: Avoid offsetting NULL pointers.
  ofp-actions: Ensure aligned accesses to masked fields.
  ofp-errors: Ensure parsed OFPT_ERROR messages are properly aligned.
  ofp-actions: Use aligned structures when decoding ofp actions.
  ci: Add UB Sanitizer.

 .ci/linux-build.sh   |6 ++
 .github/workflows/build-and-test.yml |5 ++
 configure.ac |1 
 include/openvswitch/ofp-actions.h|   15 +++--
 include/openvswitch/ofpbuf.h |   19 --
 lib/dpif-netlink.c   |2 -
 lib/dynamic-string.c |8 ++-
 lib/meta-flow.c  |4 +
 lib/ofp-actions.c|  108 +-
 lib/ofp-errors.c |2 +
 lib/ofpbuf.c |   43 ++
 lib/ovsdb-data.c |   37 +++-
 lib/ovsdb-data.h |4 +
 lib/sset.c   |4 +
 lib/tnl-ports.c  |2 -
 ofproto/ofproto-dpif-xlate.c |3 +
 ofproto/ofproto.c|2 -
 tests/atlocal.in |   16 +
 tests/automake.mk|1 
 tests/daemon.at  |8 +++
 tests/ovs-macros.at  |5 ++
 tests/ovs-ofctl.at   |2 -
 tests/ovsdb-server.at|   16 +
 tests/test-hash.c|3 +
 tests/test-util.c|   13 ++--
 25 files changed, 259 insertions(+), 70 deletions(-)


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] Missing the IP Header Identification Field in tunnels.

2022-04-05 Thread Dincer Beken

Hello,

On the GTP-U implementation in the userspace, I see that the IP Header 
Identification Field is not set in the encapsulated packets. This is not like 
this in the Kernel Implementation. I don't get replies when I send these 
tunneled packets.

>From the Specification, I see that this fiedl should be unique within a window.

Does anybody have an idea about this?

Regards,
Dincer



Dincer Beken
Software Developer

t: +49 8331 99 59-852

e: dbe...@blackned.de
w: https://blackned.de


[cid:blackned_d1650eb0-ca34-4593-90da-c73d0e84f2cd.png]

[cid:linkedin_d8fe2332-efbd-461a-923a-3153732a489e.png]


blackned GmbH · zugspitzstraße 1 · 87751 heimertingen
geschäftsführer: timo haas · hrb memmingen 17319

Diese eMail enthält vertrauliche und/oder rechtlich geschützte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese eMail irrtümlich erhalten 
haben, informieren Sie bitte sofort den Absender und löschen Sie diese eMail.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser eMail ist nicht 
gestattet.

This eMail may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this eMail in error) 
please notify the sender immediately and delete this eMail.
Any unauthorized copying, disclosure or distribution of the contents in this 
eMail is strictly forbidden.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVS DPDK DMA-Dev library/Design Discussion

2022-04-05 Thread Bruce Richardson

On Tue, Apr 05, 2022 at 01:29:25PM +0200, Ilya Maximets wrote:
> On 3/30/22 16:09, Bruce Richardson wrote:
> > On Wed, Mar 30, 2022 at 01:41:34PM +0200, Ilya Maximets wrote:
> >> On 3/30/22 13:12, Bruce Richardson wrote:
> >>> On Wed, Mar 30, 2022 at 12:52:15PM +0200, Ilya Maximets wrote:
>  On 3/30/22 12:41, Ilya Maximets wrote:
> > Forking the thread to discuss a memory consistency/ordering model.
> >
> > AFAICT, dmadev can be anything from part of a CPU to a completely
> > separate PCI device.  However, I don't see any memory ordering being
> > enforced or even described in the dmadev API or documentation.
> > Please, point me to the correct documentation, if I somehow missed it.
> >
> > We have a DMA device (A) and a CPU core (B) writing respectively
> > the data and the descriptor info.  CPU core (C) is reading the
> > descriptor and the data it points too.
> >
> > A few things about that process:
> >
> > 1. There is no memory barrier between writes A and B (Did I miss
> >them?).  Meaning that those operations can be seen by C in a
> >different order regardless of barriers issued by C and regardless
> >of the nature of devices A and B.
> >
> > 2. Even if there is a write barrier between A and B, there is
> >no guarantee that C will see these writes in the same order
> >as C doesn't use real memory barriers because vhost advertises
> 
>  s/advertises/does not advertise/
> 
> >VIRTIO_F_ORDER_PLATFORM.
> >
> > So, I'm getting to conclusion that there is a missing write barrier
> > on the vhost side and vhost itself must not advertise the
> 
>  s/must not/must/
> 
>  Sorry, I wrote things backwards. :)
> 
> > VIRTIO_F_ORDER_PLATFORM, so the virtio driver can use actual memory
> > barriers.
> >
> > Would like to hear some thoughts on that topic.  Is it a real issue?
> > Is it an issue considering all possible CPU architectures and DMA
> > HW variants?
> >
> >>>
> >>> In terms of ordering of operations using dmadev:
> >>>
> >>> * Some DMA HW will perform all operations strictly in order e.g. Intel
> >>>   IOAT, while other hardware may not guarantee order of operations/do
> >>>   things in parallel e.g. Intel DSA. Therefore the dmadev API provides the
> >>>   fence operation which allows the order to be enforced. The fence can be
> >>>   thought of as a full memory barrier, meaning no jobs after the barrier 
> >>> can
> >>>   be started until all those before it have completed. Obviously, for HW
> >>>   where order is always enforced, this will be a no-op, but for hardware 
> >>> that
> >>>   parallelizes, we want to reduce the fences to get best performance.
> >>>
> >>> * For synchronization between DMA devices and CPUs, where a CPU can only
> >>>   write after a DMA copy has been done, the CPU must wait for the dma
> >>>   completion to guarantee ordering. Once the completion has been returned
> >>>   the completed operation is globally visible to all cores.
> >>
> >> Thanks for explanation!  Some questions though:
> >>
> >> In our case one CPU waits for completion and another CPU is actually using
> >> the data.  IOW, "CPU must wait" is a bit ambiguous.  Which CPU must wait?
> >>
> >> Or should it be "Once the completion is visible on any core, the completed
> >> operation is globally visible to all cores." ?
> >>
> > 
> > The latter.
> > Once the change to memory/cache is visible to any core, it is visible to
> > all ones. This applies to regular CPU memory writes too - at least on IA,
> > and I expect on many other architectures - once the write is visible
> > outside the current core it is visible to every other core. Once the data
> > hits the l1 or l2 cache of any core, any subsequent requests for that data
> > from any other core will "snoop" the latest data from the cores cache, even
> > if it has not made its way down to a shared cache, e.g. l3 on most IA
> > systems.
> 
> It sounds like you're referring to the "multicopy atomicity" of the
> architecture.  However, that is not universally supported thing.
> AFAICT, POWER and older ARM systems doesn't support it, so writes
> performed by one core are not necessarily available to all other
> cores at the same time.  That means that if the CPU0 writes the data
> and the completion flag, CPU1 reads the completion flag and writes
> the ring, CPU2 may see the ring write, but may still not see the
> write of the data, even though there was a control dependency on CPU1.
> There should be a full memory barrier on CPU1 in order to fulfill
> the memory ordering requirements for CPU2, IIUC.
> 
> In our scenario the CPU0 is a DMA device, which may or may not be
> part of a CPU and may have different memory consistency/ordering
> requirements.  So, the question is: does DPDK DMA API guarantee
> multicopy atomicity between DMA device and all CPU cores regardless
> of CPU

Re: [ovs-dev] OVS DPDK DMA-Dev library/Design Discussion

2022-04-05 Thread Ilya Maximets

On 3/30/22 16:09, Bruce Richardson wrote:
> On Wed, Mar 30, 2022 at 01:41:34PM +0200, Ilya Maximets wrote:
>> On 3/30/22 13:12, Bruce Richardson wrote:
>>> On Wed, Mar 30, 2022 at 12:52:15PM +0200, Ilya Maximets wrote:
 On 3/30/22 12:41, Ilya Maximets wrote:
> Forking the thread to discuss a memory consistency/ordering model.
>
> AFAICT, dmadev can be anything from part of a CPU to a completely
> separate PCI device.  However, I don't see any memory ordering being
> enforced or even described in the dmadev API or documentation.
> Please, point me to the correct documentation, if I somehow missed it.
>
> We have a DMA device (A) and a CPU core (B) writing respectively
> the data and the descriptor info.  CPU core (C) is reading the
> descriptor and the data it points too.
>
> A few things about that process:
>
> 1. There is no memory barrier between writes A and B (Did I miss
>them?).  Meaning that those operations can be seen by C in a
>different order regardless of barriers issued by C and regardless
>of the nature of devices A and B.
>
> 2. Even if there is a write barrier between A and B, there is
>no guarantee that C will see these writes in the same order
>as C doesn't use real memory barriers because vhost advertises

 s/advertises/does not advertise/

>VIRTIO_F_ORDER_PLATFORM.
>
> So, I'm getting to conclusion that there is a missing write barrier
> on the vhost side and vhost itself must not advertise the

 s/must not/must/

 Sorry, I wrote things backwards. :)

> VIRTIO_F_ORDER_PLATFORM, so the virtio driver can use actual memory
> barriers.
>
> Would like to hear some thoughts on that topic.  Is it a real issue?
> Is it an issue considering all possible CPU architectures and DMA
> HW variants?
>
>>>
>>> In terms of ordering of operations using dmadev:
>>>
>>> * Some DMA HW will perform all operations strictly in order e.g. Intel
>>>   IOAT, while other hardware may not guarantee order of operations/do
>>>   things in parallel e.g. Intel DSA. Therefore the dmadev API provides the
>>>   fence operation which allows the order to be enforced. The fence can be
>>>   thought of as a full memory barrier, meaning no jobs after the barrier can
>>>   be started until all those before it have completed. Obviously, for HW
>>>   where order is always enforced, this will be a no-op, but for hardware 
>>> that
>>>   parallelizes, we want to reduce the fences to get best performance.
>>>
>>> * For synchronization between DMA devices and CPUs, where a CPU can only
>>>   write after a DMA copy has been done, the CPU must wait for the dma
>>>   completion to guarantee ordering. Once the completion has been returned
>>>   the completed operation is globally visible to all cores.
>>
>> Thanks for explanation!  Some questions though:
>>
>> In our case one CPU waits for completion and another CPU is actually using
>> the data.  IOW, "CPU must wait" is a bit ambiguous.  Which CPU must wait?
>>
>> Or should it be "Once the completion is visible on any core, the completed
>> operation is globally visible to all cores." ?
>>
> 
> The latter.
> Once the change to memory/cache is visible to any core, it is visible to
> all ones. This applies to regular CPU memory writes too - at least on IA,
> and I expect on many other architectures - once the write is visible
> outside the current core it is visible to every other core. Once the data
> hits the l1 or l2 cache of any core, any subsequent requests for that data
> from any other core will "snoop" the latest data from the cores cache, even
> if it has not made its way down to a shared cache, e.g. l3 on most IA
> systems.

It sounds like you're referring to the "multicopy atomicity" of the
architecture.  However, that is not universally supported thing.
AFAICT, POWER and older ARM systems doesn't support it, so writes
performed by one core are not necessarily available to all other
cores at the same time.  That means that if the CPU0 writes the data
and the completion flag, CPU1 reads the completion flag and writes
the ring, CPU2 may see the ring write, but may still not see the
write of the data, even though there was a control dependency on CPU1.
There should be a full memory barrier on CPU1 in order to fulfill
the memory ordering requirements for CPU2, IIUC.

In our scenario the CPU0 is a DMA device, which may or may not be
part of a CPU and may have different memory consistency/ordering
requirements.  So, the question is: does DPDK DMA API guarantee
multicopy atomicity between DMA device and all CPU cores regardless
of CPU architecture and a nature of the DMA device?

> 
>> And the main question:
>>   Are these synchronization claims documented somewhere?
>>
> 
> Not explicitly, no. However, the way DMA devices works in the regard if
> global observability is absolutely no different from

Re: [ovs-dev] [PATCH v3 8/9] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-04-05 Thread Ilya Maximets

On 4/5/22 12:24, Eelco Chaudron wrote:
> 
> 
> On 4 Apr 2022, at 21:05, Ilya Maximets wrote:
> 
>> On 3/30/22 11:35, Eelco Chaudron wrote:
>>>
>>>
>>> On 29 Mar 2022, at 13:06, Ilya Maximets wrote:
>>>
 On 2/25/22 13:29, Marcelo Ricardo Leitner wrote:
> On Thu, Feb 24, 2022 at 03:20:23PM +0100, Eelco Chaudron wrote:
>>
>>
>> On 23 Feb 2022, at 17:23, Marcelo Ricardo Leitner wrote:
>>
>>> On Tue, Feb 22, 2022 at 04:26:10PM +0100, Eelco Chaudron wrote:
 This patch checks for none offloadable ct_state match flag 
 combinations.
 If they exist force the +trk flag down to TC Flower

 Signed-off-by: Eelco Chaudron 
 ---
 v3:
  - Instead of warning about an invalid flag combination fix it.

  lib/netdev-offload-tc.c |6 ++
  1 file changed, 6 insertions(+)

 diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
 index 0105d883f..3d2c1d844 100644
 --- a/lib/netdev-offload-tc.c
 +++ b/lib/netdev-offload-tc.c
 @@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower 
 *flower, struct match *match)
  flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
  flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
  }
 +
 +if (flower->key.ct_state &&
 +!(flower->key.ct_state & 
 TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) {
 +flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
 +flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
 +}
>>>
>>> Instead, what about:
>>
>> I guess we still need the patch in OVS, as older kernels will exist ;)
>
> Hah, indeed!

 My main concern around such changes is that revlaidators might
 cycle that flow back a forth if the match is not right, so we
 should be very careful.

 This one seems safe as it only makes the match more strict and
 revalidators will not remove such flow.
>>>
>>> I did test for this scenario, and as you already concluded this is not an 
>>> issue for this specific change.
>>>
>>>
 However, I'm not sure this workaround covers all the cases.
 The kernel part inside the fl_validate_ct_state tests masked key,
 not the raw one.  So, in case +trk set in the key, but not set
 in the mask, the code above will not add the mask bit and the
 kernel will reject the flow.
>>>
>>> I guess you mean the following situations in fl_validate_ct_state, where 
>>> the mask/flag are set:
>>>
>>>
>>> - TCA_FLOWER_KEY_CT_FLAGS_NEW && TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED  -> 
>>> mutual exclusive
>>>   OVS handles this since day one, 576126a9, and does this by removing the 
>>> TCA_FLOWER_KEY_CT_FLAGS_NEW flag before programming.
>>>   Not sure why they added this in the first place, maybe there is a corner 
>>> case?
>>>
>>>
>>> - TCA_FLOWER_KEY_CT_FLAGS_NEW && TCA_FLOWER_KEY_CT_FLAGS_REPLY -> mutual 
>>> exlusive
>>> - TCA_FLOWER_KEY_CT_FLAGS_INVALID only TCA_FLOWER_KEY_CT_FLAGS_TRACKED can 
>>> be set
>>>
>>> These two are currently not handled by OVS.
>>>
>>> I guess as the documentation already explains these three combinations 
>>> would never happen, and thus in a real-life scenario, this will never 
>>> result in a match.
>>>
>>>
>>> I can do a follow-up patch once I update the system test cases to work with 
>>> TC, to handle these three (two) cases.
>>
>> What I meant was:
>>
>> flower->key.ct_state  = TCA_FLOWER_KEY_CT_FLAGS_TRACKED | 
>> TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED
>> flower->mask.ct_state = TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED
> 
> Maybe I’m still not understanding you correctly, but the above can not 
> happen. The key.ct_state will only be set if the corresponding mask is set.
> 
> 1504  if (mask->ct_state & OVS_CS_F_TRACKED) {
> 1505  if (key->ct_state & OVS_CS_F_TRACKED) {
> 1506  flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
> 1507  }
> 1508  flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
> 1509  mask->ct_state &= ~OVS_CS_F_TRACKED;
> 1510  }

Oh.  You're right.  Sorry, I was thinking in terms of ofproto-generated
keys and masks, but forgot that flower keys/masks are already processed
versions of them.  Thanks for pointing that out.

> 
>> So, ofproto layer requests to match only on the 
>> TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED,
>> while TCA_FLOWER_KEY_CT_FLAGS_TRACKED is set only in the key.
> 
> The TC match is only made on the masked bits but ofproto, all none mask bits 
> are not matched.
> 
>> Kernel will take (key.ct_state & mask.ct_state) and will fail the validation,
>> because TCA_FLOWER_KEY_CT_FLAGS_TRACKED will not be set.
> 
> My above patch will take care of this, as it will set the TRACKED key and 
> mask.
> 
>> Does that

Re: [ovs-dev] [PATCH v3 8/9] netdev-offload-tc: Check for none offloadable ct_state flag combination

2022-04-05 Thread Eelco Chaudron



On 4 Apr 2022, at 21:05, Ilya Maximets wrote:

> On 3/30/22 11:35, Eelco Chaudron wrote:
>>
>>
>> On 29 Mar 2022, at 13:06, Ilya Maximets wrote:
>>
>>> On 2/25/22 13:29, Marcelo Ricardo Leitner wrote:
 On Thu, Feb 24, 2022 at 03:20:23PM +0100, Eelco Chaudron wrote:
>
>
> On 23 Feb 2022, at 17:23, Marcelo Ricardo Leitner wrote:
>
>> On Tue, Feb 22, 2022 at 04:26:10PM +0100, Eelco Chaudron wrote:
>>> This patch checks for none offloadable ct_state match flag combinations.
>>> If they exist force the +trk flag down to TC Flower
>>>
>>> Signed-off-by: Eelco Chaudron 
>>> ---
>>> v3:
>>>  - Instead of warning about an invalid flag combination fix it.
>>>
>>>  lib/netdev-offload-tc.c |6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
>>> index 0105d883f..3d2c1d844 100644
>>> --- a/lib/netdev-offload-tc.c
>>> +++ b/lib/netdev-offload-tc.c
>>> @@ -1541,6 +1541,12 @@ parse_match_ct_state_to_flower(struct tc_flower 
>>> *flower, struct match *match)
>>>  flower->key.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
>>>  flower->mask.ct_state &= ~(TCA_FLOWER_KEY_CT_FLAGS_NEW);
>>>  }
>>> +
>>> +if (flower->key.ct_state &&
>>> +!(flower->key.ct_state & TCA_FLOWER_KEY_CT_FLAGS_TRACKED)) 
>>> {
>>> +flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
>>> +flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
>>> +}
>>
>> Instead, what about:
>
> I guess we still need the patch in OVS, as older kernels will exist ;)

 Hah, indeed!
>>>
>>> My main concern around such changes is that revlaidators might
>>> cycle that flow back a forth if the match is not right, so we
>>> should be very careful.
>>>
>>> This one seems safe as it only makes the match more strict and
>>> revalidators will not remove such flow.
>>
>> I did test for this scenario, and as you already concluded this is not an 
>> issue for this specific change.
>>
>>
>>> However, I'm not sure this workaround covers all the cases.
>>> The kernel part inside the fl_validate_ct_state tests masked key,
>>> not the raw one.  So, in case +trk set in the key, but not set
>>> in the mask, the code above will not add the mask bit and the
>>> kernel will reject the flow.
>>
>> I guess you mean the following situations in fl_validate_ct_state, where the 
>> mask/flag are set:
>>
>>
>> - TCA_FLOWER_KEY_CT_FLAGS_NEW && TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED  -> 
>> mutual exclusive
>>   OVS handles this since day one, 576126a9, and does this by removing the 
>> TCA_FLOWER_KEY_CT_FLAGS_NEW flag before programming.
>>   Not sure why they added this in the first place, maybe there is a corner 
>> case?
>>
>>
>> - TCA_FLOWER_KEY_CT_FLAGS_NEW && TCA_FLOWER_KEY_CT_FLAGS_REPLY -> mutual 
>> exlusive
>> - TCA_FLOWER_KEY_CT_FLAGS_INVALID only TCA_FLOWER_KEY_CT_FLAGS_TRACKED can 
>> be set
>>
>> These two are currently not handled by OVS.
>>
>> I guess as the documentation already explains these three combinations would 
>> never happen, and thus in a real-life scenario, this will never result in a 
>> match.
>>
>>
>> I can do a follow-up patch once I update the system test cases to work with 
>> TC, to handle these three (two) cases.
>
> What I meant was:
>
> flower->key.ct_state  = TCA_FLOWER_KEY_CT_FLAGS_TRACKED | 
> TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED
> flower->mask.ct_state = TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED

Maybe I’m still not understanding you correctly, but the above can not happen. 
The key.ct_state will only be set if the corresponding mask is set.

1504  if (mask->ct_state & OVS_CS_F_TRACKED) {
1505  if (key->ct_state & OVS_CS_F_TRACKED) {
1506  flower->key.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
1507  }
1508  flower->mask.ct_state |= TCA_FLOWER_KEY_CT_FLAGS_TRACKED;
1509  mask->ct_state &= ~OVS_CS_F_TRACKED;
1510  }

> So, ofproto layer requests to match only on the 
> TCA_FLOWER_KEY_CT_FLAGS_ESTABLISHED,
> while TCA_FLOWER_KEY_CT_FLAGS_TRACKED is set only in the key.

The TC match is only made on the masked bits but ofproto, all none mask bits 
are not matched.

> Kernel will take (key.ct_state & mask.ct_state) and will fail the validation,
> because TCA_FLOWER_KEY_CT_FLAGS_TRACKED will not be set.

My above patch will take care of this, as it will set the TRACKED key and mask.

> Does that make sense?
>
>>
>>
>>> Another thing is that we do have a way to probe support of
>>> kernel flags, so we may check if kernel accept flows without
>>> match on +trk and only add an extra match for such kernels.
>>> Might be useful in combination with the kernel patch below.
>>>
>>> What do you think?
>>
>> Does it make sense to do all this in OVS as we know that +trk is always set 
>>

[ovs-dev] [PATCH 1/2] ofproto-dpif: trigger revalidate if ct tp changes

2022-04-05 Thread lic121

Once ct_zone timeout policy changes, revalidator is supposed to be
triggered.

Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy 
tables")
Signed-off-by: lic121 
---
 ofproto/ofproto-dpif.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 6601f23..7a1bb0d 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -5597,6 +5597,7 @@ ct_set_zone_timeout_policy(const char *datapath_type, 
uint16_t zone_id,
 ct_timeout_policy_unref(backer, ct_zone->ct_tp);
 ct_zone->ct_tp = ct_tp;
 ct_tp->ref_count++;
+backer->need_revalidate = REV_RECONFIGURE;
 }
 } else {
 struct ct_zone *new_ct_zone = ct_zone_alloc(zone_id);
@@ -5604,6 +5605,7 @@ ct_set_zone_timeout_policy(const char *datapath_type, 
uint16_t zone_id,
 cmap_insert(>ct_zones, _ct_zone->node,
 hash_int(zone_id, 0));
 ct_tp->ref_count++;
+backer->need_revalidate = REV_RECONFIGURE;
 }
 }
 
@@ -5620,6 +5622,7 @@ ct_del_zone_timeout_policy(const char *datapath_type, 
uint16_t zone_id)
 if (ct_zone) {
 ct_timeout_policy_unref(backer, ct_zone->ct_tp);
 ct_zone_remove_and_destroy(backer, ct_zone);
+backer->need_revalidate = REV_RECONFIGURE;
 }
 }
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 2/2] ofproto-dpif: avoid unneccesary backer revalidation

2022-04-05 Thread lic121

If lldp didn't change, we are not supposed to trigger backer
revalidation.
Without this patch, bridge_reconfigure() always trigger udpif
revalidator because of lldp.

This patch also fix lldp memory leak bug:
lldp_create() malloc memory for lldp->lldpd->g_hardware. lldp_unref
is supposed to free the memory, but it doesn't.

Signed-off-by: lic121 
Signed-off-by: Eelco Chaudron 
Co-authored-by: Eelco Chaudron 
---
 lib/lldp/lldpd.c   | 10 +++---
 lib/ovs-lldp.c |  8 
 lib/ovs-lldp.h |  1 +
 ofproto/ofproto-dpif.c |  7 +--
 tests/ofproto-dpif.at  | 33 +
 5 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/lib/lldp/lldpd.c b/lib/lldp/lldpd.c
index 403f1f5..4bff7b0 100644
--- a/lib/lldp/lldpd.c
+++ b/lib/lldp/lldpd.c
@@ -140,13 +140,9 @@ lldpd_cleanup(struct lldpd *cfg)
 VLOG_DBG("cleanup all ports");
 
 LIST_FOR_EACH_SAFE (hw, h_entries, >g_hardware) {
-if (!hw->h_flags) {
-ovs_list_remove(>h_entries);
-lldpd_remote_cleanup(hw, NULL, true);
-lldpd_hardware_cleanup(cfg, hw);
-} else {
-lldpd_remote_cleanup(hw, NULL, false);
-}
+ovs_list_remove(>h_entries);
+lldpd_remote_cleanup(hw, NULL, true);
+lldpd_hardware_cleanup(cfg, hw);
 }
 
 VLOG_DBG("cleanup all chassis");
diff --git a/lib/ovs-lldp.c b/lib/ovs-lldp.c
index a9d205e..2d13e97 100644
--- a/lib/ovs-lldp.c
+++ b/lib/ovs-lldp.c
@@ -738,6 +738,14 @@ lldp_put_packet(struct lldp *lldp, struct dp_packet 
*packet,
 ovs_mutex_unlock();
 }
 
+/* Is LLDP enabled?
+ */
+bool
+lldp_is_enabled(struct lldp *lldp)
+{
+return lldp ? lldp->enabled : false;
+}
+
 /* Configures the LLDP stack.
  */
 bool
diff --git a/lib/ovs-lldp.h b/lib/ovs-lldp.h
index 0e536e8..661ac4e 100644
--- a/lib/ovs-lldp.h
+++ b/lib/ovs-lldp.h
@@ -86,6 +86,7 @@ void lldp_run(struct lldpd *cfg);
 bool lldp_should_send_packet(struct lldp *cfg);
 bool lldp_should_process_flow(struct lldp *lldp, const struct flow *flow);
 bool lldp_configure(struct lldp *lldp, const struct smap *cfg);
+bool lldp_is_enabled(struct lldp *lldp);
 void lldp_process_packet(struct lldp *cfg, const struct dp_packet *);
 void lldp_put_packet(struct lldp *lldp, struct dp_packet *packet,
  const struct eth_addr eth_src);
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 7a1bb0d..3ae44f0 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -2490,11 +2490,11 @@ set_lldp(struct ofport *ofport_,
 {
 struct ofport_dpif *ofport = ofport_dpif_cast(ofport_);
 struct ofproto_dpif *ofproto = ofproto_dpif_cast(ofport->up.ofproto);
+bool old_enable = lldp_is_enabled(ofport->lldp);
 int error = 0;
 
-if (cfg) {
+if (cfg && !smap_is_empty(cfg)) {
 if (!ofport->lldp) {
-ofproto->backer->need_revalidate = REV_RECONFIGURE;
 ofport->lldp = lldp_create(ofport->up.netdev, ofport_->mtu, cfg);
 }
 
@@ -2506,6 +2506,9 @@ set_lldp(struct ofport *ofport_,
 } else if (ofport->lldp) {
 lldp_unref(ofport->lldp);
 ofport->lldp = NULL;
+}
+
+if (lldp_is_enabled(ofport->lldp) != old_enable) {
 ofproto->backer->need_revalidate = REV_RECONFIGURE;
 }
 
diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at
index dbb3b6d..1c9a94c 100644
--- a/tests/ofproto-dpif.at
+++ b/tests/ofproto-dpif.at
@@ -29,6 +29,39 @@ AT_CHECK([ovs-appctl revalidator/wait])
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([ofproto-dpif - lldp revalidator event(REV_RECONFIGURE)])
+OVS_VSWITCHD_START(
+[add-port br0 p1 -- set interface p1 ofport_request=1 type=dummy]
+)
+dnl first revalidation triggered by add interface
+AT_CHECK([ovs-appctl coverage/read-counter rev_reconfigure], [0], [dnl
+1
+])
+
+dnl enable lldp
+AT_CHECK([ovs-vsctl set interface p1 lldp:enable=true])
+AT_CHECK([ovs-appctl revalidator/wait])
+AT_CHECK([ovs-appctl coverage/read-counter rev_reconfigure], [0], [dnl
+2
+])
+
+dnl disable lldp
+AT_CHECK([ovs-vsctl set interface p1 lldp:enable=false])
+AT_CHECK([ovs-appctl revalidator/wait])
+AT_CHECK([ovs-appctl coverage/read-counter rev_reconfigure], [0], [dnl
+3
+])
+
+dnl remove lldp, no revalidation as lldp was disabled
+AT_CHECK([ovs-vsctl remove interface p1 lldp enable])
+AT_CHECK([ovs-appctl revalidator/wait])
+AT_CHECK([ovs-appctl coverage/read-counter rev_reconfigure], [0], [dnl
+3
+])
+
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_SETUP([ofproto-dpif - active-backup bonding (with primary)])
 
 dnl Create br0 with members p1, p2 and p7, creating bond0 with p1 and
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 0/2] fix revalidation triggers

2022-04-05 Thread lic121

This series fix two revalidation triggers:
  - trigger revalidation if ct_zone timeout policy changes
  - avoid revalidation if lldp config doesn't change

The second patch relies on the first one because of ct test cases

lic121 (2):
  ofproto-dpif: trigger revalidate if ct tp changes
  ofproto-dpif: avoid unneccesary backer revalidation

 lib/lldp/lldpd.c   | 10 +++---
 lib/ovs-lldp.c |  8 
 lib/ovs-lldp.h |  1 +
 ofproto/ofproto-dpif.c | 10 --
 tests/ofproto-dpif.at  | 33 +
 5 files changed, 53 insertions(+), 9 deletions(-)

-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] flow: Allow matches on nw_proto also for IPv6 later frags.

2022-04-05 Thread Eelco Chaudron




On 4 Apr 2022, at 23:54, Paolo Valerio wrote:

> Aaron Conole  writes:
>
>> Paolo Valerio  writes:
>>
>>> The next header contained in the last extension header of the IPv6
>>> later frags still contain the information of the upper layer protocol
>>> number.
>>>
>>> Similarly to what OvS does for IPv4, allow L4 matches for later IPv6
>>> frags as well by processing later frags and storing the nw_proto
>>> information.
>>>
>>> Signed-off-by: Paolo Valerio 
>>> ---
>>
>> Digging in, I could never find a reason why ipv6 was treated differently
>> in this manner, but I guess there could be cases where the header layout
>> could cause problems (see the top of the while() block).  I don't know
>> that it's a practical concern (other values are probably less common),
>> but it was the only thing I could come up with for why the match might
>> behave this way.  I haven't done enough mailing list archeology to have
>> a good idea.
>
> Guess this may probably be due to (rfc8200 section-4.5):
>
> "The subsequent fragment packets are composed of:
>
>[...]
>
>(2)  A Fragment header containing:
>
>The Next Header value that identifies the first header
>after the Per-Fragment headers of the original packet.
>
>[...]"
>
> where the "first header" in the original packet could be another
> extension header or an upper-layer header, while the Fragment header is
> the last EH for later fragments (according to section 4.5).
>
> If I'm not missing something, it makes sense to drop the patch.
> It's probably a good thing if we document the different behavior between
> IPv4 and IPv6 for later frags, though.

Alternatively, we could check if the next header is a non-extension header and 
if so, mark it as such.

The only problem might be the case where packets do have extension headers 
after the fragmentation header, and they might be dropped (handled differently).

Anyhow, whatever we decide, we should document it in both the documentation and 
code.

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/2] dpif-netdev: Restructure rxq schedule logging.

2022-04-05 Thread David Marchand

On Tue, Mar 29, 2022 at 3:19 PM Kevin Traynor  wrote:
>
> Previously logging about rxq scheduling was done in a code branch with
> the selection of the PMD thread core after checking that a numa was
> selected.
>
> By splitting out the logging from the PMD thread core selection, it can
> simplify the code complexity and make it more extendable for future
> additions.
>
> Also, minor updates to a couple of variables to improve readability and
> fix a log indent while working on this code block.
>
> There is no user visibile change in behaviour or logs.

visible*

>
> Signed-off-by: Kevin Traynor 

I have two small comments (see below), but otherwise:

Acked-by: David Marchand 

> ---
>  lib/dpif-netdev.c | 59 +--
>  1 file changed, 32 insertions(+), 27 deletions(-)
>
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 40a62fd9f..dc414ae8f 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -5991,5 +5991,6 @@ sched_numa_list_schedule(struct sched_numa_list 
> *numa_list,
>  struct sched_pmd *sched_pmd = NULL;
>  struct sched_numa *numa;
> -int numa_id;
> +int port_numa_id;
> +int pmd_numa_id;

pmd_numa_id can be moved to the block where a pmd has been selected.


>  uint64_t proc_cycles;
>  char rxq_cyc_log[MAX_RXQ_CYC_STRLEN];
> @@ -6003,7 +6004,8 @@ sched_numa_list_schedule(struct sched_numa_list 
> *numa_list,
>  /* Store the cycles for this rxq as we will log these later. */
>  proc_cycles = dp_netdev_rxq_get_cycles(rxq, RXQ_CYCLES_PROC_HIST);
> -/* Select the numa that should be used for this rxq. */
> -numa_id = netdev_get_numa_id(rxq->port->netdev);
> -numa = sched_numa_list_lookup(numa_list, numa_id);
> +port_numa_id = netdev_get_numa_id(rxq->port->netdev);
> +

Nit: this makes port_numa_id next to a block with an unrelated
comment. I would add a separating newline.


> +/* Select numa. */
> +numa = sched_numa_list_lookup(numa_list, port_numa_id);
>
>  /* Check if numa has no PMDs or no non-isolated PMDs. */


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/2] dpif-netdev: Split function to find lowest loaded PMD thread core.

2022-04-05 Thread David Marchand

On Tue, Mar 29, 2022 at 3:19 PM Kevin Traynor  wrote:
>
> This splits up the looping through each PMD thread core on a numa node
> with the check to compare cycles or rxqs.
>
> This is done so in future the compare could be reused with any group
> of PMD thread cores.
>
> There is no user visibile change in behaviour.

visible*

>
> Signed-off-by: Kevin Traynor 
Acked-by: David Marchand 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

56 matches

Mail list logo