Re: [ovs-dev] [PATCH ovn 1/3] Don't save original dst IP and Port to avoid megaflow unwildcarding.

2022-07-06 Thread Han Zhou
On Wed, Jul 6, 2022 at 8:45 AM Dumitru Ceara  wrote:
>
> Hi Han,
>
> On 7/6/22 00:41, Han Zhou wrote:
> > The ls_in_pre_stateful priority 120 flow that saves dst IP and Port to
> > registers is causing a critical dataplane performance impact to
> > short-lived connections, because it unwildcards megaflows with exact
> > match on dst IP and L4 ports. Any new connections with a different
> > client side L4 port will encounter datapath flow miss and upcall to
> > ovs-vswitchd.
> >
> > These fields (dst IP and port) were saved to registers to solve a
> > problem of LB hairpin use case when different VIPs are sharing
> > overlapping backend+port [0]. The change [0] might not have as wide
> > performance impact as it is now because at that time one of the match
> > condition "REGBIT_CONNTRACK_NAT == 1" was set only for established and
> > natted traffic, while now the impact is more obvious because
> > REGBIT_CONNTRACK_NAT is now set for all IP traffic (if any VIP
> > configured on the LS) since commit [1], after several other indirectly
> > related optimizations and refactors.
> >
> > Since the changes that introduced the performance problem had their
> > own values (fixes problems or optimizes performance), so we don't want
> > to revert any of the changes (and it is also not straightforward to
> > revert any of them because there have been lots of changes and refactors
> > on top of them).
> >
> > Change [0] itself has added an alternative way to solve the overlapping
> > backends problem, which utilizes ct fields instead of saving dst IP and
> > port to registers. This patch forces to that approach and removes the
> > flows/actions that saves the dst IP and port to avoid the dataplane
> > performance problem for short-lived connections.
> >
> > (With this approach, the ct_state DNAT is not HW offload friendly, so it
> > may result in those flows not being offloaded, which is supposed to be
> > solved in a follow-up patch)
> >
> > [0] ce0ef8d59850 ("Properly handle hairpin traffic for VIPs with shared
backends.")
> > [1] 0038579d1928 ("northd: Optimize ct nat for load balancer traffic.")
> >
> > Signed-off-by: Han Zhou 
> > ---
>
> I think the main concern I have is that this forces us to choose between:
> a. non hwol friendly flows (reduced performance)
> b. less functionality (with the knob in patch 3/3 set to false).
>
Thanks Dumitru for the comments! I agree the solution is not ideal, but if
we look at it from a different angle, even with a), for most pod->service
traffic the performance is still much better than how it is today (not
offloaded kernel datapath is still much better than userspace slowpath).
And *hopefully* b) is ok for most use cases to get HW-offload capability.

> Change [0] was added to address the case when a service in kubernetes is
> exposed via two different k8s services objects that share the same
> endpoints.  That translates in ovn-k8s to two different OVN load
> balancer VIPs that share the same backends.  For such cases, if the
> service is being accessed by one of its own backends we need to be able
> to differentiate based on the VIP address it used to connect to.
>
> CC: Tim Rozet, Dan Williams for some more input from the ovn-k8s side on
> how common it is that an OVN-networked pod accesses two (or more)
> services that might have the pod itself as a backend.
>

Yes, we definitely need input from ovn-k8s side. The information we got so
far: the change [0] was to fix a bug [2] reported by Tim. However, the bug
description didn't mention anything about two VIPs sharing the same
backend. Tim also mentioned in the ovn-k8s meeting last week that the
original user bug report for [2] was [3], and [3] was in fact a completely
different problem (although it is related to hairpin, too). So, I am under
the impression that "an OVN-networked pod accesses two (or more) services
that might have the pod itself as a backend" might be a very rare use case,
if it exists at all.

> If this turns out to be mandatory I guess we might want to also look
> into alternatives like:
> - getting help from the HW to offload matches like ct_tuple()

I believe this is going to happen in the future. HWOL is continuously
enhanced.

> - limiting the impact of "a." only to some load balancers (e.g., would
> it help to use different hairpin lookup tables for such load balancers?)

I am not sure if this would work, and not sure if this is a good approach,
either. In general, I believe it is possible to solve the problem with more
complex pipelines, but we need to keep in mind it is quite easy to
introduce other performance problems (either control plane or data plane) -
many of the changes lead to the current implementation were for performance
optimizations, some for control plane, some for HWOL, and some for reducing
recirculations. I'd avoid complexity unless it is really necessary. Let's
get more input for the problem, and based on that we can decide if we want
to move to a more complex solution.

[2] 

Re: [ovs-dev] [PATCH 0/6] Remove OVS kernel driver

2022-07-06 Thread Gregory Rose




On 7/6/2022 9:57 AM, Greg Rose wrote:

It is time to remove support for the OVS kernel driver and push
towards use of the upstream Linux openvswitch kernel driver
in it's place [1].

There are many Linux specific source modules in the datapath that
will need eventual removal but some headers are still required for
the userspace code (which seems counterintuitive but...)

1.  https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393292.html

Greg Rose (6):
   acinclude.m4: Remove support for building the OVS kernel module
   rhel: Remove kernel mode spec
   rhel: Remove RHEL 6 kernel module spec
   tests: Remove support for check-kmod test
   Documentation: Remove kernel module documentation
   Disable unsupported kernel builds

  .github/workflows/build-and-test.yml  |  53 --
  Documentation/faq/releases.rst|   5 +-
  .../contributing/backporting-patches.rst  |   7 +
  Documentation/intro/install/fedora.rst|  24 -
  Documentation/intro/install/general.rst   |  63 --
  acinclude.m4  | 683 +-
  rhel/automake.mk  |  17 -
  rhel/kmod-openvswitch-rhel6.spec.in   | 123 
  rhel/openvswitch-kmod-fedora.spec.in  | 152 
  tests/automake.mk |   6 -
  10 files changed, 11 insertions(+), 1122 deletions(-)
  delete mode 100644 rhel/kmod-openvswitch-rhel6.spec.in
  delete mode 100644 rhel/openvswitch-kmod-fedora.spec.in



It seems I should have put the last patch first... Will reorder
the series and resend V2.

- Greg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 14/14] userspace: Enable TSO if available.

2022-07-06 Thread David Marchand
On Fri, Jul 1, 2022 at 5:58 AM Mike Pattrick  wrote:
>
> From: Flavio Leitner 
>
> Now that there is a segmentation in software as a fall back in
> case a netdev doesn't support TCP segmentation offloading (TSO),
> enable it by default on all possible netdevs.
>
> The existing TSO control is inverted, so that now it will disable
> TSO globally, in case TSO is not desired for some deployment.
>
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> ---
>  Documentation/topics/userspace-tso.rst |  21 ++--
>  NEWS   |   4 +
>  lib/netdev-linux.c | 133 -
>  lib/userspace-tso.c|   9 +-
>  tests/ofproto-macros.at|   1 +
>  vswitchd/vswitch.xml   |  17 +---
>  6 files changed, 154 insertions(+), 31 deletions(-)

We have one issue with vhost user client ports.

Consider the case with an OVS running  before this series is applied,
with userspace tso disabled (which is the case for existing OVS
installation).
I see that qemu negotiates TSO + ECN feature for a virtio port in the
vhost-user backend on OVS side:
2022-07-06T19:46:38.225Z|00175|dpdk|INFO|VHOST_CONFIG: negotiated
Virtio features: 0x17020a783

Next, I apply the whole series and restart OVS:
2022-07-06T19:53:29.121Z|00069|netdev_dpdk|INFO|vHost User device
'vhost1' created in 'client' mode, using client socket
'/var/lib/vhost_sockets/vhost1'
2022-07-06T19:53:29.122Z|00070|dpdk|INFO|VHOST_CONFIG: new device,
handle is 0, path is /var/lib/vhost_sockets/vhost1
2022-07-06T19:53:29.122Z|1|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_FEATURES
2022-07-06T19:53:29.122Z|2|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_PROTOCOL_FEATURES
2022-07-06T19:53:29.122Z|3|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_PROTOCOL_FEATURES
2022-07-06T19:53:29.122Z|4|dpdk|INFO|VHOST_CONFIG: negotiated
Vhost-user protocol features: 0xcbf
2022-07-06T19:53:29.122Z|5|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_QUEUE_NUM
2022-07-06T19:53:29.122Z|6|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_SLAVE_REQ_FD
2022-07-06T19:53:29.122Z|7|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_OWNER
2022-07-06T19:53:29.122Z|8|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_GET_FEATURES
2022-07-06T19:53:29.122Z|9|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_VRING_CALL
2022-07-06T19:53:29.123Z|00010|dpdk|INFO|VHOST_CONFIG: vring call idx:0 file:109
2022-07-06T19:53:29.123Z|00011|dpdk|INFO|VHOST_CONFIG: read message
VHOST_USER_SET_VRING_CALL
2022-07-06T19:53:29.123Z|00012|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:110
2022-07-06T19:53:29.123Z|00013|dpdk|INFO|VHOST_CONFIG: vhost peer closed

This happens for every vhost ports I have, in a loop flooding OVS logs.
Looking at qemu logs:
2022-07-06T19:53:17.581363Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.583183Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.587613Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.588464Z qemu-kvm: Unexpected end-of-file before
all data were read
2022-07-06T19:53:17.641010Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.641023Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Invalid argument (22)
2022-07-06T19:53:17.641035Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.641040Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Invalid argument (22)
2022-07-06T19:53:17.645027Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.645039Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.645047Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.645052Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.648953Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.648964Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.648971Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.648976Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.652951Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.652962Z qemu-kvm: vhost VQ 0 ring restore failed:
-1: Resource temporarily unavailable (11)
2022-07-06T19:53:17.652970Z qemu-kvm: Failed to set msg fds.
2022-07-06T19:53:17.652975Z qemu-kvm: vhost VQ 1 ring restore failed:
-1: Resource temporarily unavailable (11)
vhost lacks feature mask 8192 for backend
2022-07-06T19:53:29.122990Z qemu-kvm: failed to init vhost_net for queue 0
vhost lacks feature mask 8192 for backend
2022-07-06T19:53:29.259739Z qemu-kvm: failed to init vhost_net for queue 0
vhost lacks feature mask 8192 for backend

Afaiu, 8192 == 0x2000 which translates to bit 13.
VIRTIO_NET_F_HOST_ECN (13) Device can receive TSO with ECN.

Even though this 

Re: [ovs-dev] [PATCH v4 11/14] userspace: Enable L4 csum offloading by default.

2022-07-06 Thread David Marchand
I did not finish reviewing.

On Fri, Jul 1, 2022 at 5:58 AM Mike Pattrick  wrote:
>
> From: Flavio Leitner 
>
> The netdev receiving packets is supposed to provide the flags
> indicating if the L4 csum was verified and it is OK or BAD,
> otherwise the stack will check when appropriate by software.
>
> If the packet comes with good checksum, then postpone the
> checksum calculation to the egress device if needed.
>
> When encapsulate a packet with that flag, set the checksum
> of the inner L4 header since that is not yet supported.
>
> Calculate the L4 csum when the packet is going to be sent over
> a device that doesn't support the feature.
>
> Linux tap devices allows enabling L3 and L4 offload, so this
> patch enables the feature. However, Linux socket interface
> remains disabled because the API doesn't allow enabling
> those those features without enabling TSO too.

those two*

>
> Signed-off-by: Flavio Leitner 
> Co-authored-by: Mike Pattrick 
> Signed-off-by: Mike Pattrick 
> ---
>  lib/conntrack.c |  16 +--
>  lib/dp-packet.c |  23 +++-
>  lib/dp-packet.h |  97 ---
>  lib/flow.c  |  23 
>  lib/netdev-dpdk.c   | 136 -
>  lib/netdev-linux.c  | 253 ++--
>  lib/netdev-native-tnl.c |  32 +
>  lib/netdev.c|  39 ++-
>  lib/packets.c   | 175 +--
>  lib/packets.h   |   3 +
>  10 files changed, 542 insertions(+), 255 deletions(-)
>
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 11768da00..d7072e1e9 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -2105,13 +2105,13 @@ conn_key_extract(struct conntrack *ct, struct 
> dp_packet *pkt, ovs_be16 dl_type,
>  }
>
>  if (ok) {
> -bool hwol_bad_l4_csum = dp_packet_ol_l4_checksum_bad(pkt);
> -if (!hwol_bad_l4_csum) {
> -bool  hwol_good_l4_csum = dp_packet_ol_l4_checksum_good(pkt)
> -  || dp_packet_ol_tx_l4_checksum(pkt);
> +if (!dp_packet_ol_l4_checksum_bad(pkt)) {
> +
>  /* Validate the checksum only when hwol is not supported. */
>  if (extract_l4(>key, l4, dp_packet_l4_size(pkt),
> -   >icmp_related, l3, !hwol_good_l4_csum,
> +   >icmp_related, l3,
> +   !dp_packet_ol_l4_checksum_good(pkt) &&
> +   !dp_packet_ol_tx_l4_checksum(pkt),

Do we need to check for dp_packet_ol_tx_l4_checksum?


> NULL)) {
>  ctx->hash = conn_key_hash(>key, ct->hash_basis);
>  return true;
> @@ -3423,8 +3423,10 @@ handle_ftp_ctl(struct conntrack *ct, const struct 
> conn_lookup_ctx *ctx,
>  adj_seqnum(>tcp_seq, ec->seq_skew);
>  }
>
> -th->tcp_csum = 0;
> -if (!dp_packet_ol_tx_l4_checksum(pkt)) {
> +if (dp_packet_ol_tx_tcp_csum(pkt)) {
> +dp_packet_ol_reset_l4_csum_good(pkt);
> +} else {
> +th->tcp_csum = 0;
>  if (ctx->key.dl_type == htons(ETH_TYPE_IPV6)) {
>  th->tcp_csum = packet_csum_upperlayer6(nh6, th, 
> ctx->key.nw_proto,
> dp_packet_l4_size(pkt));

[snip]

> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 1eb2954ab..bfeb75add 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -145,17 +145,6 @@ typedef uint16_t dpdk_port_t;
>
>  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)
>
> -/* List of required flags advertised by the hardware that will be used
> - * if TSO is enabled. Ideally this should include
> - * RTE_ETH_TX_OFFLOAD_SCTP_CKSUM. However, very few drivers support that
> - * at the moment and SCTP is not a widely used protocol like TCP and UDP,
> - * so it's optional. */
> -#define DPDK_TX_TSO_OFFLOAD_FLAGS (RTE_ETH_TX_OFFLOAD_TCP_TSO\
> -   | RTE_ETH_TX_OFFLOAD_TCP_CKSUM\
> -   | RTE_ETH_TX_OFFLOAD_UDP_CKSUM\
> -   | RTE_ETH_TX_OFFLOAD_IPV4_CKSUM)
> -
> -
>  static const struct rte_eth_conf port_conf = {
>  .rxmode = {
>  .split_hdr_size = 0,
> @@ -398,8 +387,10 @@ enum dpdk_hw_ol_features {
>  NETDEV_RX_HW_CRC_STRIP = 1 << 1,
>  NETDEV_RX_HW_SCATTER = 1 << 2,
>  NETDEV_TX_IPV4_CKSUM_OFFLOAD = 1 << 3,
> -NETDEV_TX_TSO_OFFLOAD = 1 << 4,
> -NETDEV_TX_SCTP_CHECKSUM_OFFLOAD = 1 << 5,
> +NETDEV_TX_TCP_CKSUM_OFFLOAD = 1 << 4,
> +NETDEV_TX_UDP_CKSUM_OFFLOAD = 1 << 5,
> +NETDEV_TX_SCTP_CKSUM_OFFLOAD = 1 << 6,
> +NETDEV_TX_TSO_OFFLOAD = 1 << 7,
>  };
>
>  /*
> @@ -953,6 +944,35 @@ dpdk_watchdog(void *dummy OVS_UNUSED)
>  return NULL;
>  }
>
> +static void
> +netdev_dpdk_update_netdev_flag(struct netdev_dpdk *dev,
> +   enum dpdk_hw_ol_features hw_ol_features,
> +   enum 

Re: [ovs-dev] [PATCH v4 10/14] userspace: Enable IP checksum offloading by default.

2022-07-06 Thread David Marchand
On Wed, Jul 6, 2022 at 5:34 PM Mike Pattrick  wrote:
> > > Having to touch the packet data this late is scary, but I don't know a
> > > better place for now.
> >
> > We can move it to dp_netdev_upcall which only affects the userspace 
> > datapath.
> > Then the upcall handling code gets "complete" packet data, the same as
> > when it gets invoked in the kernel datapath case.
> >
>
> I think this solution also works; if it's preferred stylistically then
> I will make the change for the v5.
>
> Regarding the other comments, I generally agree with the areas you
> identified for refactor but think that it would fit best as a later
> patch. A lot of this code is modified throughout the patchset, so
> applying it now would be difficult.
>
> I propose the next version including:
>
> - conntrack comment
> - { on newline
> - reset dpdk ol_flags
> - relocating dp_packet_ol_send_prepare
>
> And the remaining changes to be added later.
>
> LMKWYT

Ok for me.
Thanks Mike.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v3] nb: Remove possibility of disabling logical datapath groups.

2022-07-06 Thread Mark Michelson

Thanks, Dumitru.

I pushed this to main.

On 7/6/22 09:58, Dumitru Ceara wrote:

In large scale scenarios this option hugely reduces the size of the
Southbound database positively affecting end to end performance.  In
such scenarios there's no real reason to ever disable datapath groups.

In lower scale scenarios any potential overhead due to logical datapath
groups is, very likely, negligible.

Aside from potential scalability concerns, the
NB.NB_Global.options:use_logical_dp_group knob was kept until now to
ensure that in case of a bug in the logical datapath groups code a CMS
may turn it off and fall back to the mode in which logical flows are not
grouped together.  As far as I know, this has never happened until now.

Moreover, datpath_groups are enabled by default since v21.09.0 (4 stable
releases ago), via 90daa7ce18dc ("northd: Enable logical dp groups by
default.").

 From a testing perspective removing this knob will halve the CI matrix.
This is desirable, especially in the context of more tests being added,
e.g.:
https://patchwork.ozlabs.org/project/ovn/patch/20220623110239.2973854-1-mh...@redhat.com/

This commit also adds OVN_FOR_EACH_NORTHD to the "ovn-northd -- lr
multiple gw ports NAT" test case.  That was previously skipped because
the duplicate logical flow would have been merged when running with
datapath groups enabled.  Instead, change the expected output to not
include the duplicate.

Acked-by: Mark Michelson 
Signed-off-by: Dumitru Ceara 
---
V3:
- Rebased.
V2:
- Removed unused lflow_segs as suggested by Mark.
- Added Mark's ack.
---
  NEWS|  1 +
  TODO.rst|  6 +++
  northd/northd.c | 78 ++---
  ovn-nb.xml  | 19 +++--
  tests/ovn-macros.at | 21 ++
  tests/ovn-northd.at | 95 +
  tests/ovs-macros.at |  4 +-
  7 files changed, 47 insertions(+), 177 deletions(-)

diff --git a/NEWS b/NEWS
index 3737907ca..20cea579e 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,7 @@ Post v22.06.0
  and ipsec_forceencaps=true (strongswan) to unconditionally enforce
  NAT-T UDP encapsulation. Requires OVS support for IPsec custom tunnel
  options (which will be available in OVS 2.18).
+  - Removed possibility of disabling logical datapath groups.
  
  OVN v22.06.0 - 03 Jun 2022

  --
diff --git a/TODO.rst b/TODO.rst
index 618ea4844..acbcacc4e 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -170,3 +170,9 @@ OVN To-do List
* physical.c has a global simap -localvif_to_ofport which stores the
  local OVS interfaces and the ofport numbers. Move this to the engine data
  of the engine data node - ed_type_pflow_output.
+
+* ovn-northd parallel logical flow processing
+
+  * Multi-threaded logical flow computation was optimized for the case
+when datapath groups are disabled.  Datpath groups are always enabled
+now so northd parallel processing should be revisited.
diff --git a/northd/northd.c b/northd/northd.c
index 964af992f..509186392 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4896,10 +4896,6 @@ ovn_lflow_equal(const struct ovn_lflow *a, const struct 
ovn_datapath *od,
  && nullable_string_is_equal(a->ctrl_meter, ctrl_meter));
  }
  
-/* If this option is 'true' northd will combine logical flows that differ by

- * logical datapath only by creating a datapath group. */
-static bool use_logical_dp_groups = false;
-
  enum {
  STATE_NULL,   /* parallelization is off */
  STATE_INIT_HASH_SIZES,/* parallelization is on; hashes sizing needed 
*/
@@ -4924,8 +4920,7 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct 
ovn_datapath *od,
  lflow->ctrl_meter = ctrl_meter;
  lflow->dpg = NULL;
  lflow->where = where;
-if ((parallelization_state != STATE_NULL)
-&& use_logical_dp_groups) {
+if (parallelization_state != STATE_NULL) {
  ovs_mutex_init(>odg_lock);
  }
  }
@@ -4935,7 +4930,7 @@ ovn_dp_group_add_with_reference(struct ovn_lflow 
*lflow_ref,
  struct ovn_datapath *od)
  OVS_NO_THREAD_SAFETY_ANALYSIS
  {
-if (!use_logical_dp_groups || !lflow_ref) {
+if (!lflow_ref) {
  return false;
  }
  
@@ -5014,13 +5009,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct ovn_datapath *od,

  struct ovn_lflow *old_lflow;
  struct ovn_lflow *lflow;
  
-if (use_logical_dp_groups) {

-old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
-   actions, ctrl_meter, hash);
-if (old_lflow) {
-ovn_dp_group_add_with_reference(old_lflow, od);
-return old_lflow;
-}
+old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
+   actions, ctrl_meter, hash);
+if (old_lflow) {
+ovn_dp_group_add_with_reference(old_lflow, od);
+return old_lflow;
 

[ovs-dev] [PATCH 6/6] Disable unsupported kernel builds

2022-07-06 Thread Greg Rose
Remove kernel based github workflows since the OVS kernel driver is
no longer supported since Release 2.18

Signed-off-by: Greg Rose 
---
 .github/workflows/build-and-test.yml | 53 
 1 file changed, 53 deletions(-)

diff --git a/.github/workflows/build-and-test.yml 
b/.github/workflows/build-and-test.yml
index 9e3583781..64454c5ea 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -9,21 +9,14 @@ jobs:
 automake libtool gcc bc libjemalloc1 libjemalloc-dev\
 libssl-dev llvm-dev libelf-dev libnuma-dev libpcap-dev  \
 ninja-build selinux-policy-dev
-  deb_dependencies: |
-linux-headers-$(uname -r) build-essential fakeroot devscripts equivs
-  AFXDP:   ${{ matrix.afxdp }}
   ASAN:${{ matrix.asan }}
   UBSAN:   ${{ matrix.ubsan }}
   CC:  ${{ matrix.compiler }}
-  DEB_PACKAGE: ${{ matrix.deb_package }}
   DPDK:${{ matrix.dpdk }}
   DPDK_SHARED: ${{ matrix.dpdk_shared }}
-  KERNEL:  ${{ matrix.kernel }}
-  KERNEL_LIST: ${{ matrix.kernel_list }}
   LIBS:${{ matrix.libs }}
   M32: ${{ matrix.m32 }}
   OPTS:${{ matrix.opts }}
-  TESTSUITE:   ${{ matrix.testsuite }}
 
 name: linux ${{ join(matrix.*, ' ') }}
 runs-on: ubuntu-18.04
@@ -38,56 +31,13 @@ jobs:
   - compiler: clang
 opts: --disable-ssl
 
-  - compiler: gcc
-testsuite:test
-kernel:   3.16
   - compiler: clang
 testsuite:test
-kernel:   3.16
 asan: asan
   - compiler: clang
 testsuite:test
-kernel:   3.16
 ubsan:ubsan
 
-  - compiler: gcc
-testsuite:test
-opts: --enable-shared
-  - compiler: clang
-testsuite:test
-opts: --enable-shared
-
-  - compiler: gcc
-testsuite:test
-dpdk: dpdk
-  - compiler: clang
-testsuite:test
-dpdk: dpdk
-
-  - compiler: gcc
-testsuite:test
-libs: -ljemalloc
-  - compiler: clang
-testsuite:test
-libs: -ljemalloc
-
-  - compiler: gcc
-kernel_list:  5.8 5.5 5.4 4.19
-  - compiler: clang
-kernel_list:  5.8 5.5 5.4 4.19
-
-  - compiler: gcc
-kernel_list:  4.14 4.9 4.4 3.16
-  - compiler: clang
-kernel_list:  4.14 4.9 4.4 3.16
-
-  - compiler: gcc
-afxdp:afxdp
-kernel:   5.3
-  - compiler: clang
-afxdp:afxdp
-kernel:   5.3
-
   - compiler: gcc
 dpdk: dpdk
 opts: --enable-shared
@@ -111,9 +61,6 @@ jobs:
 m32:  m32
 opts: --disable-ssl
 
-  - compiler: gcc
-deb_package:  deb
-
 steps:
 - name: checkout
   uses: actions/checkout@v2
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 4/6] tests: Remove support for check-kmod test

2022-07-06 Thread Greg Rose
The OVS kernel module is no longer supported as of OVS 2.18

Signed-off-by: Greg Rose 
---
 tests/automake.mk | 6 --
 1 file changed, 6 deletions(-)

diff --git a/tests/automake.mk b/tests/automake.mk
index b29cb783e..3496f9002 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -343,12 +343,6 @@ check-kernel: all
set $(SHELL) '$(SYSTEM_KMOD_TESTSUITE)' -C tests  
AUTOTEST_PATH='$(AUTOTEST_PATH)'; \
"$$@" $(TESTSUITEFLAGS) -j1 || (test X'$(RECHECK)' = Xyes && "$$@" 
--recheck)
 
-# Testing the out of tree Kernel module
-check-kmod: all
-   $(MAKE) modules_install
-   modprobe -r -a vport-geneve vport-gre vport-lisp vport-stt vport-vxlan 
openvswitch
-   $(MAKE) check-kernel
-
 check-system-userspace: all
set $(SHELL) '$(SYSTEM_USERSPACE_TESTSUITE)' -C tests  
AUTOTEST_PATH='$(AUTOTEST_PATH)'; \
"$$@" $(TESTSUITEFLAGS) -j1 || (test X'$(RECHECK)' = Xyes && "$$@" 
--recheck)
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 5/6] Documentation: Remove kernel module documentation

2022-07-06 Thread Greg Rose
As of Open vSwitch release 2.18 the OVS kernel module is no longer
supported.  Pull the documentation references.

Signed-off-by: Greg Rose 
---
 Documentation/faq/releases.rst|  5 +-
 .../contributing/backporting-patches.rst  |  7 +++
 Documentation/intro/install/fedora.rst| 24 ---
 Documentation/intro/install/general.rst   | 63 ---
 4 files changed, 10 insertions(+), 89 deletions(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index 8cfe2d392..ab2925a62 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -74,7 +74,7 @@ Q: What Linux kernel versions does each Open vSwitch release 
work with?
 2.14.x   3.16 to 5.5
 2.15.x   3.16 to 5.8
 2.16.x   3.16 to 5.8
-2.17.x   3.16 to 5.8
+2.17.x   N/A
  ==
 
 Open vSwitch userspace should also work with the Linux kernel module built
@@ -110,7 +110,8 @@ Q: Are all features available with all datapaths?
 Linux OVS tree
   The datapath implemented by the Linux kernel module distributed with
   the OVS source tree. This datapath is deprecated starting with OVS
-  2.15.x and support capped at Linux kernel version 5.8.
+  2.15.x and support capped at Linux kernel version 5.8. As of OVS 2.17.x
+  the Linux OVS tree is no longer supported.
 
 Userspace
   This datapath supports conventional system devices as well as
diff --git a/Documentation/internals/contributing/backporting-patches.rst 
b/Documentation/internals/contributing/backporting-patches.rst
index 162e9d209..8370c954d 100644
--- a/Documentation/internals/contributing/backporting-patches.rst
+++ b/Documentation/internals/contributing/backporting-patches.rst
@@ -119,6 +119,13 @@ userspace changes.
 How to backport kernel patches
 ~~
 
+These instructions only apply to Open vSwitch releases 2.16 and older.
+As of Open vSwitch branch 2.17 the Open vSwitch kernel module is no
+longer supported and only the Linux openvswitch kernel module is used.
+In the case of Open vSwitch releases 2.16 and older kernel backports
+may be required for bux fixes and feature implementation so these
+instructions are preserved for that reason.
+
 First, the patch should be submitted upstream to `netdev`. When the patch has
 been applied to `net-next`, it is ready to be backported. Starting from the
 Linux tree, use ``git format-patch`` to format each patch that should be
diff --git a/Documentation/intro/install/fedora.rst 
b/Documentation/intro/install/fedora.rst
index 06a0bd3d5..02481597f 100644
--- a/Documentation/intro/install/fedora.rst
+++ b/Documentation/intro/install/fedora.rst
@@ -132,36 +132,12 @@ tests.  This can take several minutes.
 $ make rpm-fedora RPMBUILD_OPT="--with check"
 
 
-Kernel OVS Tree Datapath RPM
-
-
-To build the Open vSwitch kernel module for the currently running kernel
-version, run:
-
-::
-
-$ make rpm-fedora-kmod
-
-To build the Open vSwitch kernel module for another kernel version, the desired
-kernel version can be specified via the `kversion` macro.  For example:
-
-::
-
-$ make rpm-fedora-kmod \
- RPMBUILD_OPT='-D "kversion 4.3.4-300.fc23.x86_64"'
-
 Installing
 --
 
 RPM packages can be installed by using the command ``rpm -i``. Package
 installation requires superuser privileges.
 
-The `openvswitch-kmod` RPM should be installed first if the Linux OVS tree
-datapath module is to be used. The `openvswitch-kmod` RPM should not be
-installed if only the in-tree Linux datapath or user-space datapath is needed.
-Refer to the :doc:`/faq/index` for more information about the various Open
-vSwitch datapath options.
-
 In most cases only the `openvswitch` RPM will need to be installed. The
 `python3-openvswitch`, `openvswitch-test`, `openvswitch-devel`, and
 `openvswitch-debuginfo` RPMs are optional unless required for a specific
diff --git a/Documentation/intro/install/general.rst 
b/Documentation/intro/install/general.rst
index a297aadac..c2208bbed 100644
--- a/Documentation/intro/install/general.rst
+++ b/Documentation/intro/install/general.rst
@@ -302,24 +302,6 @@ example::
 
 $ ./configure CFLAGS="-g -O2 -fsanitize=address -fno-omit-frame-pointer 
-fno-common"
 
-To build the Linux kernel module, so that you can run the kernel-based switch,
-pass the location of the kernel build directory on ``--with-linux``. For
-example, to build for a running instance of Linux::
-
-$ ./configure --with-linux=/lib/modules/$(uname -r)/build
-
-.. note::
-  If ``--with-linux`` requests building for an unsupported version of Linux,
-  then ``configure`` will fail with an error message. Refer to the
-  :doc:`/faq/index` for advice in that case.
-
-If you wish to build the kernel module for an architecture other than the
-architecture of the machine used for the build, you may specify the kernel
-architecture 

[ovs-dev] [PATCH 1/6] acinclude.m4: Remove support for building the OVS kernel module

2022-07-06 Thread Greg Rose
Since the openvswitch project inception it has had support for building
a Linux kernel module to support the OVS kernel datapath.  Since Linux
kernel release 5.8 support for newer kernels has been deprecated.  Now
is the time to fully discontinue support for building the openvswitch
kernel driver. Since Linux 5.9 the Linux built-in openvswitch kernel
driver supports all necessary features and functions of the kernel
datapath and the need to support this additional "out of tree" kernel
module is gone.

Remove the --with-linux configuration support from the acinclude.m4
configuration and warn user it is not supported any longer.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 683 +--
 1 file changed, 1 insertion(+), 682 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index b518aa624..30ee143e4 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -115,132 +115,11 @@ dnl OVS_CHECK_LINUX
 dnl
 dnl Configure linux kernel source tree
 AC_DEFUN([OVS_CHECK_LINUX], [
-  AC_ARG_WITH([linux],
-  [AC_HELP_STRING([--with-linux=/path/to/linux],
-  [Specify the Linux kernel build directory])])
-  AC_ARG_WITH([linux-source],
-  [AC_HELP_STRING([--with-linux-source=/path/to/linux-source],
-  [Specify the Linux kernel source directory
-   (usually figured out automatically from build
-   directory)])])
-
-  # Deprecated equivalents to --with-linux, --with-linux-source.
-  AC_ARG_WITH([l26])
-  AC_ARG_WITH([l26-source])
-
   if test X"$with_linux" != X; then
-KBUILD=$with_linux
-AC_MSG_WARN([--with-linux is deprecated and kernel support is limited to 
5.8 and below])
-  elif test X"$with_l26" != X; then
-KBUILD=$with_l26
-AC_MSG_WARN([--with-l26 is deprecated, please use --with-linux instead])
-  else
+AC_MSG_WARN([--with-linux is no longer supported])
 KBUILD=
   fi
 
-  if test X"$KBUILD" != X; then
-if test X"$with_linux_source" != X; then
-  KSRC=$with_linux_source
-elif test X"$with_l26_source" != X; then
-  KSRC=$with_l26_source
-  AC_MSG_WARN([--with-l26-source is deprecated, please use 
--with-linux-source instead])
-else
-  KSRC=
-fi
-  elif test X"$with_linux_source" != X || test X"$with_l26_source" != X; then
-AC_MSG_ERROR([Linux source directory may not be specified without Linux 
build directory])
-  fi
-
-  if test -n "$KBUILD"; then
-KBUILD=`eval echo "$KBUILD"`
-case $KBUILD in
-/*) ;;
-*) KBUILD=`pwd`/$KBUILD ;;
-esac
-
-# The build directory is what the user provided.
-# Make sure that it exists.
-AC_MSG_CHECKING([for Linux build directory])
-if test -d "$KBUILD"; then
-AC_MSG_RESULT([$KBUILD])
-AC_SUBST(KBUILD)
-else
-AC_MSG_RESULT([no])
-AC_ERROR([source dir $KBUILD doesn't exist])
-fi
-
-# Debian breaks kernel headers into "source" header and "build" headers.
-# We want the source headers, but $KBUILD gives us the "build" headers.
-# Use heuristics to find the source headers.
-AC_MSG_CHECKING([for Linux source directory])
-if test -n "$KSRC"; then
-  KSRC=`eval echo "$KSRC"`
-  case $KSRC in
-  /*) ;;
-  *) KSRC=`pwd`/$KSRC ;;
-  esac
-  if test ! -e $KSRC/include/linux/kernel.h; then
-AC_MSG_ERROR([$KSRC is not a kernel source directory])
-  fi
-else
-  KSRC=$KBUILD
-  if test ! -e $KSRC/include/linux/kernel.h; then
-# Debian kernel build Makefiles tend to include a line of the form:
-# MAKEARGS := -C /usr/src/linux-headers-3.2.0-1-common 
O=/usr/src/linux-headers-3.2.0-1-486
-# First try to extract the source directory from this line.
-KSRC=`sed -n 's/.*-C \([[^ ]]*\).*/\1/p' "$KBUILD"/Makefile`
-if test ! -e "$KSRC"/include/linux/kernel.h; then
-  # Didn't work.  Fall back to name-based heuristics that used to work.
-  case `echo "$KBUILD" | sed 's,/*$,,'` in # (
-*/build)
-  KSRC=`echo "$KBUILD" | sed 's,/build/*$,/source,'`
-  ;; # (
-*)
-  KSRC=`(cd $KBUILD && pwd -P) | sed 's,-[[^-]]*$,-common,'`
-  ;;
-  esac
-fi
-  fi
-  if test ! -e "$KSRC"/include/linux/kernel.h; then
-AC_MSG_ERROR([cannot find source directory (please use 
--with-linux-source)])
-  fi
-fi
-AC_MSG_RESULT([$KSRC])
-
-AC_MSG_CHECKING([for kernel version])
-version=`sed -n 's/^VERSION = //p' "$KSRC/Makefile"`
-patchlevel=`sed -n 's/^PATCHLEVEL = //p' "$KSRC/Makefile"`
-sublevel=`sed -n 's/^SUBLEVEL = //p' "$KSRC/Makefile"`
-if test X"$version" = X || test X"$patchlevel" = X; then
-   AC_ERROR([cannot determine kernel version])
-elif test X"$sublevel" = X; then
-   kversion=$version.$patchlevel
-else
-   

[ovs-dev] [PATCH 3/6] rhel: Remove RHEL 6 kernel module spec

2022-07-06 Thread Greg Rose
Remove the RHEL 6 kernel driver module specification.

Signed-off-by: Greg Rose 
---
 rhel/automake.mk|   5 --
 rhel/kmod-openvswitch-rhel6.spec.in | 123 
 2 files changed, 128 deletions(-)
 delete mode 100644 rhel/kmod-openvswitch-rhel6.spec.in

diff --git a/rhel/automake.mk b/rhel/automake.mk
index 235779b49..e0f5ec013 100644
--- a/rhel/automake.mk
+++ b/rhel/automake.mk
@@ -15,8 +15,6 @@ EXTRA_DIST += \
rhel/etc_sysconfig_network-scripts_ifup-ovs \
rhel/openvswitch-dkms.spec \
rhel/openvswitch-dkms.spec.in \
-   rhel/kmod-openvswitch-rhel6.spec \
-   rhel/kmod-openvswitch-rhel6.spec.in \
rhel/openvswitch.spec \
rhel/openvswitch.spec.in \
rhel/openvswitch-fedora.spec \
@@ -42,9 +40,6 @@ update_rhel_spec = \
 $(srcdir)/rhel/openvswitch-dkms.spec: rhel/openvswitch-dkms.spec.in 
$(top_builddir)/config.status
$(update_rhel_spec)
 
-$(srcdir)/rhel/kmod-openvswitch-rhel6.spec: 
rhel/kmod-openvswitch-rhel6.spec.in $(top_builddir)/config.status
-   $(update_rhel_spec)
-
 $(srcdir)/rhel/openvswitch.spec: rhel/openvswitch.spec.in 
$(top_builddir)/config.status
$(update_rhel_spec)
 
diff --git a/rhel/kmod-openvswitch-rhel6.spec.in 
b/rhel/kmod-openvswitch-rhel6.spec.in
deleted file mode 100644
index de69863d7..0
--- a/rhel/kmod-openvswitch-rhel6.spec.in
+++ /dev/null
@@ -1,123 +0,0 @@
-# Spec file for Open vSwitch kernel modules on Red Hat Enterprise
-# Linux 6.
-
-# Copyright (C) 2011, 2012, 2018 Nicira, Inc.
-#
-# Copying and distribution of this file, with or without modification,
-# are permitted in any medium without royalty provided the copyright
-# notice and this notice are preserved.  This file is offered as-is,
-# without warranty of any kind.
-
-%define oname openvswitch
-%{!?release_number:%define release_number 1}
-
-Name:   kmod-%{oname}
-Version:@VERSION@
-Release:%{release_number}%{?dist}
-Summary:Open vSwitch kernel module
-
-Group:  System/Kernel
-License:GPLv2
-URL:http://openvswitch.org/
-Source0:%{oname}-%{version}.tar.gz
-BuildRoot:  %(mktemp -ud %{_tmppath}/%{name}-%{version}-%{release}-XX)
-
-# Without this we get an empty openvswitch-debuginfo package (whose name
-# conflicts with the openvswitch-debuginfo package for OVS userspace).
-%undefine _enable_debug_packages
-
-%define kernel_source_extended() /usr/src/kernels/%{2}$([ %{1} = default ] || 
echo ".%{1}")
-
-# Use -D 'kversion 2.6.32-131.6.1.el6.x86_64' to build package
-# for specified kernel version.
-# Use -D 'kversion 3.10.0-693.1.1.el7.x86_64 3.10.0-693.17.1.el7.x86_64'
-# to build package for mulitple kernel versions in the same package
-# This only works for kernel 3.10.0 major revision 693 (RHEL 7.4)
-# and major revision 327 (RHEL 7.2)
-# By default, build against the latest installed kernel-devel
-%{!?kversion:%global kversion %(rpm -qa | egrep "^kernel(-rt|-aarch64)?-devel" 
| /usr/lib/rpm/redhat/rpmsort -r | head -n 1| sed "s/^kernel.*-devel-//")}
-
-# Use -D 'kflavors default debug kdump' to build packages for
-# specified kernel variants.
-%{!?kflavors:%global kflavors default}
-
-%description
-Open vSwitch Linux kernel module.
-
-%prep
-
-%setup -n %{oname}-%{version}
-
-%build
-for kv in %{kversion}; do
-for flavor in %{kflavors}; do
-mkdir -p _$flavor/_$kv
-(cd _$flavor/_$kv && ../../configure 
--with-linux="%{kernel_source_extended $flavor $kv}")
-%{__make} -C _$flavor/_$kv/datapath/linux %{?_smp_mflags}
-done
-done
-
-%install
-export INSTALL_MOD_PATH=$RPM_BUILD_ROOT
-export INSTALL_MOD_DIR=extra/%{oname}
-for kv in %{kversion}; do
-for flavor in %{kflavors} ; do
-make -C %{kernel_source_extended $flavor $kv} modules_install \
-M="`pwd`"/_$flavor/_$kv/datapath/linux
-# Cleanup unnecessary kernel-generated module dependency files.
-find $INSTALL_MOD_PATH/lib/modules -iname 'modules.*' -exec rm {} \;
-done
-done
-install -d %{buildroot}%{_sysconfdir}/depmod.d/
-for kv in %{kversion}; do
-for module in %{buildroot}/lib/modules/$kv/$INSTALL_MOD_DIR/*.ko;
-do
-modname="$(basename ${module})"
-grep -qsPo "^\s*override ${modname%.ko} \* extra\/%{oname}" 
%{oname}.conf || \
-echo "override ${modname%.ko} * extra/%{oname}" >> %{oname}.conf
-grep -qsPo "^\s*override ${modname%.ko} \* weak-updates\/%{oname}" 
%{oname}.conf || \
-echo "override ${modname%.ko} * weak-updates/%{oname}" >> 
%{oname}.conf
-done
-done
-install -m 644 %{oname}.conf %{buildroot}%{_sysconfdir}/depmod.d/
-install -d -m 0755 $RPM_BUILD_ROOT/usr/share/%{oname}/scripts
-install -p -m 0755 rhel/usr_share_openvswitch_scripts_ovs-kmod-manage.sh \
-$RPM_BUILD_ROOT/usr/share/%{oname}/scripts/ovs-kmod-manage.sh
-
-%post
-current_kernel=$(uname -r)
-IFS=. read installed_major installed_minor installed_micro 

[ovs-dev] [PATCH 2/6] rhel: Remove kernel mode spec

2022-07-06 Thread Greg Rose
Remove the kernel driver specification for RHEL 7.x, 8.x and Fedora.

Signed-off-by: Greg Rose 
---
 rhel/automake.mk |  12 ---
 rhel/openvswitch-kmod-fedora.spec.in | 152 ---
 2 files changed, 164 deletions(-)
 delete mode 100644 rhel/openvswitch-kmod-fedora.spec.in

diff --git a/rhel/automake.mk b/rhel/automake.mk
index c75406e05..235779b49 100644
--- a/rhel/automake.mk
+++ b/rhel/automake.mk
@@ -17,8 +17,6 @@ EXTRA_DIST += \
rhel/openvswitch-dkms.spec.in \
rhel/kmod-openvswitch-rhel6.spec \
rhel/kmod-openvswitch-rhel6.spec.in \
-   rhel/openvswitch-kmod-fedora.spec \
-   rhel/openvswitch-kmod-fedora.spec.in \
rhel/openvswitch.spec \
rhel/openvswitch.spec.in \
rhel/openvswitch-fedora.spec \
@@ -47,9 +45,6 @@ $(srcdir)/rhel/openvswitch-dkms.spec: 
rhel/openvswitch-dkms.spec.in $(top_buildd
 $(srcdir)/rhel/kmod-openvswitch-rhel6.spec: 
rhel/kmod-openvswitch-rhel6.spec.in $(top_builddir)/config.status
$(update_rhel_spec)
 
-$(srcdir)/rhel/openvswitch-kmod-fedora.spec: 
rhel/openvswitch-kmod-fedora.spec.in $(top_builddir)/config.status
-   $(update_rhel_spec)
-
 $(srcdir)/rhel/openvswitch.spec: rhel/openvswitch.spec.in 
$(top_builddir)/config.status
$(update_rhel_spec)
 
@@ -67,10 +62,3 @@ rpm-fedora: dist $(srcdir)/rhel/openvswitch-fedora.spec
  -D "_topdir ${RPMBUILD_TOP}" \
  -ba $(srcdir)/rhel/openvswitch-fedora.spec
 
-# Build kernel datapath RPM
-rpm-fedora-kmod: dist $(srcdir)/rhel/openvswitch-kmod-fedora.spec
-   ${MKDIR_P} ${RPMBUILD_TOP}/SOURCES
-   cp ${DIST_ARCHIVES} ${RPMBUILD_TOP}/SOURCES
-   rpmbuild -D "kversion $(shell uname -r)" ${RPMBUILD_OPT} \
- -D "_topdir ${RPMBUILD_TOP}" \
- -ba $(srcdir)/rhel/openvswitch-kmod-fedora.spec
diff --git a/rhel/openvswitch-kmod-fedora.spec.in 
b/rhel/openvswitch-kmod-fedora.spec.in
deleted file mode 100644
index e5f78701f..0
--- a/rhel/openvswitch-kmod-fedora.spec.in
+++ /dev/null
@@ -1,152 +0,0 @@
-# Spec file for Open vSwitch.
-
-# Copyright (C) 2009, 2010, 2015, 2018 Nicira Networks, Inc.
-#
-# Copying and distribution of this file, with or without modification,
-# are permitted in any medium without royalty provided the copyright
-# notice and this notice are preserved.  This file is offered as-is,
-# without warranty of any kind.
-
-%global debug_package %{nil}
-
-# Use the kversion macro such as
-# RPMBUILD_OPT='-D "kversion 3.10.0-693.1.1.el7.x86_64 
3.10.0-693.17.1.el7.x86_64"'
-# to build package for mulitple kernel versions in the same package
-# This only works for the following kernels.
-#   - 3.10.0 major revision 327  (RHEL 7.2)
-#   - 3.10.0 major revision 693  (RHEL 7.4)
-#   - 3.10.0 major revision 957  (RHEL 7.6)
-#   - 3.10.0 major revision 1062 (RHEL 7.7)
-#   - 3.10.0 major revision 1101 (RHEL 7.8 Beta)
-#   - 3.10.0 major revision 1127 (RHEL 7.8 GA)
-#   - 3.10.0 major revision 1160 (RHEL 7.9 GA)
-# By default, build against the current running kernel version
-#%define kernel 3.1.5-1.fc16.x86_64
-#define kernel %{kernel_source}
-%{?kversion:%define kernel %kversion}
-
-%{!?release_number:%define release_number 1}
-
-Name: openvswitch-kmod
-Summary: Open vSwitch Kernel Modules
-Group: System Environment/Daemons
-URL: http://www.openvswitch.org/
-Vendor: OpenSource Security Ralf Spenneberg 
-Version: @VERSION@
-
-# The entire source code is ASL 2.0 except datapath/ which is GPLv2
-License: GPLv2
-Release: %{release_number}%{?dist}
-Source: openvswitch-%{version}.tar.gz
-#Source1: openvswitch-init
-Buildroot: /tmp/openvswitch-xen-rpm
-Provides: kmod-openvswitch
-Obsoletes: kmod-openvswitch < %{version}-%{release}
-
-%description
-Open vSwitch provides standard network bridging functions augmented with
-support for the OpenFlow protocol for remote per-flow control of
-traffic. This package contains the kernel modules.
-
-%prep
-%setup -q -n openvswitch-%{version}
-
-%build
-for kv in %{kversion}; do
-mkdir -p _$kv
-(cd _$kv && /bin/cp -f ../configure . && %configure --srcdir=.. \
---with-linux=/lib/modules/${kv}/build --enable-ssl 
%{_ovs_config_extra_flags})
-make %{_smp_mflags} -C _$kv/datapath/linux
-done
-
-%install
-export INSTALL_MOD_DIR=extra/openvswitch
-rm -rf $RPM_BUILD_ROOT
-for kv in %{kversion}; do
-make INSTALL_MOD_PATH=$RPM_BUILD_ROOT -C _$kv/datapath/linux 
modules_install
-done
-mkdir -p $RPM_BUILD_ROOT/etc/depmod.d
-for kv in %{kversion}; do
-for module in $RPM_BUILD_ROOT/lib/modules/${kv}/extra/openvswitch/*.ko
-do
-modname="$(basename ${module})"
-grep -qsPo "^\s*override ${modname%.ko} \* extra\/openvwitch" \
-$RPM_BUILD_ROOT/etc/depmod.d/kmod-openvswitch.conf || \
-echo "override ${modname%.ko} * extra/openvswitch" >> \
-$RPM_BUILD_ROOT/etc/depmod.d/kmod-openvswitch.conf
-grep -qsPo "^\s*override ${modname%.ko} \* 

[ovs-dev] [PATCH 0/6] Remove OVS kernel driver

2022-07-06 Thread Greg Rose
It is time to remove support for the OVS kernel driver and push
towards use of the upstream Linux openvswitch kernel driver
in it's place [1].

There are many Linux specific source modules in the datapath that
will need eventual removal but some headers are still required for
the userspace code (which seems counterintuitive but...)

1.  https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393292.html

Greg Rose (6):
  acinclude.m4: Remove support for building the OVS kernel module
  rhel: Remove kernel mode spec
  rhel: Remove RHEL 6 kernel module spec
  tests: Remove support for check-kmod test
  Documentation: Remove kernel module documentation
  Disable unsupported kernel builds

 .github/workflows/build-and-test.yml  |  53 --
 Documentation/faq/releases.rst|   5 +-
 .../contributing/backporting-patches.rst  |   7 +
 Documentation/intro/install/fedora.rst|  24 -
 Documentation/intro/install/general.rst   |  63 --
 acinclude.m4  | 683 +-
 rhel/automake.mk  |  17 -
 rhel/kmod-openvswitch-rhel6.spec.in   | 123 
 rhel/openvswitch-kmod-fedora.spec.in  | 152 
 tests/automake.mk |   6 -
 10 files changed, 11 insertions(+), 1122 deletions(-)
 delete mode 100644 rhel/kmod-openvswitch-rhel6.spec.in
 delete mode 100644 rhel/openvswitch-kmod-fedora.spec.in

-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn 1/3] Don't save original dst IP and Port to avoid megaflow unwildcarding.

2022-07-06 Thread Dumitru Ceara
Hi Han,

On 7/6/22 00:41, Han Zhou wrote:
> The ls_in_pre_stateful priority 120 flow that saves dst IP and Port to
> registers is causing a critical dataplane performance impact to
> short-lived connections, because it unwildcards megaflows with exact
> match on dst IP and L4 ports. Any new connections with a different
> client side L4 port will encounter datapath flow miss and upcall to
> ovs-vswitchd.
> 
> These fields (dst IP and port) were saved to registers to solve a
> problem of LB hairpin use case when different VIPs are sharing
> overlapping backend+port [0]. The change [0] might not have as wide
> performance impact as it is now because at that time one of the match
> condition "REGBIT_CONNTRACK_NAT == 1" was set only for established and
> natted traffic, while now the impact is more obvious because
> REGBIT_CONNTRACK_NAT is now set for all IP traffic (if any VIP
> configured on the LS) since commit [1], after several other indirectly
> related optimizations and refactors.
> 
> Since the changes that introduced the performance problem had their
> own values (fixes problems or optimizes performance), so we don't want
> to revert any of the changes (and it is also not straightforward to
> revert any of them because there have been lots of changes and refactors
> on top of them).
> 
> Change [0] itself has added an alternative way to solve the overlapping
> backends problem, which utilizes ct fields instead of saving dst IP and
> port to registers. This patch forces to that approach and removes the
> flows/actions that saves the dst IP and port to avoid the dataplane
> performance problem for short-lived connections.
> 
> (With this approach, the ct_state DNAT is not HW offload friendly, so it
> may result in those flows not being offloaded, which is supposed to be
> solved in a follow-up patch)
> 
> [0] ce0ef8d59850 ("Properly handle hairpin traffic for VIPs with shared 
> backends.")
> [1] 0038579d1928 ("northd: Optimize ct nat for load balancer traffic.")
> 
> Signed-off-by: Han Zhou 
> ---

I think the main concern I have is that this forces us to choose between:
a. non hwol friendly flows (reduced performance)
b. less functionality (with the knob in patch 3/3 set to false).

Change [0] was added to address the case when a service in kubernetes is
exposed via two different k8s services objects that share the same
endpoints.  That translates in ovn-k8s to two different OVN load
balancer VIPs that share the same backends.  For such cases, if the
service is being accessed by one of its own backends we need to be able
to differentiate based on the VIP address it used to connect to.

CC: Tim Rozet, Dan Williams for some more input from the ovn-k8s side on
how common it is that an OVN-networked pod accesses two (or more)
services that might have the pod itself as a backend.

If this turns out to be mandatory I guess we might want to also look
into alternatives like:
- getting help from the HW to offload matches like ct_tuple()
- limiting the impact of "a." only to some load balancers (e.g., would
it help to use different hairpin lookup tables for such load balancers?)

Thanks,
Dumitru

>  controller/lflow.c   |  74 ++--
>  include/ovn/logical-fields.h |   5 -
>  lib/lb.c |   3 -
>  lib/lb.h |   3 -
>  northd/northd.c  |  95 ---
>  northd/ovn-northd.8.xml  |  30 +
>  tests/ovn-northd.at  |  78 -
>  tests/ovn.at | 218 +--
>  8 files changed, 141 insertions(+), 365 deletions(-)
> 
> diff --git a/controller/lflow.c b/controller/lflow.c
> index 934b23698..a44f6d056 100644
> --- a/controller/lflow.c
> +++ b/controller/lflow.c
> @@ -1932,10 +1932,6 @@ add_lb_vip_hairpin_reply_action(struct in6_addr *vip6, 
> ovs_be32 vip,
>  }
>  
>  /* Adds flows to detect hairpin sessions.
> - *
> - * For backwards compatibilty with older ovn-northd versions, uses
> - * ct_nw_dst(), ct_ipv6_dst(), ct_tp_dst(), otherwise uses the
> - * original destination tuple stored by ovn-northd.
>   */
>  static void
>  add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
> @@ -1956,10 +1952,8 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
>  /* Matching on ct_nw_dst()/ct_ipv6_dst()/ct_tp_dst() requires matching
>   * on ct_state first.
>   */
> -if (!lb->hairpin_orig_tuple) {
> -uint32_t ct_state = OVS_CS_F_TRACKED | OVS_CS_F_DST_NAT;
> -match_set_ct_state_masked(_match, ct_state, ct_state);
> -}
> +uint32_t ct_state = OVS_CS_F_TRACKED | OVS_CS_F_DST_NAT;
> +match_set_ct_state_masked(_match, ct_state, ct_state);
>  
>  if (IN6_IS_ADDR_V4MAPPED(_vip->vip)) {
>  ovs_be32 bip4 = in6_addr_get_mapped_ipv4(_backend->ip);
> @@ -1971,14 +1965,7 @@ add_lb_vip_hairpin_flows(struct ovn_controller_lb *lb,
>  match_set_dl_type(_match, htons(ETH_TYPE_IP));
>  match_set_nw_src(_match, bip4);
>  

Re: [ovs-dev] [PATCH v4 10/14] userspace: Enable IP checksum offloading by default.

2022-07-06 Thread Mike Pattrick
On Wed, Jul 6, 2022 at 9:02 AM David Marchand  wrote:
>
> On Mon, Jul 4, 2022 at 10:24 PM David Marchand
>  wrote:
> > > diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> > > index 57f94df54..164738503 100644
> > > --- a/ofproto/ofproto-dpif-upcall.c
> > > +++ b/ofproto/ofproto-dpif-upcall.c
> > > @@ -215,7 +215,7 @@ struct upcall {
> > >  enum odp_key_fitness fitness;  /* Fitness of 'flow' relative to ODP 
> > > key. */
> > >  const ovs_u128 *ufid;  /* Unique identifier for 'flow'. */
> > >  unsigned pmd_id;   /* Datapath poll mode driver id. */
> > > -const struct dp_packet *packet;   /* Packet associated with this 
> > > upcall. */
> > > +struct dp_packet *packet;  /* Packet associated with this 
> > > upcall. */
> > >  ofp_port_t ofp_in_port;/* OpenFlow in port, or OFPP_NONE. */
> > >  uint16_t mru;  /* If !0, Maximum receive unit of
> > >fragmented IP packet */
> > > @@ -395,7 +395,7 @@ static void delete_op_init(struct udpif *udpif, 
> > > struct ukey_op *op,
> > > struct udpif_key *ukey);
> > >
> > >  static int upcall_receive(struct upcall *, const struct dpif_backer *,
> > > -  const struct dp_packet *packet, enum 
> > > dpif_upcall_type,
> > > +  struct dp_packet *packet, enum 
> > > dpif_upcall_type,
> > >const struct nlattr *userdata, const struct 
> > > flow *,
> > >const unsigned int mru,
> > >const ovs_u128 *ufid, const unsigned pmd_id);
> > > @@ -1140,7 +1140,7 @@ compose_slow_path(struct udpif *udpif, struct 
> > > xlate_out *xout,
> > >   * since the 'upcall->put_actions' remains uninitialized. */
> > >  static int
> > >  upcall_receive(struct upcall *upcall, const struct dpif_backer *backer,
> > > -   const struct dp_packet *packet, enum dpif_upcall_type 
> > > type,
> > > +   struct dp_packet *packet, enum dpif_upcall_type type,
> > > const struct nlattr *userdata, const struct flow *flow,
> > > const unsigned int mru,
> > > const ovs_u128 *ufid, const unsigned pmd_id)
> > > @@ -1336,7 +1336,7 @@ should_install_flow(struct udpif *udpif, struct 
> > > upcall *upcall)
> > >  }
> > >
> > >  static int
> > > -upcall_cb(const struct dp_packet *packet, const struct flow *flow, 
> > > ovs_u128 *ufid,
> > > +upcall_cb(struct dp_packet *packet, const struct flow *flow, ovs_u128 
> > > *ufid,
> > >unsigned pmd_id, enum dpif_upcall_type type,
> > >const struct nlattr *userdata, struct ofpbuf *actions,
> > >struct flow_wildcards *wc, struct ofpbuf *put_actions, void 
> > > *aux)
> > > @@ -1446,7 +1446,7 @@ static int
> > >  process_upcall(struct udpif *udpif, struct upcall *upcall,
> > > struct ofpbuf *odp_actions, struct flow_wildcards *wc)
> > >  {
> > > -const struct dp_packet *packet = upcall->packet;
> > > +struct dp_packet *packet = upcall->packet;
> > >  const struct flow *flow = upcall->flow;
> > >  size_t actions_len = 0;
> > >
> > > @@ -1524,6 +1524,10 @@ process_upcall(struct udpif *udpif, struct upcall 
> > > *upcall,
> > >  break;
> > >  }
> > >
> > > +/* The packet is going to be encapsulated and sent to
> > > + * the controller. */
> > > +dp_packet_ol_send_prepare(packet, 0);
> >
> > Having to touch the packet data this late is scary, but I don't know a
> > better place for now.
>
> We can move it to dp_netdev_upcall which only affects the userspace datapath.
> Then the upcall handling code gets "complete" packet data, the same as
> when it gets invoked in the kernel datapath case.
>

I think this solution also works; if it's preferred stylistically then
I will make the change for the v5.

Regarding the other comments, I generally agree with the areas you
identified for refactor but think that it would fit best as a later
patch. A lot of this code is modified throughout the patchset, so
applying it now would be difficult.

I propose the next version including:

- conntrack comment
- { on newline
- reset dpdk ol_flags
- relocating dp_packet_ol_send_prepare

And the remaining changes to be added later.

LMKWYT

Cheers,
M

>
> --
> David Marchand
>

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5 1/2] handlers: Create additional handler threads when using CPU isolation

2022-07-06 Thread Aaron Conole
Michael Santana  writes:

> On 7/4/22 09:46, Ilya Maximets wrote:
>> Hi, Michael.  Thanks for the new version!
> Hi Ilya, thanks for the response
>> On 6/6/22 20:59, Michael Santana wrote:
>>> Additional threads are required to service upcalls when we have CPU
>>> isolation (in per-cpu dispatch mode). The reason additional threads
>>> are required is because it creates a more fair distribution.
>> I think, the description above lacks the information on why we need
>> to create additional threads, and why we choose the number to be a
>> prime number.  Yes, you said that it creates a more fair distribution,
>> but we probably need to describe that more with an example.
>> 
>>> With more
>>> threads we decrease the load of each thread as more threads would
>>> decrease the number of cores each threads is assigned.
>> While that is true, it's not obvious why increasing the number of
>> threads beyond the number of available cores is good for us.
>> The key factor is more fair load distribution among threads, so all
>> cores are utilized, but that is not highlighted.
> Yes agreed.
>
> We want additional threads because we want to minimize the number of
> cores assigned to individual threads. 75 handler threads servicing 100 
> cores is more optimal than 50 threads servicing 100 cores. This is
> because on the 75 handler case, each thread would have to service on 
> average 1.33 cores where as the 50 handler case each thread would have
> to service 2 cores. Less cores assigned to individual threads less to 
> less work that thread has to do. This example ignores the number of
> actual cores available to use for OVS userspace.
>
> On the flip side, we do not wish to create too many threads as we fear
> we end up in a case where we have too many threads and not enough
> active cores OVS can use
>
> Which brings me to my last point. The prime schema is arbitrary (or
> good enough for now). We really just want more threads and there is no
> reason why prime schema is better than the next (at least not without
> knowing how many threads we can get away with adding before kernel
> overhead hurts performance). I think you had mentioned it several
> times but I think we need to do testing to figure out exactly how many
> threads we can add before kernel overhead hurts performance. I
> speculate the prime schema is on the low end and I think that we can
> add more threads without hurting performance than the prime schema
> will allow. We can look at how the upcalls/s rate changes based on the
> number of handlers and the number of active cores. The prime schema is
> an easy solution but it is a little driving blindly without knowing
> more stats.

Let's also not forget that a user tells OVS to be bound only to certain
cores, and will be slightly confused when there are more handler threads
than the number of isolated CPUs.  This will be confusing no matter
what schema is chosen - so it will be important to have some good
documentation, and perhaps an INFO log that tells something about this.

> On the other hand, stats might be different from one system to
> another. What might be good on my test system doesn't necessarily
> translate to another system. LMK what you think about this, if this is
> worth the effort
>
> The prime schema is sufficient as is. I just think that we can improve
> it a little bit. But we dont have to go down that road if need be
>
>
>
>
>> 
>>> Adding as many
>>> threads are there are cores could have a performance hit when the number
>>> of active cores (which all threads have to share) is very low. For this
>>> reason we avoid creating as many threads as there are cores and instead
>>> meet somewhere in the middle.
>>>
>>> The formula used to calculate the number of handler threads to create
>>> is as follows:
>>>
>>> handlers_n = min(next_prime(active_cores+1), total_cores)
>>>
>>> Where next_prime is a function that returns the next numeric prime
>>> number that is strictly greater than active_cores
>>>
>>> Assume default behavior when total_cores <= 2, that is do not create
>>> additional threads when we have less than 2 total cores on the system
>>>
>>> Fixes: b1e517bd2f81 ("dpif-netlink: Introduce per-cpu upcall dispatch.")
>>> Signed-off-by: Michael Santana 
>>> ---
>>>   lib/dpif-netlink.c | 86 --
>>>   lib/ovs-thread.c   | 16 +
>>>   lib/ovs-thread.h   |  1 +
>>>   3 files changed, 101 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
>>> index 06e1e8ca0..e948a829b 100644
>>> --- a/lib/dpif-netlink.c
>>> +++ b/lib/dpif-netlink.c
>>> @@ -31,6 +31,7 @@
>>>   #include 
>>>   #include 
>>>   #include 
>>> +#include 
>>> #include "bitmap.h"
>>>   #include "dpif-netlink-rtnl.h"
>>> @@ -2506,6 +2507,87 @@ dpif_netlink_handler_uninit(struct dpif_handler 
>>> *handler)
>>>   }
>>>   #endif
>>>   +/*
>>> + * Returns 1 if num is a prime number,
>>> + * Otherwise return 0
>>> + */
>>> +static uint32_t

Re: [ovs-dev] [RFC PATCH 2/2] rhel: use the complete version as OVS version

2022-07-06 Thread Aaron Conole
Timothy Redaelli  writes:

> Since on CentOS/RHEL the builds are based on stable branches and not on
> tags for debugging purpose it's better to have the downstream version as
> version so it's easier to know which commits are included in a build.
>
> This commit changes the version of OVS on Fedora/CentOS/RHEL to be aligned
> with the downstream one.
>
> Signed-off-by: Timothy Redaelli 
> ---
>  rhel/openvswitch-fedora.spec.in | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/rhel/openvswitch-fedora.spec.in b/rhel/openvswitch-fedora.spec.in
> index 16ef1ac3a..f8398c74c 100644
> --- a/rhel/openvswitch-fedora.spec.in
> +++ b/rhel/openvswitch-fedora.spec.in
> @@ -162,6 +162,9 @@ This package provides IPsec tunneling support for OVS 
> tunnels.
>  %setup -q
>  
>  %build
> +# Append release to version
> +sed -i -e "s/^AC_INIT(openvswitch,.*,/AC_INIT(openvswitch, 
> %{version}-%{release},/" configure.ac
> +

If we're making this change here, why not also just run sed across the
.te.in file (from patch 1/2) rather than making that change a dist
wide one.

We could simply add something like (NOTE: the syntax is untested):

 sed -i -e "s/%{version}-%{release}/%{version}/" selinux/openvswitch-custom.te

Maybe there's also an alternative where we use an extra_dist kind of
variable to create a version file that we use in those places we will
want to use the full version?

>  %configure \
>  %if %{with libcapng}
>  --enable-libcapng \

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] tc: Support masks for tunnel destination ports.

2022-07-06 Thread Roi Dayan via dev




On 2022-07-06 3:34 PM, Roi Dayan wrote:



On 2022-07-06 1:26 PM, Eelco Chaudron wrote:



On 5 Jul 2022, at 0:45, Ilya Maximets wrote:


TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may
create a masked match on this field.

This change is important as we're not clearing the masks which wasn't
really used, so if OVS requests match on ports, we should use the
mask and clear, otherwise offloading will fail.

Signed-off-by: Ilya Maximets 


Changes look good to me, and all system-traffic.at tests that were 
passing without the change are still passing, including the failing 
ERSPAN ones.


Acked-by: Eelco Chaudron 



can we postpone to merge this?
After this patch simple vxlan rules won't have enc dst port on decap
rules and currently mlx5 driver must have enc dst port to do the
offload of the decap rule.
I'm checking about this limitation and if we can remove it but
if we can postpone a bit so we won't break users with our nics..

thanks,
Roi



I finished with the testing and checking I needed.
I don't have a problem with the patch. if something else will raise i'll
start a different discussion.

Acked-by: Roi Dayan 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] netdev-offload-tc: Fix ignoring unknown tunnel keys.

2022-07-06 Thread Roi Dayan via dev




On 2022-07-05 1:45 AM, Ilya Maximets wrote:

Current offloading code supports only limited number of tunnel keys
and silently ignores everything it doesn't understand.  This is
causing, for example, offloaded ERSPAN tunnels to not work, because
flow is offloaded, but ERSPAN options are not provided to TC.

There is a number of tunnel keys, which are supported by the userspace,
but silently ignored during offloading:

   OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT
   OVS_TUNNEL_KEY_ATTR_OAM
   OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS
   OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS

OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions
and for some reason is set from the tunnel port instead of the
provided action, and not currently supported for the tunnel key in
the match.

Addig a default case to fail offloading of unknown attributes.  For
now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag,
otherwise we'll break all tunnel offloading by default.  VXLAN and
ERSPAN options has to fail offloading, because the tunnel will not
work otherwise.  OAM is not a default configurations, so failing it
as well. The missing DONT_FRAGMENT flag though should, probably,
cause frequent flow revalidation, but that is not new with this patch.

Same for the 'match' key, only clearing masks that was actually
consumed, except for the DONT_FRAGMENT and CSUM flags, which are
explicitly allowed and highlighted as broken.

Also, ttl and tos were incorrectly checked by the value instead of a
mask during the flow key dump. Destination port as well as CSUM
configuration for unknown reason was not taken from the actions list
and were passed via HW offload info.

Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395522.html
Reported-by: Eelco Chaudron 
Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc 
interface")
Signed-off-by: Ilya Maximets 
---
  lib/dpif-netlink.c  | 14 +-
  lib/netdev-offload-tc.c | 57 ++---
  lib/netdev-offload.h|  3 ---
  3 files changed, 49 insertions(+), 25 deletions(-)

diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
index 06e1e8ca0..1e116b4ad 100644
--- a/lib/dpif-netlink.c
+++ b/lib/dpif-netlink.c
@@ -2237,8 +2237,6 @@ parse_flow_put(struct dpif_netlink *dpif, struct 
dpif_flow_put *put)
  size_t left;
  struct netdev *dev;
  struct offload_info info;
-ovs_be16 dst_port = 0;
-uint8_t csum_on = false;
  int err;
  
  info.tc_modify_flow_deleted = false;

@@ -2258,10 +2256,9 @@ parse_flow_put(struct dpif_netlink *dpif, struct 
dpif_flow_put *put)
  return EOPNOTSUPP;
  }
  
-/* Get tunnel dst port */

+/* Check the output port for a tunnel. */
  NL_ATTR_FOR_EACH(nla, left, put->actions, put->actions_len) {
  if (nl_attr_type(nla) == OVS_ACTION_ATTR_OUTPUT) {
-const struct netdev_tunnel_config *tnl_cfg;
  struct netdev *outdev;
  odp_port_t out_port;
  
@@ -2271,19 +2268,10 @@ parse_flow_put(struct dpif_netlink *dpif, struct dpif_flow_put *put)

  err = EOPNOTSUPP;
  goto out;
  }
-tnl_cfg = netdev_get_tunnel_config(outdev);
-if (tnl_cfg && tnl_cfg->dst_port != 0) {
-dst_port = tnl_cfg->dst_port;
-}
-if (tnl_cfg) {
-csum_on = tnl_cfg->csum;
-}
  netdev_close(outdev);
  }
  }
  
-info.tp_dst_port = dst_port;

-info.tunnel_csum_on = csum_on;
  info.recirc_id_shared_with_tc = (dpif->user_features
   & OVS_DP_F_TC_RECIRC_SHARING);
  err = netdev_flow_put(dev, ,
diff --git a/lib/netdev-offload-tc.c b/lib/netdev-offload-tc.c
index 262faf3c6..e62687c82 100644
--- a/lib/netdev-offload-tc.c
+++ b/lib/netdev-offload-tc.c
@@ -805,11 +805,11 @@ parse_tc_flower_to_match(struct tc_flower *flower,
>key.tunnel.ipv6.ipv6_src,
>mask.tunnel.ipv6.ipv6_src);
  }
-if (flower->key.tunnel.tos) {
+if (flower->mask.tunnel.tos) {
  match_set_tun_tos_masked(match, flower->key.tunnel.tos,
   flower->mask.tunnel.tos);
  }
-if (flower->key.tunnel.ttl) {
+if (flower->mask.tunnel.ttl) {
  match_set_tun_ttl_masked(match, flower->key.tunnel.ttl,
   flower->mask.tunnel.ttl);
  }
@@ -1284,9 +1284,11 @@ static int
  parse_put_flow_set_action(struct tc_flower *flower, struct tc_action *action,
const struct nlattr *set, size_t set_len)
  {
+static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
  const struct nlattr *tunnel;
  const struct nlattr *tun_attr;
  size_t tun_left, tunnel_len;
+bool attr_csum_present;
  
  if (nl_attr_type(set) == 

[ovs-dev] [PATCH ovn v3] nb: Remove possibility of disabling logical datapath groups.

2022-07-06 Thread Dumitru Ceara
In large scale scenarios this option hugely reduces the size of the
Southbound database positively affecting end to end performance.  In
such scenarios there's no real reason to ever disable datapath groups.

In lower scale scenarios any potential overhead due to logical datapath
groups is, very likely, negligible.

Aside from potential scalability concerns, the
NB.NB_Global.options:use_logical_dp_group knob was kept until now to
ensure that in case of a bug in the logical datapath groups code a CMS
may turn it off and fall back to the mode in which logical flows are not
grouped together.  As far as I know, this has never happened until now.

Moreover, datpath_groups are enabled by default since v21.09.0 (4 stable
releases ago), via 90daa7ce18dc ("northd: Enable logical dp groups by
default.").

>From a testing perspective removing this knob will halve the CI matrix.
This is desirable, especially in the context of more tests being added,
e.g.:
https://patchwork.ozlabs.org/project/ovn/patch/20220623110239.2973854-1-mh...@redhat.com/

This commit also adds OVN_FOR_EACH_NORTHD to the "ovn-northd -- lr
multiple gw ports NAT" test case.  That was previously skipped because
the duplicate logical flow would have been merged when running with
datapath groups enabled.  Instead, change the expected output to not
include the duplicate.

Acked-by: Mark Michelson 
Signed-off-by: Dumitru Ceara 
---
V3:
- Rebased.
V2:
- Removed unused lflow_segs as suggested by Mark.
- Added Mark's ack.
---
 NEWS|  1 +
 TODO.rst|  6 +++
 northd/northd.c | 78 ++---
 ovn-nb.xml  | 19 +++--
 tests/ovn-macros.at | 21 ++
 tests/ovn-northd.at | 95 +
 tests/ovs-macros.at |  4 +-
 7 files changed, 47 insertions(+), 177 deletions(-)

diff --git a/NEWS b/NEWS
index 3737907ca..20cea579e 100644
--- a/NEWS
+++ b/NEWS
@@ -13,6 +13,7 @@ Post v22.06.0
 and ipsec_forceencaps=true (strongswan) to unconditionally enforce
 NAT-T UDP encapsulation. Requires OVS support for IPsec custom tunnel
 options (which will be available in OVS 2.18).
+  - Removed possibility of disabling logical datapath groups.
 
 OVN v22.06.0 - 03 Jun 2022
 --
diff --git a/TODO.rst b/TODO.rst
index 618ea4844..acbcacc4e 100644
--- a/TODO.rst
+++ b/TODO.rst
@@ -170,3 +170,9 @@ OVN To-do List
   * physical.c has a global simap -localvif_to_ofport which stores the
 local OVS interfaces and the ofport numbers. Move this to the engine data
 of the engine data node - ed_type_pflow_output.
+
+* ovn-northd parallel logical flow processing
+
+  * Multi-threaded logical flow computation was optimized for the case
+when datapath groups are disabled.  Datpath groups are always enabled
+now so northd parallel processing should be revisited.
diff --git a/northd/northd.c b/northd/northd.c
index 964af992f..509186392 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -4896,10 +4896,6 @@ ovn_lflow_equal(const struct ovn_lflow *a, const struct 
ovn_datapath *od,
 && nullable_string_is_equal(a->ctrl_meter, ctrl_meter));
 }
 
-/* If this option is 'true' northd will combine logical flows that differ by
- * logical datapath only by creating a datapath group. */
-static bool use_logical_dp_groups = false;
-
 enum {
 STATE_NULL,   /* parallelization is off */
 STATE_INIT_HASH_SIZES,/* parallelization is on; hashes sizing needed */
@@ -4924,8 +4920,7 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct 
ovn_datapath *od,
 lflow->ctrl_meter = ctrl_meter;
 lflow->dpg = NULL;
 lflow->where = where;
-if ((parallelization_state != STATE_NULL)
-&& use_logical_dp_groups) {
+if (parallelization_state != STATE_NULL) {
 ovs_mutex_init(>odg_lock);
 }
 }
@@ -4935,7 +4930,7 @@ ovn_dp_group_add_with_reference(struct ovn_lflow 
*lflow_ref,
 struct ovn_datapath *od)
 OVS_NO_THREAD_SAFETY_ANALYSIS
 {
-if (!use_logical_dp_groups || !lflow_ref) {
+if (!lflow_ref) {
 return false;
 }
 
@@ -5014,13 +5009,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct 
ovn_datapath *od,
 struct ovn_lflow *old_lflow;
 struct ovn_lflow *lflow;
 
-if (use_logical_dp_groups) {
-old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
-   actions, ctrl_meter, hash);
-if (old_lflow) {
-ovn_dp_group_add_with_reference(old_lflow, od);
-return old_lflow;
-}
+old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
+   actions, ctrl_meter, hash);
+if (old_lflow) {
+ovn_dp_group_add_with_reference(old_lflow, od);
+return old_lflow;
 }
 
 lflow = xmalloc(sizeof *lflow);
@@ -5076,8 +5069,7 @@ ovn_lflow_add_at_with_hash(struct hmap *lflow_map, 

Re: [ovs-dev] [PATCH v4 10/14] userspace: Enable IP checksum offloading by default.

2022-07-06 Thread David Marchand
On Mon, Jul 4, 2022 at 10:24 PM David Marchand
 wrote:
> > diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> > index 57f94df54..164738503 100644
> > --- a/ofproto/ofproto-dpif-upcall.c
> > +++ b/ofproto/ofproto-dpif-upcall.c
> > @@ -215,7 +215,7 @@ struct upcall {
> >  enum odp_key_fitness fitness;  /* Fitness of 'flow' relative to ODP 
> > key. */
> >  const ovs_u128 *ufid;  /* Unique identifier for 'flow'. */
> >  unsigned pmd_id;   /* Datapath poll mode driver id. */
> > -const struct dp_packet *packet;   /* Packet associated with this 
> > upcall. */
> > +struct dp_packet *packet;  /* Packet associated with this upcall. 
> > */
> >  ofp_port_t ofp_in_port;/* OpenFlow in port, or OFPP_NONE. */
> >  uint16_t mru;  /* If !0, Maximum receive unit of
> >fragmented IP packet */
> > @@ -395,7 +395,7 @@ static void delete_op_init(struct udpif *udpif, struct 
> > ukey_op *op,
> > struct udpif_key *ukey);
> >
> >  static int upcall_receive(struct upcall *, const struct dpif_backer *,
> > -  const struct dp_packet *packet, enum 
> > dpif_upcall_type,
> > +  struct dp_packet *packet, enum dpif_upcall_type,
> >const struct nlattr *userdata, const struct flow 
> > *,
> >const unsigned int mru,
> >const ovs_u128 *ufid, const unsigned pmd_id);
> > @@ -1140,7 +1140,7 @@ compose_slow_path(struct udpif *udpif, struct 
> > xlate_out *xout,
> >   * since the 'upcall->put_actions' remains uninitialized. */
> >  static int
> >  upcall_receive(struct upcall *upcall, const struct dpif_backer *backer,
> > -   const struct dp_packet *packet, enum dpif_upcall_type type,
> > +   struct dp_packet *packet, enum dpif_upcall_type type,
> > const struct nlattr *userdata, const struct flow *flow,
> > const unsigned int mru,
> > const ovs_u128 *ufid, const unsigned pmd_id)
> > @@ -1336,7 +1336,7 @@ should_install_flow(struct udpif *udpif, struct 
> > upcall *upcall)
> >  }
> >
> >  static int
> > -upcall_cb(const struct dp_packet *packet, const struct flow *flow, 
> > ovs_u128 *ufid,
> > +upcall_cb(struct dp_packet *packet, const struct flow *flow, ovs_u128 
> > *ufid,
> >unsigned pmd_id, enum dpif_upcall_type type,
> >const struct nlattr *userdata, struct ofpbuf *actions,
> >struct flow_wildcards *wc, struct ofpbuf *put_actions, void *aux)
> > @@ -1446,7 +1446,7 @@ static int
> >  process_upcall(struct udpif *udpif, struct upcall *upcall,
> > struct ofpbuf *odp_actions, struct flow_wildcards *wc)
> >  {
> > -const struct dp_packet *packet = upcall->packet;
> > +struct dp_packet *packet = upcall->packet;
> >  const struct flow *flow = upcall->flow;
> >  size_t actions_len = 0;
> >
> > @@ -1524,6 +1524,10 @@ process_upcall(struct udpif *udpif, struct upcall 
> > *upcall,
> >  break;
> >  }
> >
> > +/* The packet is going to be encapsulated and sent to
> > + * the controller. */
> > +dp_packet_ol_send_prepare(packet, 0);
>
> Having to touch the packet data this late is scary, but I don't know a
> better place for now.

We can move it to dp_netdev_upcall which only affects the userspace datapath.
Then the upcall handling code gets "complete" packet data, the same as
when it gets invoked in the kernel datapath case.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC PATCH 0/6] Remove OVS kernel driver

2022-07-06 Thread Ilya Maximets
On 7/5/22 20:28, Gregory Rose wrote:
> 
> 
> On 6/29/2022 3:42 PM, Ilya Maximets wrote:
>> On 6/10/22 02:31, Frode Nordahl wrote:
>>> On Fri, Jun 3, 2022 at 4:16 PM Frode Nordahl
>>>  wrote:

 On Thu, Jun 2, 2022 at 6:58 PM Ilya Maximets  wrote:
>
> On 6/1/22 22:53, Gregory Rose wrote:
>>
>>
>> On 5/31/2022 12:22 PM, Frode Nordahl wrote:
>>> On Tue, May 31, 2022 at 7:05 PM Ilya Maximets  
>>> wrote:

 On 5/31/22 17:36, Gregory Rose wrote:
>
>
> On 5/25/2022 6:47 AM, Flavio Leitner wrote:
>>
>> Hi Greg,
>>
>>
>> On Mon, May 23, 2022 at 09:10:36PM +0200, Ilya Maximets wrote:
>>> On 5/19/22 20:04, Gregory Rose wrote:


 On 4/15/2022 2:42 PM, Greg Rose wrote:
> It is time to remove support for the OVS kernel driver and push
> towards use of the upstream Linux openvswitch kernel driver
> in it's place [1].
>
> This patch series represents a first attempt but there are a few
> primary remaining issues that I have yet to address.
>
> A) Removal of debian packing support for the dkms kernel driver
>    module. The debian/rules are not well known to me - I've 
> never
>    actually made any changes in that area and do not have a
>    well formed understanding of how debian packaging works.  
> I wil
>    attempt to fix that up in upcoming patch series.
> B) Figuring out how the github workflow - I removed the tests I
>    could find that depend on the Linux kernel (i.e. they use
>    install_kernel() function.  Several other tests are  
> failing
>    that would not seem to depend on the Linux kernel.  I need 
> to
>    read and understand that code better.
> C) There are many Linux specific source modules in the datapath 
> that
>    will need eventual removal but some headers are still 
> required for
>    the userspace code (which seems counterintuitive but...)
>
> Reviews, suggestions, etc. are appreciated!
>
> 1.  
> https://mail.openvswitch.org/pipermail/ovs-dev/2022-April/393292.html

 I would like to suggest at this time that rather than removing the 
 OVS
 Linux kernel path that we "freeze" it at Linux 5.8. This will make 
 it
 easier for some consumers of OVS that are continuing to support the
 Linux kernel datapath in old distributions.

 The ultimate goal of shifting toward DPDK and AFXDP datapaths is 
 still
 preserved but we are placing less burden on some consumers of OVS 
 for
 older Linux distributions.

 Perhaps in suggesting removal of the kernel datapath I was being a 
 bit
 overly aggressive.

 Thoughts? Concerns? Other suggestions?
>>>
>>> Hi.  I think we discussed that before.  Removal from the master 
>>> branch
>>> doesn't mean that we will stop supporting the kernel module 
>>> immediately.
>>> It will remain in branch 2.17 which will become our new LTS series 
>>> soon.
>>> This branch will be supported until 2025.  And we also talked about
>>> possibility of extending the support just for a kernel module on 
>>> that
>>> branch, if required.  It's not necassary to use the kernel module 
>>> and
>>> OVS form the same branch, obviously.
>>>
>>> Removal from the master branch will just make it possible to remove
>>> the maintenance burden eventually, not right away.
>>>
>>> And FWIW, the goal is not to force everyone to use userspace 
>>> datapath,
>>> but remove a maintenance burden and push users to use a better 
>>> supported
>>> version of a code.  Frankly, we're not doing a great job supporting 
>>> the
>>> out-of-tree module these days.  It's getting hard to backport bug 
>>> fixes.
>>> And will be even harder over time since the code drifts away from 
>>> the
>>> version in the upstream kernel.  Mainly because we're not 
>>> backporting
>>> new features for a few years already.
>>>
>>> Does that make sense?
>>
>> Any thoughts on this? The freeze time is approaching, so it would
>> be great to know your plans for this patch set.
>>
>> Thanks,
>> fbl
>>
>
> Hi Flavio and Ilya,

Re: [ovs-dev] [PATCH 2/2] tc: Support masks for tunnel destination ports.

2022-07-06 Thread Roi Dayan via dev




On 2022-07-06 1:26 PM, Eelco Chaudron wrote:



On 5 Jul 2022, at 0:45, Ilya Maximets wrote:


TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may
create a masked match on this field.

This change is important as we're not clearing the masks which wasn't
really used, so if OVS requests match on ports, we should use the
mask and clear, otherwise offloading will fail.

Signed-off-by: Ilya Maximets 


Changes look good to me, and all system-traffic.at tests that were passing 
without the change are still passing, including the failing ERSPAN ones.

Acked-by: Eelco Chaudron 



can we postpone to merge this?
After this patch simple vxlan rules won't have enc dst port on decap
rules and currently mlx5 driver must have enc dst port to do the
offload of the decap rule.
I'm checking about this limitation and if we can remove it but
if we can postpone a bit so we won't break users with our nics..

thanks,
Roi
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Need Inputs on --- ovs-appctl "JSON format"

2022-07-06 Thread Veda Barrenkala
Hi

In "ovs-vsctl" utility, we have a parameter --format=json which outputs the
data in JSON format with proper semantics. I would like to add similar
"JSON" format option for 'ovs-appctl coverage/show' command. This helps the
user to consume the output data easily. We do have 2 options for this
1) Implementing JSON format for ovs-appctl globally or
2) Implementing JSON format for each command.

Which would be the proper way to add this feature? Kindly provide your
views and input on this.

Thanks and regards
B.Veda
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] netdev-offload-tc: Fix ignoring unknown tunnel keys.

2022-07-06 Thread Eelco Chaudron


On 5 Jul 2022, at 0:45, Ilya Maximets wrote:

> Current offloading code supports only limited number of tunnel keys
> and silently ignores everything it doesn't understand.  This is
> causing, for example, offloaded ERSPAN tunnels to not work, because
> flow is offloaded, but ERSPAN options are not provided to TC.
>
> There is a number of tunnel keys, which are supported by the userspace,
> but silently ignored during offloading:
>
>   OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT
>   OVS_TUNNEL_KEY_ATTR_OAM
>   OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS
>   OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS
>
> OVS_TUNNEL_KEY_ATTR_CSUM is kind of supported, but only for actions
> and for some reason is set from the tunnel port instead of the
> provided action, and not currently supported for the tunnel key in
> the match.
>
> Addig a default case to fail offloading of unknown attributes.  For
> now explicitly allowing incorrect behavior for the DONT_FRAGMENT flag,
> otherwise we'll break all tunnel offloading by default.  VXLAN and
> ERSPAN options has to fail offloading, because the tunnel will not
> work otherwise.  OAM is not a default configurations, so failing it
> as well. The missing DONT_FRAGMENT flag though should, probably,
> cause frequent flow revalidation, but that is not new with this patch.
>
> Same for the 'match' key, only clearing masks that was actually
> consumed, except for the DONT_FRAGMENT and CSUM flags, which are
> explicitly allowed and highlighted as broken.
>
> Also, ttl and tos were incorrectly checked by the value instead of a
> mask during the flow key dump. Destination port as well as CSUM
> configuration for unknown reason was not taken from the actions list
> and were passed via HW offload info.
>
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2022-July/395522.html
> Reported-by: Eelco Chaudron 
> Fixes: 8f283af89298 ("netdev-tc-offloads: Implement netdev flow put using tc 
> interface")
> Signed-off-by: Ilya Maximets 

Changes look good to me, and all system-traffic.at tests that were passing 
without the change are still passing, including the failing ERSPAN ones.

Acked-by: Eelco Chaudron 


Two small nits below.

> ---
>  lib/dpif-netlink.c  | 14 +-
>  lib/netdev-offload-tc.c | 57 ++---
>  lib/netdev-offload.h|  3 ---
>  3 files changed, 49 insertions(+), 25 deletions(-)
>

> +case OVS_TUNNEL_KEY_ATTR_DONT_FRAGMENT: {
> +/* XXX: This is wrong!  We're ignoring the DF flag configuration
> + * requested by the user.  However, TC for now has no way to pass
> + * that flag and it is set by default, meaning tunnel offloading
> + * will not work if 'options:df_default=false' is not set.
> + * Keeping incorrect behavior for now. */



> +
> +/* XXX: This is wrong!  We're ignoring DF and CSUM flags 
> configuration
> + * requested by the user.  However, TC for now has no way to pass
> + * these flags in a flower key and their masks are set by default,
> + * meaning tunnel offloading will not work at all if not cleared.
> + * Keeping incorrect behavior for now. */
> +tnl_mask->flags &= ~(FLOW_TNL_F_DONT_FRAGMENT | FLOW_TNL_F_CSUM);
> +

Small nit, 2x two spaces before “  However, TC for...”.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/2] tc: Support masks for tunnel destination ports.

2022-07-06 Thread Eelco Chaudron



On 5 Jul 2022, at 0:45, Ilya Maximets wrote:

> TCA_FLOWER_KEY_ENC_UDP_DST_PORT_MASK is supported by TC, and OVS may
> create a masked match on this field.
>
> This change is important as we're not clearing the masks which wasn't
> really used, so if OVS requests match on ports, we should use the
> mask and clear, otherwise offloading will fail.
>
> Signed-off-by: Ilya Maximets 

Changes look good to me, and all system-traffic.at tests that were passing 
without the change are still passing, including the failing ERSPAN ones.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-vport: Register IFINDEX for ERSPAN device

2022-07-06 Thread Eelco Chaudron


On 5 Jul 2022, at 0:48, Ilya Maximets wrote:

> On 7/4/22 18:52, Eelco Chaudron wrote:
>> Hi Abhiram,
>>
>> It’s been there since your patch was committed to the repo.
>> I did a quick verification again, and it is:
>>
>> git checkout 7539b4e45
>> 
>> git diff
>> diff --git a/tests/system-traffic.at b/tests/system-traffic.at
>> index 4a7fa49fc..b55d757bf 100644
>> --- a/tests/system-traffic.at
>> +++ b/tests/system-traffic.at
>> @@ -393,6 +393,9 @@ OVS_CHECK_GRE()
>> OVS_CHECK_ERSPAN()
>>
>> OVS_TRAFFIC_VSWITCHD_START()
>> +
>> +ovs-vsctl --no-wait set Open_vSwitch . other_config:hw-offload=true
>> +
>> ADD_BR([br-underlay])
>>
>> make -j 32 check-kernel TESTSUITEFLAGS='-k "ping over erspan v1 tunnel" -v'
>> ...
>> ./system-traffic.at:417: ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
>> ping -q -c 3 -i 0.3 -w 2 172.31.1.100 | grep "transmitted" | sed 
>> 's/time.*ms$/time 0ms/'
>> NS_EXEC_HEREDOC
>> ./system-traffic.at:423: ip netns exec at_ns0 sh << NS_EXEC_HEREDOC
>> ping -s 1200 -i 0.3 -c 3 10.1.1.100 | grep "transmitted" | sed 
>> 's/time.*ms$/time 0ms/'
>> NS_EXEC_HEREDOC
>> --- - 2022-07-04 18:48:07.788952607 +0200
>> +++ 
>> /home/echaudron/Documents/Scratch/ovs/tests/system-kmod-testsuite.dir/at-groups/13/stdout
>>  2022-07-04 18:48:07.785404896 +0200
>> @@ -1,2 +1,2 @@
>> -3 packets transmitted, 3 received, 0% packet loss, time 0ms
>> +3 packets transmitted, 1 received, 66.6667% packet loss, time 0ms
>> ...
>>
>>
>> Can you please investigate? I know reverting the patch will not offload to 
>> TC, but it’s not working, so it’s better to not offload and work than to 
>> offload and not work ;)
>
> I sent some patches to resolve the issue:
> https://patchwork.ozlabs.org/project/openvswitch/list/?series=307976=*
>
> It's not a problem of ERSPAN in particular, but a wider
> problem with ignoring a lot of tunnel keys and flags.
>
> ERSPAN offload will probably work if no options are specified,
> but I'm not sure how much sense it makes this way.

Ilya, Your patches look good, however, the test cases for ERSPAN use 
ERSPAN-specific options, so they will fall back to kerne dp.

Abhiram, so if you need a specific ERSPAN option with HW offload this will not 
work. This is because the TC kernel infrastructure does not allow these to be 
passed to the kernel.

>>
>> //Eelco
>>
>>
>> On 4 Jul 2022, at 17:25, Abhiram RN wrote:
>>
>> Hi Eelco,
>>
>> I do remember testing with this. I don't have full logs as such. But 
>> from one of the mail I had sent out I found below logs (this is with the 
>> change of course) where it's offloaded to TC (PSB). [ Because there seems to 
>> be support needed in HW driver (which is not present) and hence flows are 
>> not offloaded to HW which is expected]. But as you can see flows are in 
>> *dp:tc*.. (I can see 39 packets... so it's not only the first one). Not sure 
>> if anything got changed post that which might have caused this to fail now.
>>
>> [root@rhos-nfv-09 ~]#  ovs-appctl dpctl/dump-flows -m
>> ufid:edb7d581-9f14-42df-9e50-69a4923e1dbd, 
>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_0),packet_type(ns=0/0,id=0/0),eth(src=6a:62:68:b2:61:f6,dst=66:f0:e4:1e:2c:bb),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=5.5.5.2/255.255.255.254,proto=0/0,tos=0/0x3,ttl=0/0,frag=no
>>  
>> 
>>  ), packets:39, bytes:3276, used:0.260s, *dp:tc*, 
>> actions:set(tunnel(tun_id=0x0,src=10.10.10.1,dst=10.10.10.2,ttl=64,flags(key))),erspan_sys,enp4s0f0_1
>> ufid:519d8cfd-9821-467f-9f61-e2bf61cdde60, 
>> skb_priority(0/0),skb_mark(0/0),ct_state(0/0),ct_zone(0/0),ct_mark(0/0),ct_label(0/0),recirc_id(0),dp_hash(0/0),in_port(enp4s0f0_1),packet_type(ns=0/0,id=0/0),eth(src=66:f0:e4:1e:2c:bb,dst=6a:62:68:b2:61:f6),eth_type(0x0800),ipv4(src=0.0.0.0/0.0.0.0,dst=5.5.5.2/255.255.255.254,proto=0/0,tos=0/0x3,ttl=0/0,frag=no
>>  
>> 
>>  ), packets:39, bytes:3276, used:0.260s, *dp:tc*, 
>> actions:enp4s0f0_0,set(tunnel(tun_id=0x0,src=10.10.10.1,dst=10.10.10.2,ttl=64,flags(key))),erspan_sys
>> [root@rhos-nfv-09 ~]#
>>
>> JFYI,
>> Without this change still flows won't get offloaded to TC. Below is the 
>> error what I used to get and the flows used to be in *dp:ovs*
>> 2022-02-25T14:33:04.134Z|00151|netdev_offload_tc|INFO|init: failed to 
>> get ifindex for erspan0: Operation not supported
>> 2022-02-25T14:33:04.134Z|00152|netdev_offload|INFO|erspan0: No suitable 
>> flow API found.
>>
>> Thanks & Regards,
>> Abhiram R N
>>
>> On Mon, Jul 4, 2022 at 7:48 PM Eelco Chaudron > > wrote:
>>
>>
>>
>> On 5 Apr 2022, at 0:48, Ilya Maximets wrote:
>>
>> > On 3/23/22 14:01, Eelco Chaudron wrote:
>> >>
>> >>
>> >> On 23 Mar 2022, at 11:43, 

Re: [ovs-dev] [PATCH] netdev-linux: set correct action for packets that passed policer

2022-07-06 Thread Roi Dayan via dev




On 2022-07-06 9:53 AM, Vlad Buslov wrote:

Referenced commit changed policer action type from TC_ACT_UNSPEC (continue)
to TC_ACT_PIPE. However, since neither TC hardware offload layer nor mlx5
driver at the time validated action type and always assumed 'continue', the
breakage wasn't caught until later validation code was added. The change
also broke valid configuration when sending from offload-capable device to
non-offload capable. For example, when sending from mlx5 VF to OvS bridge
netdevice the traffic that passed matchall classifier with policer could no
longer match the following flower rule in software:

filter protocol all pref 1 matchall chain 0
filter protocol all pref 1 matchall chain 0 handle 0x1
   in_hw (rule hit 7863)
 action order 1:  police 0x1 rate 32Mbit burst 1000Kb mtu 64Kb action 
drop/pipe overhead 0b
 ref 1 bind 1  installed 17 sec firstused 17 sec
 Action statistics:
 Sent 152199634 bytes 102550 pkt (dropped 1315, overlimits 1315 
requeues 0)
 Sent software 74612172 bytes 51275 pkt
 Sent hardware 77587462 bytes 51275 pkt
 backlog 0b 0p requeues 0
 used_hw_stats delayed

filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
   dst_mac aa:94:1f:f2:f8:44
   src_mac e4:00:01:08:00:02
   eth_type ipv4
   ip_flags nofrag
   not_in_hw
 action order 1: skbedit  ptype host pipe
  index 1 ref 1 bind 1 installed 6 sec used 6 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0

 action order 2: mirred (Ingress Redirect to device br-ovs) stolen
 index 1 ref 1 bind 1 installed 6 sec used 6 sec
 Action statistics:
 Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
 backlog 0b 0p requeues 0
 cookie 401a9c8b3d403c62240d3eb5e21c1604
 no_percpu

Fix the issue by restoring pps policer action type to 'continue'.

Fixes: c2567e533f8a ("add port-based ingress policing based packet-per-second 
rate-limiting")
Signed-off-by: Vlad Buslov 
---
  lib/netdev-linux.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 2766b3f2bf67..d73391624555 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -2621,9 +2621,9 @@ nl_msg_act_police_start_nest(struct ofpbuf *request, 
uint32_t prio,
  
  static void

  nl_msg_act_police_end_nest(struct ofpbuf *request, size_t offset,
-   size_t act_offset)
+   size_t act_offset, uint32_t notexceed_act)
  {
-nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_PIPE);
+nl_msg_put_u32(request, TCA_POLICE_RESULT, notexceed_act);
  nl_msg_end_nested(request, offset);
  nl_msg_end_nested(request, act_offset);
  }
@@ -2643,7 +2643,7 @@ nl_msg_put_act_police(struct ofpbuf *request, struct 
tc_police police,
  nl_msg_act_police_start_nest(request, ++prio, , _offset);
  tc_put_rtab(request, TCA_POLICE_RATE, );
  nl_msg_put_unspec(request, TCA_POLICE_TBF, , sizeof police);
-nl_msg_act_police_end_nest(request, offset, act_offset);
+nl_msg_act_police_end_nest(request, offset, act_offset, TC_ACT_UNSPEC);
  }
  if (kpkts_rate) {
  unsigned int pkt_burst_ticks, pps_rate, size;
@@ -2658,7 +2658,7 @@ nl_msg_put_act_police(struct ofpbuf *request, struct 
tc_police police,
 (uint64_t) pkt_burst_ticks);
  nl_msg_put_unspec(request, TCA_POLICE_TBF, _police,
sizeof null_police);
-nl_msg_act_police_end_nest(request, offset, act_offset);
+nl_msg_act_police_end_nest(request, offset, act_offset, TC_ACT_PIPE);
  }
  }
  



Acked-by: Roi Dayan 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] netdev-linux: set correct action for packets that passed policer

2022-07-06 Thread Vlad Buslov via dev
Referenced commit changed policer action type from TC_ACT_UNSPEC (continue)
to TC_ACT_PIPE. However, since neither TC hardware offload layer nor mlx5
driver at the time validated action type and always assumed 'continue', the
breakage wasn't caught until later validation code was added. The change
also broke valid configuration when sending from offload-capable device to
non-offload capable. For example, when sending from mlx5 VF to OvS bridge
netdevice the traffic that passed matchall classifier with policer could no
longer match the following flower rule in software:

filter protocol all pref 1 matchall chain 0
filter protocol all pref 1 matchall chain 0 handle 0x1
  in_hw (rule hit 7863)
action order 1:  police 0x1 rate 32Mbit burst 1000Kb mtu 64Kb action 
drop/pipe overhead 0b
ref 1 bind 1  installed 17 sec firstused 17 sec
Action statistics:
Sent 152199634 bytes 102550 pkt (dropped 1315, overlimits 1315 requeues 
0)
Sent software 74612172 bytes 51275 pkt
Sent hardware 77587462 bytes 51275 pkt
backlog 0b 0p requeues 0
used_hw_stats delayed

filter protocol ip pref 3 flower chain 0
filter protocol ip pref 3 flower chain 0 handle 0x1
  dst_mac aa:94:1f:f2:f8:44
  src_mac e4:00:01:08:00:02
  eth_type ipv4
  ip_flags nofrag
  not_in_hw
action order 1: skbedit  ptype host pipe
 index 1 ref 1 bind 1 installed 6 sec used 6 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

action order 2: mirred (Ingress Redirect to device br-ovs) stolen
index 1 ref 1 bind 1 installed 6 sec used 6 sec
Action statistics:
Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0
cookie 401a9c8b3d403c62240d3eb5e21c1604
no_percpu

Fix the issue by restoring pps policer action type to 'continue'.

Fixes: c2567e533f8a ("add port-based ingress policing based packet-per-second 
rate-limiting")
Signed-off-by: Vlad Buslov 
---
 lib/netdev-linux.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index 2766b3f2bf67..d73391624555 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -2621,9 +2621,9 @@ nl_msg_act_police_start_nest(struct ofpbuf *request, 
uint32_t prio,
 
 static void
 nl_msg_act_police_end_nest(struct ofpbuf *request, size_t offset,
-   size_t act_offset)
+   size_t act_offset, uint32_t notexceed_act)
 {
-nl_msg_put_u32(request, TCA_POLICE_RESULT, TC_ACT_PIPE);
+nl_msg_put_u32(request, TCA_POLICE_RESULT, notexceed_act);
 nl_msg_end_nested(request, offset);
 nl_msg_end_nested(request, act_offset);
 }
@@ -2643,7 +2643,7 @@ nl_msg_put_act_police(struct ofpbuf *request, struct 
tc_police police,
 nl_msg_act_police_start_nest(request, ++prio, , _offset);
 tc_put_rtab(request, TCA_POLICE_RATE, );
 nl_msg_put_unspec(request, TCA_POLICE_TBF, , sizeof police);
-nl_msg_act_police_end_nest(request, offset, act_offset);
+nl_msg_act_police_end_nest(request, offset, act_offset, TC_ACT_UNSPEC);
 }
 if (kpkts_rate) {
 unsigned int pkt_burst_ticks, pps_rate, size;
@@ -2658,7 +2658,7 @@ nl_msg_put_act_police(struct ofpbuf *request, struct 
tc_police police,
(uint64_t) pkt_burst_ticks);
 nl_msg_put_unspec(request, TCA_POLICE_TBF, _police,
   sizeof null_police);
-nl_msg_act_police_end_nest(request, offset, act_offset);
+nl_msg_act_police_end_nest(request, offset, act_offset, TC_ACT_PIPE);
 }
 }
 
-- 
2.36.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev