Re: [ovs-dev] [PATCH v2 4/4] Tunnel: Snoop ingress packets and update neigh cache if needed.

2021-11-24 Thread Flavio Leitner
On Wed, Nov 10, 2021 at 11:46:55AM +0100, Paolo Valerio wrote:
> In case of native tunnel with bfd enabled, if the MAC address of the
> remote end's interface changes (e.g. because it got rebooted, and the
> MAC address is allocated dynamically), the BFD session will never be
> re-established.
> 
> This happens because the local tunnel neigh entry doesn't get updated,
> and the local end keeps sending BFD packets with the old destination
> MAC address. This was not an issue until
> b23ddcc57d41 ("tnl-neigh-cache: tighten arp and nd snooping.")
> because ARP requests were snooped as well avoiding the problem.
> 
> Fix this by snooping the incoming packets in the slow path, and
> updating the neigh cache accordingly.
> 
> Signed-off-by: Paolo Valerio 
> Fixes: b23ddcc57d41 ("tnl-neigh-cache: tighten arp and nd snooping.")
> Acked-by: Gaetan Rivet 
> ---

If you happen to respin the series, maybe you could add
the tag Reported-at: .

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/4] Native tunnel: Do not refresh the entry while revalidating.

2021-11-24 Thread Flavio Leitner
On Wed, Nov 10, 2021 at 11:46:49AM +0100, Paolo Valerio wrote:
> This is a minor issue but visible e.g. when you try to flush the neigh
> cache while the ARP flow is still present in the datapath, triggering
> the revalidation of the datapath flows which subsequently
> refreshes/adds the entry in the cache.
> 
> Signed-off-by: Paolo Valerio 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/4] Native tunnel: Add tnl/neigh/aging command.

2021-11-24 Thread Flavio Leitner
On Wed, Nov 10, 2021 at 11:46:42AM +0100, Paolo Valerio wrote:
> with the command is now possible to change the aging time of the
> cache entries.
> 
> For the existing entries the aging time is updated only if the
> current expiration is greater than the new one. In any case, the next
> refresh will set it to the new value.
> 
> This is intended mostly for debugging purpose.
> 
> Signed-off-by: Paolo Valerio 
> ---
> v2:
> - fixed NEIGH_ENTRY_MAX_AGEING_TIME (turned to seconds) correcting a
>   leftover.
> - turned relaxed atomics to acq/rel.
> - added range checks to tunnel-push-pop.at. It was useless to
>   duplicate the test for both ipv6 and ipv4, so only the latter
>   includes it.
> - slightly modified the NEWS entry.
> ---
>  NEWS|2 +
>  lib/tnl-neigh-cache.c   |   79 
> +++
>  ofproto/ofproto-tnl-unixctl.man |9 
>  tests/tunnel-push-pop-ipv6.at   |   30 +++
>  tests/tunnel-push-pop.at|   47 +++
>  5 files changed, 158 insertions(+), 9 deletions(-)
> 
> diff --git a/NEWS b/NEWS
> index 434ee570f..1aa233a0d 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -16,6 +16,8 @@ Post-v2.16.0
> - ovs-dpctl and 'ovs-appctl dpctl/':
>   * New commands 'cache-get-size' and 'cache-set-size' that allows to
> get or configure linux kernel datapath cache sizes.
> +   - ovs-appctl:
> + * New command tnl/neigh/aging to read/write the neigh aging time.
>  
>  
>  v2.16.0 - 16 Aug 2021
> diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c
> index 1e6cc31db..a4d56e4cc 100644
> --- a/lib/tnl-neigh-cache.c
> +++ b/lib/tnl-neigh-cache.c
> @@ -46,6 +46,7 @@
>  
>  
>  #define NEIGH_ENTRY_DEFAULT_IDLE_TIME_MS  (15 * 60 * 1000)
> +#define NEIGH_ENTRY_MAX_AGING_TIME  3600

Shouldn't we include the unit suffix here too?

fbl

>  
>  struct tnl_neigh_entry {
>  struct cmap_node cmap_node;
> @@ -57,6 +58,7 @@ struct tnl_neigh_entry {
>  
>  static struct cmap table = CMAP_INITIALIZER;
>  static struct ovs_mutex mutex = OVS_MUTEX_INITIALIZER;
> +static atomic_uint32_t neigh_aging;
>  
>  static uint32_t
>  tnl_neigh_hash(const struct in6_addr *ip)
> @@ -74,6 +76,15 @@ tnl_neigh_expired(struct tnl_neigh_entry *neigh)
>  return expires <= time_msec();
>  }
>  
> +static uint32_t
> +tnl_neigh_get_aging(void)
> +{
> +unsigned int aging;
> +
> +atomic_read_explicit(_aging, , memory_order_acquire);
> +return aging;
> +}
> +
>  static struct tnl_neigh_entry *
>  tnl_neigh_lookup__(const char br_name[IFNAMSIZ], const struct in6_addr *dst)
>  {
> @@ -88,7 +99,7 @@ tnl_neigh_lookup__(const char br_name[IFNAMSIZ], const 
> struct in6_addr *dst)
>  }
>  
>  atomic_store_explicit(>expires, time_msec() +
> -  NEIGH_ENTRY_DEFAULT_IDLE_TIME_MS,
> +  tnl_neigh_get_aging(),
>memory_order_release);
>  return neigh;
>  }
> @@ -134,7 +145,7 @@ tnl_neigh_set__(const char name[IFNAMSIZ], const struct 
> in6_addr *dst,
>  if (neigh) {
>  if (eth_addr_equals(neigh->mac, mac)) {
>  atomic_store_relaxed(>expires, time_msec() +
> - NEIGH_ENTRY_DEFAULT_IDLE_TIME_MS);
> + tnl_neigh_get_aging());
>  ovs_mutex_unlock();
>  return;
>  }
> @@ -147,7 +158,7 @@ tnl_neigh_set__(const char name[IFNAMSIZ], const struct 
> in6_addr *dst,
>  neigh->ip = *dst;
>  neigh->mac = mac;
>  atomic_store_relaxed(>expires, time_msec() +
> - NEIGH_ENTRY_DEFAULT_IDLE_TIME_MS);
> + tnl_neigh_get_aging());
>  ovs_strlcpy(neigh->br_name, name, sizeof neigh->br_name);
>  cmap_insert(, >cmap_node, tnl_neigh_hash(>ip));
>  ovs_mutex_unlock();
> @@ -273,6 +284,45 @@ tnl_neigh_cache_flush(struct unixctl_conn *conn, int 
> argc OVS_UNUSED,
>  unixctl_command_reply(conn, "OK");
>  }
>  
> +static void
> +tnl_neigh_cache_aging(struct unixctl_conn *conn, int argc,
> +const char *argv[], void *aux OVS_UNUSED)
> +{
> +long long int new_exp, curr_exp;
> +struct tnl_neigh_entry *neigh;
> +uint32_t aging;
> +
> +if (argc == 1) {
> +struct ds ds = DS_EMPTY_INITIALIZER;
> +ds_put_format(, "%"PRIu32, tnl_neigh_get_aging() / 1000);
> +unixctl_command_reply(conn, ds_cstr());
> +ds_destroy();
> +
> +return;
> +}
> +
> +if (!ovs_scan(argv[1], "%"SCNu32, ) ||
> +!aging || aging > NEIGH_ENTRY_MAX_AGING_TIME) {
> +unixctl_command_reply_error(conn, "bad aging value");
> +return;
> +}
> +
> +aging *= 1000;
> +atomic_store_explicit(_aging, aging, memory_order_release);
> +new_exp = time_msec() + aging;
> +
> +CMAP_FOR_EACH (neigh, cmap_node, ) {
> +atomic_read_explicit(>expires, _exp,

Re: [ovs-dev] [PATCH v2 1/4] Native tunnel: Read/write expires atomically.

2021-11-24 Thread Flavio Leitner
On Wed, Nov 10, 2021 at 11:46:36AM +0100, Paolo Valerio wrote:
> Expires is modified in different threads (revalidator, pmd-rx, bfd-tx).
> It's better to use atomics for such potentially parallel write.
> 
> Signed-off-by: Paolo Valerio 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 4/4] system-dpdk: Add tests for HXPS

2021-11-24 Thread Maxime Coquelin
This patch adds two tests for Vhost-user client ports with
multiqueue. First test is with HXPS disabled, where the
expectation is that a single queue Vhost-user queue
receives the 256 packets injected on the dummy port.
Second test is with HXPS enabled, where the expectation
is that the 256 packets injected on the dummy port are
are spread on the two vhsot-user queues.

Signed-off-by: Maxime Coquelin 
---
 tests/system-dpdk.at | 135 +++
 1 file changed, 135 insertions(+)

diff --git a/tests/system-dpdk.at b/tests/system-dpdk.at
index 894e01316..2cb81729c 100644
--- a/tests/system-dpdk.at
+++ b/tests/system-dpdk.at
@@ -223,6 +223,141 @@ OVS_VSWITCHD_STOP("m4_join([], 
[SYSTEM_DPDK_ALLOWED_LOGS], [
 AT_CLEANUP
 dnl --
 
+dnl --
+dnl Multiqueue, HXPS disabled (default) with vhost-user-client port
+AT_SETUP([OVS-DPDK - MQ, HXPS disabled with vhost-user-client port])
+AT_KEYWORDS([dpdk])
+OVS_DPDK_PRE_CHECK()
+AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], [])
+OVS_DPDK_START()
+
+dnl Find number of sockets
+AT_CHECK([lscpu], [], [stdout])
+AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
+
+dnl Add userspace bridge and attach it to OVS
+AT_CHECK([ovs-vsctl add-br br10 -- set bridge br10 datapath_type=netdev])
+AT_CHECK([ovs-vsctl add-port br10 dummy0 -- set interface dummy0 
type=dummy-pmd])
+AT_CHECK([ovs-vsctl add-port br10 dpdkvhostuserclient0 -- set Interface \
+  dpdkvhostuserclient0 \
+  type=dpdkvhostuserclient \
+  options:vhost-server-path=$OVS_RUNDIR/dpdkvhostclient0], [],
+ [stdout], [stderr])
+AT_CHECK([ovs-vsctl show], [], [stdout])
+
+dnl Parse log file
+AT_CHECK([grep "VHOST_CONFIG: vhost-user client: socket created" 
ovs-vswitchd.log], [], [stdout])
+AT_CHECK([grep "vHost User device 'dpdkvhostuserclient0' created in 'client' 
mode, using client socket" ovs-vswitchd.log], [], [stdout])
+AT_CHECK([grep "VHOST_CONFIG: $OVS_RUNDIR/dpdkvhostclient0: reconnecting..." 
ovs-vswitchd.log], [], [stdout])
+
+dnl Execute testpmd in background
+on_exit "pkill -f -x -9 'tail -f /dev/null'"
+tail -f /dev/null | dpdk-testpmd --socket-mem="$(cat NUMA_NODE)" --no-pci\
+   
--vdev="net_virtio_user,path=$OVS_RUNDIR/dpdkvhostclient0,server=1,queues=2" \
+   --file-prefix page0 --single-file-segments -- \
+   --rxq=2 --txq=2 --nb-cores=1 --forward-mode=rxonly -a \
+   >$OVS_RUNDIR/testpmd-dpdkvhostuserclient0.log 2>&1 &
+
+OVS_WAIT_UNTIL([grep "assigned port 'dpdkvhostuserclient0' rx queue 1" 
ovs-vswitchd.log])
+
+on_exit "pkill -f -x -9 '$PYTHON3 $srcdir/mfex_pkts.py 256 fuzz'"
+($PYTHON3 $srcdir/genpkts.py 256 fuzz | while read pkt; do
+ ovs-appctl netdev-dummy/receive dummy0 "$pkt" || break
+ done)
+
+OVS_WAIT_UNTIL([test `ovs-vsctl get Interface dpdkvhostuserclient0 
statistics:tx_packets` -eq 256])
+AT_CHECK([ovs-vsctl get Interface dpdkvhostuserclient0 statistics], [], 
[stdout])
+AT_CHECK( `ovs-vsctl get Interface dpdkvhostuserclient0 
statistics:tx_q0_packets` -eq 0 &&  \
+  `ovs-vsctl get Interface dpdkvhostuserclient0 
statistics:tx_q1_packets` -eq 256 ]]] || \
+  [[[ `ovs-vsctl get Interface dpdkvhostuserclient0 
statistics:tx_q0_packets` -eq 256 && \
+  `ovs-vsctl get Interface dpdkvhostuserclient0 
statistics:tx_q1_packets` -eq 0 )
+
+dnl Clean up the testpmd now
+pkill -f -x -9 'tail -f /dev/null'
+
+dnl Clean up
+AT_CHECK([ovs-vsctl del-port br10 dpdkvhostuserclient0], [], [stdout], 
[stderr])
+AT_CHECK([ovs-vsctl del-port br10 dummy0], [], [stdout], [stderr])
+OVS_VSWITCHD_STOP("m4_join([], [SYSTEM_DPDK_ALLOWED_LOGS], [
+\@Failed to enable flow control@d
+\@VHOST_CONFIG: recvmsg failed@d
+\@VHOST_CONFIG: failed to connect to $OVS_RUNDIR/dpdkvhostclient0: No such 
file or directory@d
+\@dpdkvhostuser ports are considered deprecated;  please migrate to 
dpdkvhostuserclient ports.@d
+\@failed to enumerate system datapaths: No such file or directory@d
+])")
+AT_CLEANUP
+dnl --
+
+dnl --
+dnl Multiqueue, HXPS enabled with vhost-user-client port
+AT_SETUP([OVS-DPDK - MQ, HXPS enabled with vhost-user-client port])
+AT_KEYWORDS([dpdk])
+OVS_DPDK_PRE_CHECK()
+AT_SKIP_IF([! which dpdk-testpmd >/dev/null 2>/dev/null])
+AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], [])
+OVS_DPDK_START()
+
+dnl Find number of sockets
+AT_CHECK([lscpu], [], [stdout])
+AT_CHECK([cat stdout | grep "NUMA node(s)" | awk '{c=1; while (c++<$(3)) 
{printf "512,"}; print "512"}' > NUMA_NODE])
+
+dnl Add userspace bridge and attach it to OVS
+AT_CHECK([ovs-vsctl add-br br10 -- set 

[ovs-dev] [PATCH 3/4] dpif-netdev: Add HXPS Tx queue mode

2021-11-24 Thread Maxime Coquelin
This patch adds a new HXPS Tx mode that distributes the
traffic on all the Tx queues, whatever the number of PMD
threads. It would be useful for guests expecting traffic
to be distributed on all the vCPUs.

The idea here is to re-use the 5-tuple hash of the packets,
already computed to build the flows batches (and so it
does not provide flexibility on which fields are part of
the hash).

There are also no user-configurable indirection table,
given the feature is transparent to the guest. The queue
selection is just a modulo operation between the packet
hash and the number of Tx queues.

There are no (at least intentionnally) functionnal changes
for the existing XPS and static modes. There should not be
noticeable performance changes for these modes (only one
more branch in the hot path).

For the HXPS mode, performance could be impacted due to
locking when multiple PMD threads are in use (same as
XPS mode) and also because of the second level of batching.

Regarding the batching, the existing Tx port output_pkts
is not modified. It means that at maximum, NETDEV_MAX_BURST
can be batched for all the Tx queues. A second level of
batching is done in dp_netdev_pmd_flush_output_on_port(),
only for this HXPS mode.

Signed-off-by: Maxime Coquelin 
---
 Documentation/automake.mk   |  1 +
 Documentation/topics/dpdk/hxps.rst  | 51 +++
 Documentation/topics/dpdk/index.rst |  1 +
 lib/dpif-netdev.c   | 78 +++--
 4 files changed, 117 insertions(+), 14 deletions(-)
 create mode 100644 Documentation/topics/dpdk/hxps.rst

diff --git a/Documentation/automake.mk b/Documentation/automake.mk
index 137cc57c5..c982207d5 100644
--- a/Documentation/automake.mk
+++ b/Documentation/automake.mk
@@ -33,6 +33,7 @@ DOC_SOURCE = \
Documentation/topics/datapath.rst \
Documentation/topics/design.rst \
Documentation/topics/dpdk/index.rst \
+   Documentation/topics/dpdk/hxps.rst \
Documentation/topics/dpdk/bridge.rst \
Documentation/topics/dpdk/jumbo-frames.rst \
Documentation/topics/dpdk/memory.rst \
diff --git a/Documentation/topics/dpdk/hxps.rst 
b/Documentation/topics/dpdk/hxps.rst
new file mode 100644
index 0..1395f88c3
--- /dev/null
+++ b/Documentation/topics/dpdk/hxps.rst
@@ -0,0 +1,51 @@
+..
+  Licensed under the Apache License, Version 2.0 (the "License"); you may
+  not use this file except in compliance with the License. You may obtain
+  a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+  WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+  License for the specific language governing permissions and limitations
+  under the License.
+
+  Convention for heading levels in Open vSwitch documentation:
+
+  ===  Heading 0 (reserved for the title in a document)
+  ---  Heading 1
+  ~~~  Heading 2
+  +++  Heading 3
+  '''  Heading 4
+
+  Avoid deeper levels because they do not render well.
+
+=
+Hash-based Tx packet steering
+=
+
+HXPS mode distributes the traffic on all the port transmit queues, whatever the
+number of PMD threads. Queue selection is based on the 5-tuples hash already
+computed to build the flows batches, the selected queue being the modulo
+between the hash and the number of Tx queues of the port.
+
+HXPS may be used for example with Vhost-user ports, when the number of vCPUs
+and queues of the guest are greater than the number of PMD threads, aq without
+HXPS, the Tx queues used would be limited to the number of PMD.
+
+Hash-based Tx packet steering may have an impact on the performance, given the
+Tx lock acquisition is required and a second level of batching is performed.
+
+This feature is disabled by default.
+
+Usage
+~
+
+To enable HXPS::
+
+$ ovs-vsctl set Interface  other_config:hxps=true
+
+To disable HXPS::
+
+$ ovs-vsctl set Interface  other_config:hxps=false
diff --git a/Documentation/topics/dpdk/index.rst 
b/Documentation/topics/dpdk/index.rst
index a5be5e344..ab6132357 100644
--- a/Documentation/topics/dpdk/index.rst
+++ b/Documentation/topics/dpdk/index.rst
@@ -39,3 +39,4 @@ DPDK Support
/topics/dpdk/qos
/topics/dpdk/jumbo-frames
/topics/dpdk/memory
+   /topics/dpdk/hxps
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 0407f30b3..966741751 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -388,12 +388,13 @@ struct dp_netdev_rxq {
 enum txq_mode {
 TXQ_MODE_STATIC,
 TXQ_MODE_XPS,
+TXQ_MODE_HXPS,
 };
 
 /* A port in a netdev-based datapath. */
 struct dp_netdev_port {
 odp_port_t port_no;
-enum txq_mode txq_mode; /* static, XPS */
+enum txq_mode txq_mode; /* static, XPS, HXPS 

[ovs-dev] [PATCH 2/4] dpif-netdev: Introduce Tx queue mode

2021-11-24 Thread Maxime Coquelin
A boolean is currently used to differenciate between the
static and XPS Tx queue modes.

Since we are going to introduce a new RSS mode, replace
this boolean with an enum.

This patch does not introduce functionnal changes.

Signed-off-by: Maxime Coquelin 
---
 lib/dpif-netdev.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 69d7ec26e..0407f30b3 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -385,10 +385,15 @@ struct dp_netdev_rxq {
 atomic_ullong cycles_intrvl[PMD_INTERVAL_MAX];
 };
 
+enum txq_mode {
+TXQ_MODE_STATIC,
+TXQ_MODE_XPS,
+};
+
 /* A port in a netdev-based datapath. */
 struct dp_netdev_port {
 odp_port_t port_no;
-bool dynamic_txqs;  /* If true XPS will be used. */
+enum txq_mode txq_mode; /* static, XPS */
 bool need_reconfigure;  /* True if we should reconfigure netdev. */
 struct netdev *netdev;
 struct hmap_node node;  /* Node in dp_netdev's 'ports'. */
@@ -4591,24 +4596,25 @@ dp_netdev_pmd_flush_output_on_port(struct 
dp_netdev_pmd_thread *pmd,
 int i;
 int tx_qid;
 int output_cnt;
-bool dynamic_txqs;
+bool concurrent_txqs;
 struct cycle_timer timer;
 uint64_t cycles;
 uint32_t tx_flush_interval;
 
 cycle_timer_start(>perf_stats, );
 
-dynamic_txqs = p->port->dynamic_txqs;
-if (dynamic_txqs) {
+if (p->port->txq_mode == TXQ_MODE_XPS) {
 tx_qid = dpif_netdev_xps_get_tx_qid(pmd, p);
+concurrent_txqs = true;
 } else {
 tx_qid = pmd->static_tx_qid;
+concurrent_txqs = false;
 }
 
 output_cnt = dp_packet_batch_size(>output_pkts);
 ovs_assert(output_cnt > 0);
 
-netdev_send(p->port->netdev, tx_qid, >output_pkts, dynamic_txqs);
+netdev_send(p->port->netdev, tx_qid, >output_pkts, concurrent_txqs);
 dp_packet_batch_init(>output_pkts);
 
 /* Update time of the next flush. */
@@ -5761,14 +5767,14 @@ reconfigure_datapath(struct dp_netdev *dp)
  * 'port->need_reconfigure', because netdev_is_reconf_required() can
  * change at any time.
  * Also mark for reconfiguration all ports which will likely change their
- * 'dynamic_txqs' parameter.  It's required to stop using them before
+ * 'txq_mode' parameter.  It's required to stop using them before
  * changing this setting and it's simpler to mark ports here and allow
  * 'pmd_remove_stale_ports' to remove them from threads.  There will be
  * no actual reconfiguration in 'port_reconfigure' because it's
  * unnecessary.  */
 HMAP_FOR_EACH (port, node, >ports) {
 if (netdev_is_reconf_required(port->netdev)
-|| (port->dynamic_txqs
+|| ((port->txq_mode == TXQ_MODE_XPS)
 != (netdev_n_txq(port->netdev) < wanted_txqs))) {
 port->need_reconfigure = true;
 }
@@ -5804,7 +5810,8 @@ reconfigure_datapath(struct dp_netdev *dp)
 seq_change(dp->port_seq);
 port_destroy(port);
 } else {
-port->dynamic_txqs = netdev_n_txq(port->netdev) < wanted_txqs;
+port->txq_mode = (netdev_n_txq(port->netdev) < wanted_txqs) ?
+TXQ_MODE_XPS : TXQ_MODE_STATIC;
 }
 }
 
@@ -7818,7 +7825,7 @@ dpif_netdev_xps_revalidate_pmd(const struct 
dp_netdev_pmd_thread *pmd,
 long long interval;
 
 HMAP_FOR_EACH (tx, node, >send_port_cache) {
-if (!tx->port->dynamic_txqs) {
+if (tx->port->txq_mode != TXQ_MODE_XPS) {
 continue;
 }
 interval = pmd->ctx.now - tx->last_used;
-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/4] netdev-dpdk: Introduce per rxq/txq Vhost-user statistics

2021-11-24 Thread Maxime Coquelin
HXPS feature will enable steering Tx packets on transmist
queues based on their hashes. In order to test the feature,
it is needed to be able to get the per-queue statistics for
Vhost-user ports.

This patch introduces "bytes", "packets" and "error"
per-queue custom statistics for Vhost-user ports.

Suggested-by David Marchand 
Signed-off-by: Maxime Coquelin 
---
 lib/netdev-dpdk.c | 143 +++---
 1 file changed, 135 insertions(+), 8 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ca92c947a..e80d5b4ab 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -192,6 +192,13 @@ static const struct vhost_device_ops virtio_net_device_ops 
=
 .guest_notified = vhost_guest_notified,
 };
 
+/* Custome software per-queue stats for Vhost ports */
+struct netdev_dpdk_vhost_q_stats {
+uint64_t bytes;
+uint64_t packets;
+uint64_t errors;
+};
+
 /* Custom software stats for dpdk ports */
 struct netdev_dpdk_sw_stats {
 /* No. of retries when unable to transmit. */
@@ -206,6 +213,10 @@ struct netdev_dpdk_sw_stats {
 uint64_t rx_qos_drops;
 /* Packet drops in HWOL processing. */
 uint64_t tx_invalid_hwol_drops;
+/* Per-queue Vhost Tx stats */
+struct netdev_dpdk_vhost_q_stats *txq;
+/* Per-queue Vhost Rx stats */
+struct netdev_dpdk_vhost_q_stats *rxq;
 };
 
 enum dpdk_dev_type {
@@ -1276,6 +1287,13 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 dev->sw_stats = xzalloc(sizeof *dev->sw_stats);
 dev->sw_stats->tx_retries = (dev->type == DPDK_DEV_VHOST) ? 0 : UINT64_MAX;
 
+if (dev->type == DPDK_DEV_VHOST) {
+dev->sw_stats->txq = xcalloc(netdev->n_txq,
+ sizeof *dev->sw_stats->txq);
+dev->sw_stats->rxq = xcalloc(netdev->n_rxq,
+ sizeof *dev->sw_stats->rxq);
+}
+
 return 0;
 }
 
@@ -2353,17 +2371,21 @@ netdev_dpdk_vhost_update_rx_size_counters(struct 
netdev_stats *stats,
 }
 
 static inline void
-netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev,
+netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk *dev, int qid,
  struct dp_packet **packets, int count,
  int qos_drops)
 {
 struct netdev_stats *stats = >stats;
+struct netdev_dpdk_vhost_q_stats *q_stats = >sw_stats->rxq[qid];
 struct dp_packet *packet;
 unsigned int packet_size;
 int i;
 
 stats->rx_packets += count;
+q_stats->packets += count;
 stats->rx_dropped += qos_drops;
+q_stats->errors += qos_drops;
+
 for (i = 0; i < count; i++) {
 packet = packets[i];
 packet_size = dp_packet_size(packet);
@@ -2374,6 +2396,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk 
*dev,
  * further processing. */
 stats->rx_errors++;
 stats->rx_length_errors++;
+q_stats->errors++;
 continue;
 }
 
@@ -2385,6 +2408,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_dpdk 
*dev,
 }
 
 stats->rx_bytes += packet_size;
+q_stats->bytes += packet_size;
 }
 
 if (OVS_UNLIKELY(qos_drops)) {
@@ -2437,7 +2461,7 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
 }
 
 rte_spinlock_lock(>stats_lock);
-netdev_dpdk_vhost_update_rx_counters(dev, batch->packets,
+netdev_dpdk_vhost_update_rx_counters(dev, qid, batch->packets,
  nb_rx, qos_drops);
 rte_spinlock_unlock(>stats_lock);
 
@@ -2551,7 +2575,7 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, 
struct rte_mbuf **pkts,
 }
 
 static inline void
-netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev,
+netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk *dev, int qid,
  struct dp_packet **packets,
  int attempted,
  struct netdev_dpdk_sw_stats *sw_stats_add)
@@ -2561,14 +2585,20 @@ netdev_dpdk_vhost_update_tx_counters(struct netdev_dpdk 
*dev,
   sw_stats_add->tx_failure_drops +
   sw_stats_add->tx_invalid_hwol_drops;
 struct netdev_stats *stats = >stats;
+struct netdev_dpdk_vhost_q_stats *q_stats = >sw_stats->txq[qid];
 int sent = attempted - dropped;
 int i;
 
 stats->tx_packets += sent;
+q_stats->packets += sent;
 stats->tx_dropped += dropped;
+q_stats->errors += dropped;
 
 for (i = 0; i < sent; i++) {
-stats->tx_bytes += dp_packet_size(packets[i]);
+uint64_t bytes = dp_packet_size(packets[i]);
+
+stats->tx_bytes += bytes;
+q_stats->bytes += bytes;
 }
 
 if (OVS_UNLIKELY(dropped || sw_stats_add->tx_retries)) {
@@ -2656,7 +2686,7 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
 sw_stats_add.tx_retries = MIN(retries, max_retries);
 
 

[ovs-dev] [PATCH 0/4] dpif-netdev: Hash-based Tx packet steering

2021-11-24 Thread Maxime Coquelin
This series introduces a new HXPS Tx mode alognside existing
XPS and static modes. The goal is to provide a mode where all
the transmit queues are used, whatever the number of PMD
threads. This may be used with Vhost-user ports, where the
guest application driving the Virtio device expects packets
to be distributed on all the queues.

As a preliminary step, in order to be able to validate the
feature at OVS level, the first patch introduces per-queue
basic statistics for Vhost-user ports. This patch is
complementary to David's patch [0] adding per-queue
statistics to DPDK ports using xstats.

The series also introduces two DPDK tests for Vhost-user
multiqueue, with and without HXPS enabled.

Maxime Coquelin (4):
  netdev-dpdk: Introduce per rxq/txq Vhost-user statistics
  dpif-netdev: Introduce Tx queue mode
  dpif-netdev: Add HXPS Tx queue mode
  system-dpdk: Add tests for HXPS

 Documentation/automake.mk   |   1 +
 Documentation/topics/dpdk/hxps.rst  |  51 ++
 Documentation/topics/dpdk/index.rst |   1 +
 lib/dpif-netdev.c   |  95 ++
 lib/netdev-dpdk.c   | 143 ++--
 tests/system-dpdk.at| 135 ++
 6 files changed, 399 insertions(+), 27 deletions(-)
 create mode 100644 Documentation/topics/dpdk/hxps.rst

-- 
2.31.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 0/2] ovsdb-data: Consolidate ovsdb atom and json strings.

2021-11-24 Thread Mike Pattrick
On Mon, 2021-11-22 at 01:09 +0100, Ilya Maximets wrote:
> This patch-set is a continuation of work for deduplication of
> ovsdb strings.  Previous patch (accepted):
>   
> https://patchwork.ozlabs.org/project/openvswitch/patch/20210923001353.4072533-1-i.maxim...@ovn.org/
> 
> Ilya Maximets (2):
>   json: Inline clone and destroy functions.
>   ovsdb-data: Consolidate ovsdb atom and json strings.
> 
>  include/openvswitch/json.h | 26 +--
>  lib/json.c | 53 +++-
> --
>  lib/ovsdb-data.c   | 27 ---
>  lib/ovsdb-data.h   | 25 ++
>  ovsdb/ovsdb-idlc.in|  5 ++--
>  ovsdb/ovsdb-server.c   |  7 ++---
>  ovsdb/rbac.c   |  8 +++---
>  python/ovs/db/data.py  | 10 ---
>  tests/test-ovsdb.c |  9 ---
>  9 files changed, 86 insertions(+), 84 deletions(-)
> 

Looks good to me!

Acked-by: Mike Pattrick

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] stream-ssl: Fix handling of default ciphers/protocols

2021-11-24 Thread Flavio Leitner
On Mon, Nov 15, 2021 at 10:40:47AM +0100, Frode Nordahl wrote:
> On Mon, Sep 13, 2021 at 4:23 AM Frode Nordahl
>  wrote:
> >
> > On Sat, Sep 11, 2021 at 10:23 PM Flavio Leitner  wrote:
> > >
> > > On Fri, Sep 10, 2021 at 06:20:45PM +0200, Frode Nordahl wrote:
> > > > On Thu, Sep 9, 2021 at 9:53 PM Flavio Leitner  wrote:
> > > > >
> > > > >
> > > > > Hi Frode,
> > > > >
> > > > > Thanks for your patch.
> > > > > Please see my comments below.
> > > >
> > > > Flavio, thank you for taking the time to review.
> > > >
> > > > > On Wed, Aug 25, 2021 at 01:05:13PM +0200, Frode Nordahl wrote:
> > > > > > Contrary to what is stated in the documentation, when SSL
> > > > > > ciphers or protocols options are omitted the default values
> > > > > > will not be set.  The SSL library default settings will be used
> > > > > > instead.
> > > > > >
> > > > > > Fix handling of default ciphers and protocols so that we actually
> > > > > > enforce what is listed as defaults.
> > > > > >
> > > > > > Fixes: e18a1d086133 ("Add support for specifying SSL connection 
> > > > > > parameters to ovsdb")
> > > > > > Signed-off-by: Frode Nordahl 
> > > > > > ---
> > > > > >  lib/stream-ssl.c | 30 ++
> > > > > >  1 file changed, 22 insertions(+), 8 deletions(-)
> > > > > >
> > > > > > diff --git a/lib/stream-ssl.c b/lib/stream-ssl.c
> > > > > > index 0ea3f2c08..6b4cf6970 100644
> > > > > > --- a/lib/stream-ssl.c
> > > > > > +++ b/lib/stream-ssl.c
> > > > > > @@ -162,8 +162,10 @@ struct ssl_config_file {
> > > > > >  static struct ssl_config_file private_key;
> > > > > >  static struct ssl_config_file certificate;
> > > > > >  static struct ssl_config_file ca_cert;
> > > > > > -static char *ssl_protocols = "TLSv1,TLSv1.1,TLSv1.2";
> > > > > > -static char *ssl_ciphers = "HIGH:!aNULL:!MD5";
> > > > > > +static char *default_ssl_protocols = "TLSv1,TLSv1.1,TLSv1.2";
> > > > > > +static char *default_ssl_ciphers = "HIGH:!aNULL:!MD5";
> > > > > > +static char *ssl_protocols = NULL;
> > > > > > +static char *ssl_ciphers = NULL;
> > > > > >
> > > > > >  /* Ordinarily, the SSL client and server verify each other's 
> > > > > > certificates using
> > > > > >   * a CA certificate.  Setting this to false disables this 
> > > > > > behavior.  (This is a
> > > > > > @@ -1225,14 +1227,19 @@ stream_ssl_set_key_and_cert(const char 
> > > > > > *private_key_file,
> > > > > >  void
> > > > > >  stream_ssl_set_ciphers(const char *arg)
> > > > > >  {
> > > > > > -if (ssl_init() || !arg || !strcmp(ssl_ciphers, arg)) {
> > > > >
> > > > > The ssl_init() calls at least one time do_ssl_init() which then
> > > > > calls SSL_CTX_set_cipher_list(ctx, "HIGH:!aNULL:!MD5").
> > > > > Those are the defaults in the man-page and not from the library.
> > > > >
> > > > > The do_ssl_init() also does:
> > > > >method = CONST_CAST(SSL_METHOD *, SSLv23_method());
> > > > >
> > > > > That should return SSLv3, TLSv1, TLSv1.1 and TLS1.2.
> > > > >
> > > > >ctx = SSL_CTX_new(method);
> > > > >SSL_CTX_set_options(ctx, SSL_OP_NO_SSLv2 | SSL_OP_NO_SSLv3);
> > > > >
> > > > > And there it excludes those SSL v2 and v3.
> > > > >
> > > > > Therefore, the default would be "TLSv1,TLSv1.1,TLSv1.2" which is
> > > > > the same in the man-page.
> > > > >
> > > > > Did I miss something?
> > > >
> > > > Thank you for pointing out that, I did not realize we manipulated
> > > > these options multiple places.
> > > >
> > > > I do need to rephrase the commit message, but there is still a problem
> > > > here. It became apparent when working on the next patch in the series,
> > > > where functional tests behave unexpectedly when passing the
> > > > ssl-protocols options. The reason being that when the protocol list
> > > > matches what is already in the static char *ssl_protocols in
> > > > lib/stream-ssl.c stream_ssl_set_protocols returns early without
> > > > setting any option.
> > >
> > > If that matches then it is because the default is set, so it
> > > shouldn't make any difference to return early. Do you still
> > > have the next patch without the default_ssl_protocols change
> > > for me to take a look? That might help me to see the issue.
> >
> > That would be true if do_ssl_init() and stream_ssl_set_protocols()
> > manipulated the options the same way, the reality is that they do not,
> > and the effective output from do_ssl_init() differ depending on the
> > version of OpenSSL Open vSwitch is built with.
> >
> > > > So I guess the question then becomes, should we stop doing this
> > > > multiple places or do you want me to update the call to
> > > > SSL_CTX_set_options in do_ssl_init and not introduce this change?
> > >
> > > Not sure yet because I didn't see the problem yet.
> >
> > Let me demonstrate, with OVS built from master against OpenSSL 1.1.1f
> > and the following modification to the ovs-sandbox script:
> > --- a/tutorial/ovs-sandbox
> > +++ b/tutorial/ovs-sandbox
> > @@ -270,7 +270,7 @@ trap 'kill `cat "$sandbox"/*.pid`' 0 1 2 3 13 

[ovs-dev] [PATCH net-next v8] net: openvswitch: IPv6: Add IPv6 extension header support

2021-11-24 Thread Toms Atteka
This change adds a new OpenFlow field OFPXMT_OFB_IPV6_EXTHDR and
packets can be filtered using ipv6_ext flag.

Signed-off-by: Toms Atteka 
---
 include/uapi/linux/openvswitch.h |   6 ++
 net/openvswitch/flow.c   | 140 +++
 net/openvswitch/flow.h   |  14 
 net/openvswitch/flow_netlink.c   |  26 +-
 4 files changed, 184 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index a87b44cd5590..43790f07e4a2 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -342,6 +342,7 @@ enum ovs_key_attr {
OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV4,   /* struct ovs_key_ct_tuple_ipv4 */
OVS_KEY_ATTR_CT_ORIG_TUPLE_IPV6,   /* struct ovs_key_ct_tuple_ipv6 */
OVS_KEY_ATTR_NSH,   /* Nested set of ovs_nsh_key_* */
+   OVS_KEY_ATTR_IPV6_EXTHDRS,  /* struct ovs_key_ipv6_exthdr */
 
 #ifdef __KERNEL__
OVS_KEY_ATTR_TUNNEL_INFO,  /* struct ip_tunnel_info */
@@ -421,6 +422,11 @@ struct ovs_key_ipv6 {
__u8   ipv6_frag;   /* One of OVS_FRAG_TYPE_*. */
 };
 
+/* separate structure to support backward compatibility with older user space 
*/
+struct ovs_key_ipv6_exthdrs {
+   __u16  hdrs;
+};
+
 struct ovs_key_tcp {
__be16 tcp_src;
__be16 tcp_dst;
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 9d375e74b607..28acb40437ca 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -239,6 +239,144 @@ static bool icmphdr_ok(struct sk_buff *skb)
  sizeof(struct icmphdr));
 }
 
+/**
+ * get_ipv6_ext_hdrs() - Parses packet and sets IPv6 extension header flags.
+ *
+ * @skb: buffer where extension header data starts in packet
+ * @nh: ipv6 header
+ * @ext_hdrs: flags are stored here
+ *
+ * OFPIEH12_UNREP is set if more than one of a given IPv6 extension header
+ * is unexpectedly encountered. (Two destination options headers may be
+ * expected and would not cause this bit to be set.)
+ *
+ * OFPIEH12_UNSEQ is set if IPv6 extension headers were not in the order
+ * preferred (but not required) by RFC 2460:
+ *
+ * When more than one extension header is used in the same packet, it is
+ * recommended that those headers appear in the following order:
+ *  IPv6 header
+ *  Hop-by-Hop Options header
+ *  Destination Options header
+ *  Routing header
+ *  Fragment header
+ *  Authentication header
+ *  Encapsulating Security Payload header
+ *  Destination Options header
+ *  upper-layer header
+ */
+static void get_ipv6_ext_hdrs(struct sk_buff *skb, struct ipv6hdr *nh,
+ u16 *ext_hdrs)
+{
+   u8 next_type = nh->nexthdr;
+   unsigned int start = skb_network_offset(skb) + sizeof(struct ipv6hdr);
+   int dest_options_header_count = 0;
+
+   *ext_hdrs = 0;
+
+   while (ipv6_ext_hdr(next_type)) {
+   struct ipv6_opt_hdr _hdr, *hp;
+
+   switch (next_type) {
+   case IPPROTO_NONE:
+   *ext_hdrs |= OFPIEH12_NONEXT;
+   /* stop parsing */
+   return;
+
+   case IPPROTO_ESP:
+   if (*ext_hdrs & OFPIEH12_ESP)
+   *ext_hdrs |= OFPIEH12_UNREP;
+   if ((*ext_hdrs & ~(OFPIEH12_HOP | OFPIEH12_DEST |
+  OFPIEH12_ROUTER | IPPROTO_FRAGMENT |
+  OFPIEH12_AUTH | OFPIEH12_UNREP)) ||
+   dest_options_header_count >= 2) {
+   *ext_hdrs |= OFPIEH12_UNSEQ;
+   }
+   *ext_hdrs |= OFPIEH12_ESP;
+   break;
+
+   case IPPROTO_AH:
+   if (*ext_hdrs & OFPIEH12_AUTH)
+   *ext_hdrs |= OFPIEH12_UNREP;
+   if ((*ext_hdrs &
+~(OFPIEH12_HOP | OFPIEH12_DEST | OFPIEH12_ROUTER |
+  IPPROTO_FRAGMENT | OFPIEH12_UNREP)) ||
+   dest_options_header_count >= 2) {
+   *ext_hdrs |= OFPIEH12_UNSEQ;
+   }
+   *ext_hdrs |= OFPIEH12_AUTH;
+   break;
+
+   case IPPROTO_DSTOPTS:
+   if (dest_options_header_count == 0) {
+   if (*ext_hdrs &
+   ~(OFPIEH12_HOP | OFPIEH12_UNREP))
+   *ext_hdrs |= OFPIEH12_UNSEQ;
+   *ext_hdrs |= OFPIEH12_DEST;
+   } else if (dest_options_header_count == 1) {
+   if (*ext_hdrs &
+   ~(OFPIEH12_HOP | OFPIEH12_DEST |
+ OFPIEH12_ROUTER | OFPIEH12_FRAG |
+   

Re: [ovs-dev] [PATCH] tests/flowgen: Fix length field of 802.2 data link header.

2021-11-24 Thread Paolo Valerio
Hello Ilya,

Ilya Maximets  writes:

> Length in Data Link Header for these packets should not include
> source and destination MACs or the length field itself.
>
> Therefore, it should be 14 bytes less, otherwise other network
> tools like wireshark complains:
>
>   Expert Info (Error/Malformed):
> Length field value goes past the end of the payload
>
> Additionally fixing the printing of the packet/flow configuration,
> as it currently prints '%s=%s' strings without any real data.
>
> Signed-off-by: Ilya Maximets 
> ---

The patch LGTM and works as expected.
Do you think it makes sense to include a 'Fixes' tag?

In any case:

Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: I-P: fix BFD dependency.

2021-11-24 Thread Dumitru Ceara
On 11/24/21 17:21, Numan Siddique wrote:
> On Wed, Nov 24, 2021 at 11:19 AM Dumitru Ceara  wrote:
>>
>> On 11/24/21 17:15, Numan Siddique wrote:
>>> On Tue, Nov 23, 2021 at 7:52 PM Dumitru Ceara  wrote:

 BFD entries are updated and their ->ref field is set to 'true' when
 static routes are parsed, within build_lflows(), in the 'en_lflow' I-P
 node.  Therefore we cannot clean up BFD entries in 'en_northd' which is
 an input of 'en_lflow'.

 To fix this we now move all BFD handling in the 'en_lflow' node.

 This fixes the CI failures that we've recently started hitting when
 running the ovn-kubernetes jobs, for example:
 https://github.com/ovn-org/ovn/runs/4294661249?check_suite_focus=true#step:10:9154

 Fixes: 1fa6612ffebf ("northd: Add lflow node")
 Signed-off-by: Dumitru Ceara 
>>>
>>> Hi Dumitru,
>>>
>>
>> Hi Numan,
>>
>>> Thanks for fixing this issue.
>>>
>>
>> Thanks for reviewing this.
>>
>>> In my opinion en_northd engine node should handle the DB syncs and
>>> en_lflow engine node
>>> should handle the flow generation. I think it would be better to move
>>> the syncing  of
>>> NB bfd->state  from en_lflow engine node (ie frombuild_lflows())  to
>>> en_northd engine node.
>>> This would mean en_northd engine node should iterate the logical
>>> static routes and set the
>>> status. Which would mean we need to move out the function
>>> parsed_routes_add() from
>>> build_static_route_flows_for_lrouter().
>>>
>>> What do you think ? Would you agree with this ?
>>>
>>
>> Sounds like this would be the ideal way to implement this, yes.
>>
>>> I'm not too sure about how easy it is ?  I'm inclined to accept this
>>
>> I don't see a very clean way of implementing it though.
>>
>>> patch and then revisit this
>>> as a follow up or once we branch out of 21.12.
>>
>> I think this would be the safest way to go for now.
> 
> Sounds good.  I applied this patch to the main branch.
> 

Thanks!

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: I-P: fix BFD dependency.

2021-11-24 Thread Numan Siddique
On Wed, Nov 24, 2021 at 11:19 AM Dumitru Ceara  wrote:
>
> On 11/24/21 17:15, Numan Siddique wrote:
> > On Tue, Nov 23, 2021 at 7:52 PM Dumitru Ceara  wrote:
> >>
> >> BFD entries are updated and their ->ref field is set to 'true' when
> >> static routes are parsed, within build_lflows(), in the 'en_lflow' I-P
> >> node.  Therefore we cannot clean up BFD entries in 'en_northd' which is
> >> an input of 'en_lflow'.
> >>
> >> To fix this we now move all BFD handling in the 'en_lflow' node.
> >>
> >> This fixes the CI failures that we've recently started hitting when
> >> running the ovn-kubernetes jobs, for example:
> >> https://github.com/ovn-org/ovn/runs/4294661249?check_suite_focus=true#step:10:9154
> >>
> >> Fixes: 1fa6612ffebf ("northd: Add lflow node")
> >> Signed-off-by: Dumitru Ceara 
> >
> > Hi Dumitru,
> >
>
> Hi Numan,
>
> > Thanks for fixing this issue.
> >
>
> Thanks for reviewing this.
>
> > In my opinion en_northd engine node should handle the DB syncs and
> > en_lflow engine node
> > should handle the flow generation. I think it would be better to move
> > the syncing  of
> > NB bfd->state  from en_lflow engine node (ie frombuild_lflows())  to
> > en_northd engine node.
> > This would mean en_northd engine node should iterate the logical
> > static routes and set the
> > status. Which would mean we need to move out the function
> > parsed_routes_add() from
> > build_static_route_flows_for_lrouter().
> >
> > What do you think ? Would you agree with this ?
> >
>
> Sounds like this would be the ideal way to implement this, yes.
>
> > I'm not too sure about how easy it is ?  I'm inclined to accept this
>
> I don't see a very clean way of implementing it though.
>
> > patch and then revisit this
> > as a follow up or once we branch out of 21.12.
>
> I think this would be the safest way to go for now.

Sounds good.  I applied this patch to the main branch.

Numan

>
> >
> > Let me know your thoughts.
> >
> > Thanks
> > Numan
> >
>
> Thanks,
> Dumitru
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: I-P: fix BFD dependency.

2021-11-24 Thread Dumitru Ceara
On 11/24/21 17:15, Numan Siddique wrote:
> On Tue, Nov 23, 2021 at 7:52 PM Dumitru Ceara  wrote:
>>
>> BFD entries are updated and their ->ref field is set to 'true' when
>> static routes are parsed, within build_lflows(), in the 'en_lflow' I-P
>> node.  Therefore we cannot clean up BFD entries in 'en_northd' which is
>> an input of 'en_lflow'.
>>
>> To fix this we now move all BFD handling in the 'en_lflow' node.
>>
>> This fixes the CI failures that we've recently started hitting when
>> running the ovn-kubernetes jobs, for example:
>> https://github.com/ovn-org/ovn/runs/4294661249?check_suite_focus=true#step:10:9154
>>
>> Fixes: 1fa6612ffebf ("northd: Add lflow node")
>> Signed-off-by: Dumitru Ceara 
> 
> Hi Dumitru,
> 

Hi Numan,

> Thanks for fixing this issue.
> 

Thanks for reviewing this.

> In my opinion en_northd engine node should handle the DB syncs and
> en_lflow engine node
> should handle the flow generation. I think it would be better to move
> the syncing  of
> NB bfd->state  from en_lflow engine node (ie frombuild_lflows())  to
> en_northd engine node.
> This would mean en_northd engine node should iterate the logical
> static routes and set the
> status. Which would mean we need to move out the function
> parsed_routes_add() from
> build_static_route_flows_for_lrouter().
> 
> What do you think ? Would you agree with this ?
> 

Sounds like this would be the ideal way to implement this, yes.

> I'm not too sure about how easy it is ?  I'm inclined to accept this

I don't see a very clean way of implementing it though.

> patch and then revisit this
> as a follow up or once we branch out of 21.12.

I think this would be the safest way to go for now.

> 
> Let me know your thoughts.
> 
> Thanks
> Numan
> 

Thanks,
Dumitru

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 3/3] system-dpdk: Test with mlx5 devices.

2021-11-24 Thread Eelco Chaudron


On 24 Nov 2021, at 17:14, Eelco Chaudron wrote:

> On 23 Nov 2021, at 15:15, David Marchand wrote:
>
>> The DPDK unit test only runs if vfio or igb_uio kernel module is loaded:
>> on systems with only mlx5, this test is always skipped.
>>
>> Besides, the test tries to grab the first device listed by dpdk-devbind.py,
>> regardless of the PCI device status regarding kmod binding.
>>
>> Remove dependency on this DPDK script and use a minimal script that
>> reads PCI sysfs.
>>
>> This script is not perfect, as one can imagine PCI devices bound to
>> vfio-pci for virtual machines.
>> Add a new environment variable DPDK_PCI_ADDR for testers to select the
>> PCI device of their liking.
>> For consistency and grep, the temporary file PCI_ADDR is renamed
>> to DPDK_PCI_ADDR.
>>
>> Note: with mlx5 devices, there is now more OVS/DPDK warnings to waive.
>
> Uncovered some build issues with my scripts as now my MLX5 card in the system 
> is used :)
>
> Changes look good to me, one small nit.
>
> Acked-by: Eelco Chaudron 
>
>> Signed-off-by: David Marchand 
>> ---
>>  Documentation/topics/testing.rst |  1 -
>>  tests/automake.mk|  1 +
>>  tests/system-dpdk-find-device.py | 44 
>>  tests/system-dpdk-macros.at  | 10 ++--
>>  tests/system-dpdk.at |  4 ++-
>>  5 files changed, 50 insertions(+), 10 deletions(-)
>>  create mode 100755 tests/system-dpdk-find-device.py
>>
>> diff --git a/Documentation/topics/testing.rst 
>> b/Documentation/topics/testing.rst
>> index 0ddcbd58ab..00f734a77a 100644
>> --- a/Documentation/topics/testing.rst
>> +++ b/Documentation/topics/testing.rst
>> @@ -343,7 +343,6 @@ To see a list of all the available tests, run::
>>
>>  These tests support a `DPDK supported NIC`_. The tests operate on a wider 
>> set of
>>  environments, for instance, when a virtual port is used.
>> -They do require proper DPDK variables (``DPDK_DIR`` and ``DPDK_BUILD``).
>>  Moreover you need to have root privileges to load the required modules and 
>> to bind
>>  the NIC to the DPDK-compatible driver.
>>
>> diff --git a/tests/automake.mk b/tests/automake.mk
>> index c519a5cb36..ead92ace0f 100644
>> --- a/tests/automake.mk
>> +++ b/tests/automake.mk
>> @@ -188,6 +188,7 @@ SYSTEM_OFFLOADS_TESTSUITE_AT = \
>>
>>  SYSTEM_DPDK_TESTSUITE_AT = \
>>  tests/system-common-macros.at \
>> +tests/system-dpdk-find-device.py \
>>  tests/system-dpdk-macros.at \
>>  tests/system-dpdk-testsuite.at \
>>  tests/system-dpdk.at
>> diff --git a/tests/system-dpdk-find-device.py 
>> b/tests/system-dpdk-find-device.py
>> new file mode 100755
>> index 00..4253326e75
>> --- /dev/null
>> +++ b/tests/system-dpdk-find-device.py
>> @@ -0,0 +1,44 @@
>> +#!/usr/bin/env python3

Forgot to complain about the this, i.e. “#!/usr/bin/env python3” ;)

>> +# Copyright (c) 2021 Red Hat, Inc.
>> +#
>> +# Licensed under the Apache License, Version 2.0 (the "License");
>> +# you may not use this file except in compliance with the License.
>> +# You may obtain a copy of the License at:
>> +#
>> +# http://www.apache.org/licenses/LICENSE-2.0
>> +#
>> +# Unless required by applicable law or agreed to in writing, software
>> +# distributed under the License is distributed on an "AS IS" BASIS,
>> +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
>> +# See the License for the specific language governing permissions and
>> +# limitations under the License.
>> +
>> +
>> +from pathlib import Path
>> +import os
>> +import sys
>> +
>> +# The tester might want to select a PCI device, if so, trust it.
>> +if 'DPDK_PCI_ADDR' in os.environ:
>> +print(os.environ['DPDK_PCI_ADDR'])
>> +sys.exit(0)
>> +
>> +mlx5_ib_available = Path('/sys/module/mlx5_ib').exists()
>> +
>> +for device in sorted(Path('/sys/bus/pci/devices').iterdir()):
>> +class_path = device / 'class'
>> +# Only consider Network class devices
>> +if class_path.read_text().strip() != '0x02':
>> +continue
>> +kmod_path = device / 'driver' / 'module'
>> +kmod_name = kmod_path.resolve().name
>> +# Devices of interest:
>> +# - device is bound to vfio_pci or igb_uio,
>> +# - device is bound to mlx5_core and mlx5 is loaded,
>> +if (kmod_name not in ['vfio_pci', 'igb_uio'] and
>> +(kmod_name not in ['mlx5_core'] or not mlx5_ib_available)):
>
> Flake8 is complaining about this. But for some reason, I like this better ;)
> Quick online search reviews unclear interpretations, so I would leave it as 
> is…
>
> system-dpdk-find-device.py:39:9: E129 visually indented line with same indent 
> as next logical line
>
>> +continue
>> +print(device.resolve().name)
>> +sys.exit(0)
>> +
>> +sys.exit(1)
>> diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
>> index b3614f0565..066fae0ba5 100644
>> --- a/tests/system-dpdk-macros.at
>> +++ b/tests/system-dpdk-macros.at
>> @@ -22,14 +22,8 @@ 

Re: [ovs-dev] [PATCH ovn] northd: I-P: fix BFD dependency.

2021-11-24 Thread Numan Siddique
On Tue, Nov 23, 2021 at 7:52 PM Dumitru Ceara  wrote:
>
> BFD entries are updated and their ->ref field is set to 'true' when
> static routes are parsed, within build_lflows(), in the 'en_lflow' I-P
> node.  Therefore we cannot clean up BFD entries in 'en_northd' which is
> an input of 'en_lflow'.
>
> To fix this we now move all BFD handling in the 'en_lflow' node.
>
> This fixes the CI failures that we've recently started hitting when
> running the ovn-kubernetes jobs, for example:
> https://github.com/ovn-org/ovn/runs/4294661249?check_suite_focus=true#step:10:9154
>
> Fixes: 1fa6612ffebf ("northd: Add lflow node")
> Signed-off-by: Dumitru Ceara 

Hi Dumitru,

Thanks for fixing this issue.

In my opinion en_northd engine node should handle the DB syncs and
en_lflow engine node
should handle the flow generation. I think it would be better to move
the syncing  of
NB bfd->state  from en_lflow engine node (ie frombuild_lflows())  to
en_northd engine node.
This would mean en_northd engine node should iterate the logical
static routes and set the
status. Which would mean we need to move out the function
parsed_routes_add() from
build_static_route_flows_for_lrouter().

What do you think ? Would you agree with this ?

I'm not too sure about how easy it is ?  I'm inclined to accept this
patch and then revisit this
as a follow up or once we branch out of 21.12.

Let me know your thoughts.

Thanks
Numan


>  northd/en-lflow.c|  8 
>  northd/en-northd.c   |  4 
>  northd/inc-proc-northd.c |  4 ++--
>  northd/northd.c  | 11 ---
>  northd/northd.h  | 11 +--
>  tests/ovn-northd.at  |  8 
>  6 files changed, 31 insertions(+), 15 deletions(-)
>
> diff --git a/northd/en-lflow.c b/northd/en-lflow.c
> index 5bef35cf6..ffbdaf4e8 100644
> --- a/northd/en-lflow.c
> +++ b/northd/en-lflow.c
> @@ -38,6 +38,10 @@ void en_lflow_run(struct engine_node *node, void *data 
> OVS_UNUSED)
>
>  struct northd_data *northd_data = engine_get_input_data("northd", node);
>
> +lflow_input.nbrec_bfd_table =
> +EN_OVSDB_GET(engine_get_input("NB_bfd", node));
> +lflow_input.sbrec_bfd_table =
> +EN_OVSDB_GET(engine_get_input("SB_bfd", node));
>  lflow_input.sbrec_logical_flow_table =
>  EN_OVSDB_GET(engine_get_input("SB_logical_flow", node));
>  lflow_input.sbrec_multicast_group_table =
> @@ -60,7 +64,11 @@ void en_lflow_run(struct engine_node *node, void *data 
> OVS_UNUSED)
>northd_data->ovn_internal_version_changed;
>
>  stopwatch_start(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
> +build_bfd_table(_input, eng_ctx->ovnsb_idl_txn,
> +_data->bfd_connections,
> +_data->ports);
>  build_lflows(_input, eng_ctx->ovnsb_idl_txn);
> +bfd_cleanup_connections(_input, _data->bfd_connections);
>  stopwatch_stop(BUILD_LFLOWS_STOPWATCH_NAME, time_msec());
>
>  engine_set_node_state(node, EN_UPDATED);
> diff --git a/northd/en-northd.c b/northd/en-northd.c
> index 0523560f8..79da7e1c4 100644
> --- a/northd/en-northd.c
> +++ b/northd/en-northd.c
> @@ -66,8 +66,6 @@ void en_northd_run(struct engine_node *node, void *data)
>  EN_OVSDB_GET(engine_get_input("NB_load_balancer", node));
>  input_data.nbrec_port_group_table =
>  EN_OVSDB_GET(engine_get_input("NB_port_group", node));
> -input_data.nbrec_bfd_table =
> -EN_OVSDB_GET(engine_get_input("NB_bfd", node));
>  input_data.nbrec_address_set_table =
>  EN_OVSDB_GET(engine_get_input("NB_address_set", node));
>  input_data.nbrec_meter_table =
> @@ -93,8 +91,6 @@ void en_northd_run(struct engine_node *node, void *data)
>  EN_OVSDB_GET(engine_get_input("SB_load_balancer", node));
>  input_data.sbrec_service_monitor_table =
>  EN_OVSDB_GET(engine_get_input("SB_service_monitor", node));
> -input_data.sbrec_bfd_table =
> -EN_OVSDB_GET(engine_get_input("SB_bfd", node));
>  input_data.sbrec_address_set_table =
>  EN_OVSDB_GET(engine_get_input("SB_address_set", node));
>  input_data.sbrec_port_group_table =
> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> index 8e7428dda..c02c5a44a 100644
> --- a/northd/inc-proc-northd.c
> +++ b/northd/inc-proc-northd.c
> @@ -178,7 +178,6 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
>  engine_add_input(_northd, _nb_gateway_chassis, NULL);
>  engine_add_input(_northd, _nb_ha_chassis_group, NULL);
>  engine_add_input(_northd, _nb_ha_chassis, NULL);
> -engine_add_input(_northd, _nb_bfd, NULL);
>
>  engine_add_input(_northd, _sb_sb_global, NULL);
>  engine_add_input(_northd, _sb_chassis, NULL);
> @@ -206,8 +205,9 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
>  engine_add_input(_northd, _sb_ip_multicast, NULL);
>  engine_add_input(_northd, _sb_service_monitor, NULL);
>  engine_add_input(_northd, _sb_load_balancer, NULL);
> -

Re: [ovs-dev] [PATCH v2 3/3] system-dpdk: Test with mlx5 devices.

2021-11-24 Thread Eelco Chaudron


On 23 Nov 2021, at 15:15, David Marchand wrote:

> The DPDK unit test only runs if vfio or igb_uio kernel module is loaded:
> on systems with only mlx5, this test is always skipped.
>
> Besides, the test tries to grab the first device listed by dpdk-devbind.py,
> regardless of the PCI device status regarding kmod binding.
>
> Remove dependency on this DPDK script and use a minimal script that
> reads PCI sysfs.
>
> This script is not perfect, as one can imagine PCI devices bound to
> vfio-pci for virtual machines.
> Add a new environment variable DPDK_PCI_ADDR for testers to select the
> PCI device of their liking.
> For consistency and grep, the temporary file PCI_ADDR is renamed
> to DPDK_PCI_ADDR.
>
> Note: with mlx5 devices, there is now more OVS/DPDK warnings to waive.

Uncovered some build issues with my scripts as now my MLX5 card in the system 
is used :)

Changes look good to me, one small nit.

Acked-by: Eelco Chaudron 

> Signed-off-by: David Marchand 
> ---
>  Documentation/topics/testing.rst |  1 -
>  tests/automake.mk|  1 +
>  tests/system-dpdk-find-device.py | 44 
>  tests/system-dpdk-macros.at  | 10 ++--
>  tests/system-dpdk.at |  4 ++-
>  5 files changed, 50 insertions(+), 10 deletions(-)
>  create mode 100755 tests/system-dpdk-find-device.py
>
> diff --git a/Documentation/topics/testing.rst 
> b/Documentation/topics/testing.rst
> index 0ddcbd58ab..00f734a77a 100644
> --- a/Documentation/topics/testing.rst
> +++ b/Documentation/topics/testing.rst
> @@ -343,7 +343,6 @@ To see a list of all the available tests, run::
>
>  These tests support a `DPDK supported NIC`_. The tests operate on a wider 
> set of
>  environments, for instance, when a virtual port is used.
> -They do require proper DPDK variables (``DPDK_DIR`` and ``DPDK_BUILD``).
>  Moreover you need to have root privileges to load the required modules and 
> to bind
>  the NIC to the DPDK-compatible driver.
>
> diff --git a/tests/automake.mk b/tests/automake.mk
> index c519a5cb36..ead92ace0f 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -188,6 +188,7 @@ SYSTEM_OFFLOADS_TESTSUITE_AT = \
>
>  SYSTEM_DPDK_TESTSUITE_AT = \
>   tests/system-common-macros.at \
> + tests/system-dpdk-find-device.py \
>   tests/system-dpdk-macros.at \
>   tests/system-dpdk-testsuite.at \
>   tests/system-dpdk.at
> diff --git a/tests/system-dpdk-find-device.py 
> b/tests/system-dpdk-find-device.py
> new file mode 100755
> index 00..4253326e75
> --- /dev/null
> +++ b/tests/system-dpdk-find-device.py
> @@ -0,0 +1,44 @@
> +#!/usr/bin/env python3
> +# Copyright (c) 2021 Red Hat, Inc.
> +#
> +# Licensed under the Apache License, Version 2.0 (the "License");
> +# you may not use this file except in compliance with the License.
> +# You may obtain a copy of the License at:
> +#
> +# http://www.apache.org/licenses/LICENSE-2.0
> +#
> +# Unless required by applicable law or agreed to in writing, software
> +# distributed under the License is distributed on an "AS IS" BASIS,
> +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> +# See the License for the specific language governing permissions and
> +# limitations under the License.
> +
> +
> +from pathlib import Path
> +import os
> +import sys
> +
> +# The tester might want to select a PCI device, if so, trust it.
> +if 'DPDK_PCI_ADDR' in os.environ:
> +print(os.environ['DPDK_PCI_ADDR'])
> +sys.exit(0)
> +
> +mlx5_ib_available = Path('/sys/module/mlx5_ib').exists()
> +
> +for device in sorted(Path('/sys/bus/pci/devices').iterdir()):
> +class_path = device / 'class'
> +# Only consider Network class devices
> +if class_path.read_text().strip() != '0x02':
> +continue
> +kmod_path = device / 'driver' / 'module'
> +kmod_name = kmod_path.resolve().name
> +# Devices of interest:
> +# - device is bound to vfio_pci or igb_uio,
> +# - device is bound to mlx5_core and mlx5 is loaded,
> +if (kmod_name not in ['vfio_pci', 'igb_uio'] and
> +(kmod_name not in ['mlx5_core'] or not mlx5_ib_available)):

Flake8 is complaining about this. But for some reason, I like this better ;)
Quick online search reviews unclear interpretations, so I would leave it as is…

system-dpdk-find-device.py:39:9: E129 visually indented line with same indent 
as next logical line

> +continue
> +print(device.resolve().name)
> +sys.exit(0)
> +
> +sys.exit(1)
> diff --git a/tests/system-dpdk-macros.at b/tests/system-dpdk-macros.at
> index b3614f0565..066fae0ba5 100644
> --- a/tests/system-dpdk-macros.at
> +++ b/tests/system-dpdk-macros.at
> @@ -22,14 +22,8 @@ m4_define([OVS_DPDK_PRE_PHY_SKIP],
>[dnl Perform the precheck
> OVS_DPDK_PRE_CHECK()
>
> -   dnl Check if VFIO or UIO driver is loaded
> -   AT_SKIP_IF([ ! (lsmod | grep -E "igb_uio|vfio") ], [], [stdout])
> -
> -   dnl Find PCI address candidate, skip if there is no 

Re: [ovs-dev] [PATCH v2 2/3] system-dpdk: Use dummy-pmd port for packet injection.

2021-11-24 Thread Eelco Chaudron
On 23 Nov 2021, at 15:15, David Marchand wrote:

> net_pcap is not always available in DPDK (like, in a dev
> environment when you forgot to install the libpcap-devel).
> On the other hand, OVS already has its own way to inject packets into a
> bridge. Let's make use of it.
>
> This solution is slower than net/pcap DPDK, so lower the number of
> expected packets so that it runs in OVS_CTL_TIMEOUT seconds.
>
> While at it, convert "known" packets from pcap to scapy so that the
> injected packets can be updated without having to read/write a pcap file.
>
> Note: this change also (avoids/) fixes a python exception in PcapWriter
> with scapy 2.4.3 that comes from EPEL.
>
> Suggested-by: Ilya Maximets 
> Signed-off-by: David Marchand 

One small nit below, rest looks good!

> ---
> Changes since v1:
> - renamed generator script,
> - decreased packet count for fuzzy test,
> - simplified wait expression for packet count,
>
> ---
>  tests/automake.mk   |   5 ++--
>  tests/genpkts.py|  56 
>  tests/mfex_fuzzy.py |  32 -
>  tests/pcap/mfex_test.pcap   | Bin 416 -> 0 bytes
>  tests/system-dpdk-macros.at |   4 +--
>  tests/system-dpdk.at|  30 +++
>  6 files changed, 84 insertions(+), 43 deletions(-)
>  create mode 100755 tests/genpkts.py
>  delete mode 100755 tests/mfex_fuzzy.py
>  delete mode 100644 tests/pcap/mfex_test.pcap
>
> diff --git a/tests/automake.mk b/tests/automake.mk
> index 43731d0973..c519a5cb36 100644
> --- a/tests/automake.mk
> +++ b/tests/automake.mk
> @@ -145,8 +145,7 @@ $(srcdir)/tests/fuzz-regression-list.at: tests/automake.mk
>
>  EXTRA_DIST += $(MFEX_AUTOVALIDATOR_TESTS)
>  MFEX_AUTOVALIDATOR_TESTS = \
> - tests/pcap/mfex_test.pcap \
> - tests/mfex_fuzzy.py
> + tests/genpkts.py
>
>  OVSDB_CLUSTER_TESTSUITE_AT = \
>   tests/ovsdb-cluster-testsuite.at \
> @@ -518,7 +517,7 @@ tests_test_type_props_SOURCES = tests/test-type-props.c
>  CHECK_PYFILES = \
>   tests/appctl.py \
>   tests/flowgen.py \
> - tests/mfex_fuzzy.py \
> + tests/genpkts.py \
>   tests/ovsdb-monitor-sort.py \
>   tests/test-daemon.py \
>   tests/test-json.py \
> diff --git a/tests/genpkts.py b/tests/genpkts.py
> new file mode 100755
> index 00..e243bb960f
> --- /dev/null
> +++ b/tests/genpkts.py
> @@ -0,0 +1,56 @@
> +#!/usr/bin/python3

Maybe change this to “#!/usr/bin/env python3” as other scripts have.

> +
> +import sys
> +
> +from scapy.all import RandMAC, RandIP, RandIP6, RandShort, fuzz
> +from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP
> +
> +if len(sys.argv) < 2:
> +print('usage: {} packets_count [fuzz]'.format(sys.argv[0]))
> +sys.exit(1)
> +
> +tmpl = []
> +
> +if len(sys.argv) == 2:
> +eth = Ether(dst='ff:ff:ff:ff:ff:ff')
> +vlan = eth / Dot1Q(vlan=1)
> +p = eth / IP() / TCP(sport=20, dport=80, flags='SA', window=8192)
> +tmpl += [p.build().hex()]
> +p = eth / IP() / UDP(sport=53, dport=53)
> +tmpl += [p.build().hex()]
> +p = eth / IP() / TCP(sport=20, dport=80, flags='S', window=8192)
> +tmpl += [p.build().hex()]
> +p = eth / IP() / UDP(sport=53, dport=53)
> +tmpl += [p.build().hex()]
> +p = vlan / IP() / UDP(sport=53, dport=53)
> +tmpl += [p.build().hex()]
> +p = vlan / IP() / TCP(sport=20, dport=80, flags='S', window=8192)
> +tmpl += [p.build().hex()]
> +elif sys.argv[2] == 'fuzz':
> +# Generate random protocol bases, use a fuzz() over the combined packet
> +# for full fuzzing.
> +eth = Ether(src=RandMAC(), dst=RandMAC())
> +vlan = Dot1Q()
> +ipv4 = IP(src=RandIP(), dst=RandIP())
> +ipv6 = IPv6(src=RandIP6(), dst=RandIP6())
> +udp = UDP(dport=RandShort(), sport=RandShort())
> +tcp = TCP(dport=RandShort(), sport=RandShort())
> +
> +# IPv4 packets with fuzzing
> +tmpl += [fuzz(eth / ipv4 / udp).build().hex()]
> +tmpl += [fuzz(eth / ipv4 / tcp).build().hex()]
> +tmpl += [fuzz(eth / vlan / ipv4 / udp).build().hex()]
> +tmpl += [fuzz(eth / vlan / ipv4 / tcp).build().hex()]
> +
> +# IPv6 packets with fuzzing
> +tmpl += [fuzz(eth / ipv6 / udp).build().hex()]
> +tmpl += [fuzz(eth / ipv6 / tcp).build().hex()]
> +tmpl += [fuzz(eth / vlan / ipv6 / udp).build().hex()]
> +tmpl += [fuzz(eth / vlan / ipv6 / tcp).build().hex()]
> +
> +i = 0
> +count = int(sys.argv[1])
> +while count != 0:
> +print(tmpl[i % len(tmpl)])
> +i += 1
> +count -= 1
> diff --git a/tests/mfex_fuzzy.py b/tests/mfex_fuzzy.py
> deleted file mode 100755
> index 3efe1152da..00
> --- a/tests/mfex_fuzzy.py
> +++ /dev/null
> @@ -1,32 +0,0 @@
> -#!/usr/bin/python3
> -
> -import sys
> -
> -from scapy.all import RandMAC, RandIP, PcapWriter, RandIP6, RandShort, fuzz
> -from scapy.all import IPv6, Dot1Q, IP, Ether, UDP, TCP
> -
> -path = str(sys.argv[1]) + "/pcap/fuzzy.pcap"
> -pktdump = PcapWriter(path, append=False, sync=True)
> -
> -for i in range(0, 

Re: [ovs-dev] [PATCH v2 1/3] system-dpdk: Refactor common logs matching.

2021-11-24 Thread Eelco Chaudron



On 23 Nov 2021, at 15:15, David Marchand wrote:

> Move EAL logs and commonly ignored logs to a common macro.
> Remove obsolete ones (like i40e [1] and timer [2] logs).
> Extend regex on hugepage logs since a check on hugepages availability is
> already present on OVS side, and as a consequence, we don't care about
> the warnings on availability for certain hugepage size.
> Add logs checks for MFEX tests that were missing them.
>
> 1: https://git.dpdk.org/dpdk/commit/?id=a075ce2b3e8c
> 2: https://git.dpdk.org/dpdk/commit/?id=c1077933d45b
>
> Signed-off-by: David Marchand 

Changes look good to me! Could not test with AVX512 as I have no hardware.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] netdev-vport : Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.

2021-11-24 Thread 0-day Robot
Bleep bloop.  Greetings lin huang, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Author linhuang  needs to sign off.
WARNING: Unexpected sign-offs from developers who are not authors or co-authors 
or committers: Lin Huang 
Lines checked: 60, Warnings: 1, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] netdev-vport : Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.

2021-11-24 Thread lin huang
From: linhuang 

Userspace tunnel doesn't have a valid device in the kernel. So
get_ifindex() function (ioctl) always get error during
adding a port, deleting a port or updating a port status.

The info log is
"2021-08-29T09:17:39.830Z|00059|netdev_linux|INFO|ioctl(SIOCGIFINDEX)
on vxlan_sys_4789 device failed: No such device"

If there are a lot of userspace tunnel ports on a bridge, the
iface_refresh_netdev_status() function will spend a lot of time.

So ignore userspace tunnel port ioctl(SIOCGIFINDEX) operation, just
return -ENODEV.

Signed-off-by: Lin Huang 
Test-by: Mike Pattrick 
Reviewed-by: Aaron Conole 
Reviewed-by: Ilya Maximets 
---
 lib/netdev-vport.c | 4 +++-
 vswitchd/bridge.c  | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-vport.c b/lib/netdev-vport.c
index 499c029..f0ff02b 100644
--- a/lib/netdev-vport.c
+++ b/lib/netdev-vport.c
@@ -1151,8 +1151,10 @@ netdev_vport_get_ifindex(const struct netdev *netdev_)
 {
 char buf[NETDEV_VPORT_NAME_BUFSIZE];
 const char *name = netdev_vport_get_dpif_port(netdev_, buf, sizeof(buf));
+const char *dpif_type = netdev_get_dpif_type(netdev_);

-return linux_get_ifindex(name);
+return (dpif_type && !strcmp(dpif_type, "system")
+? linux_get_ifindex(name) : -ENODEV);
 }

 #define NETDEV_VPORT_GET_IFINDEX netdev_vport_get_ifindex
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 5223aa8..513ef7e 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -2052,6 +2052,8 @@ iface_do_create(const struct bridge *br,
 goto error;
 }

+netdev_set_dpif_type(netdev, br->ofproto->type);
+
 error = iface_set_netdev_config(iface_cfg, netdev, errp);
 if (error) {
 goto error;
--
1.8.3.1
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netlink-socket: Check for null sock in nl_sock_recv__()

2021-11-24 Thread Flavio Leitner
On Tue, Nov 23, 2021 at 07:58:52PM -0300, Murilo Opsfelder Araújo wrote:
> Hi, Flavio.
> 
> On 11/23/21 15:32, Flavio Leitner wrote:
> > On Mon, Nov 22, 2021 at 03:46:28PM -0300, Murilo Opsfelder Araújo wrote:
> > > Hi, Ilya Maximets.
> > > 
> > > On 11/19/21 13:23, Ilya Maximets wrote:
> > > > On 11/18/21 22:19, David Christensen wrote:
> > > > > 
> > > > > 
> > > > > On 11/18/21 11:56 AM, Murilo Opsfelder Araújo wrote:
> > > > > > On 11/16/21 19:31, Ilya Maximets wrote:
> > > > > > > On 10/25/21 19:45, David Christensen wrote:
> > > > > > > > In certain high load situations, such as when creating a large 
> > > > > > > > number of
> > > > > > > > ports on a switch, the parameter 'sock' may be passed to 
> > > > > > > > nl_sock_recv__()
> > > > > > > > as null, resulting in a segmentation fault when 'sock' is later
> > > > > > > > dereferenced, such as when calling recvmsg().
> > > > > > > 
> > > > > > > Hi, David.  Thanks for the patch.
> > > > > > > 
> > > > > > > It's OK to check for a NULL pointer there, I guess.  However,
> > > > > > > do you know from where it was actually called?  This function,
> > > > > > > in general, should not be called without the actual socket,
> > > > > > > so we, probably, should fix the caller instead.
> > > > > > > 
> > > > > > > Best regards, Ilya Maximets.
> > > > > > 
> > > > > > Hi, Ilya Maximets.
> > > > > > 
> > > > > > When I looked at the coredump file, ch->sock was nil and was passed 
> > > > > > to nl_sock_recv():
> > > > > > 
> > > > > > (gdb) l
> > > > > > 2701
> > > > > > 2702    while (handler->event_offset < handler->n_events) {
> > > > > > 2703    int idx = 
> > > > > > handler->epoll_events[handler->event_offset].data.u32;
> > > > > > 2704    struct dpif_channel *ch = >channels[idx];
> > > > > > 
> > > > > > (gdb) p idx
> > > > > > $26 = 4
> > > > > > (gdb) p *dpif->channels@5
> > > > > > $27 = {{sock = 0x1001ae88240, last_poll = -9223372036854775808}, 
> > > > > > {sock = 0x1001aa9a8a0, last_poll = -9223372036854775808}, {sock = 
> > > > > > 0x1001ae09510, last_poll = 60634070}, {sock = 0x1001a9dbb60, 
> > > > > > last_poll = 60756950}, {sock = 0x0,
> > > > > >    last_poll = 61340749}}
> > > > > > 
> > > > > > 
> > > > > > The above snippet is from lib/dpif-netlink.c and the caller is 
> > > > > > dpif_netlink_recv_vport_dispatch().
> > > > > > 
> > > > > > The channel at idx=4 had sock=0x0, which was passed to 
> > > > > > nl_sock_recv() via ch->sock parameter.
> > > > > > In that function, it tried to access sock->fd when calling 
> > > > > > recvmsg(), causing the segfault.
> > > > > > 
> > > > > > I'm not enough experienced in Open vSwitch to explain why sock was 
> > > > > > nil at that given index.
> > > > > > The fix seems worth, though.
> > > > > 
> > > > > A few other points of note:
> > > > > 
> > > > > - Test system was a very large configuration (2K CPUs, > 1TB RAM)
> > > > > - OVS Switch was configured with 6K ports as follows:
> > > > > 
> > > > > # b=br0; cmds=; for i in {1..6000}; do cmds+=" -- add-port $b p$i -- 
> > > > > set interface p$i type=internal"; done
> > > > > # time sudo ovs-vsctl $cmds
> > > > > 
> > > > > - OVS was installed from RHEL RPM.  Build from source did not exhibit 
> > > > > the same problem.
> > > > > - Unable to reproduce on a different system (128 CPUs, 256GB RAM), 
> > > > > even with 10K ports.
> > > > > 
> > > > > Dave
> > > > 
> > > > Thanks for all the information.  Having a NULL socket in the 'channels'
> > > > array doesn't look a good sign.  This may mean that even if OVS will
> > > > not crash, it may not receive upcalls for certain ports, which is not
> > > > good eihter.
> > > > 
> > > > Could you provide more info on which RHEL package did you use and which
> > > > version you built from source?  Maybe we can find the difference.
> > > 
> > > When the bug was reported internally, the distro package was actually 
> > > from Fedora 33 ppc64le.
> > > 
> > > One difference I've found at that time was regarding the configure 
> > > options.
> > > 
> > >  From the logs at [0], the distro package was configured with these 
> > > options:
> > > 
> > >  + ./configure --build=ppc64le-redhat-linux-gnu 
> > > --host=ppc64le-redhat-linux-gnu --program-prefix= 
> > > --disable-dependency-tracking --prefix=/usr --exec-prefix=/usr 
> > > --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc 
> > > --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 
> > > --libexecdir=/usr/libexec --localstatedir=/var --sharedstatedir=/var/lib 
> > > --mandir=/usr/share/man --infodir=/usr/share/info --enable-libcapng 
> > > --disable-static --enable-shared --enable-ssl --with-dpdk=shared 
> > > --with-pkidir=/var/lib/openvswitch/pki
> > > 
> > > And the built-from-source binaries from Dave were built like this:
> > > 
> > >  git clone https://github.com/openvswitch/ovs.git
> > >  cd ovs/
> > >  git checkout v2.15.0
> > 
> > Do you mean 2.16.0? Because the 

Re: [ovs-dev] [PATCH dpdk-latest v3 0/5] Fixes for DPDK v21.11

2021-11-24 Thread David Marchand
Hello Ian,

On Wed, Nov 24, 2021 at 12:17 PM Stokes, Ian  wrote:
> > On Wed, Nov 10, 2021 at 3:19 PM Stokes, Ian  wrote:
> > > Thanks for the series. Looks good on the github actions to myself. I'd 
> > > like to
> > get OVS up and running on validation with rc2 so I've pushed the series to
> > dpdk-latest and rebased.
> > >
> > > Folks on the Intel side will be carrying out mode detailed OVS testing 
> > > over
> > the next few days so it's great to have this enabled now with RC2.
> >
> > It would be cool to get some test report before v21.11 is released
> > (expected end of next week).
> > Did you guys have a go with v21.11-rc2, or rc3?
>
> Hi David, we have finished our testing with RC2, are now looking at what 
> tests to re-run for rc3. I can provide a response to the DPDK ML or here when 
> finished RC3. What would you prefer?

If you have a report for rc2, yes, please send it.


Fwiw, the changes affecting main components and Intel drivers in rc3 are:
$ git ll v21.11-rc2..v21.11-rc3 -- lib/eal lib/ethdev lib/kvargs
lib/mbuf lib/mempool lib/net lib/pci lib/stack lib/vhost
drivers/bus/{vdev,pci} drivers/net/{ixgbe,i40e,ice,iavf}
drivers/common/iavf
25cf263074 - net: add macro for VLAN header length (7 days ago) 
46e461565c - net/iavf: fix VLAN insertion (8 days ago) 
798155194c - net/ice: fix secondary process Rx offload path (8 days
ago) 
b5d753f233 - net/ice/base: fix GTPU UL/DL flag (8 days ago) 
af4d7ad5d9 - vhost: fix packed ring descriptor update in async enqueue
(8 days ago) 
4a6672c2d3 - fix spelling in comments and doxygen (8 days ago)

f2a66612ee - eal/ppc: support ASan (8 days ago) 
5139502783 - ethdev: fix typos (9 days ago) 
c1c99d3421 - net/ice: fix pointer of meta data (12 days ago) 
f96789aeaf - net/iavf: fix pointer of meta data (13 days ago) 
cdce4bd37c - net/iavf: fix xstats reset (13 days ago) 
c4d3e8fbe4 - net/i40e: fix risk in descriptor read in scalar Rx (13
days ago) 
9143a09ded - net/ice: fix adminq handling on DCF passive reset (2
weeks ago) 
64be0e779f - ethdev: fix device capability to string translation (2
weeks ago) 
001d402c89 - eal/arm64: support ASan (12 days ago) 

There is not much chance DPDK broke something :-).
Maybe worth double checking VLAN.


rc4 is scheduled today (end of day, GMT+1) and, seeing how we got no
report for critical bugs, the final release is still on track for
11/26.


Thanks.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] northd: I-P: fix BFD dependency.

2021-11-24 Thread Dumitru Ceara
On 11/24/21 01:51, Dumitru Ceara wrote:
> BFD entries are updated and their ->ref field is set to 'true' when
> static routes are parsed, within build_lflows(), in the 'en_lflow' I-P
> node.  Therefore we cannot clean up BFD entries in 'en_northd' which is
> an input of 'en_lflow'.
> 
> To fix this we now move all BFD handling in the 'en_lflow' node.
> 
> This fixes the CI failures that we've recently started hitting when
> running the ovn-kubernetes jobs, for example:
> https://github.com/ovn-org/ovn/runs/4294661249?check_suite_focus=true#step:10:9154
> 
> Fixes: 1fa6612ffebf ("northd: Add lflow node")
> Signed-off-by: Dumitru Ceara 
> ---

Successful CI run:

https://mail.openvswitch.org/pipermail/ovs-build/2021-November/018639.html

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] utilities: Add another GDB macro for ovs-vswitchd

2021-11-24 Thread Eelco Chaudron


On 19 Nov 2021, at 15:35, Mike Pattrick wrote:

> This commit adds a basic packet metadata macro to the already existing
> macros in ovs_gdb.py, ovs_dump_packets will print out information about
> one or more packets. It feeds packets into tcpdump, and the user can
> pass in tcpdump options to modify how packets are parsed or even write
> out packets to a pcap file.
>
> Example usage:
> (gdb) break fast_path_processing
> (gdb) commands
>  ovs_dump_packets packets_
>  continue
>  end

Guess you removed the > before the commands from the previous commit, guess 
another mail client issue.
However should not matter, it’s just a commit message.

Rest looks good to me (one small nit below, maybe can be fixed on commit)

Acked-by: Eelco Chaudron 

> (gdb) continue
>
> Thread 1 "ovs-vswitchd" hit Breakpoint 2, fast_path_processing ...
> 12:01:05.962485 ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.1.1.1 
> tell 10.1.1.2, length 28
> Thread 1 "ovs-vswitchd" hit Breakpoint 1, fast_path_processing ...
> 12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 is-at 
> a6:0f:c3:f0:5f:bd (oui Unknown), length 28
>
> Signed-off-by: Mike Pattrick 
> ---
>  utilities/gdb/ovs_gdb.py | 114 ++-
>  1 file changed, 112 insertions(+), 2 deletions(-)
>
> diff --git a/utilities/gdb/ovs_gdb.py b/utilities/gdb/ovs_gdb.py
> index 0b2ecb81b..9da83920b 100644
> --- a/utilities/gdb/ovs_gdb.py
> +++ b/utilities/gdb/ovs_gdb.py
> @@ -29,6 +29,7 @@
>  #- ovs_dump_netdev
>  #- ovs_dump_netdev_provider
>  #- ovs_dump_ovs_list  {[] [] 
> {dump}]}
> +#- ovs_dump_packets  [tcpdump 
> options]
>  #- ovs_dump_simap 
>  #- ovs_dump_smap 
>  #- ovs_dump_udpif_keys {|} {short}
> @@ -56,8 +57,15 @@
>  #...
>  #
>  import gdb
> +import struct
>  import sys
>  import uuid
> +try:
> +from scapy.layers.l2 import Ether
> +from scapy.utils import tcpdump
> +except ModuleNotFoundError:
> +Ether = None
> +tcpdump = None
>
>
>  #
> @@ -138,7 +146,7 @@ def get_time_msec():
>
>  def get_time_now():
>  # See get_time_msec() above
> -return int(get_global_variable("coverage_run_time"))/1000, -5
> +return int(get_global_variable("coverage_run_time")) / 1000, -5
>
>
>  def eth_addr_to_string(eth_addr):
> @@ -156,7 +164,7 @@ def eth_addr_to_string(eth_addr):
>  #
>  class ProgressIndicator(object):
>  def __init__(self, message=None):
> -self.spinner = "/-\|"
> +self.spinner = "/-\\|"
>  self.spinner_index = 0
>  self.message = message
>
> @@ -1306,6 +1314,107 @@ class CmdDumpOfpacts(gdb.Command):
>  print("(struct ofpact *) {}: {}".format(node, 
> node.dereference()))
>
>
> +#
> +# Implements the GDB "ovs_dump_packets" command
> +#
> +class CmdDumpPackets(gdb.Command):
> +"""Dump metadata about dp_packets
> +Usage: ovs_dump_packets  
> [tcpdump options]
> +
> +This command can take either a dp_packet_batch struct and print out
> +metadata about all packets in this batch, or a single dp_packet struct 
> and
> +print out metadata about a single packet.
> +
> +Everything after the struct reference is passed into tcpdump. If nothing
> +is passed in as a tcpdump option, the default is "-n".
> +
> +(gdb) ovs_dump_packets packets_
> +12:01:05.981214 ARP, Ethernet (len 6), IPv4 (len 4), Reply 10.1.1.1 
> is-at a6:0f:c3:f0:5f:bd (oui Unknown), length 28
> +"""
> +def __init__(self):
> +super().__init__("ovs_dump_packets", gdb.COMMAND_DATA)
> +
> +def invoke(self, arg, from_tty):
> +if Ether is None:
> +print("ERROR: This command requires scapy to be installed.")
> +
> +arg_list = gdb.string_to_argv(arg)
> +if len(arg_list) == 0:
> +print("Usage: ovs_dump_packets  +  "struct dp_packet> [tcpdump options]")
> +return
> +
> +symb_name = arg_list[0]
> +tcpdump_args = arg_list[1:]
> +
> +if not tcpdump_args:
> +# Add a sane default
> +tcpdump_args = ["-n"]
> +
> +val = gdb.parse_and_eval(symb_name)
> +while val.type.code == gdb.TYPE_CODE_PTR:
> +val = val.dereference()
> +
> +pkt_list = []
> +if str(val.type).startswith("struct dp_packet_batch"):
> +for idx in range(val['count']):
> +pkt_struct = val['packets'][idx].dereference()
> +pkt = self.extract_pkt(pkt_struct)
> +if pkt is None:
> +continue
> +pkt_list.append(pkt)
> +elif str(val.type) == "struct dp_packet":
> +pkt = self.extract_pkt(val)
> +if pkt is None:
> +return
> +pkt_list.append(pkt)
> +else:
> +print("Error, unsupported argument type: 
> {}".format(str(val.type)))

Guess you format to fix the general ERROR format, i.e.:

print(“ERROR: Unsupported argument 

Re: [ovs-dev] [PATCH dpdk-latest v3 0/5] Fixes for DPDK v21.11

2021-11-24 Thread Stokes, Ian
> Hello Ian,
> 
> On Wed, Nov 10, 2021 at 3:19 PM Stokes, Ian  wrote:
> > Thanks for the series. Looks good on the github actions to myself. I'd like 
> > to
> get OVS up and running on validation with rc2 so I've pushed the series to
> dpdk-latest and rebased.
> >
> > Folks on the Intel side will be carrying out mode detailed OVS testing over
> the next few days so it's great to have this enabled now with RC2.
> 
> It would be cool to get some test report before v21.11 is released
> (expected end of next week).
> Did you guys have a go with v21.11-rc2, or rc3?

Hi David, we have finished our testing with RC2, are now looking at what tests 
to re-run for rc3. I can provide a response to the DPDK ML or here when 
finished RC3. What would you prefer?

So far no issues discovered (fingers crossed it stays the same for rc3).

Thanks
Ian
> 
> 
> Thanks.
> 
> --
> David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev