[ovs-dev] [PATCH v9 1/1] netdev-offload-dpdk: Replace action PORT_ID with REPRESENTED_PORT.

2024-01-15 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 v8: split from 
https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/406049.html
 v9: address warnings by 0-robot regarding the summary line

 lib/netdev-offload-dpdk.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 992627fa2..623005b1c 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,15 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev) {
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 }
+
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1777,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1812,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] dpdk: Update to use v23.11.

2024-01-15 Thread Ivan Malov via dev

Hi Ilya,

On Tue, 16 Jan 2024, Ilya Maximets wrote:


On 1/15/24 17:13, Kevin Traynor wrote:

On 15/01/2024 14:28, David Marchand wrote:

This commit adds support for DPDK v23.11.
It updates the CI script and documentation and includes the following
changes coming from the dpdk-latest branch:

- sparse: Add some compiler intrinsics for DPDK build.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=371129=*

- ci: Cache DPDK installed libraries only.
- ci: Reduce optional libraries in DPDK.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=383367=*

- system-dpdk: Ignore net/ice error log about QinQ offloading.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=385259=*

There is a known issue with i40e VF devices where OVS main thread may
block when adding such devices as dpif-netdev dpdk ports.

Signed-off-by: David Marchand 
---


Acked-by: Kevin Traynor 



I removed the mentioning of the internal 'dpif-netdev' name from
the NEWS file and applied the change.

Michael, could you, please, update the Intel CI to use 23.11 ?

Ivan, please, re-post (with a version bump) your rte_flow patch,
so we can accept it.


Done. See 
https://patchwork.ozlabs.org/project/openvswitch/patch/20240116004848.7957-1-ivan.ma...@arknetworks.am/
 .

Thank you.



Thanks!
Best regards, Ilya Maximets.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 1/1] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2024-01-15 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 v8: split from 
https://mail.openvswitch.org/pipermail/ovs-dev/2023-June/406049.html

 lib/netdev-offload-dpdk.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 992627fa2..623005b1c 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,15 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev) {
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 }
+
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1777,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1812,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Update to use v23.11.

2023-12-23 Thread Ivan Malov via dev

Ilya, David,

I appreciate your work.

With regard to the deferred change waiting to be revisited,
do you want me to take some action on it? Rework maybe?

Thank you.

On Sat, 16 Dec 2023, Ilya Maximets wrote:


On 12/13/23 14:06, David Marchand wrote:

This commit adds support for DPDK v23.11.
It updates the CI script and documentation and includes the following
changes coming from the dpdk-latest branch:

- sparse: Add some compiler intrinsics for DPDK build.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=371129=*

- ci: Cache DPDK installed libraries only.
- ci: Reduce optional libraries in DPDK.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=383367=*

- system-dpdk: Ignore net/ice error log about QinQ offloading.
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=385259=*

Signed-off-by: David Marchand 
---


Thanks, David!  I didn't test this much, but it looks
correct to me:

Acked-by: Ilya Maximets 


BTW, Just to not forget, there is one deferred change that
is waiting for 23.11 to be revisited:
 
https://patchwork.ozlabs.org/project/openvswitch/patch/20230630024630.6464-3-ivan.ma...@arknetworks.am/

Best regards, Ilya Maximets.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-07-16 Thread Ivan Malov via dev

Hi Ilya,

A quick follow-up from me: I made a new version of this patch [1].
[1] 
https://patchwork.ozlabs.org/project/openvswitch/patch/20230716115720.6789-1-ivan.ma...@arknetworks.am/

Thank you.

On Fri, 14 Jul 2023, Ilya Maximets wrote:


On 6/30/23 04:46, Ivan Malov wrote:

This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 


Hi, Ivan.  Thanks for the patch!
I suppose, it can be considered as a bug fix.  Could you add
a Fixes tag to the commit message?


---
 lib/netdev-dpdk.c | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 63dac689e..d9d1b43f6 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -492,6 +492,9 @@ struct netdev_dpdk {

 /* Array of vhost rxq states, see vring_state_changed. */
 bool *vhost_rxq_enabled;
+
+/* Ensures that Rx metadata delivery is configured only once. */
+bool rx_metadata_delivery_configured;
 );

 PADDED_MEMBERS(CACHE_LINE_SIZE,
@@ -1187,6 +1190,42 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }

+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured) {
+return;
+}
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */


This doesn't seem to be correct.  This flag is only needed for
tunnel offload.  Should also be under experimental api ifdef?


+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);


For all the logs above, use the following format instead:

   VLOG_*("%s: ...", netdev_get_name(>up));

For the last two, you may use:

   VLOG(ret == -ENOTSUP ? VLL_DBG : VLL_WARN,
"%s: Cannot negotiate Rx metadata: %s",
netdev_get_name(>up), rte_strerror(-ret));


+}
+
+dev->rx_metadata_delivery_configured = true;
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1200,6 +1239,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;

+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);


It's a bit strange that we request metadata regardless of hw-offload
being enabled.  Should this be under netdev_is_flow_api_enabled() ?


+
 rte_eth_dev_info_get(dev->port_id, );

 if (strstr(info.driver_name, "vf") != NULL) {
@@ -1382,6 +1431,8 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 /* Initilize the hardware offload flags to 0 */
 dev->hw_ol_features = 0;

+dev->rx_metadata_delivery_configured = false;
+
 dev->flags = NETDEV_UP | NETDEV_PROMISC;

 ovs_list_push_back(_list, >list_node);




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/1] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-07-16 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow")
Signed-off-by: Ivan Malov 
---
 v2: add missing experimental api ifdef

 lib/netdev-dpdk.c | 56 +++
 1 file changed, 56 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index aa87ee546..ea4cc6977 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -512,6 +512,9 @@ struct netdev_dpdk {
 
 /* Array of vhost rxq states, see vring_state_changed. */
 bool *vhost_rxq_enabled;
+
+/* Ensures that Rx metadata delivery is configured only once. */
+bool rx_metadata_delivery_configured;
 );
 
 PADDED_MEMBERS(CACHE_LINE_SIZE,
@@ -1220,6 +1223,45 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured) {
+return;
+}
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+
+#ifdef ALLOW_EXPERIMENTAL_API
+/* For the tunnel offload  */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+#endif /* ALLOW_EXPERIMENTAL_API */
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("%s: The NIC will not provide per-packet USER_MARK",
+ netdev_get_name(>up));
+}
+#ifdef ALLOW_EXPERIMENTAL_API
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("%s: The NIC will not provide per-packet TUNNEL_ID",
+ netdev_get_name(>up));
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+} else {
+VLOG(ret == -ENOTSUP ? VLL_DBG : VLL_WARN,
+ "%s: Cannot negotiate Rx metadata: %s",
+ netdev_get_name(>up), rte_strerror(-ret));
+}
+
+dev->rx_metadata_delivery_configured = true;
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1233,6 +1275,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+if (netdev_is_flow_api_enabled()) {
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+}
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
@@ -1421,6 +1475,8 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 /* Initilize the hardware offload flags to 0 */
 dev->hw_ol_features = 0;
 
+dev->rx_metadata_delivery_configured = false;
+
 dev->flags = NETDEV_UP | NETDEV_PROMISC;
 
 ovs_list_push_back(_list, >list_node);
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/1] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-07-16 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Fixes: e8a2b5bf92bb ("netdev-dpdk: implement flow offload with rte flow")
Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 54 +++
 1 file changed, 54 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index aa87ee546..3d39a3afd 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -512,6 +512,9 @@ struct netdev_dpdk {
 
 /* Array of vhost rxq states, see vring_state_changed. */
 bool *vhost_rxq_enabled;
+
+/* Ensures that Rx metadata delivery is configured only once. */
+bool rx_metadata_delivery_configured;
 );
 
 PADDED_MEMBERS(CACHE_LINE_SIZE,
@@ -1220,6 +1223,43 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured) {
+return;
+}
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+
+#ifdef ALLOW_EXPERIMENTAL_API
+/* For the tunnel offload  */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+#endif /* ALLOW_EXPERIMENTAL_API */
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("%s: The NIC will not provide per-packet USER_MARK",
+ netdev_get_name(>up));
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("%s: The NIC will not provide per-packet TUNNEL_ID",
+ netdev_get_name(>up));
+}
+} else {
+VLOG(ret == -ENOTSUP ? VLL_DBG : VLL_WARN,
+ "%s: Cannot negotiate Rx metadata: %s",
+ netdev_get_name(>up), rte_strerror(-ret));
+}
+
+dev->rx_metadata_delivery_configured = true;
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1233,6 +1273,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+if (netdev_is_flow_api_enabled()) {
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+}
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
@@ -1421,6 +1473,8 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 /* Initilize the hardware offload flags to 0 */
 dev->hw_ol_features = 0;
 
+dev->rx_metadata_delivery_configured = false;
+
 dev->flags = NETDEV_UP | NETDEV_PROMISC;
 
 ovs_list_push_back(_list, >list_node);
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-07-16 Thread Ivan Malov via dev

Hi Ilya,

Thanks for reviewing this. PSB.

On Fri, 14 Jul 2023, Ilya Maximets wrote:


On 6/30/23 04:46, Ivan Malov wrote:

This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 


Hi, Ivan.  Thanks for the patch!
I suppose, it can be considered as a bug fix.  Could you add
a Fixes tag to the commit message?


---
 lib/netdev-dpdk.c | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 63dac689e..d9d1b43f6 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -492,6 +492,9 @@ struct netdev_dpdk {

 /* Array of vhost rxq states, see vring_state_changed. */
 bool *vhost_rxq_enabled;
+
+/* Ensures that Rx metadata delivery is configured only once. */
+bool rx_metadata_delivery_configured;
 );

 PADDED_MEMBERS(CACHE_LINE_SIZE,
@@ -1187,6 +1190,42 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }

+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured) {
+return;
+}
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */


This doesn't seem to be correct.  This flag is only needed for
tunnel offload.  Should also be under experimental api ifdef?


My guess, it's comment which is incorrect. I'm going to fix it, yes.
But, regarding the experimental api ifdef, according to review [1]
from Eli, it might be unneeded. Furthermore, the dpdk api being
invoked is no longer experimental. What do you think?

[1] 
https://patchwork.ozlabs.org/project/openvswitch/patch/20220720121823.2497727-4-ivan.ma...@oktetlabs.ru/#2935847




+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);


For all the logs above, use the following format instead:

   VLOG_*("%s: ...", netdev_get_name(>up));

For the last two, you may use:

   VLOG(ret == -ENOTSUP ? VLL_DBG : VLL_WARN,
"%s: Cannot negotiate Rx metadata: %s",
netdev_get_name(>up), rte_strerror(-ret));


+}
+
+dev->rx_metadata_delivery_configured = true;
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1200,6 +1239,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;

+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);


It's a bit strange that we request metadata regardless of hw-offload
being enabled.  Should this be under netdev_is_flow_api_enabled() ?


+
 rte_eth_dev_info_get(dev->port_id, );

 if (strstr(info.driver_name, "vf") != NULL) {
@@ -1382,6 +1431,8 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 /* Initilize the hardware offload flags to 0 */
 dev->hw_ol_features = 0;

+dev->rx_metadata_delivery_configured = false;
+
 dev->flags = NETDEV_UP | NETDEV_PROMISC;

 ovs_list_push_back(_list, >list_node);




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-06-29 Thread Ivan Malov via dev

Hi Ilya,

Thanks again for reviewing this. I attempted to fix review notes in
https://patchwork.ozlabs.org/project/openvswitch/list/?series=361784 .

Thank you.

On Thu, 29 Jun 2023, Ilya Maximets wrote:


On 6/29/23 15:58, Simon Horman wrote:

On Tue, Jun 06, 2023 at 03:35:32PM +0400, Ivan Malov via dev wrote:

This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 50 +++
 1 file changed, 50 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8cb1a7703..98765fe6e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -542,6 +542,9 @@ struct netdev_dpdk {
 int rte_xstats_names_size;
 int rte_xstats_ids_size;
 uint64_t *rte_xstats_ids;
+
+/* Ensures that Rx metadata delivery is configured only once */


This also needs a period at the end of a comment.


+bool rx_metadata_delivery_configured;


And this doesn't really belong in the 'stats' section of the structure.
Might be better fit in one of the previous sections.  The one with 'up',
for example.


 );
 };

@@ -1140,6 +1143,41 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }

+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured)
+return;


Hi Ivan,

IIRC, the OVS coding style calls for the use of { } for all conditional blocks.
So please update this to:

if (dev->rx_metadata_delivery_configured) {
return;
}

Likewise, a similar update is needed for patch 2/2.

Thanks!




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 2/2] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2023-06-29 Thread Ivan Malov via dev

Hi Ilya,

Thanks for reviewing this. Please see below.

On Thu, 29 Jun 2023, Ilya Maximets wrote:


On 6/6/23 13:35, Ivan Malov wrote:

Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.


AFAICT, not all drivers moved to a REPRESENTED_PORT action.
I don't see support in NFP driver, for example.


This is the full story:

1) On Oct 13, 2021, commit [1] introduced
   a deprecation note for action PORT_ID.

2) On Aug 13, 2022, series [2] was sent to provide replacement
   for action PORT_ID in drivers that still had been using it.
   The series was accepted on Aug 22. NFP was not among the
   drivers to address because it had not been using PORT_ID.

3) On Oct 21, 2022, commit [3] added support for action PORT_ID to NFP.

Based on this input, it appears that deprecation notice for
action PORT_ID might have been ignored by NFP developers
for some reason. They added support for this action
after series [2] had been applied, so developers
behind [2] could not have a chance to offer help.

[1] 5da44faa809a ("ethdev: deprecate hard-to-use or ambiguous items and 
actions")
[2] 
https://patchwork.dpdk.org/project/dpdk/list/?series=24302=both=*
[3] 4d946034bf9c ("net/nfp: support basic flow actions")



Are we OK with dropping hardware offload support for NFP ?


I'm not saying above explanation justifies dropping such support,
but perhaps it pays to ask NFP team for their opinion on this.

Thank you.


Or are there plans to support this new action in that driver?

Simon, maybe you have some visibility on that?

Best regards, Ilya Maximets.



Signed-off-by: Ivan Malov 
---
 lib/netdev-offload-dpdk.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 14bc87771..ef0a266c5 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev)
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);

-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
-}
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1776,22 @@ add_count_action(struct flow_actions *actions)
 }

 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;

 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }

@@ -1808,7 +1811,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-06-29 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 51 +++
 1 file changed, 51 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 63dac689e..d9d1b43f6 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -492,6 +492,9 @@ struct netdev_dpdk {
 
 /* Array of vhost rxq states, see vring_state_changed. */
 bool *vhost_rxq_enabled;
+
+/* Ensures that Rx metadata delivery is configured only once. */
+bool rx_metadata_delivery_configured;
 );
 
 PADDED_MEMBERS(CACHE_LINE_SIZE,
@@ -1187,6 +1190,42 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured) {
+return;
+}
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+
+dev->rx_metadata_delivery_configured = true;
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1200,6 +1239,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
@@ -1382,6 +1431,8 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 /* Initilize the hardware offload flags to 0 */
 dev->hw_ol_features = 0;
 
+dev->rx_metadata_delivery_configured = false;
+
 dev->flags = NETDEV_UP | NETDEV_PROMISC;
 
 ovs_list_push_back(_list, >list_node);
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7 2/2] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2023-06-29 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 lib/netdev-offload-dpdk.c | 30 +-
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 14bc87771..40fdf03c1 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,15 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev) {
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 }
+
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1777,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1812,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v7 0/2] DPDK: align flow offloads with 22.11

2023-06-29 Thread Ivan Malov via dev
Address two problems using two corresponding features in DPDK,
which have been around for a year and are now stable:

1) The need to make sure that metadata generated by flow
   rule execution be delivered from NIC to application;

2) Replacing PORT_ID action with REPRESENTED_PORT;
---
 v2: amendments to care about proxy detach and port ID logging
 v3: a minor adjustment in the cover letter subject line
 v4: edits to apply review notes and align with 22.11
 v5: drop the transfer proxy patch; fix CI in [1/2]
 v6: another attempt to fix re-configure in [1/2]
 v7: address review notes from Simon and Ilya

Ivan Malov (2):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

 lib/netdev-dpdk.c | 51 +++
 lib/netdev-offload-dpdk.c | 30 +--
 2 files changed, 68 insertions(+), 13 deletions(-)

-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-06-29 Thread Ivan Malov via dev

On Thu, 29 Jun 2023, Simon Horman wrote:


On Tue, Jun 06, 2023 at 03:35:32PM +0400, Ivan Malov via dev wrote:

This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 50 +++
 1 file changed, 50 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8cb1a7703..98765fe6e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -542,6 +542,9 @@ struct netdev_dpdk {
 int rte_xstats_names_size;
 int rte_xstats_ids_size;
 uint64_t *rte_xstats_ids;
+
+/* Ensures that Rx metadata delivery is configured only once */
+bool rx_metadata_delivery_configured;
 );
 };

@@ -1140,6 +1143,41 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }

+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured)
+return;


Hi Ivan,

IIRC, the OVS coding style calls for the use of { } for all conditional blocks.
So please update this to:

   if (dev->rx_metadata_delivery_configured) {
   return;
   }

Likewise, a similar update is needed for patch 2/2.

Thanks!


Hi Simon,

Very valuable information indeed. Now I understand
the true reason behind the check-bot warning.

Thank you.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6 2/2] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2023-06-06 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 lib/netdev-offload-dpdk.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 14bc87771..ef0a266c5 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev)
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
-}
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1776,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1811,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6 0/2] DPDK: align flow offloads with 22.11

2023-06-06 Thread Ivan Malov via dev
Address two problems using two corresponding features in DPDK,
which have been around for a year and are now stable:

1) The need to make sure that metadata generated by flow
   rule execution be delivered from NIC to application;

2) Replacing PORT_ID action with REPRESENTED_PORT;
---
 v2: amendments to care about proxy detach and port ID logging
 v3: a minor adjustment in the cover letter subject line
 v4: edits to apply review notes and align with 22.11
 v5: drop the transfer proxy patch; fix CI in [1/2]
 v6: another attempt to fix re-configure in [1/2]

Ivan Malov (2):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

 lib/netdev-dpdk.c | 50 +++
 lib/netdev-offload-dpdk.c | 31 +---
 2 files changed, 67 insertions(+), 14 deletions(-)

-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-06-06 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 50 +++
 1 file changed, 50 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8cb1a7703..98765fe6e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -542,6 +542,9 @@ struct netdev_dpdk {
 int rte_xstats_names_size;
 int rte_xstats_ids_size;
 uint64_t *rte_xstats_ids;
+
+/* Ensures that Rx metadata delivery is configured only once */
+bool rx_metadata_delivery_configured;
 );
 };
 
@@ -1140,6 +1143,41 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+if (dev->rx_metadata_delivery_configured)
+return;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+
+dev->rx_metadata_delivery_configured = true;
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1154,6 +1192,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
@@ -1320,6 +1368,8 @@ common_construct(struct netdev *netdev, dpdk_port_t 
port_no,
 /* Initilize the hardware offload flags to 0 */
 dev->hw_ol_features = 0;
 
+dev->rx_metadata_delivery_configured = false;
+
 dev->flags = NETDEV_UP | NETDEV_PROMISC;
 
 ovs_list_push_back(_list, >list_node);
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 2/2] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2023-06-06 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 lib/netdev-offload-dpdk.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 14bc87771..ef0a266c5 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev)
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
-}
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1776,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1811,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 0/2] DPDK: align flow offloads with 22.11

2023-06-06 Thread Ivan Malov via dev
Address two problems using two corresponding features in DPDK,
which have been around for a year and are now stable:

1) The need to make sure that metadata generated by flow
   rule execution be delivered from NIC to application;

2) Replacing PORT_ID action with REPRESENTED_PORT.
---
 v2: amendments to care about proxy detach and port ID logging
 v3: a minor adjustment in the cover letter subject line
 v4: edits to apply review notes and align with 22.11
 v5: drop the transfer proxy patch; fix CI in [1/2]

Ivan Malov (2):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

 lib/netdev-dpdk.c | 43 +++
 lib/netdev-offload-dpdk.c | 31 +++-
 2 files changed, 60 insertions(+), 14 deletions(-)

-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 1/2] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-06-06 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 43 +++
 1 file changed, 43 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 8cb1a7703..774ca8d51 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1140,6 +1140,39 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else if (ret == -EBUSY) {
+/* Already configured */
+return;
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1154,6 +1187,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
-- 
2.17.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2023-06-06 Thread Ivan Malov via dev

Hi Eli,

Thanks for responding. PSB.

On Tue, 6 Jun 2023, Eli Britstein wrote:





-Original Message-
From: Ivan Malov 
Sent: Sunday, 4 June 2023 15:58
To: Eli Britstein 
Cc: ovs-dev@openvswitch.org; Ilya Maximets ; Ori
Kam ; David Marchand 
Subject: RE: [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

Thanks for reviewing this. Please see below.

On Tue, 21 Feb 2023, Eli Britstein wrote:





-Original Message-
From: Ivan Malov 
Sent: Tuesday, 21 February 2023 2:41
To: ovs-dev@openvswitch.org
Cc: Ilya Maximets ; Eli Britstein
; Ori Kam ; David Marchand

Subject: [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified explicitly,
via the corresponding pattern item.

Signed-off-by: Ivan Malov 
---
lib/netdev-dpdk.c | 88 +++
lib/netdev-dpdk.h |  4 +-
lib/netdev-offload-dpdk.c | 55 +++-
3 files changed, 117 insertions(+), 30 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
2cebc3cca..3a9c9d9a0
100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -434,6 +434,7 @@ enum dpdk_hw_ol_features {

struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE,

cacheline0,

+dpdk_port_t flow_transfer_proxy_port_id;

This extra field here makes it overflow one cache line.

dpdk_port_t port_id;

/* If true, device was attached by rte_eth_dev_attach(). */
@@ -1183,6
+1184,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
uint32_t rx_chksm_offload_capa =

RTE_ETH_RX_OFFLOAD_UDP_CKSUM |

 RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
 RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
+int ret;

/*
 * Full tunnel offload requires that tunnel ID metadata be @@
-1194,6
+1196,24 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 */
dpdk_eth_dev_init_rx_metadata(dev);

+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+
rte_eth_dev_info_get(dev->port_id, );

if (strstr(info.driver_name, "vf") != NULL) { @@ -3981,8 +4001,10
@@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc

OVS_UNUSED,

   const char *argv[], void *aux OVS_UNUSED)  {
struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+struct netdev_dpdk *dev_self = NULL;
struct rte_eth_dev_info dev_info;
dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
dpdk_port_t port_id;
bool used = false;
char *response;
@@ -4000,8 +4022,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn,
int argc OVS_UNUSED,
  argv[1]);

RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
LIST_FOR_EACH (dev, list_node, _list) {
if (dev->port_id != sibling_port_id) {
continue;
@@ -4021,6 +4041,25 @@ netdev_dpdk_detach(struct unixctl_conn

*conn,

int argc OVS_UNUSED,
}
ds_destroy(_interfaces);

+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id == port_id) {
+dev_self = dev;
+} else if (dev->flow_transfer_proxy_port_id == port_id) {
+response = xasprintf("Device '%s' can not be detached (flow

proxy)",

+ argv[1]);

This is not acceptable.
When removing a port, we clean the offloads using

netdev_offload_dpdk_flow_flush().

It should be enhanced to check if the proxy port is detached, remove the

offloads of all the ports that used it.

There is a related patch proposed in [1].
[1]


https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatch



work.ozlabs.org%2Fproject%2Fopenvswitch%2Fpatch%2F20220905144603.358
51

05-1-

elibr%40nvidia.com%2F=05%7C01%7Celibr%40nvidia.com%7Ca5083e7



6e913441830f108db64fb5c71%7C43083d157273

Re: [ovs-dev] [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2023-06-04 Thread Ivan Malov via dev

Hi Eli,

Thanks for reviewing this. Please see below.

On Tue, 21 Feb 2023, Eli Britstein wrote:





-Original Message-
From: Ivan Malov 
Sent: Tuesday, 21 February 2023 2:41
To: ovs-dev@openvswitch.org
Cc: Ilya Maximets ; Eli Britstein ; Ori
Kam ; David Marchand 
Subject: [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified explicitly, via the
corresponding pattern item.

Signed-off-by: Ivan Malov 
---
lib/netdev-dpdk.c | 88 +++
lib/netdev-dpdk.h |  4 +-
lib/netdev-offload-dpdk.c | 55 +++-
3 files changed, 117 insertions(+), 30 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 2cebc3cca..3a9c9d9a0
100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -434,6 +434,7 @@ enum dpdk_hw_ol_features {

struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;

This extra field here makes it overflow one cache line.

dpdk_port_t port_id;

/* If true, device was attached by rte_eth_dev_attach(). */ @@ -1183,6
+1184,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
 RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
 RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
+int ret;

/*
 * Full tunnel offload requires that tunnel ID metadata be @@ -1194,6
+1196,24 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 */
dpdk_eth_dev_init_rx_metadata(dev);

+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+
rte_eth_dev_info_get(dev->port_id, );

if (strstr(info.driver_name, "vf") != NULL) { @@ -3981,8 +4001,10 @@
netdev_dpdk_detach(struct unixctl_conn *conn, int argc OVS_UNUSED,
   const char *argv[], void *aux OVS_UNUSED)  {
struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+struct netdev_dpdk *dev_self = NULL;
struct rte_eth_dev_info dev_info;
dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
dpdk_port_t port_id;
bool used = false;
char *response;
@@ -4000,8 +4022,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int
argc OVS_UNUSED,
  argv[1]);

RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
LIST_FOR_EACH (dev, list_node, _list) {
if (dev->port_id != sibling_port_id) {
continue;
@@ -4021,6 +4041,25 @@ netdev_dpdk_detach(struct unixctl_conn *conn,
int argc OVS_UNUSED,
}
ds_destroy(_interfaces);

+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id == port_id) {
+dev_self = dev;
+} else if (dev->flow_transfer_proxy_port_id == port_id) {
+response = xasprintf("Device '%s' can not be detached (flow 
proxy)",
+ argv[1]);

This is not acceptable.
When removing a port, we clean the offloads using 
netdev_offload_dpdk_flow_flush().
It should be enhanced to check if the proxy port is detached, remove the 
offloads of all the ports that used it.
There is a related patch proposed in [1].
[1] 
http://patchwork.ozlabs.org/project/openvswitch/patch/20220905144603.3585105-1-el...@nvidia.com/


I hear you. But do you want me to first wait for patch [1] to be applied
and then send the proxy enhancement to be applied on top of it?
Or do you believe I can do something in parallel here?

As I can see, patch [1] has not been applied yet.
Is there any problem with it?

If I need to wait, then I believe I can revoke this [3/3]
patch from the series for the time being and send v5.
What's your opinion?




+goto error;
+}
+}
+
+/* Indicate that the device being detached no longer needs a flow proxy.
*/
+if (dev_self != NULL)
+dev_self-&g

Re: [ovs-dev] [PATCH dpdk-latest v4 1/5] netdev-dpdk: use flow transfer proxy

2023-05-08 Thread Ivan Malov via dev

Hi Nole, Simon, Ilya,

On Mon, 8 May 2023, Nole Zhang wrote:





-Original Message-
From: Ilya Maximets 
Sent: Saturday, May 6, 2023 3:07 AM
To: Ivan Malov ; Simon Horman

Cc: i.maxim...@ovn.org; d...@openvswitch.org; Eli Britstein
; Kevin Liu ; Chaoyong He
; oss-drivers ; Nole
Zhang 
Subject: Re: [ovs-dev] [PATCH dpdk-latest v4 1/5] netdev-dpdk: use flow
transfer proxy

On 5/3/23 16:00, Ivan Malov wrote:

Hello Simon,

This patch has me intrigued. By the looks of it, it bears uncanny
resemblance to patch [1] by another author. Is your patch based on
patch [1]? If yes, could you please comment on the following:

1) Your patch does not seem to reference the original author.
   Why is it so? Is there a problem, colleagues?


When re-using someone else's work, please, retain the original authorship.  I
see there are changes made to the patch, but it's the same as the original in
many parts.  Since you made changes, you should add yourself as co-authors.
If you feel that changes made are more significant than the original ptch,
then you may swap the authorship, but you should add the original author to
the list of co-authors anyway.


Thanks for the clarification Ilya.



@Ivan Malov sorry again.
It is my fault, I'm very sorry for these.
Next version, I will add signed-by.


Glad we see eye to eye about this and can understand
each other after all.





2) Your patch does not seem to address review feedback [2].
   There's a problem that has been indicated by Eli,
   regarding flow flush. Doesn't it still stand?


In this version the rte_flow_flush() call is added instead of failing the 
detach.
However,

a. the flush operation should have already been executed from
   the higher layer from do_del_port() in dpif-netdev.  So,
   it should not be needed.

b. The problem doesn't apper to be addressed, because related
   ports will not get their flows flushed.


Ok, I think you are right, do you think if I use the  rte_flow_flush() for 
related ports is OK?


Please find Eli's review of the original patch where he explains
how it should be done:

https://mail.openvswitch.org/pipermail/ovs-dev/2023-February/402172.html

Thank you.





Best regards, Ilya Maximets.



Interested to hear your input on this. Thank you.

[1]
https://mail.openvswitch.org/pipermail/ovs-dev/2023-February/402152.ht
ml

[2]
https://mail.openvswitch.org/pipermail/ovs-dev/2023-February/402172.ht
ml



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH dpdk-latest v4 1/5] netdev-dpdk: use flow transfer proxy

2023-05-03 Thread Ivan Malov via dev

Hello Simon,

This patch has me intrigued. By the looks of it, it bears uncanny
resemblance to patch [1] by another author. Is your patch based
on patch [1]? If yes, could you please comment on the following:

1) Your patch does not seem to reference the original author.
   Why is it so? Is there a problem, colleagues?

2) Your patch does not seem to address review feedback [2].
   There's a problem that has been indicated by Eli,
   regarding flow flush. Doesn't it still stand?

Interested to hear your input on this. Thank you.

[1] 
https://mail.openvswitch.org/pipermail/ovs-dev/2023-February/402152.html


[2] 
https://mail.openvswitch.org/pipermail/ovs-dev/2023-February/402172.html


On Wed, 26 Apr 2023, Simon Horman wrote:


From: Peng Zhang 

Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified
explicitly, via the corresponding pattern item.

Signed-off-by: Peng Zhang 
Signed-off-by: Jin Liu 
Co-authored-by: Jin Liu 
Signed-off-by: Simon Horman 
---
lib/netdev-dpdk.c | 97 +--
lib/netdev-dpdk.h |  3 ++
lib/netdev-offload-dpdk.c | 16 ---
3 files changed, 96 insertions(+), 20 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fff57f78279a..278d6afc08af 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -437,6 +437,7 @@ enum dpdk_hw_ol_features {
struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
dpdk_port_t port_id;
+dpdk_port_t flow_transfer_proxy_port_id;

/* If true, device was attached by rte_eth_dev_attach(). */
bool attached;
@@ -1155,6 +1156,24 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
 RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
 RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
+int ret;
+
+/* Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id,
+   NULL);
+if (ret != 0) {
+/* The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}

rte_eth_dev_info_get(dev->port_id, );

@@ -3767,8 +3786,10 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
   const char *argv[], void *aux OVS_UNUSED)
{
struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+struct rte_flow_error ferror;
struct rte_eth_dev_info dev_info;
dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
dpdk_port_t port_id;
bool used = false;
char *response;
@@ -3786,8 +3807,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
  argv[1]);

RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
LIST_FOR_EACH (dev, list_node, _list) {
if (dev->port_id != sibling_port_id) {
continue;
@@ -3807,6 +3826,22 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
}
ds_destroy(_interfaces);

+/* The device being detached may happen to be a flow proxy port
+ * for another device (still attached). Update the flow proxy port id.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id != port_id &&
+dev->flow_transfer_proxy_port_id == port_id) {
+if (rte_flow_flush(dev->flow_transfer_proxy_port_id, )) {
+response = xasprintf("Flush port failed, Device '%s' can not"
+ "be detached (flow proxy)", argv[1]);
+goto error;
+}
+break;
+}
+}
+
+
rte_eth_dev_info_get(port_id, _info);
rte_eth_dev_close(port_id);
if (rte_dev_remove(dev_info.device) < 0) {
@@ -3817,6 +3852,16 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
response = xasprintf("All devices shared with device '%s' "
 "have been detached", argv[1]);

+/* The device being detached may happen to be a flow proxy port.
+ * After the flow proxy port was detached, the related ports
+ * will reconfigure the device and update the proxy_port_id.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+ if (dev->flow_transfer_proxy_port_id == port_id) {
+netdev_request_reconfigure(>up);
+}
+}
+
ovs_mutex_unlock(_mutex);
unixctl_command_reply(conn, 

[ovs-dev] [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2023-02-20 Thread Ivan Malov via dev
Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified
explicitly, via the corresponding pattern item.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 88 +++
 lib/netdev-dpdk.h |  4 +-
 lib/netdev-offload-dpdk.c | 55 +++-
 3 files changed, 117 insertions(+), 30 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2cebc3cca..3a9c9d9a0 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -434,6 +434,7 @@ enum dpdk_hw_ol_features {
 
 struct netdev_dpdk {
 PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
 dpdk_port_t port_id;
 
 /* If true, device was attached by rte_eth_dev_attach(). */
@@ -1183,6 +1184,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
+int ret;
 
 /*
  * Full tunnel offload requires that tunnel ID metadata be
@@ -1194,6 +1196,24 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  */
 dpdk_eth_dev_init_rx_metadata(dev);
 
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
@@ -3981,8 +4001,10 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
const char *argv[], void *aux OVS_UNUSED)
 {
 struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+struct netdev_dpdk *dev_self = NULL;
 struct rte_eth_dev_info dev_info;
 dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
 dpdk_port_t port_id;
 bool used = false;
 char *response;
@@ -4000,8 +4022,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
   argv[1]);
 
 RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
 LIST_FOR_EACH (dev, list_node, _list) {
 if (dev->port_id != sibling_port_id) {
 continue;
@@ -4021,6 +4041,25 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
 }
 ds_destroy(_interfaces);
 
+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id == port_id) {
+dev_self = dev;
+} else if (dev->flow_transfer_proxy_port_id == port_id) {
+response = xasprintf("Device '%s' can not be detached (flow 
proxy)",
+ argv[1]);
+goto error;
+}
+}
+
+/* Indicate that the device being detached no longer needs a flow proxy. */
+if (dev_self != NULL)
+dev_self->flow_transfer_proxy_port_id = port_id;
+
 rte_eth_dev_info_get(port_id, _info);
 rte_eth_dev_close(port_id);
 if (rte_dev_remove(dev_info.device) < 0) {
@@ -5470,7 +5509,8 @@ unlock:
 }
 
 int
-netdev_dpdk_get_port_id(struct netdev *netdev)
+netdev_dpdk_get_port_id(struct netdev *netdev,
+bool flow_transfer_proxy)
 {
 struct netdev_dpdk *dev;
 int ret = -1;
@@ -5481,7 +5521,7 @@ netdev_dpdk_get_port_id(struct netdev *netdev)
 
 dev = netdev_dpdk_cast(netdev);
 ovs_mutex_lock(>mutex);
-ret = dev->port_id;
+ret = flow_transfer_proxy ? dev->flow_transfer_proxy_port_id : 
dev->port_id;
 ovs_mutex_unlock(>mutex);
 out:
 return ret;
@@ -5517,13 +5557,15 @@ out:
 
 int
 netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
- struct rte_flow *rte_flow,
+ bool transfer, struct rte_flow *rte_flow,
  struct rte_flow_error *error)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 int ret;
 
-ret = rte_flow_destroy(dev->port_id, rte_flow, error);
+ret = rte_flow_destroy(transfer ?
+   dev-

[ovs-dev] [PATCH v4 0/3] DPDK: align flow offloads with 22.11

2023-02-20 Thread Ivan Malov via dev
Address three problems using three corresponding features in
DPDK, which have been around for a year and are now stable:

1) The need to make sure that metadata generated by flow
   rule execution be delivered from NIC to application;

2) Replacing PORT_ID action with REPRESENTED_PORT;

3) Potential need to manage "transfer" flows
   via a privileged ("transfer proxy") port.
---
 v2: amendments to care about proxy detach and port ID logging
 v3: a minor adjustment in the cover letter subject line
 v4: edits to apply review notes and align with 22.11

Ivan Malov (3):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT
  netdev-offload-dpdk: use flow transfer proxy mechanism

 lib/netdev-dpdk.c | 128 +-
 lib/netdev-dpdk.h |   4 +-
 lib/netdev-offload-dpdk.c |  86 +
 3 files changed, 174 insertions(+), 44 deletions(-)

-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 1/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-02-20 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fb0dd43f7..2cebc3cca 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1140,6 +1140,36 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1154,6 +1184,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/3] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2023-02-20 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 lib/netdev-offload-dpdk.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index b3421c099..7f2598a53 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev)
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
-}
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1776,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1811,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 2/3] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2023-02-20 Thread Ivan Malov via dev
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
---
 lib/netdev-offload-dpdk.c | 31 +--
 1 file changed, 17 insertions(+), 14 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index b3421c099..7f2598a53 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -735,14 +735,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
-} else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
-const struct rte_flow_action_port_id *port_id = actions->conf;
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+
+if (ethdev)
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
 
-ds_put_cstr(s, "port_id ");
-if (port_id) {
-ds_put_format(s, "original %d id %d ",
-  port_id->original, port_id->id);
-}
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
 ds_put_cstr(s, "drop / ");
@@ -1776,19 +1776,22 @@ add_count_action(struct flow_actions *actions)
 }
 
 static int
-add_port_id_action(struct flow_actions *actions,
-   struct netdev *outdev)
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
 {
-struct rte_flow_action_port_id *port_id;
+struct rte_flow_action_ethdev *ethdev;
 int outdev_id;
 
 outdev_id = netdev_dpdk_get_port_id(outdev);
 if (outdev_id < 0) {
 return -1;
 }
-port_id = xzalloc(sizeof *port_id);
-port_id->id = outdev_id;
-add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
+
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+
 return 0;
 }
 
@@ -1808,7 +1811,7 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
-add_port_id_action(actions, outdev)) {
+add_represented_port_action(actions, outdev)) {
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 1/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2023-02-20 Thread Ivan Malov via dev
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index fb0dd43f7..2cebc3cca 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1140,6 +1140,36 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1154,6 +1184,16 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2023-02-20 Thread Ivan Malov via dev
Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified
explicitly, via the corresponding pattern item.

Signed-off-by: Ivan Malov 
---
 lib/netdev-dpdk.c | 88 +++
 lib/netdev-dpdk.h |  4 +-
 lib/netdev-offload-dpdk.c | 55 +++-
 3 files changed, 117 insertions(+), 30 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 2cebc3cca..3a9c9d9a0 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -434,6 +434,7 @@ enum dpdk_hw_ol_features {
 
 struct netdev_dpdk {
 PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
 dpdk_port_t port_id;
 
 /* If true, device was attached by rte_eth_dev_attach(). */
@@ -1183,6 +1184,7 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
+int ret;
 
 /*
  * Full tunnel offload requires that tunnel ID metadata be
@@ -1194,6 +1196,24 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  */
 dpdk_eth_dev_init_rx_metadata(dev);
 
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
@@ -3981,8 +4001,10 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
const char *argv[], void *aux OVS_UNUSED)
 {
 struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+struct netdev_dpdk *dev_self = NULL;
 struct rte_eth_dev_info dev_info;
 dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
 dpdk_port_t port_id;
 bool used = false;
 char *response;
@@ -4000,8 +4022,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
   argv[1]);
 
 RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
 LIST_FOR_EACH (dev, list_node, _list) {
 if (dev->port_id != sibling_port_id) {
 continue;
@@ -4021,6 +4041,25 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
 }
 ds_destroy(_interfaces);
 
+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id == port_id) {
+dev_self = dev;
+} else if (dev->flow_transfer_proxy_port_id == port_id) {
+response = xasprintf("Device '%s' can not be detached (flow 
proxy)",
+ argv[1]);
+goto error;
+}
+}
+
+/* Indicate that the device being detached no longer needs a flow proxy. */
+if (dev_self != NULL)
+dev_self->flow_transfer_proxy_port_id = port_id;
+
 rte_eth_dev_info_get(port_id, _info);
 rte_eth_dev_close(port_id);
 if (rte_dev_remove(dev_info.device) < 0) {
@@ -5470,7 +5509,8 @@ unlock:
 }
 
 int
-netdev_dpdk_get_port_id(struct netdev *netdev)
+netdev_dpdk_get_port_id(struct netdev *netdev,
+bool flow_transfer_proxy)
 {
 struct netdev_dpdk *dev;
 int ret = -1;
@@ -5481,7 +5521,7 @@ netdev_dpdk_get_port_id(struct netdev *netdev)
 
 dev = netdev_dpdk_cast(netdev);
 ovs_mutex_lock(>mutex);
-ret = dev->port_id;
+ret = flow_transfer_proxy ? dev->flow_transfer_proxy_port_id : 
dev->port_id;
 ovs_mutex_unlock(>mutex);
 out:
 return ret;
@@ -5517,13 +5557,15 @@ out:
 
 int
 netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
- struct rte_flow *rte_flow,
+ bool transfer, struct rte_flow *rte_flow,
  struct rte_flow_error *error)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 int ret;
 
-ret = rte_flow_destroy(dev->port_id, rte_flow, error);
+ret = rte_flow_destroy(transfer ?
+   dev-

[ovs-dev] [PATCH v4 0/3] DPDK: align flow offloads with 22.11

2023-02-20 Thread Ivan Malov via dev
Address three problems using three corresponding features in
DPDK, which have been around for a year and are now stable:

1) The need to make sure that metadata generated by flow
   rule execution be delivered from NIC to application;

2) Replacing PORT_ID action with REPRESENTED_PORT;

3) Potential need to manage "transfer" flows
   via a privileged ("transfer proxy") port.
---
 v2: amendments to care about proxy detach and port ID logging
 v3: a minor adjustment in the cover letter subject line
 v4: edits to apply review notes and align with 22.11

Ivan Malov (3):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT
  netdev-offload-dpdk: use flow transfer proxy mechanism

 lib/netdev-dpdk.c | 128 +-
 lib/netdev-dpdk.h |   4 +-
 lib/netdev-offload-dpdk.c |  86 +
 3 files changed, 174 insertions(+), 44 deletions(-)

-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 0/3] Rework the usage of DPDK transfer flow offloads

2022-07-22 Thread Ivan Malov

Hi Ilya, Eli,

Many thanks for your feedback. Please see below.

On Wed, 20 Jul 2022, Ilya Maximets wrote:


On 7/20/22 14:18, Ivan Malov wrote:

DPDK has got support for offloads involving assignment of per-packet
Rx metadata (flag, mark, tunnel ID and the likes). However, delivery
of such metadata from the NIC to the PMD might need to be negotiated
in advance. API [1] addresses that problem. Make OvS invoke this API.

Another problem is how flow rules with attribute "transfer" refer to
embedded switch ports by means of their representors. Action PORT_ID
has proved ambiguous: it refers to a DPDK ethdev, but in fact steers
packets to the entity represented by the ethdev, in example, to a VF.
The problem is addressed by [2]. Use the solution in OvS accordingly.

In addition, [2] and [3] address filtering traffic by input ports of
the embedded switch. In the suggested approach, "transfer" rules are
managed via the only ethdev with sufficient privileges, whilst match
criteria include an explicit item to indicate the desired input port.
Revisit OvS support for "transfer" rules to follow the said approach.

The following tests have been considered so far:
- build check with the current dpdk-next-net;
- running "make check" for every patch;
- tunnel offload demo with net/sfc PMD.

[1] http://mails.dpdk.org/archives/dev/2021-October/224291.html
[2] http://mails.dpdk.org/archives/dev/2021-October/224620.html
[3] http://mails.dpdk.org/archives/dev/2021-October/225081.html
---
 v1 -> v2: amendments to care about proxy detach and port ID logging
 v2 -> v3: a minor adjustment in the cover letter's subject line

Ivan Malov (3):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT
  netdev-offload-dpdk: use flow transfer proxy mechanism

 lib/netdev-dpdk.c | 141 +-
 lib/netdev-dpdk.h |   4 +-
 lib/netdev-offload-dpdk.c |  91 +---
 3 files changed, 209 insertions(+), 27 deletions(-)



Hi.  Since this patch set contains use of DPDK's experimental API,
it needs to be based on dpdk-latest branch and have 'dpdk-latest'
in the subject prefix.  OVS doesn't normally use experimental API.
The tunnel offload support was an exception, because it required
significant re-work of the generic code that wasn't feasible to
carry in the dpdk-latest branch.


Very useful information indeed.



CC: Ian, dpdk-latest branch might need some rebase.

Also, please, don't send new versions in reply to previous one.
It's only required for dpdk-dev list for some reason, but prohibited
or frown upon in every other list that I know of.  It's very
inconvenient to work this way.


Thanks for the hint. I should've known.



For the patch set itself, a brief look over current state of DPDK
shows that not all drivers that support PORT_ID support the
REPRESENTED_PORT action as well.  I see at least dpaa2 and cnxk
that doesn't.  So, this patch set essentially breaks offloading
for these NICs, IIUC.  Yet another rte_flow inconsistency.
Any plans for addressing that?

Best regards, Ilya Maximets.



This observation of yours is correct. The REPRESENTED_PORT commit is:

commit 88caad251c8de3a84e353b0b2a27014bc303df87
Author: Ivan Malov 
Date:   Wed Oct 13 20:34:40 2021 +0300

ethdev: add represented port action to flow API

While the PORT_ID commits in dpaa2 and cnxk, respectively, are:

commit 028d1dfd183043db03eb3d726579b18219c02624
Author: Jun Yang 
Date:   Wed Oct 6 22:31:23 2021 +0530

net/dpaa2: support Tx flow redirection action

commit 00ea15e7a39617903ad39a9b89280a028ec7783a
Author: Satheesh Paul 
Date:   Thu Oct 21 10:41:15 2021 +0530

net/cnxk: support port ID flow action

Indeed, I might've lost at least one driver from my view when
sending the REPRESENTED_PORT series updating the capable PMDs.
And in the case of cnxk, perhaps maintainers were busy enough
and thus did not spot the PORT_ID deprecation being under way.

So yes, the problem exists, and many thanks for noticing that.

Looking into it, I see that we've had a deprecation notice
in DPDK for almost a year already:

* ethdev: Items and actions ``PF``, ``VF``, ``PHY_PORT``, ``PORT_ID`` are
  deprecated as hard-to-use / ambiguous and will be removed in DPDK 22.11.

* ethdev: The use of attributes ``ingress`` / ``egress`` in "transfer" flows
  is deprecated as ambiguous with respect to the embedded switch. The use of
  these attributes will become invalid starting from DPDK 22.11.

So, based on that, I devise the following plan:

1) I will send a patch to make the new items / actions non-experimental.
2) I will send two patches to add REPRESENTED_PORT to dpaa2 and cnxk.
3) I will dispose of the deprecated items / actions and the notice.
   Doing so will also remove "experimental" from transfer proxy
   as well as from Rx metadata configuration API.
4) When (1-3) hav

Re: [ovs-dev] [PATCH v3 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-07-20 Thread Ivan Malov

Hi Eli,

Thanks a lot for a prompt review. PSB.

On Wed, 20 Jul 2022, Eli Britstein wrote:





-Original Message-
From: Ivan Malov 
Sent: Wednesday, July 20, 2022 3:18 PM
To: d...@openvswitch.org
Cc: Eli Britstein ; Stephen Hemminger
; Ilya Maximets ; Ori
Kam ; Maxime Coquelin
; David Marchand
; Andrew Rybchenko

Subject: [PATCH v3 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified explicitly, via the
corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
lib/netdev-dpdk.c | 99 ---
lib/netdev-dpdk.h |  4 +-
lib/netdev-offload-dpdk.c | 61 
3 files changed, 135 insertions(+), 29 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
45e5d26d2..01fb40255 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {

struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
dpdk_port_t port_id;

/* If true, device was attached by rte_eth_dev_attach(). */ @@ -1130,8
+1131,9 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
 RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
 RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
-
#ifdef ALLOW_EXPERIMENTAL_API

There are all over the patches this ifdef, even in cases it's harmless to do 
without. It makes the code less readable and might cause future cherry-picking 
issues. Try to minimize it only to places of a must.


I see your point. But I'm a rookie here, in OvS, and from the existing
code it's not clear which kinds of places would be harmless to stay
without if-defs and which would not. May I kindly request that
you point out at least one example in my patch = a place where
it would be safe to remove the checks. - ?


+int ret;
+
/*
 * Full tunnel offload requires that tunnel ID metadata be
 * delivered with "miss" packets from the hardware to the @@ -1141,6
+1143,27 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 * Request delivery of such metadata.
 */
dpdk_eth_dev_init_rx_metadata(dev);
+
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+#else /* ! ALLOW_EXPERIMENTAL_API */
+/* No API to get transfer proxy; assume the proxy is unneeded. */
+dev->flow_transfer_proxy_port_id = dev->port_id;
#endif /* ALLOW_EXPERIMENTAL_API */

rte_eth_dev_info_get(dev->port_id, ); @@ -3762,8 +3785,12 @@
netdev_dpdk_detach(struct unixctl_conn *conn, int argc OVS_UNUSED,
   const char *argv[], void *aux OVS_UNUSED)  {
struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+#ifdef ALLOW_EXPERIMENTAL_API
+struct netdev_dpdk *dev_self = NULL; #endif /*
+ALLOW_EXPERIMENTAL_API */
struct rte_eth_dev_info dev_info;
dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
dpdk_port_t port_id;
bool used = false;
char *response;
@@ -3781,8 +3808,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int
argc OVS_UNUSED,
  argv[1]);

RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
LIST_FOR_EACH (dev, list_node, _list) {
if (dev->port_id != sibling_port_id) {
continue;
@@ -3802,6 +3827,27 @@ netdev_dpdk_detach(struct unixctl_conn *conn,
int argc OVS_UNUSED,
}
ds_destroy(_interfaces);

+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.

I don't think this is acceptable to deny the port from being detached, or to 
enforce such ordering. For example, ports are being detached upon shutdown, 
with unknown order.


I see your point. However, as far as I can understand, this specific
function is not the one that unplugs devices upon the "global" OvS
shutdown. It might be the one

Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-07-20 Thread Ivan Malov

Hi Eli,

Thank you for your review efforts. Please find the new version:

http://patchwork.ozlabs.org/project/openvswitch/patch/20220720121823.2497727-4-ivan.ma...@oktetlabs.ru/

It cares about the detach scenario we've been talking about.
And it cares about port ID logging for transfer flows, too.

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Wednesday, June 8, 2022 10:02 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Wednesday, June 8, 2022 5:46 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Tuesday, June 7, 2022 11:56 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets
; Ori Kam ;
NBU-Contact-Thomas Monjalon (EXTERNAL) ;
Stephen Hemminger ; David Marchand
; Gaetan Rivet ;

Maxime

Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer
proxy mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 1 Jun 2022, Eli Britstein wrote:


- Missing proper handling of the testpmd syntax logging. It
changes the used

port according to "transfer", but the log still uses

netdev_dpdk_get_port_id().


Thanks for noticing. I will see to it in the next version.


- The usage of the "proxy" port for rte-flow implies that this
proxy port is

attached to OVS, otherwise it is not "started" and creation of
flows will

fail.


That's the way it is. If there is no proxy for a given port, then
the original port value will be used for managing flows. For
vendors that don't need the proxy, this will work. For others, it won't.

That's OK.



I don't really understand why this can't be done inside dpdk domain
(if there

is a proxy, and it is up, use it, otherwise don't).

That's *currently* the way it is. I understand that if dpdk works
like this OVS

should align, but maybe you or someone else here knows why dpdk works
like this? (not too late to change, this is experimental...).


Regardless of DPDK, on some NICs, it is possible to insert rules via
unprivileged PFs or VFs, but there are also NICs which cannot do it.

In DPDK, this contradiction has to be resolved somehow.
In example, for NICs that can only manage flows via privileged ports,
two possible solutions exist:

1. Route flow management requests from unprivileged ethdevs
   to the privileged one implicitly, inside the PMD. This
   is transparent to users, but, at the same time, it is
   tricky because the application does not realise that
   flows it manages via an ethdev "B" are in fact
   communicated to the NIC via an ethdev "A".

   Unbeknownst of the implicit scheme, the application may
   detach the privileged ethdev "A" in-between. And, when
   time comes to remove flows, doing so via ethdev "B"
   will fail. This scheme breaks in-app housekeeping.

2. Expose the "proxy" port existence to the application.
   If it knows the truth about the real ethdev that
   handles the transfer flows, it won't attempt to
   detach it in-between. The housekeeping is fine.

Outing the existence of the "proxy" port to users seems like the most
reasonable approach. This is why it was implemented in DPDK like this.
Currently, it's indeed an experimental feature. DPDK PMDs which need
it, are supposed to switch to it during the transition phase.

Thanks very much for the explanation, though IMHO relevant PMDs could

still hide it and not do this "outing" of their internals.


Sort of yes, they could hide it. But that would mean doing additional record-
keeping internally, in order to return EBUSY when the app asks to detach the
privileged port which still has active flows on it that have been originally
requested via an unprivileged port. Might be quite error prone. Also, given the
fact that quite a few vendors might need this, isn't it better to make the
feature generic?

Discussing such scenario, this patch does not handle it. Suppose A is the 
privileged port, served as a proxy for port B.
Now, there are flows applied on port B (but actually on A). Nothing prevents 
OVS to detach port A. The flows applied on port B remain ghosted in OVS. Wh

[ovs-dev] [PATCH v3 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-07-20 Thread Ivan Malov
Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified
explicitly, via the corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-dpdk.c | 99 ---
 lib/netdev-dpdk.h |  4 +-
 lib/netdev-offload-dpdk.c | 61 
 3 files changed, 135 insertions(+), 29 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 45e5d26d2..01fb40255 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {
 
 struct netdev_dpdk {
 PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
 dpdk_port_t port_id;
 
 /* If true, device was attached by rte_eth_dev_attach(). */
@@ -1130,8 +1131,9 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
-
 #ifdef ALLOW_EXPERIMENTAL_API
+int ret;
+
 /*
  * Full tunnel offload requires that tunnel ID metadata be
  * delivered with "miss" packets from the hardware to the
@@ -1141,6 +1143,27 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  * Request delivery of such metadata.
  */
 dpdk_eth_dev_init_rx_metadata(dev);
+
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+#else /* ! ALLOW_EXPERIMENTAL_API */
+/* No API to get transfer proxy; assume the proxy is unneeded. */
+dev->flow_transfer_proxy_port_id = dev->port_id;
 #endif /* ALLOW_EXPERIMENTAL_API */
 
 rte_eth_dev_info_get(dev->port_id, );
@@ -3762,8 +3785,12 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
const char *argv[], void *aux OVS_UNUSED)
 {
 struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+#ifdef ALLOW_EXPERIMENTAL_API
+struct netdev_dpdk *dev_self = NULL;
+#endif /* ALLOW_EXPERIMENTAL_API */
 struct rte_eth_dev_info dev_info;
 dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
 dpdk_port_t port_id;
 bool used = false;
 char *response;
@@ -3781,8 +3808,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
   argv[1]);
 
 RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
 LIST_FOR_EACH (dev, list_node, _list) {
 if (dev->port_id != sibling_port_id) {
 continue;
@@ -3802,6 +3827,27 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
 }
 ds_destroy(_interfaces);
 
+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id == port_id) {
+dev_self = dev;
+} else if (dev->flow_transfer_proxy_port_id == port_id) {
+response = xasprintf("Device '%s' can not be detached (flow 
proxy)",
+ argv[1]);
+goto error;
+}
+}
+
+/* Indicate that the device being detached no longer needs a flow proxy. */
+if (dev_self != NULL)
+dev_self->flow_transfer_proxy_port_id = port_id;
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 rte_eth_dev_info_get(port_id, _info);
 rte_eth_dev_close(port_id);
 if (rte_dev_remove(dev_info.device) < 0) {
@@ -5167,7 +5213,8 @@ unlock:
 }
 
 int
-netdev_dpdk_get_port_id(struct netdev *netdev)
+netdev_dpdk_get_port_id(struct netdev *netdev,
+bool flow_transfer_proxy)
 {
 struct netdev_dpdk *dev;
 int ret = -1;
@@ -5178,7 +5225,7 @@ netdev_dpdk_get_port_id(struct netdev *netdev)
 
 dev = netdev_dpdk_cast(netdev);
 ovs_mutex_lock(>mutex);
-ret = dev->port_id;
+ret = flow_transfer_proxy ? dev->flow_transfer_proxy_port_id : 
dev->port_id;
 ovs_mutex_unlock(>mutex);
 out:
 return r

[ovs-dev] [PATCH v3 2/3] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2022-07-20 Thread Ivan Malov
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-offload-dpdk.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 12d299603..9cd5a0159 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -733,6 +733,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
+#ifdef ALLOW_EXPERIMENTAL_API
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+if (ethdev) {
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
+#else /* ! ALLOW_EXPERIMENTAL_API */
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
 const struct rte_flow_action_port_id *port_id = actions->conf;
 
@@ -740,6 +748,7 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 if (port_id) {
 ds_put_format(s, "original %d id %d ",
   port_id->original, port_id->id);
+#endif /* ALLOW_EXPERIMENTAL_API */
 }
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
@@ -1767,6 +1776,24 @@ add_count_action(struct flow_actions *actions)
 add_flow_action(actions, RTE_FLOW_ACTION_TYPE_COUNT, count);
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+static int
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
+{
+struct rte_flow_action_ethdev *ethdev;
+int outdev_id;
+
+outdev_id = netdev_dpdk_get_port_id(outdev);
+if (outdev_id < 0) {
+return -1;
+}
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+return 0;
+}
+#else /* ! ALLOW_EXPERIMENTAL_API */
 static int
 add_port_id_action(struct flow_actions *actions,
struct netdev *outdev)
@@ -1783,6 +1810,7 @@ add_port_id_action(struct flow_actions *actions,
 add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
 return 0;
 }
+#endif /* ALLOW_EXPERIMENTAL_API */
 
 static int
 add_output_action(struct netdev *netdev,
@@ -1800,7 +1828,11 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
+#ifdef ALLOW_EXPERIMENTAL_API
+add_represented_port_action(actions, outdev)) {
+#else /* ! ALLOW_EXPERIMENTAL_API */
 add_port_id_action(actions, outdev)) {
+#endif /* ALLOW_EXPERIMENTAL_API */
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 1/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2022-07-20 Thread Ivan Malov
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-dpdk.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f9535bfb4..45e5d26d2 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1085,6 +1085,38 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1099,6 +1131,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3 0/3] Rework the usage of DPDK transfer flow offloads

2022-07-20 Thread Ivan Malov
DPDK has got support for offloads involving assignment of per-packet
Rx metadata (flag, mark, tunnel ID and the likes). However, delivery
of such metadata from the NIC to the PMD might need to be negotiated
in advance. API [1] addresses that problem. Make OvS invoke this API.

Another problem is how flow rules with attribute "transfer" refer to
embedded switch ports by means of their representors. Action PORT_ID
has proved ambiguous: it refers to a DPDK ethdev, but in fact steers
packets to the entity represented by the ethdev, in example, to a VF.
The problem is addressed by [2]. Use the solution in OvS accordingly.

In addition, [2] and [3] address filtering traffic by input ports of
the embedded switch. In the suggested approach, "transfer" rules are
managed via the only ethdev with sufficient privileges, whilst match
criteria include an explicit item to indicate the desired input port.
Revisit OvS support for "transfer" rules to follow the said approach.

The following tests have been considered so far:
- build check with the current dpdk-next-net;
- running "make check" for every patch;
- tunnel offload demo with net/sfc PMD.

[1] http://mails.dpdk.org/archives/dev/2021-October/224291.html
[2] http://mails.dpdk.org/archives/dev/2021-October/224620.html
[3] http://mails.dpdk.org/archives/dev/2021-October/225081.html
---
 v1 -> v2: amendments to care about proxy detach and port ID logging
 v2 -> v3: a minor adjustment in the cover letter's subject line

Ivan Malov (3):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT
  netdev-offload-dpdk: use flow transfer proxy mechanism

 lib/netdev-dpdk.c | 141 +-
 lib/netdev-dpdk.h |   4 +-
 lib/netdev-offload-dpdk.c |  91 +---
 3 files changed, 209 insertions(+), 27 deletions(-)

-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-07-20 Thread Ivan Malov
Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified
explicitly, via the corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-dpdk.c | 99 ---
 lib/netdev-dpdk.h |  4 +-
 lib/netdev-offload-dpdk.c | 61 
 3 files changed, 135 insertions(+), 29 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 45e5d26d2..01fb40255 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {
 
 struct netdev_dpdk {
 PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
 dpdk_port_t port_id;
 
 /* If true, device was attached by rte_eth_dev_attach(). */
@@ -1130,8 +1131,9 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 uint32_t rx_chksm_offload_capa = RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
-
 #ifdef ALLOW_EXPERIMENTAL_API
+int ret;
+
 /*
  * Full tunnel offload requires that tunnel ID metadata be
  * delivered with "miss" packets from the hardware to the
@@ -1141,6 +1143,27 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  * Request delivery of such metadata.
  */
 dpdk_eth_dev_init_rx_metadata(dev);
+
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret != 0) {
+/*
+ * The PMD does not indicate the proxy port.
+ * Assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
+#else /* ! ALLOW_EXPERIMENTAL_API */
+/* No API to get transfer proxy; assume the proxy is unneeded. */
+dev->flow_transfer_proxy_port_id = dev->port_id;
 #endif /* ALLOW_EXPERIMENTAL_API */
 
 rte_eth_dev_info_get(dev->port_id, );
@@ -3762,8 +3785,12 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
const char *argv[], void *aux OVS_UNUSED)
 {
 struct ds used_interfaces = DS_EMPTY_INITIALIZER;
+#ifdef ALLOW_EXPERIMENTAL_API
+struct netdev_dpdk *dev_self = NULL;
+#endif /* ALLOW_EXPERIMENTAL_API */
 struct rte_eth_dev_info dev_info;
 dpdk_port_t sibling_port_id;
+struct netdev_dpdk *dev;
 dpdk_port_t port_id;
 bool used = false;
 char *response;
@@ -3781,8 +3808,6 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
   argv[1]);
 
 RTE_ETH_FOREACH_DEV_SIBLING (sibling_port_id, port_id) {
-struct netdev_dpdk *dev;
-
 LIST_FOR_EACH (dev, list_node, _list) {
 if (dev->port_id != sibling_port_id) {
 continue;
@@ -3802,6 +3827,27 @@ netdev_dpdk_detach(struct unixctl_conn *conn, int argc 
OVS_UNUSED,
 }
 ds_destroy(_interfaces);
 
+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * The device being detached may happen to be a flow proxy port
+ * for another device (still attached). If so, do not allow to
+ * detach. Devices dependent on this one must be detached first.
+ */
+LIST_FOR_EACH (dev, list_node, _list) {
+if (dev->port_id == port_id) {
+dev_self = dev;
+} else if (dev->flow_transfer_proxy_port_id == port_id) {
+response = xasprintf("Device '%s' can not be detached (flow 
proxy)",
+ argv[1]);
+goto error;
+}
+}
+
+/* Indicate that the device being detached no longer needs a flow proxy. */
+if (dev_self != NULL)
+dev_self->flow_transfer_proxy_port_id = port_id;
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 rte_eth_dev_info_get(port_id, _info);
 rte_eth_dev_close(port_id);
 if (rte_dev_remove(dev_info.device) < 0) {
@@ -5167,7 +5213,8 @@ unlock:
 }
 
 int
-netdev_dpdk_get_port_id(struct netdev *netdev)
+netdev_dpdk_get_port_id(struct netdev *netdev,
+bool flow_transfer_proxy)
 {
 struct netdev_dpdk *dev;
 int ret = -1;
@@ -5178,7 +5225,7 @@ netdev_dpdk_get_port_id(struct netdev *netdev)
 
 dev = netdev_dpdk_cast(netdev);
 ovs_mutex_lock(>mutex);
-ret = dev->port_id;
+ret = flow_transfer_proxy ? dev->flow_transfer_proxy_port_id : 
dev->port_id;
 ovs_mutex_unlock(>mutex);
 out:
 return r

[ovs-dev] [PATCH v2 2/3] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2022-07-20 Thread Ivan Malov
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-offload-dpdk.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 12d299603..9cd5a0159 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -733,6 +733,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
+#ifdef ALLOW_EXPERIMENTAL_API
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+if (ethdev) {
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
+#else /* ! ALLOW_EXPERIMENTAL_API */
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
 const struct rte_flow_action_port_id *port_id = actions->conf;
 
@@ -740,6 +748,7 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 if (port_id) {
 ds_put_format(s, "original %d id %d ",
   port_id->original, port_id->id);
+#endif /* ALLOW_EXPERIMENTAL_API */
 }
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
@@ -1767,6 +1776,24 @@ add_count_action(struct flow_actions *actions)
 add_flow_action(actions, RTE_FLOW_ACTION_TYPE_COUNT, count);
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+static int
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
+{
+struct rte_flow_action_ethdev *ethdev;
+int outdev_id;
+
+outdev_id = netdev_dpdk_get_port_id(outdev);
+if (outdev_id < 0) {
+return -1;
+}
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+return 0;
+}
+#else /* ! ALLOW_EXPERIMENTAL_API */
 static int
 add_port_id_action(struct flow_actions *actions,
struct netdev *outdev)
@@ -1783,6 +1810,7 @@ add_port_id_action(struct flow_actions *actions,
 add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
 return 0;
 }
+#endif /* ALLOW_EXPERIMENTAL_API */
 
 static int
 add_output_action(struct netdev *netdev,
@@ -1800,7 +1828,11 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
+#ifdef ALLOW_EXPERIMENTAL_API
+add_represented_port_action(actions, outdev)) {
+#else /* ! ALLOW_EXPERIMENTAL_API */
 add_port_id_action(actions, outdev)) {
+#endif /* ALLOW_EXPERIMENTAL_API */
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 0/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2022-07-20 Thread Ivan Malov
DPDK has got support for offloads involving assignment of per-packet
Rx metadata (flag, mark, tunnel ID and the likes). However, delivery
of such metadata from the NIC to the PMD might need to be negotiated
in advance. API [1] addresses that problem. Make OvS invoke this API.

Another problem is how flow rules with attribute "transfer" refer to
embedded switch ports by means of their representors. Action PORT_ID
has proved ambiguous: it refers to a DPDK ethdev, but in fact steers
packets to the entity represented by the ethdev, in example, to a VF.
The problem is addressed by [2]. Use the solution in OvS accordingly.

In addition, [2] and [3] address filtering traffic by input ports of
the embedded switch. In the suggested approach, "transfer" rules are
managed via the only ethdev with sufficient privileges, whilst match
criteria include an explicit item to indicate the desired input port.
Revisit OvS support for "transfer" rules to follow the said approach.

The following tests have been considered so far:
- build check with the current dpdk-next-net;
- running "make check" for every patch;
- tunnel offload demo with net/sfc PMD.

[1] http://mails.dpdk.org/archives/dev/2021-October/224291.html
[2] http://mails.dpdk.org/archives/dev/2021-October/224620.html
[3] http://mails.dpdk.org/archives/dev/2021-October/225081.html
---
 v1 -> v2: amendments to care about proxy detach and port ID logging

Ivan Malov (3):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT
  netdev-offload-dpdk: use flow transfer proxy mechanism

 lib/netdev-dpdk.c | 141 +-
 lib/netdev-dpdk.h |   4 +-
 lib/netdev-offload-dpdk.c |  91 +---
 3 files changed, 209 insertions(+), 27 deletions(-)

-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2022-07-20 Thread Ivan Malov
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-dpdk.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f9535bfb4..45e5d26d2 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1085,6 +1085,38 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1099,6 +1131,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Ivan Malov

Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Wednesday, June 8, 2022 5:46 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Tuesday, June 7, 2022 11:56 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 1 Jun 2022, Eli Britstein wrote:


- Missing proper handling of the testpmd syntax logging. It changes
the used

port according to "transfer", but the log still uses

netdev_dpdk_get_port_id().


Thanks for noticing. I will see to it in the next version.


- The usage of the "proxy" port for rte-flow implies that this proxy
port is

attached to OVS, otherwise it is not "started" and creation of flows will

fail.


That's the way it is. If there is no proxy for a given port, then the
original port value will be used for managing flows. For vendors that
don't need the proxy, this will work. For others, it won't. That's OK.



I don't really understand why this can't be done inside dpdk domain (if there

is a proxy, and it is up, use it, otherwise don't).

That's *currently* the way it is. I understand that if dpdk works like this OVS

should align, but maybe you or someone else here knows why dpdk works like
this? (not too late to change, this is experimental...).


Regardless of DPDK, on some NICs, it is possible to insert rules via
unprivileged PFs or VFs, but there are also NICs which cannot do it.

In DPDK, this contradiction has to be resolved somehow.
In example, for NICs that can only manage flows via privileged ports, two
possible solutions exist:

1. Route flow management requests from unprivileged ethdevs
   to the privileged one implicitly, inside the PMD. This
   is transparent to users, but, at the same time, it is
   tricky because the application does not realise that
   flows it manages via an ethdev "B" are in fact
   communicated to the NIC via an ethdev "A".

   Unbeknownst of the implicit scheme, the application may
   detach the privileged ethdev "A" in-between. And, when
   time comes to remove flows, doing so via ethdev "B"
   will fail. This scheme breaks in-app housekeeping.

2. Expose the "proxy" port existence to the application.
   If it knows the truth about the real ethdev that
   handles the transfer flows, it won't attempt to
   detach it in-between. The housekeeping is fine.

Outing the existence of the "proxy" port to users seems like the most
reasonable approach. This is why it was implemented in DPDK like this.
Currently, it's indeed an experimental feature. DPDK PMDs which need it, are
supposed to switch to it during the transition phase.
Thanks very much for the explanation, though IMHO relevant PMDs could 

still hide it and not do this "outing" of their internals.


Sort of yes, they could hide it. But that would mean doing additional
record-keeping internally, in order to return EBUSY when the app asks
to detach the privileged port which still has active flows on it that
have been originally requested via an unprivileged port. Might be
quite error prone. Also, given the fact that quite a few vendors
might need this, isn't it better to make the feature generic?




However, I should stress out that to NICs that support managing transfer
flows on any PFs and VFs, this proxy scheme is a don't care. The
corresponding drivers may not implement the proxy query method at all:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
b.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2Fethdev%2Frte_flow.c%2
3L1345data=05%7C01%7Celibr%40nvidia.com%7Cf5a80eb00f0342498
63308da495dab8b%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6
37902963929533013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
sdata=ojwUOsPlz09NXtDXfeO8lAT%2BHcgGYWNRdIhxB6f0cy0%3D
mp;reserved=0

The generic part of the API will just return the original port ID to the
application.

Yes, I saw that. Thanks.



You're very welcome.










-Original Message-
From: Ivan Malov 
Sent: Monday, May 30, 2022 5:16 PM
To: d...@openvswitch.org
Cc: Andrew Rybchenko ; Ilya

Maximets

; Ori Kam ; Eli Britstein
; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marc

Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Ivan Malov

Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Tuesday, June 7, 2022 11:56 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 1 Jun 2022, Eli Britstein wrote:


- Missing proper handling of the testpmd syntax logging. It changes the used

port according to "transfer", but the log still uses netdev_dpdk_get_port_id().

Thanks for noticing. I will see to it in the next version.


- The usage of the "proxy" port for rte-flow implies that this proxy port is

attached to OVS, otherwise it is not "started" and creation of flows will fail.

That's the way it is. If there is no proxy for a given port, then the original 
port
value will be used for managing flows. For vendors that don't need the proxy,
this will work. For others, it won't. That's OK.



I don't really understand why this can't be done inside dpdk domain (if there 
is a proxy, and it is up, use it, otherwise don't).
That's *currently* the way it is. I understand that if dpdk works like this OVS 
should align, but maybe you or someone else here knows why dpdk works like 
this? (not too late to change, this is experimental...).



Regardless of DPDK, on some NICs, it is possible to insert rules via
unprivileged PFs or VFs, but there are also NICs which cannot do it.

In DPDK, this contradiction has to be resolved somehow.
In example, for NICs that can only manage flows via
privileged ports, two possible solutions exist:

1. Route flow management requests from unprivileged ethdevs
   to the privileged one implicitly, inside the PMD. This
   is transparent to users, but, at the same time, it is
   tricky because the application does not realise that
   flows it manages via an ethdev "B" are in fact
   communicated to the NIC via an ethdev "A".

   Unbeknownst of the implicit scheme, the application may
   detach the privileged ethdev "A" in-between. And, when
   time comes to remove flows, doing so via ethdev "B"
   will fail. This scheme breaks in-app housekeeping.

2. Expose the "proxy" port existence to the application.
   If it knows the truth about the real ethdev that
   handles the transfer flows, it won't attempt to
   detach it in-between. The housekeeping is fine.

Outing the existence of the "proxy" port to users seems
like the most reasonable approach. This is why it was
implemented in DPDK like this. Currently, it's indeed
an experimental feature. DPDK PMDs which need it, are
supposed to switch to it during the transition phase.

However, I should stress out that to NICs that support
managing transfer flows on any PFs and VFs, this proxy
scheme is a don't care. The corresponding drivers may
not implement the proxy query method at all:

https://github.com/DPDK/dpdk/blob/main/lib/ethdev/rte_flow.c#L1345

The generic part of the API will just return
the original port ID to the application.







-Original Message-
From: Ivan Malov 
Sent: Monday, May 30, 2022 5:16 PM
To: d...@openvswitch.org
Cc: Andrew Rybchenko ; Ilya Maximets
; Ori Kam ; Eli Britstein
; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified explicitly,
via the corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
lib/netdev-dpdk.c | 73 ---
lib/netdev-dpdk.h |  2 +-
lib/netdev-offload-dpdk.c | 43 ++-
3 files changed, 103 insertions(+), 15 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
45e5d26d2..d0bf4613a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {

struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE,

cacheline0,

+dpdk_port_t flow_transfer_proxy_port_id;
dpdk_port_t port_id;

/* If true, device was attached by rte_eth_dev_attach(). */
@@ -1115,6
+1116,23 @@ dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
  DPDK_PORT_ID_FMT, dev->port_id);
}
}
+
+static void
+dpdk_eth_dev_init_flow_transfer_proxy(struct netdev_dpdk *dev) {
+int ret;
+
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret == 0)
+return;
+
+/*
+ * 

Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-07 Thread Ivan Malov

Hi Eli,

On Wed, 1 Jun 2022, Eli Britstein wrote:


- Missing proper handling of the testpmd syntax logging. It changes the used port 
according to "transfer", but the log still uses netdev_dpdk_get_port_id().


Thanks for noticing. I will see to it in the next version.


- The usage of the "proxy" port for rte-flow implies that this proxy port is attached to 
OVS, otherwise it is not "started" and creation of flows will fail.


That's the way it is. If there is no proxy for a given port, then the
original port value will be used for managing flows. For vendors that
don't need the proxy, this will work. For others, it won't. That's OK.




-----Original Message-
From: Ivan Malov 
Sent: Monday, May 30, 2022 5:16 PM
To: d...@openvswitch.org
Cc: Andrew Rybchenko ; Ilya Maximets
; Ori Kam ; Eli Britstein
; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified explicitly, via the
corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
lib/netdev-dpdk.c | 73 ---
lib/netdev-dpdk.h |  2 +-
lib/netdev-offload-dpdk.c | 43 ++-
3 files changed, 103 insertions(+), 15 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
45e5d26d2..d0bf4613a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {

struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
dpdk_port_t port_id;

/* If true, device was attached by rte_eth_dev_attach(). */ @@ -1115,6
+1116,23 @@ dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
  DPDK_PORT_ID_FMT, dev->port_id);
}
}
+
+static void
+dpdk_eth_dev_init_flow_transfer_proxy(struct netdev_dpdk *dev) {
+int ret;
+
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret == 0)
+return;
+
+/*
+ * The PMD does not indicate the proxy port.
+ * It is OK to assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id; }
#endif /* ALLOW_EXPERIMENTAL_API */

static int
@@ -1141,6 +1159,19 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 * Request delivery of such metadata.
 */
dpdk_eth_dev_init_rx_metadata(dev);
+
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+dpdk_eth_dev_init_flow_transfer_proxy(dev);
+#else /* ! ALLOW_EXPERIMENTAL_API */
+/* It is OK to assume the proxy is unneeded. */
+dev->flow_transfer_proxy_port_id = dev->port_id;
#endif /* ALLOW_EXPERIMENTAL_API */

rte_eth_dev_info_get(dev->port_id, );
@@ -5214,13 +5245,15 @@ out:

int
netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
- struct rte_flow *rte_flow,
+ bool transfer, struct rte_flow *rte_flow,
 struct rte_flow_error *error)
{
struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
int ret;

-ret = rte_flow_destroy(dev->port_id, rte_flow, error);
+ret = rte_flow_destroy(transfer ?
+   dev->flow_transfer_proxy_port_id : dev->port_id,
+   rte_flow, error);
return ret;
}

@@ -5234,7 +5267,19 @@ netdev_dpdk_rte_flow_create(struct netdev
*netdev,
struct rte_flow *flow;
struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);

-flow = rte_flow_create(dev->port_id, attr, items, actions, error);
+#ifdef ALLOW_EXPERIMENTAL_API
+if (!attr->transfer) {
+/*
+ * The 1st item in any pattern is a traffic source one.
+ * It is unnecessary in the case of non-transfer rules.
+ */
+++(items);
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+
+flow = rte_flow_create(attr->transfer ?
+   dev->flow_transfer_proxy_port_id : dev->port_id,
+   attr, items, actions, error);
return flow;
}

@@ -5262,7 +5307,8 @@ netdev_dpdk_rte_flow_query_count(struct netdev
*netdev,
}

dev = netdev_dpdk_cast(netdev);
-ret = rte_flow_query(dev->port_id, rte_flow, actions, query, error);
+ret = rte_flow_query(dev->flow_transfer_proxy_po

Re: [ovs-dev] [PATCH 1/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2022-06-07 Thread Ivan Malov

Hi Eli,

Thanks for reviewing the series.

On Wed, 1 Jun 2022, Eli Britstein wrote:


"TUNNEL_ID" is a bad name, but that's how dpdk called it.
There was a discussion about having this knowledge in OVS so we can avoid 
calling rte_flow_get_restore_info(). How else it is used?


As far as I know, avoiding rte_flow_get_restore_info() demands that
tunnel ID be retrieved from mbufs themselves, using a dynamic field.
That proposal does not discard the need to deliver tunnel data from
the NIC to the PMD, and the patch that we discuss is still required.




-Original Message-----
From: Ivan Malov 
Sent: Monday, May 30, 2022 5:16 PM
To: d...@openvswitch.org
Cc: Andrew Rybchenko ; Ilya Maximets
; Ori Kam ; Eli Britstein
; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: [PATCH 1/3] netdev-dpdk: negotiate delivery of per-packet Rx
metadata

External email: Use caution opening links or attachments


This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
lib/netdev-dpdk.c | 44 
1 file changed, 44 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index f9535bfb4..45e5d26d2
100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1085,6 +1085,38 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk
*dev) OVS_REQUIRES(dev->mutex)
}
}

+#ifdef ALLOW_EXPERIMENTAL_API
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev) {
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port
"
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port
"
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+
static int
dpdk_eth_dev_init(struct netdev_dpdk *dev)
OVS_REQUIRES(dev->mutex)
@@ -1099,6 +1131,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
 RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
 RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;

+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+#endif /* ALLOW_EXPERIMENTAL_API */
+
rte_eth_dev_info_get(dev->port_id, );

if (strstr(info.driver_name, "vf") != NULL) {
--
2.30.2




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 0/3] Rework the usage of DPDK transfer flow offloads

2022-05-30 Thread Ivan Malov
DPDK has got support for offloads involving assignment of per-packet
Rx metadata (flag, mark, tunnel ID and the likes). However, delivery
of such metadata from the NIC to the PMD might need to be negotiated
in advance. API [1] addresses that problem. Make OvS invoke this API.

Another problem is how flow rules with attribute "transfer" refer to
embedded switch ports by means of their representors. Action PORT_ID
has proved ambiguous: it refers to a DPDK ethdev, but in fact steers
packets to the entity represented by the ethdev, in example, to a VF.
The problem is addressed by [2]. Use the solution in OvS accordingly.

In addition, [2] and [3] address filtering traffic by input ports of
the embedded switch. In the suggested approach, "transfer" rules are
managed via the only ethdev with sufficient privileges, whilst match
criteria include an explicit item to indicate the desired input port.
Revisit OvS support for "transfer" rules to follow the said approach.

The following tests have been considered so far:
- build check with the current dpdk-next-net;
- running "make check" for every patch;
- tunnel offload demo with net/sfc PMD.

[1] http://mails.dpdk.org/archives/dev/2021-October/224291.html
[2] http://mails.dpdk.org/archives/dev/2021-October/224620.html
[3] http://mails.dpdk.org/archives/dev/2021-October/225081.html

Ivan Malov (3):
  netdev-dpdk: negotiate delivery of per-packet Rx metadata
  netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT
  netdev-offload-dpdk: use flow transfer proxy mechanism

 lib/netdev-dpdk.c | 117 +-
 lib/netdev-dpdk.h |   2 +-
 lib/netdev-offload-dpdk.c |  75 +++-
 3 files changed, 179 insertions(+), 15 deletions(-)

-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-05-30 Thread Ivan Malov
Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified
explicitly, via the corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-dpdk.c | 73 ---
 lib/netdev-dpdk.h |  2 +-
 lib/netdev-offload-dpdk.c | 43 ++-
 3 files changed, 103 insertions(+), 15 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 45e5d26d2..d0bf4613a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {
 
 struct netdev_dpdk {
 PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE, cacheline0,
+dpdk_port_t flow_transfer_proxy_port_id;
 dpdk_port_t port_id;
 
 /* If true, device was attached by rte_eth_dev_attach(). */
@@ -1115,6 +1116,23 @@ dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
   DPDK_PORT_ID_FMT, dev->port_id);
 }
 }
+
+static void
+dpdk_eth_dev_init_flow_transfer_proxy(struct netdev_dpdk *dev)
+{
+int ret;
+
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret == 0)
+return;
+
+/*
+ * The PMD does not indicate the proxy port.
+ * It is OK to assume the proxy is unneeded.
+ */
+dev->flow_transfer_proxy_port_id = dev->port_id;
+}
 #endif /* ALLOW_EXPERIMENTAL_API */
 
 static int
@@ -1141,6 +1159,19 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  * Request delivery of such metadata.
  */
 dpdk_eth_dev_init_rx_metadata(dev);
+
+/*
+ * Managing "transfer" flows requires that the user communicate them
+ * via a port which has the privilege to control the embedded switch.
+ * For some vendors, all ports in a given switching domain have
+ * this privilege. For other vendors, it's only one port.
+ *
+ * Get the proxy port ID and remember it for later use.
+ */
+dpdk_eth_dev_init_flow_transfer_proxy(dev);
+#else /* ! ALLOW_EXPERIMENTAL_API */
+/* It is OK to assume the proxy is unneeded. */
+dev->flow_transfer_proxy_port_id = dev->port_id;
 #endif /* ALLOW_EXPERIMENTAL_API */
 
 rte_eth_dev_info_get(dev->port_id, );
@@ -5214,13 +5245,15 @@ out:
 
 int
 netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
- struct rte_flow *rte_flow,
+ bool transfer, struct rte_flow *rte_flow,
  struct rte_flow_error *error)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 int ret;
 
-ret = rte_flow_destroy(dev->port_id, rte_flow, error);
+ret = rte_flow_destroy(transfer ?
+   dev->flow_transfer_proxy_port_id : dev->port_id,
+   rte_flow, error);
 return ret;
 }
 
@@ -5234,7 +5267,19 @@ netdev_dpdk_rte_flow_create(struct netdev *netdev,
 struct rte_flow *flow;
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 
-flow = rte_flow_create(dev->port_id, attr, items, actions, error);
+#ifdef ALLOW_EXPERIMENTAL_API
+if (!attr->transfer) {
+/*
+ * The 1st item in any pattern is a traffic source one.
+ * It is unnecessary in the case of non-transfer rules.
+ */
+++(items);
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+
+flow = rte_flow_create(attr->transfer ?
+   dev->flow_transfer_proxy_port_id : dev->port_id,
+   attr, items, actions, error);
 return flow;
 }
 
@@ -5262,7 +5307,8 @@ netdev_dpdk_rte_flow_query_count(struct netdev *netdev,
 }
 
 dev = netdev_dpdk_cast(netdev);
-ret = rte_flow_query(dev->port_id, rte_flow, actions, query, error);
+ret = rte_flow_query(dev->flow_transfer_proxy_port_id, rte_flow,
+ actions, query, error);
 return ret;
 }
 
@@ -5284,8 +5330,8 @@ netdev_dpdk_rte_flow_tunnel_decap_set(struct netdev 
*netdev,
 
 dev = netdev_dpdk_cast(netdev);
 ovs_mutex_lock(>mutex);
-ret = rte_flow_tunnel_decap_set(dev->port_id, tunnel, actions,
-num_of_actions, error);
+ret = rte_flow_tunnel_decap_set(dev->flow_transfer_proxy_port_id, tunnel,
+actions, num_of_actions, error);
 ovs_mutex_unlock(>mutex);
 return ret;
 }
@@ -5306,8 +5352,8 @@ netdev_dpdk_rte_flow_tunnel_match(struct netdev *netdev,
 
 dev = netdev_dpdk_cast(netdev);
 ovs_mutex_lock(>mutex);
-ret = rte_flow_tunnel_match(dev->port_id, tunnel, items, num_of_items,
-error);
+ret = rte_flow_tunnel_match(dev->flow_transfer_proxy_port_id, tunnel,
+items, num_of_items, error);
 ovs_mutex_unlock(>mutex);
 return ret;
 }
@@ -5328,7 +537

[ovs-dev] [PATCH 2/3] netdev-offload-dpdk: replace action PORT_ID with REPRESENTED_PORT

2022-05-30 Thread Ivan Malov
Action PORT_ID has been deprecated. Use REPRESENTED_PORT instead.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-offload-dpdk.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 12d299603..9cd5a0159 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -733,6 +733,14 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 ds_put_cstr(s, "rss / ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_COUNT) {
 ds_put_cstr(s, "count / ");
+#ifdef ALLOW_EXPERIMENTAL_API
+} else if (actions->type == RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT) {
+const struct rte_flow_action_ethdev *ethdev = actions->conf;
+
+ds_put_cstr(s, "represented_port ");
+if (ethdev) {
+ds_put_format(s, "ethdev_port_id %d ", ethdev->port_id);
+#else /* ! ALLOW_EXPERIMENTAL_API */
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_PORT_ID) {
 const struct rte_flow_action_port_id *port_id = actions->conf;
 
@@ -740,6 +748,7 @@ dump_flow_action(struct ds *s, struct ds *s_extra,
 if (port_id) {
 ds_put_format(s, "original %d id %d ",
   port_id->original, port_id->id);
+#endif /* ALLOW_EXPERIMENTAL_API */
 }
 ds_put_cstr(s, "/ ");
 } else if (actions->type == RTE_FLOW_ACTION_TYPE_DROP) {
@@ -1767,6 +1776,24 @@ add_count_action(struct flow_actions *actions)
 add_flow_action(actions, RTE_FLOW_ACTION_TYPE_COUNT, count);
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+static int
+add_represented_port_action(struct flow_actions *actions,
+struct netdev *outdev)
+{
+struct rte_flow_action_ethdev *ethdev;
+int outdev_id;
+
+outdev_id = netdev_dpdk_get_port_id(outdev);
+if (outdev_id < 0) {
+return -1;
+}
+ethdev = xzalloc(sizeof *ethdev);
+ethdev->port_id = outdev_id;
+add_flow_action(actions, RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT, ethdev);
+return 0;
+}
+#else /* ! ALLOW_EXPERIMENTAL_API */
 static int
 add_port_id_action(struct flow_actions *actions,
struct netdev *outdev)
@@ -1783,6 +1810,7 @@ add_port_id_action(struct flow_actions *actions,
 add_flow_action(actions, RTE_FLOW_ACTION_TYPE_PORT_ID, port_id);
 return 0;
 }
+#endif /* ALLOW_EXPERIMENTAL_API */
 
 static int
 add_output_action(struct netdev *netdev,
@@ -1800,7 +1828,11 @@ add_output_action(struct netdev *netdev,
 return -1;
 }
 if (!netdev_flow_api_equals(netdev, outdev) ||
+#ifdef ALLOW_EXPERIMENTAL_API
+add_represented_port_action(actions, outdev)) {
+#else /* ! ALLOW_EXPERIMENTAL_API */
 add_port_id_action(actions, outdev)) {
+#endif /* ALLOW_EXPERIMENTAL_API */
 VLOG_DBG_RL(, "%s: Output to port \'%s\' cannot be offloaded.",
 netdev_get_name(netdev), netdev_get_name(outdev));
 ret = -1;
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/3] netdev-dpdk: negotiate delivery of per-packet Rx metadata

2022-05-30 Thread Ivan Malov
This may be required by some PMDs in offload scenarios.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
 lib/netdev-dpdk.c | 44 
 1 file changed, 44 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f9535bfb4..45e5d26d2 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1085,6 +1085,38 @@ dpdk_eth_flow_ctrl_setup(struct netdev_dpdk *dev) 
OVS_REQUIRES(dev->mutex)
 }
 }
 
+#ifdef ALLOW_EXPERIMENTAL_API
+static void
+dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
+{
+uint64_t rx_metadata = 0;
+int ret;
+
+/* For the fallback offload (non-"transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_USER_MARK;
+/* For the full offload ("transfer" rules) */
+rx_metadata |= RTE_ETH_RX_METADATA_TUNNEL_ID;
+
+ret = rte_eth_rx_metadata_negotiate(dev->port_id, _metadata);
+if (ret == 0) {
+if (!(rx_metadata & RTE_ETH_RX_METADATA_USER_MARK)) {
+VLOG_DBG("The NIC will not provide per-packet USER_MARK on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+if (!(rx_metadata & RTE_ETH_RX_METADATA_TUNNEL_ID)) {
+VLOG_DBG("The NIC will not provide per-packet TUNNEL_ID on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+}
+} else if (ret == -ENOTSUP) {
+VLOG_DBG("Rx metadata negotiate procedure is not supported on port "
+ DPDK_PORT_ID_FMT, dev->port_id);
+} else {
+VLOG_WARN("Cannot negotiate Rx metadata on port "
+  DPDK_PORT_ID_FMT, dev->port_id);
+}
+}
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 static int
 dpdk_eth_dev_init(struct netdev_dpdk *dev)
 OVS_REQUIRES(dev->mutex)
@@ -1099,6 +1131,18 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
  RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
  RTE_ETH_RX_OFFLOAD_IPV4_CKSUM;
 
+#ifdef ALLOW_EXPERIMENTAL_API
+/*
+ * Full tunnel offload requires that tunnel ID metadata be
+ * delivered with "miss" packets from the hardware to the
+ * PMD. The same goes for megaflow mark metadata which is
+ * used in MARK + RSS offload scenario.
+ *
+ * Request delivery of such metadata.
+ */
+dpdk_eth_dev_init_rx_metadata(dev);
+#endif /* ALLOW_EXPERIMENTAL_API */
+
 rte_eth_dev_info_get(dev->port_id, );
 
 if (strstr(info.driver_name, "vf") != NULL) {
-- 
2.30.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V6 00/13] Netdev vxlan-decap offload

2021-04-06 Thread Ivan Malov

Hi Eli,

Thank you for improving the patch series gradually. May I clarify one 
extra aspect (and I'm sorry if it has been discussed already): will OvS 
always insert F1 first and then insert F2? I mean, yes, in the model, F2 
should process packets coming from F1, but is it *technically* possible 
for OvS to insert the rules in reverse order?


On 04/04/2021 12:54, Eli Britstein wrote:

VXLAN decap in OVS-DPDK configuration consists of two flows:
F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0

F1 is a classification flow. It has outer headers matches and it
classifies the packet as a VXLAN packet, and using tnl_pop action the
packet continues processing in F2.
F2 is a flow that has matches on tunnel metadata as well as on the inner
packet headers (as any other flow).

In order to fully offload VXLAN decap path, both F1 and F2 should be
offloaded. As there are more than one flow in HW, it is possible that
F1 is done by HW but F2 is not. Packet is received by SW, and should be
processed starting from F2 as F1 was already done by HW.
Rte_flows are applicable only on physical port IDs. Keeping the original
physical in port on which the packet is received on enables applying
vport flows (e.g. F2) on that physical port.

This patch-set makes use of [1] introduced in DPDK 20.11, that adds API
for tunnel offloads.

Note that MLX5 PMD has a bug that the tnl_pop private actions must be
first. In OVS it is not.
Fixing this issue is scheduled for 21.05 (and stable 20.11.2).
Meanwhile, tests were done with a workaround for it [2].

v2-v1:
- Tracking original in_port, and applying vport on that physical port instead 
of all PFs.
v3-v2:
- Traversing ports using a new API instead of flow_dump.
- Refactor packet state recover logic, with bug fix for error pop_header.
- One ref count for netdev in non-tunnel case.
- Rename variables, comments, rebase.
v4-v3:
- Extract orig_in_port from physdev for flow modify.
- Miss handling fixes.
v5-v4:
- Drop refactor offload rule creation commit.
- Comment about setting in_port in restore.
- Refactor vports flow offload commit.
v6-v5:
- Fixed duplicate netdev ref bug.

Travis:
v1: https://travis-ci.org/github/elibritstein/OVS/builds/756418552
v2: https://travis-ci.org/github/elibritstein/OVS/builds/758382963
v3: https://travis-ci.org/github/elibritstein/OVS/builds/761089087
v4: https://travis-ci.org/github/elibritstein/OVS/builds/763146966
v5: https://travis-ci.org/github/elibritstein/OVS/builds/765271879
v6: https://travis-ci.org/github/elibritstein/OVS/builds/765816800

GitHub Actions:
v1: https://github.com/elibritstein/OVS/actions/runs/515334647
v2: https://github.com/elibritstein/OVS/actions/runs/554986007
v3: https://github.com/elibritstein/OVS/actions/runs/613226225
v4: https://github.com/elibritstein/OVS/actions/runs/658415274
v5: https://github.com/elibritstein/OVS/actions/runs/704357369
v6: https://github.com/elibritstein/OVS/actions/runs/716304028

[1] https://mails.dpdk.org/archives/dev/2020-October/187314.html
[2] https://github.com/elibritstein/dpdk-stable/pull/1


Eli Britstein (10):
   netdev-offload: Add HW miss packet state recover API
   netdev-dpdk: Introduce DPDK tunnel APIs
   netdev-offload: Introduce an API to traverse ports
   netdev-dpdk: Add flow_api support for netdev vxlan vports
   netdev-offload-dpdk: Implement HW miss packet recover for vport
   dpif-netdev: Add HW miss packet state recover logic
   netdev-offload-dpdk: Change log rate limits
   netdev-offload-dpdk: Support tunnel pop action
   netdev-offload-dpdk: Support vports flows offload
   netdev-dpdk-offload: Add vxlan pattern matching function

Ilya Maximets (2):
   netdev-offload: Allow offloading to netdev without ifindex.
   netdev-offload: Disallow offloading to unrelated tunneling vports.

Sriharsha Basavapatna (1):
   dpif-netdev: Provide orig_in_port in metadata for tunneled packets

  Documentation/howto/dpdk.rst  |   1 +
  NEWS  |   2 +
  lib/dpif-netdev.c |  97 +++--
  lib/netdev-dpdk.c | 118 ++
  lib/netdev-dpdk.h | 106 -
  lib/netdev-offload-dpdk.c | 704 +++---
  lib/netdev-offload-provider.h |   5 +
  lib/netdev-offload-tc.c   |   8 +
  lib/netdev-offload.c  |  47 ++-
  lib/netdev-offload.h  |  10 +
  lib/packets.h |   8 +-
  11 files changed, 1022 insertions(+), 84 deletions(-)



--
Ivan M
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V4 00/14] Netdev vxlan-decap offload

2021-03-25 Thread Ivan Malov

Hello,

You know, now you mention it, will action COUNT always go before any 
other actions in OvS ("tunnel set" and "tunnel match" flows)? If this is 
the case, then the byte count in the counter of "tunnel set" flow should 
be "*before* decapsulation" and the byte count in the counter of "tunnel 
match" flow should be "*after* decapsulation". Is my understanding right?


On 25/03/2021 12:10, Eli Britstein wrote:

Hello,

Note that MLX5 PMD has a bug that the tnl_pop private actions must be 
first. In OVS it is not.


Fixing this issue is scheduled for 21.05 (and stable 20.11.2).

Meanwhile, tests were done with a workaround for it.

See https://github.com/elibritstein/dpdk-stable/pull/1


Also, any other comments on this series?


Thanks,

Eli


On 3/17/2021 8:35 AM, Eli Britstein wrote:

VXLAN decap in OVS-DPDK configuration consists of two flows:
F1: in_port(ens1f0),eth(),ipv4(),udp(), actions:tnl_pop(vxlan_sys_4789)
F2: tunnel(),in_port(vxlan_sys_4789),eth(),ipv4(), actions:ens1f0_0

F1 is a classification flow. It has outer headers matches and it
classifies the packet as a VXLAN packet, and using tnl_pop action the
packet continues processing in F2.
F2 is a flow that has matches on tunnel metadata as well as on the inner
packet headers (as any other flow).

In order to fully offload VXLAN decap path, both F1 and F2 should be
offloaded. As there are more than one flow in HW, it is possible that
F1 is done by HW but F2 is not. Packet is received by SW, and should be
processed starting from F2 as F1 was already done by HW.
Rte_flows are applicable only on physical port IDs. Keeping the original
physical in port on which the packet is received on enables applying
vport flows (e.g. F2) on that physical port.

This patch-set makes use of [1] introduced in DPDK 20.11, that adds API
for tunnel offloads.

v2-v1:
- Tracking original in_port, and applying vport on that physical port 
instead of all PFs.

v3-v2:
- Traversing ports using a new API instead of flow_dump.
- Refactor packet state recover logic, with bug fix for error pop_header.
- One ref count for netdev in non-tunnel case.
- Rename variables, comments, rebase.
v4-v3:
- Extract orig_in_port from physdev for flow modify.
- Miss handling fixes.

Travis:
v1: https://travis-ci.org/github/elibritstein/OVS/builds/756418552
v2: https://travis-ci.org/github/elibritstein/OVS/builds/758382963
v3: https://travis-ci.org/github/elibritstein/OVS/builds/761089087
v4: https://travis-ci.org/github/elibritstein/OVS/builds/763146966

GitHub Actions:
v1: https://github.com/elibritstein/OVS/actions/runs/515334647
v2: https://github.com/elibritstein/OVS/actions/runs/554986007
v3: https://github.com/elibritstein/OVS/actions/runs/613226225
v4: https://github.com/elibritstein/OVS/actions/runs/658415274

[1] https://mails.dpdk.org/archives/dev/2020-October/187314.html

Eli Britstein (11):
   netdev-offload: Add HW miss packet state recover API
   netdev-dpdk: Introduce DPDK tunnel APIs
   netdev-offload: Introduce an API to traverse ports
   netdev-dpdk: Add flow_api support for netdev vxlan vports
   netdev-offload-dpdk: Implement HW miss packet recover for vport
   dpif-netdev: Add HW miss packet state recover logic
   netdev-offload-dpdk: Change log rate limits
   netdev-offload-dpdk: Support tunnel pop action
   netdev-offload-dpdk: Refactor offload rule creation
   netdev-offload-dpdk: Support vports flows offload
   netdev-dpdk-offload: Add vxlan pattern matching function

Ilya Maximets (2):
   netdev-offload: Allow offloading to netdev without ifindex.
   netdev-offload: Disallow offloading to unrelated tunneling vports.

Sriharsha Basavapatna (1):
   dpif-netdev: Provide orig_in_port in metadata for tunneled packets

  Documentation/howto/dpdk.rst  |   1 +
  NEWS  |   2 +
  lib/dpif-netdev.c |  97 +++--
  lib/netdev-dpdk.c | 118 +
  lib/netdev-dpdk.h | 106 -
  lib/netdev-offload-dpdk.c | 782 ++
  lib/netdev-offload-provider.h |   5 +
  lib/netdev-offload-tc.c   |   8 +
  lib/netdev-offload.c  |  47 +-
  lib/netdev-offload.h  |  10 +
  lib/packets.h |   8 +-
  11 files changed, 1052 insertions(+), 132 deletions(-)



--
Ivan M
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V4 05/14] netdev-offload-dpdk: Implement HW miss packet recover for vport

2021-03-17 Thread Ivan Malov

Hi Eli,

On 17/03/2021 09:35, Eli Britstein wrote:

+parse_tcp_flags(packet);
+if (vport_netdev->netdev_class->pop_header(packet) == NULL) {


Thank you for revising the patch series. As far as I can see, in the new 
revision (patch [06/14]), parsing TCP flags is done after successful 
miss recovery (which yields a decapsulated packet), and that should be 
fairly correct. However, why also call parse_tcp_flags() over here, 
before popping the header? This invocation doesn't make use of the 
returned value...


(Sorry if I simply misread the code).

--
Ivan M
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V3 05/14] netdev-offload-dpdk: Implement HW miss packet recover for vport

2021-03-15 Thread Ivan Malov

Hi Eli,

On 02/03/2021 14:25, Eli Britstein wrote:
> +/* For tunnel recovery (RTE_FLOW_RESTORE_INFO_TUNNEL), it is 
possible

> + * to have the packet to still be encapsulated, or not
> + * (RTE_FLOW_RESTORE_INFO_ENCAPSULATED).
> + * In the case it is on, the packet is still encapsulated, 
and we do

> + * the pop in SW.
> + * In the case it is off, the packet is already decapsulated 
by HW, and
> + * the tunnel info is provided in the tunnel struct. For 
this case we

> + * take it to OVS metadata.
> + */
> +if (rte_restore_info.flags & 
RTE_FLOW_RESTORE_INFO_ENCAPSULATED) {


The highlighted code is located *after* 
netdev_dpdk_rte_flow_get_restore_info() invocation. The comment says 
that the packet may *still* be encapsulated. Is the packet in fact 
allowed to have encapsulation header in it even *before* 
netdev_dpdk_rte_flow_get_restore_info()? I.e. right from the point where 
the packet is read from an ethdev's Rx queue. Is that legitimate?


What concerns me is that in patch [06/14] TCP flags get parsed before 
the miss recovery:
> +/* Restore the packet if HW processing was terminated before 
completion. */

> +*tcp_flags = parse_tcp_flags(packet);
> +p = pmd_send_port_cache_lookup(pmd, port_no);
> +if (OVS_LIKELY(p)) {
> +int err = netdev_hw_miss_packet_recover(p->port->netdev, 
packet);


So, I wonder, if the packet is indeed allowed to have encapsulation 
header at that point, can parse_tcp_flags() tolerate that? Will the 
flags lookup still be fine?


(I apologise for asking too many questions).

--
Ivan M
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH V2 10/14] netdev-offload-dpdk: Support tunnel pop action

2021-03-02 Thread Ivan Malov

Hi Eli,

From what has been discussed so far it follows that JUMP from group 0 
to the very same group 0 is intentional and corresponds to 
dp_netdev_recirculate. But could you please clarify, is OvS supposed to 
always use the group value of 0 with the tunnel offload model? In other 
words, can PMD expect the OvS to always insert "tunnel set" and "tunnel 
match" rules at group 0?


Can the PMD always expect OvS to use priority level of 0, too?

A brief comment in the code would be useful, too.

--
Ivan M
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev