Re: [ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a physdev

2023-06-08 Thread Simon Horman
On Wed, Jun 07, 2023 at 12:21:08PM +0200, David Marchand wrote:
> Hello,
> 
> On Tue, Jun 6, 2023 at 7:10 AM Eli Britstein  wrote:
> > >> >
> > >> > Vport's offloads are done on the tracked orig-in-port, but the flow
> > >> > itself is associated in the vport's map.
> > >> >
> > >> > Removing the physdev will flush all the ports that are on its map,
> > >> > but
> > >>
> > >> all the flows*
> > >>
> > >> > not the ones on other netdevs' maps. Since flows take reference
> > >> > count on both their vport and their physdev, the physdev fails to be
> > >removed.
> > >>
> > >> I tested with a simple ping over a vxlan tunnel.
> > >> In my testing, I do manage to remove the physdev port.
> 
> [snip]
> 
> > >I am simply seeing the vport flow expiring.
> > >So I am probably not testing the right way, could you share a reproducer?
> >
> > Sorry. Indeed, the commit msg is misleading.
> > The physport is being able to be detached, while there is a tunnel 
> > offloaded traffic.
> > However, re-adding it (before the flows are aged) encounters an "already in 
> > use" error:
> > $ ovs-vsctl del-port p0
> > $ ovs-vsctl add-port br-phy0 p0 -- set interface p0 type=dpdk 
> > options:dpdk-devargs=":08:00.0,dv_xmeta_en=3"
> > ovs-vsctl: Error detected while setting up 'p0': 'p0' is trying to use 
> > device ':08:00.0,dv_xmeta_en=3' which is already in use by 'p0'.  See 
> > ovs-vswitchd log for details.
> > ovs-vsctl: The default log directory is "/var/log/openvswitch".
> 
> Ok, got it thanks.
> It seems a bit strange to remove and re-add the PF port, but cleaning
> up those flows makes sense on the principle.
> 
> >
> > I can rebase and rephrase the commit msg.
> 
> Maybe rephrase yes, otherwise the fix works and lgtm.

Thanks,

I think it would be best to post a v2 with this change.
Please include patch 2/2 unless you wish to drop it for some reason.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a physdev

2023-06-07 Thread David Marchand
Hello,

On Tue, Jun 6, 2023 at 7:10 AM Eli Britstein  wrote:
> >> >
> >> > Vport's offloads are done on the tracked orig-in-port, but the flow
> >> > itself is associated in the vport's map.
> >> >
> >> > Removing the physdev will flush all the ports that are on its map,
> >> > but
> >>
> >> all the flows*
> >>
> >> > not the ones on other netdevs' maps. Since flows take reference
> >> > count on both their vport and their physdev, the physdev fails to be
> >removed.
> >>
> >> I tested with a simple ping over a vxlan tunnel.
> >> In my testing, I do manage to remove the physdev port.

[snip]

> >I am simply seeing the vport flow expiring.
> >So I am probably not testing the right way, could you share a reproducer?
>
> Sorry. Indeed, the commit msg is misleading.
> The physport is being able to be detached, while there is a tunnel offloaded 
> traffic.
> However, re-adding it (before the flows are aged) encounters an "already in 
> use" error:
> $ ovs-vsctl del-port p0
> $ ovs-vsctl add-port br-phy0 p0 -- set interface p0 type=dpdk 
> options:dpdk-devargs=":08:00.0,dv_xmeta_en=3"
> ovs-vsctl: Error detected while setting up 'p0': 'p0' is trying to use device 
> ':08:00.0,dv_xmeta_en=3' which is already in use by 'p0'.  See 
> ovs-vswitchd log for details.
> ovs-vsctl: The default log directory is "/var/log/openvswitch".

Ok, got it thanks.
It seems a bit strange to remove and re-add the PF port, but cleaning
up those flows makes sense on the principle.

>
> I can rebase and rephrase the commit msg.

Maybe rephrase yes, otherwise the fix works and lgtm.


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a physdev

2023-06-05 Thread Eli Britstein via dev


>-Original Message-
>From: David Marchand 
>Sent: Friday, 26 May 2023 17:48
>To: Eli Britstein 
>Cc: d...@openvswitch.org; Ilya Maximets ;
>wuxi_...@163.com
>Subject: Re: [ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a
>physdev
>
>External email: Use caution opening links or attachments
>
>
>On Fri, May 26, 2023 at 4:35 PM David Marchand
> wrote:
>>
>> Hello Eli,
>>
>> On Mon, Sep 5, 2022 at 4:46 PM Eli Britstein via dev
>>  wrote:
>> >
>> > Vport's offloads are done on the tracked orig-in-port, but the flow
>> > itself is associated in the vport's map.
>> >
>> > Removing the physdev will flush all the ports that are on its map,
>> > but
>>
>> all the flows*
>>
>> > not the ones on other netdevs' maps. Since flows take reference
>> > count on both their vport and their physdev, the physdev fails to be
>removed.
>>
>> I tested with a simple ping over a vxlan tunnel.
>> In my testing, I do manage to remove the physdev port.
>> The revalidator later flushes the expired flow (related to the vport),
>> and the offload thread ends up crashing.
>>
>> netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
>> ../lib/netdev-dpdk.c:5438
>> 5438if (!is_dpdk_class(netdev->netdev_class)) {
>> (gdb) bt
>> #0  netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
>> ../lib/netdev-dpdk.c:5438
>> #1  0x00a34930 in netdev_offload_dpdk_flow_destroy
>> (rte_flow_data=0x7fa51c0104a0) at ../lib/netdev-offload-dpdk.c:2349
>> #2  0x00926f7c in mark_to_flow_disassociate (dp=0x5c93c80,
>> flow=0x7fa4f400d8a0) at ../lib/dpif-netdev.c:2621
>> #3  0x009276f7 in dp_netdev_flow_offload_del
>> (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2743
>> #4  dp_offload_flow (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2855
>> #5  dp_netdev_flow_offload_main (arg=0x59aced0) at
>> ../lib/dpif-netdev.c:2918
>> #6  0x009c0635 in ovsthread_wrapper (aux_=) at
>> ../lib/ovs-thread.c:423
>>
>> (gdb) p rte_flow_data->physdev
>> $5 = (struct netdev *) 0x17d333600
>> (gdb) p rte_flow_data->netdev
>> $6 = (struct netdev *) 0x60bb520
>> (gdb) p *rte_flow_data->physdev
>> $7 = {name = 0x0, netdev_class = 0x0, auto_classified = false,
>> ol_flags = 0, mtu_user_config = false, ref_cnt = 0, change_seq = 0,
>> reconfigure_seq = 0x0, last_reconfigure_seq = 0, n_txq = 0, n_rxq = 0,
>> node = 0x0, saved_flags_list = {
>> prev = 0x0, next = 0x0}, flow_api = {p = 0x0}, dpif_type = 0x0,
>> hw_info = {oor = false, miss_api_supported = false, offload_count = 0,
>> pending_count = 0, offload_data = {p = 0x0}}}
>> (gdb) p *rte_flow_data->netdev
>> $8 = {name = 0x60b8eb0 "vxlan0", netdev_class = 0xcc5518
>> , auto_classified = false, ol_flags = 0,
>> mtu_user_config = false, ref_cnt = 8, change_seq = 4, reconfigure_seq
>> = 0x60b9120, last_reconfigure_seq = 1802,
>>   n_txq = 0, n_rxq = 0, node = 0x60b8da0, saved_flags_list = {prev =
>> 0x60bb570, next = 0x60bb570}, flow_api = {p = 0xb7ede0
>> }, dpif_type = 0xb7e42b "netdev", hw_info = {oor
>> = false, miss_api_supported = true,
>> offload_count = 0, pending_count = 0, offload_data = {p =
>> 0x60bc090}}}
>>
>> There is probably something wrong with the physdev refcnt... and it
>> seems I am hitting an issue close but different to yours.
>
>Ah ah.. nvm, the refcnt issue is on the debug log.
>I'll send a fix for this.
>
>But then I am not able to reproduce your issue.
>I am simply seeing the vport flow expiring.
>So I am probably not testing the right way, could you share a reproducer?

Sorry. Indeed, the commit msg is misleading.
The physport is being able to be detached, while there is a tunnel offloaded 
traffic.
However, re-adding it (before the flows are aged) encounters an "already in 
use" error:
$ ovs-vsctl del-port p0
$ ovs-vsctl add-port br-phy0 p0 -- set interface p0 type=dpdk 
options:dpdk-devargs=":08:00.0,dv_xmeta_en=3"
ovs-vsctl: Error detected while setting up 'p0': 'p0' is trying to use device 
':08:00.0,dv_xmeta_en=3' which is already in use by 'p0'.  See ovs-vswitchd 
log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch".

I can rebase and rephrase the commit msg.
>
>Thanks!
>
>
>--
>David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a physdev

2023-05-26 Thread David Marchand
On Fri, May 26, 2023 at 4:35 PM David Marchand
 wrote:
>
> Hello Eli,
>
> On Mon, Sep 5, 2022 at 4:46 PM Eli Britstein via dev
>  wrote:
> >
> > Vport's offloads are done on the tracked orig-in-port, but the flow itself
> > is associated in the vport's map.
> >
> > Removing the physdev will flush all the ports that are on its map, but
>
> all the flows*
>
> > not the ones on other netdevs' maps. Since flows take reference count on
> > both their vport and their physdev, the physdev fails to be removed.
>
> I tested with a simple ping over a vxlan tunnel.
> In my testing, I do manage to remove the physdev port.
> The revalidator later flushes the expired flow (related to the vport),
> and the offload thread ends up crashing.
>
> netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
> ../lib/netdev-dpdk.c:5438
> 5438if (!is_dpdk_class(netdev->netdev_class)) {
> (gdb) bt
> #0  netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
> ../lib/netdev-dpdk.c:5438
> #1  0x00a34930 in netdev_offload_dpdk_flow_destroy
> (rte_flow_data=0x7fa51c0104a0) at ../lib/netdev-offload-dpdk.c:2349
> #2  0x00926f7c in mark_to_flow_disassociate (dp=0x5c93c80,
> flow=0x7fa4f400d8a0) at ../lib/dpif-netdev.c:2621
> #3  0x009276f7 in dp_netdev_flow_offload_del
> (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2743
> #4  dp_offload_flow (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2855
> #5  dp_netdev_flow_offload_main (arg=0x59aced0) at ../lib/dpif-netdev.c:2918
> #6  0x009c0635 in ovsthread_wrapper (aux_=) at
> ../lib/ovs-thread.c:423
>
> (gdb) p rte_flow_data->physdev
> $5 = (struct netdev *) 0x17d333600
> (gdb) p rte_flow_data->netdev
> $6 = (struct netdev *) 0x60bb520
> (gdb) p *rte_flow_data->physdev
> $7 = {name = 0x0, netdev_class = 0x0, auto_classified = false,
> ol_flags = 0, mtu_user_config = false, ref_cnt = 0, change_seq = 0,
> reconfigure_seq = 0x0, last_reconfigure_seq = 0, n_txq = 0, n_rxq = 0,
> node = 0x0, saved_flags_list = {
> prev = 0x0, next = 0x0}, flow_api = {p = 0x0}, dpif_type = 0x0,
> hw_info = {oor = false, miss_api_supported = false, offload_count = 0,
> pending_count = 0, offload_data = {p = 0x0}}}
> (gdb) p *rte_flow_data->netdev
> $8 = {name = 0x60b8eb0 "vxlan0", netdev_class = 0xcc5518
> , auto_classified = false, ol_flags = 0,
> mtu_user_config = false, ref_cnt = 8, change_seq = 4, reconfigure_seq
> = 0x60b9120, last_reconfigure_seq = 1802,
>   n_txq = 0, n_rxq = 0, node = 0x60b8da0, saved_flags_list = {prev =
> 0x60bb570, next = 0x60bb570}, flow_api = {p = 0xb7ede0
> }, dpif_type = 0xb7e42b "netdev", hw_info = {oor
> = false, miss_api_supported = true,
> offload_count = 0, pending_count = 0, offload_data = {p = 0x60bc090}}}
>
> There is probably something wrong with the physdev refcnt... and it
> seems I am hitting an issue close but different to yours.

Ah ah.. nvm, the refcnt issue is on the debug log.
I'll send a fix for this.

But then I am not able to reproduce your issue.
I am simply seeing the vport flow expiring.
So I am probably not testing the right way, could you share a reproducer?

Thanks!


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a physdev

2023-05-26 Thread David Marchand
Hello Eli,

On Mon, Sep 5, 2022 at 4:46 PM Eli Britstein via dev
 wrote:
>
> Vport's offloads are done on the tracked orig-in-port, but the flow itself
> is associated in the vport's map.
>
> Removing the physdev will flush all the ports that are on its map, but

all the flows*

> not the ones on other netdevs' maps. Since flows take reference count on
> both their vport and their physdev, the physdev fails to be removed.

I tested with a simple ping over a vxlan tunnel.
In my testing, I do manage to remove the physdev port.
The revalidator later flushes the expired flow (related to the vport),
and the offload thread ends up crashing.

netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
../lib/netdev-dpdk.c:5438
5438if (!is_dpdk_class(netdev->netdev_class)) {
(gdb) bt
#0  netdev_dpdk_get_port_id (netdev=netdev@entry=0x17d333600) at
../lib/netdev-dpdk.c:5438
#1  0x00a34930 in netdev_offload_dpdk_flow_destroy
(rte_flow_data=0x7fa51c0104a0) at ../lib/netdev-offload-dpdk.c:2349
#2  0x00926f7c in mark_to_flow_disassociate (dp=0x5c93c80,
flow=0x7fa4f400d8a0) at ../lib/dpif-netdev.c:2621
#3  0x009276f7 in dp_netdev_flow_offload_del
(item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2743
#4  dp_offload_flow (item=0x7fa4fc003660) at ../lib/dpif-netdev.c:2855
#5  dp_netdev_flow_offload_main (arg=0x59aced0) at ../lib/dpif-netdev.c:2918
#6  0x009c0635 in ovsthread_wrapper (aux_=) at
../lib/ovs-thread.c:423

(gdb) p rte_flow_data->physdev
$5 = (struct netdev *) 0x17d333600
(gdb) p rte_flow_data->netdev
$6 = (struct netdev *) 0x60bb520
(gdb) p *rte_flow_data->physdev
$7 = {name = 0x0, netdev_class = 0x0, auto_classified = false,
ol_flags = 0, mtu_user_config = false, ref_cnt = 0, change_seq = 0,
reconfigure_seq = 0x0, last_reconfigure_seq = 0, n_txq = 0, n_rxq = 0,
node = 0x0, saved_flags_list = {
prev = 0x0, next = 0x0}, flow_api = {p = 0x0}, dpif_type = 0x0,
hw_info = {oor = false, miss_api_supported = false, offload_count = 0,
pending_count = 0, offload_data = {p = 0x0}}}
(gdb) p *rte_flow_data->netdev
$8 = {name = 0x60b8eb0 "vxlan0", netdev_class = 0xcc5518
, auto_classified = false, ol_flags = 0,
mtu_user_config = false, ref_cnt = 8, change_seq = 4, reconfigure_seq
= 0x60b9120, last_reconfigure_seq = 1802,
  n_txq = 0, n_rxq = 0, node = 0x60b8da0, saved_flags_list = {prev =
0x60bb570, next = 0x60bb570}, flow_api = {p = 0xb7ede0
}, dpif_type = 0xb7e42b "netdev", hw_info = {oor
= false, miss_api_supported = true,
offload_count = 0, pending_count = 0, offload_data = {p = 0x60bc090}}}

There is probably something wrong with the physdev refcnt... and it
seems I am hitting an issue close but different to yours.
Can you retest on the current master branch and confirm?


>
> Fix it by flushing the physdev's offload flows in all related netdevs,
> e.g. the netdev itself, or for physical devices, all vports.
>
> Fixes: adbd4301a249 ("netdev-offload-dpdk: Use per-netdev offload metadata.")
> Reported-by: 15895987278 
> Signed-off-by: Eli Britstein 


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 1/2] netdev-offload-dpdk: Fix flushing of a physdev

2022-09-05 Thread Eli Britstein via dev
Vport's offloads are done on the tracked orig-in-port, but the flow itself
is associated in the vport's map.

Removing the physdev will flush all the ports that are on its map, but
not the ones on other netdevs' maps. Since flows take reference count on
both their vport and their physdev, the physdev fails to be removed.

Fix it by flushing the physdev's offload flows in all related netdevs,
e.g. the netdev itself, or for physical devices, all vports.

Fixes: adbd4301a249 ("netdev-offload-dpdk: Use per-netdev offload metadata.")
Reported-by: 15895987278 
Signed-off-by: Eli Britstein 
---
 lib/netdev-offload-dpdk.c | 35 ++-
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index cceefbc50..981897da1 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -2530,15 +2530,15 @@ out:
 return ret;
 }
 
-static int
-netdev_offload_dpdk_flow_flush(struct netdev *netdev)
+static void
+flush_netdev_flows_in_related(struct netdev *netdev, struct netdev *related)
 {
-struct cmap *map = offload_data_map(netdev);
-struct ufid_to_rte_flow_data *data;
 unsigned int tid = netdev_offload_thread_id();
+struct cmap *map = offload_data_map(related);
+struct ufid_to_rte_flow_data *data;
 
 if (!map) {
-return -1;
+return;
 }
 
 CMAP_FOR_EACH (data, node, map) {
@@ -2549,6 +2549,31 @@ netdev_offload_dpdk_flow_flush(struct netdev *netdev)
 netdev_offload_dpdk_flow_destroy(data);
 }
 }
+}
+
+static bool
+flush_in_vport_cb(struct netdev *vport,
+  odp_port_t odp_port OVS_UNUSED,
+  void *aux)
+{
+struct netdev *netdev = aux;
+
+/* Only vports are related to physical devices. */
+if (netdev_vport_is_vport_class(vport->netdev_class)) {
+flush_netdev_flows_in_related(netdev, vport);
+}
+
+return false;
+}
+
+static int
+netdev_offload_dpdk_flow_flush(struct netdev *netdev)
+{
+flush_netdev_flows_in_related(netdev, netdev);
+
+if (!netdev_vport_is_vport_class(netdev->netdev_class)) {
+netdev_ports_traverse(netdev->dpif_type, flush_in_vport_cb, netdev);
+}
 
 return 0;
 }
-- 
2.28.0.2311.g225365fb51

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev