Re: [ovs-dev] [PATCH net-next] net: rename reference+tracking helpers

2022-06-08 Thread Jiri Pirko
Wed, Jun 08, 2022 at 04:58:27PM CEST, k...@kernel.org wrote:
>On Wed, 8 Jun 2022 10:27:15 +0200 Jiri Pirko wrote:
>> Wed, Jun 08, 2022 at 06:39:55AM CEST, k...@kernel.org wrote:
>> >Netdev reference helpers have a dev_ prefix for historic
>> >reasons. Renaming the old helpers would be too much churn  
>> 
>> Hmm, I think it would be great to eventually rename the rest too in
>> order to maintain unique prefix for netdev things. Why do you think the
>> "churn" would be an issue?
>
>Felt like we're better of moving everyone to the new tracking helpers
>than doing just a pure rename. But I'm not opposed to a pure rename.
>
>> >diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
>> >index 817577e713d7..815738c0e067 100644
>> >--- a/drivers/net/macsec.c
>> >+++ b/drivers/net/macsec.c
>> >@@ -3462,7 +3462,7 @@ static int macsec_dev_init(struct net_device *dev)
>> >memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len);
>> > 
>> >/* Get macsec's reference to real_dev */
>> >-   dev_hold_track(real_dev, >dev_tracker, GFP_KERNEL);
>> >+   netdev_hold(real_dev, >dev_tracker, GFP_KERNEL);  
>> 
>> So we later decide to rename dev_hold() to obey the netdev_*() naming
>> scheme, we would have collision.
>
>dev_hold() should not be used in new code, we should use tracking
>everywhere. Given that we can name the old helpers __netdev_hold().
>
>> Also, seems to me odd to have:
>> OLDPREFIX_x()
>> and
>> NEWPREFIX_x()
>> to be different functions.
>> 
>> For the sake of not making naming mess, could we rather have:
>> netdev_hold_track()
>> or
>> netdev_hold_tr() if the prior is too long
>> ?
>
>See above, one day non-track version should be removed.
>IMO to encourage use of the track-capable API we could keep their names
>short and call the legacy functions __netdev_hold() as I mentioned or
>maybe netdev_hold_notrack().

Okay, that makes sense.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net] net: openvswitch: fix misuse of the cached connection on tuple changes

2022-06-08 Thread patchwork-bot+netdevbpf
Hello:

This patch was applied to netdev/net.git (master)
by Jakub Kicinski :

On Tue,  7 Jun 2022 00:11:40 +0200 you wrote:
> If packet headers changed, the cached nfct is no longer relevant
> for the packet and attempt to re-use it leads to the incorrect packet
> classification.
> 
> This issue is causing broken connectivity in OpenStack deployments
> with OVS/OVN due to hairpin traffic being unexpectedly dropped.
> 
> [...]

Here is the summary with links:
  - [net] net: openvswitch: fix misuse of the cached connection on tuple changes
https://git.kernel.org/netdev/net/c/2061ecfdf235

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread 刘梦馨
In our profile, the conntrack is the main reason for performance drop.

And when need related functions like lb and acl, we have to carefully check
if other unrelated flows are affected like this patch for multicast traffic
http://patchwork.ozlabs.org/project/ovn/patch/20211217141645.9931-1-dce...@redhat.com

For some performance test poc scenarios we also have to disable functions
related to conntrack for a better result. If XDP/eBPF can help with the
conntrack performance issues, I think it will be a big boost and we don't
need lots of customization or turn to Cilium to replace some functions but
bring in lots of complexity.

On Thu, 9 Jun 2022 at 11:19, 刘梦馨  wrote:

> > pod -> pod (directly to the other Pod IP) shouldn't go through any load
> balancer related flows though, right?
>
> It didn't match the final vip and ct_lb action. But when the lb rule
> exists, it will first send all packets to conntrack and lead recirculation
> with ovs clone and it hurts the performance.
>
> And I find the initial commit that send all traffic to conntrack here
> https://github.com/ovn-org/ovn/commit/64cc065e2c59c0696edeef738180989d993ceceb
> is to fix a bug.
>
> Even if we bypass the conntrack action in ingress pipeline by a customized
> ovn, we still cannot bypass the conntrack in the egress pipeline. All
> egress packets still need to be sent to conntrack to test if they match a
> nat session.
>
> I cannot find the full performance test data at the moment. What I find is
> that with the patch to bypass ingress conntrack, with lb rules, the latency
> for pod-to-pod qperf test dropped from 118us to 97us. And if no lb rules
> exist, the pod-to-pod latency drops to 88us.
>
> On Thu, 9 Jun 2022 at 01:52, Dan Williams  wrote:
>
>> On Thu, 2022-06-09 at 00:41 +0800, 刘梦馨 wrote:
>> > > Could you tell roughly how many packets were sent in a single test?
>> > > Was
>> > the latency measured for all the UDP packets in average?
>> >
>> > Let me describe my test method more clearly. In fact, we only tested
>> > pod-to-pod performance *not* pod-to-service and then do profile with
>> > flamegraph and find the loadbalancer process took about 30% CPU
>> > usage.
>>
>> pod -> pod (directly to the other Pod IP) shouldn't go through any load
>> balancer related flows though, right? That seems curious to me... It
>> might hit OVN's load balancer stages but (I think!) shouldn't be
>> matching any rules in them, because the packet's destination IP
>> wouldn't be a LB VIP.
>>
>> Did you do an ofproto/trace to see what OVS flows the packet was
>> hitting and if any were OVN LB related?
>>
>> Dan
>>
>> >
>> > Run two Pods in two different node, and one run qperf server the
>> > other run
>> > qperf client to test udp latency and bandwidth performance with
>> > command
>> > `qperf {another Pod IP} -ub -oo msg_size:1 -vu udp_lat udp_bw`.
>> >
>> > In the first test, we use kube-ovn default setup which use ovn
>> > loadbalancer
>> > to replace kube-proxy and got the result latency  25.7us and
>> > bandwidth
>> > 2.8Mb/s
>> >
>> > Then we manually delete all ovn loadbalancer rules bind to the
>> > logical
>> > switch, and got a much better result 18.5us and 6Mb/s
>> >
>> > > Was it clear why the total datapath cannot be offloaded to HW?
>> > The issue we meet with hw-offload is that mellanox cx5/cx6 didn't
>> > support
>> > dp_hash and hash at the moment and these two method are used by
>> > group table to select a backend.
>> > What makes things worse is that when any lb bind to a ls all packet
>> > will go
>> > through the lb pipeline even if it not designate to service. So the
>> > total
>> > ls datapath cannot be offloaded.
>> >
>> > We have a customized path to bypaas the lb pipeline if traffic not
>> > designate to service here
>> >
>> https://github.com/kubeovn/ovn/commit/d26ae4de0ab070f6b602688ba808c8963f69d5c4.patch
>> >
>> > > I am sorry that I am confused by OVN "L2" LB. I think you might
>> > > mean OVN
>> > "L3/L4" LB?
>> > I mean loadbalancers add to ls by ls-lb-add, kube-ovn uses it to
>> > replace
>> > kube-proxy
>> >
>> > >   I am asking because if the packets hit mega flows in the kernel
>> > > cache,
>> > it shouldn't be slower than kube-proxy which also uses conntrack. If
>> > it is
>> > HW offloaded it should be faster.
>> >
>> > In my previous profile it seems unrelated to mega flow cache. The
>> > flame
>> > graph shows that there is extra ovs clone and reprocess compared to
>> > the
>> > flame graph without lb. I have introduced how to profile and optimize
>> > kube-ovn performance before and give more detail about the lb
>> > performance
>> > issue at the beginning of the video in Chinese
>> > https://www.youtube.com/watch?v=eqKHs05NUlg=27s hope it can provide
>> > more
>> > help
>> >
>> > On Wed, 8 Jun 2022 at 23:53, Han Zhou  wrote:
>> >
>> > >
>> > >
>> > > On Wed, Jun 8, 2022 at 8:08 AM Numan Siddique 
>> > > wrote:
>> > > >
>> > > > On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨 
>> > > > wrote:
>> > > > >
>> > > > > Just give 

Re: [ovs-dev] [PATCH net-next] net: rename reference+tracking helpers

2022-06-08 Thread Jakub Kicinski
On Wed, 8 Jun 2022 16:58:08 -0600 David Ahern wrote:
> On 6/8/22 8:58 AM, Jakub Kicinski wrote:
> > IMO to encourage use of the track-capable API we could keep their names
> > short and call the legacy functions __netdev_hold() as I mentioned or
> > maybe netdev_hold_notrack().  
> 
> I like that option. Similar to the old nla_parse functions that were
> renamed with _deprecated - makes it easier to catch new uses.

Well, not really a perfect parallel because _deprecated nla has to stay
forever, given it behaves differently, while _notrack would hopefully
die either thru conversion or someone rightly taking an axe to the
cobwebbed code.

Either way, I hope nobody is against merging the current patch.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread 刘梦馨
> pod -> pod (directly to the other Pod IP) shouldn't go through any load
balancer related flows though, right?

It didn't match the final vip and ct_lb action. But when the lb rule
exists, it will first send all packets to conntrack and lead recirculation
with ovs clone and it hurts the performance.

And I find the initial commit that send all traffic to conntrack here
https://github.com/ovn-org/ovn/commit/64cc065e2c59c0696edeef738180989d993ceceb
is to fix a bug.

Even if we bypass the conntrack action in ingress pipeline by a customized
ovn, we still cannot bypass the conntrack in the egress pipeline. All
egress packets still need to be sent to conntrack to test if they match a
nat session.

I cannot find the full performance test data at the moment. What I find is
that with the patch to bypass ingress conntrack, with lb rules, the latency
for pod-to-pod qperf test dropped from 118us to 97us. And if no lb rules
exist, the pod-to-pod latency drops to 88us.

On Thu, 9 Jun 2022 at 01:52, Dan Williams  wrote:

> On Thu, 2022-06-09 at 00:41 +0800, 刘梦馨 wrote:
> > > Could you tell roughly how many packets were sent in a single test?
> > > Was
> > the latency measured for all the UDP packets in average?
> >
> > Let me describe my test method more clearly. In fact, we only tested
> > pod-to-pod performance *not* pod-to-service and then do profile with
> > flamegraph and find the loadbalancer process took about 30% CPU
> > usage.
>
> pod -> pod (directly to the other Pod IP) shouldn't go through any load
> balancer related flows though, right? That seems curious to me... It
> might hit OVN's load balancer stages but (I think!) shouldn't be
> matching any rules in them, because the packet's destination IP
> wouldn't be a LB VIP.
>
> Did you do an ofproto/trace to see what OVS flows the packet was
> hitting and if any were OVN LB related?
>
> Dan
>
> >
> > Run two Pods in two different node, and one run qperf server the
> > other run
> > qperf client to test udp latency and bandwidth performance with
> > command
> > `qperf {another Pod IP} -ub -oo msg_size:1 -vu udp_lat udp_bw`.
> >
> > In the first test, we use kube-ovn default setup which use ovn
> > loadbalancer
> > to replace kube-proxy and got the result latency  25.7us and
> > bandwidth
> > 2.8Mb/s
> >
> > Then we manually delete all ovn loadbalancer rules bind to the
> > logical
> > switch, and got a much better result 18.5us and 6Mb/s
> >
> > > Was it clear why the total datapath cannot be offloaded to HW?
> > The issue we meet with hw-offload is that mellanox cx5/cx6 didn't
> > support
> > dp_hash and hash at the moment and these two method are used by
> > group table to select a backend.
> > What makes things worse is that when any lb bind to a ls all packet
> > will go
> > through the lb pipeline even if it not designate to service. So the
> > total
> > ls datapath cannot be offloaded.
> >
> > We have a customized path to bypaas the lb pipeline if traffic not
> > designate to service here
> >
> https://github.com/kubeovn/ovn/commit/d26ae4de0ab070f6b602688ba808c8963f69d5c4.patch
> >
> > > I am sorry that I am confused by OVN "L2" LB. I think you might
> > > mean OVN
> > "L3/L4" LB?
> > I mean loadbalancers add to ls by ls-lb-add, kube-ovn uses it to
> > replace
> > kube-proxy
> >
> > >   I am asking because if the packets hit mega flows in the kernel
> > > cache,
> > it shouldn't be slower than kube-proxy which also uses conntrack. If
> > it is
> > HW offloaded it should be faster.
> >
> > In my previous profile it seems unrelated to mega flow cache. The
> > flame
> > graph shows that there is extra ovs clone and reprocess compared to
> > the
> > flame graph without lb. I have introduced how to profile and optimize
> > kube-ovn performance before and give more detail about the lb
> > performance
> > issue at the beginning of the video in Chinese
> > https://www.youtube.com/watch?v=eqKHs05NUlg=27s hope it can provide
> > more
> > help
> >
> > On Wed, 8 Jun 2022 at 23:53, Han Zhou  wrote:
> >
> > >
> > >
> > > On Wed, Jun 8, 2022 at 8:08 AM Numan Siddique 
> > > wrote:
> > > >
> > > > On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨 
> > > > wrote:
> > > > >
> > > > > Just give some input about eBPF/XDP support.
> > > > >
> > > > > We used to use OVN L2 LB to replace kube-proxy in Kubernetes,
> > > > > but found
> > > > > that
> > > > > the L2 LB will use conntrack and ovs clone which hurts
> > > > > performance
> > > badly.
> > > > > The latency
> > > > > for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth
> > > > > drop
> > > from
> > > > > 6Mb/s to 2.8Mb/s.
> > > > >
> > > Thanks for the input!
> > > Could you tell roughly how many packets were sent in a single test?
> > > Was
> > > the latency measured for all the UDP packets in average? I am
> > > asking
> > > because if the packets hit mega flows in the kernel cache, it
> > > shouldn't be
> > > slower than kube-proxy which also uses conntrack. If it is HW
> > > offloaded it
> > > should be faster.
> > >

Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Eli Britstein via dev
Hi Ivan,

>-Original Message-
>From: Ivan Malov 
>Sent: Wednesday, June 8, 2022 10:02 PM
>To: Eli Britstein 
>Cc: d...@openvswitch.org; Andrew Rybchenko
>; Ilya Maximets ;
>Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
>; Stephen Hemminger
>; David Marchand
>; Gaetan Rivet ; Maxime
>Coquelin 
>Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
>mechanism
>
>External email: Use caution opening links or attachments
>
>
>Hi Eli,
>
>On Wed, 8 Jun 2022, Eli Britstein wrote:
>
>> Hi Ivan,
>>
>>> -Original Message-
>>> From: Ivan Malov 
>>> Sent: Wednesday, June 8, 2022 5:46 PM
>>> To: Eli Britstein 
>>> Cc: d...@openvswitch.org; Andrew Rybchenko
>>> ; Ilya Maximets ;
>>> Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
>>> ; Stephen Hemminger
>>> ; David Marchand
>>> ; Gaetan Rivet ; Maxime
>>> Coquelin 
>>> Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
>>> mechanism
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Hi Eli,
>>>
>>> On Wed, 8 Jun 2022, Eli Britstein wrote:
>>>
 Hi Ivan,

> -Original Message-
> From: Ivan Malov 
> Sent: Tuesday, June 7, 2022 11:56 PM
> To: Eli Britstein 
> Cc: d...@openvswitch.org; Andrew Rybchenko
> ; Ilya Maximets
> ; Ori Kam ;
> NBU-Contact-Thomas Monjalon (EXTERNAL) ;
> Stephen Hemminger ; David Marchand
> ; Gaetan Rivet ;
>Maxime
> Coquelin 
> Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer
> proxy mechanism
>
> External email: Use caution opening links or attachments
>
>
> Hi Eli,
>
> On Wed, 1 Jun 2022, Eli Britstein wrote:
>
>> - Missing proper handling of the testpmd syntax logging. It
>> changes the used
> port according to "transfer", but the log still uses
>>> netdev_dpdk_get_port_id().
>
> Thanks for noticing. I will see to it in the next version.
>
>> - The usage of the "proxy" port for rte-flow implies that this
>> proxy port is
> attached to OVS, otherwise it is not "started" and creation of
> flows will
>>> fail.
>
> That's the way it is. If there is no proxy for a given port, then
> the original port value will be used for managing flows. For
> vendors that don't need the proxy, this will work. For others, it won't.
>That's OK.
>>>
 I don't really understand why this can't be done inside dpdk domain
 (if there
>>> is a proxy, and it is up, use it, otherwise don't).
 That's *currently* the way it is. I understand that if dpdk works
 like this OVS
>>> should align, but maybe you or someone else here knows why dpdk works
>>> like this? (not too late to change, this is experimental...).
>>>
>>>
>>> Regardless of DPDK, on some NICs, it is possible to insert rules via
>>> unprivileged PFs or VFs, but there are also NICs which cannot do it.
>>>
>>> In DPDK, this contradiction has to be resolved somehow.
>>> In example, for NICs that can only manage flows via privileged ports,
>>> two possible solutions exist:
>>>
>>> 1. Route flow management requests from unprivileged ethdevs
>>>to the privileged one implicitly, inside the PMD. This
>>>is transparent to users, but, at the same time, it is
>>>tricky because the application does not realise that
>>>flows it manages via an ethdev "B" are in fact
>>>communicated to the NIC via an ethdev "A".
>>>
>>>Unbeknownst of the implicit scheme, the application may
>>>detach the privileged ethdev "A" in-between. And, when
>>>time comes to remove flows, doing so via ethdev "B"
>>>will fail. This scheme breaks in-app housekeeping.
>>>
>>> 2. Expose the "proxy" port existence to the application.
>>>If it knows the truth about the real ethdev that
>>>handles the transfer flows, it won't attempt to
>>>detach it in-between. The housekeeping is fine.
>>>
>>> Outing the existence of the "proxy" port to users seems like the most
>>> reasonable approach. This is why it was implemented in DPDK like this.
>>> Currently, it's indeed an experimental feature. DPDK PMDs which need
>>> it, are supposed to switch to it during the transition phase.
>> Thanks very much for the explanation, though IMHO relevant PMDs could
>still hide it and not do this "outing" of their internals.
>
>
>Sort of yes, they could hide it. But that would mean doing additional record-
>keeping internally, in order to return EBUSY when the app asks to detach the
>privileged port which still has active flows on it that have been originally
>requested via an unprivileged port. Might be quite error prone. Also, given the
>fact that quite a few vendors might need this, isn't it better to make the
>feature generic?
Discussing such scenario, this patch does not handle it. Suppose A is the 
privileged port, served as a proxy for port B.
Now, there are flows applied on port B (but actually on A). Nothing prevents 
OVS to detach port A. The flows applied on port B 

Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Ivan Malov

Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Wednesday, June 8, 2022 5:46 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Tuesday, June 7, 2022 11:56 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 1 Jun 2022, Eli Britstein wrote:


- Missing proper handling of the testpmd syntax logging. It changes
the used

port according to "transfer", but the log still uses

netdev_dpdk_get_port_id().


Thanks for noticing. I will see to it in the next version.


- The usage of the "proxy" port for rte-flow implies that this proxy
port is

attached to OVS, otherwise it is not "started" and creation of flows will

fail.


That's the way it is. If there is no proxy for a given port, then the
original port value will be used for managing flows. For vendors that
don't need the proxy, this will work. For others, it won't. That's OK.



I don't really understand why this can't be done inside dpdk domain (if there

is a proxy, and it is up, use it, otherwise don't).

That's *currently* the way it is. I understand that if dpdk works like this OVS

should align, but maybe you or someone else here knows why dpdk works like
this? (not too late to change, this is experimental...).


Regardless of DPDK, on some NICs, it is possible to insert rules via
unprivileged PFs or VFs, but there are also NICs which cannot do it.

In DPDK, this contradiction has to be resolved somehow.
In example, for NICs that can only manage flows via privileged ports, two
possible solutions exist:

1. Route flow management requests from unprivileged ethdevs
   to the privileged one implicitly, inside the PMD. This
   is transparent to users, but, at the same time, it is
   tricky because the application does not realise that
   flows it manages via an ethdev "B" are in fact
   communicated to the NIC via an ethdev "A".

   Unbeknownst of the implicit scheme, the application may
   detach the privileged ethdev "A" in-between. And, when
   time comes to remove flows, doing so via ethdev "B"
   will fail. This scheme breaks in-app housekeeping.

2. Expose the "proxy" port existence to the application.
   If it knows the truth about the real ethdev that
   handles the transfer flows, it won't attempt to
   detach it in-between. The housekeeping is fine.

Outing the existence of the "proxy" port to users seems like the most
reasonable approach. This is why it was implemented in DPDK like this.
Currently, it's indeed an experimental feature. DPDK PMDs which need it, are
supposed to switch to it during the transition phase.
Thanks very much for the explanation, though IMHO relevant PMDs could 

still hide it and not do this "outing" of their internals.


Sort of yes, they could hide it. But that would mean doing additional
record-keeping internally, in order to return EBUSY when the app asks
to detach the privileged port which still has active flows on it that
have been originally requested via an unprivileged port. Might be
quite error prone. Also, given the fact that quite a few vendors
might need this, isn't it better to make the feature generic?




However, I should stress out that to NICs that support managing transfer
flows on any PFs and VFs, this proxy scheme is a don't care. The
corresponding drivers may not implement the proxy query method at all:

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
b.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2Fethdev%2Frte_flow.c%2
3L1345data=05%7C01%7Celibr%40nvidia.com%7Cf5a80eb00f0342498
63308da495dab8b%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6
37902963929533013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
sdata=ojwUOsPlz09NXtDXfeO8lAT%2BHcgGYWNRdIhxB6f0cy0%3D
mp;reserved=0

The generic part of the API will just return the original port ID to the
application.

Yes, I saw that. Thanks.



You're very welcome.










-Original Message-
From: Ivan Malov 
Sent: Monday, May 30, 2022 5:16 PM
To: d...@openvswitch.org
Cc: Andrew Rybchenko ; Ilya

Maximets

; Ori Kam ; Eli Britstein
; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ;

Maxime

Coquelin 
Subject: [PATCH 3/3] netdev-offload-dpdk: use flow transfer 

Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread Dan Williams
On Thu, 2022-06-09 at 00:41 +0800, 刘梦馨 wrote:
> > Could you tell roughly how many packets were sent in a single test?
> > Was
> the latency measured for all the UDP packets in average?
> 
> Let me describe my test method more clearly. In fact, we only tested
> pod-to-pod performance *not* pod-to-service and then do profile with
> flamegraph and find the loadbalancer process took about 30% CPU
> usage.

pod -> pod (directly to the other Pod IP) shouldn't go through any load
balancer related flows though, right? That seems curious to me... It
might hit OVN's load balancer stages but (I think!) shouldn't be
matching any rules in them, because the packet's destination IP
wouldn't be a LB VIP.

Did you do an ofproto/trace to see what OVS flows the packet was
hitting and if any were OVN LB related?

Dan

> 
> Run two Pods in two different node, and one run qperf server the
> other run
> qperf client to test udp latency and bandwidth performance with
> command
> `qperf {another Pod IP} -ub -oo msg_size:1 -vu udp_lat udp_bw`.
> 
> In the first test, we use kube-ovn default setup which use ovn
> loadbalancer
> to replace kube-proxy and got the result latency  25.7us and
> bandwidth
> 2.8Mb/s
> 
> Then we manually delete all ovn loadbalancer rules bind to the
> logical
> switch, and got a much better result 18.5us and 6Mb/s
> 
> > Was it clear why the total datapath cannot be offloaded to HW?
> The issue we meet with hw-offload is that mellanox cx5/cx6 didn't
> support
> dp_hash and hash at the moment and these two method are used by
> group table to select a backend.
> What makes things worse is that when any lb bind to a ls all packet
> will go
> through the lb pipeline even if it not designate to service. So the
> total
> ls datapath cannot be offloaded.
> 
> We have a customized path to bypaas the lb pipeline if traffic not
> designate to service here
> https://github.com/kubeovn/ovn/commit/d26ae4de0ab070f6b602688ba808c8963f69d5c4.patch
> 
> > I am sorry that I am confused by OVN "L2" LB. I think you might
> > mean OVN
> "L3/L4" LB?
> I mean loadbalancers add to ls by ls-lb-add, kube-ovn uses it to
> replace
> kube-proxy
> 
> >   I am asking because if the packets hit mega flows in the kernel
> > cache,
> it shouldn't be slower than kube-proxy which also uses conntrack. If
> it is
> HW offloaded it should be faster.
> 
> In my previous profile it seems unrelated to mega flow cache. The
> flame
> graph shows that there is extra ovs clone and reprocess compared to
> the
> flame graph without lb. I have introduced how to profile and optimize
> kube-ovn performance before and give more detail about the lb
> performance
> issue at the beginning of the video in Chinese
> https://www.youtube.com/watch?v=eqKHs05NUlg=27s hope it can provide
> more
> help
> 
> On Wed, 8 Jun 2022 at 23:53, Han Zhou  wrote:
> 
> > 
> > 
> > On Wed, Jun 8, 2022 at 8:08 AM Numan Siddique 
> > wrote:
> > > 
> > > On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨 
> > > wrote:
> > > > 
> > > > Just give some input about eBPF/XDP support.
> > > > 
> > > > We used to use OVN L2 LB to replace kube-proxy in Kubernetes,
> > > > but found
> > > > that
> > > > the L2 LB will use conntrack and ovs clone which hurts
> > > > performance
> > badly.
> > > > The latency
> > > > for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth
> > > > drop
> > from
> > > > 6Mb/s to 2.8Mb/s.
> > > > 
> > Thanks for the input!
> > Could you tell roughly how many packets were sent in a single test?
> > Was
> > the latency measured for all the UDP packets in average? I am
> > asking
> > because if the packets hit mega flows in the kernel cache, it
> > shouldn't be
> > slower than kube-proxy which also uses conntrack. If it is HW
> > offloaded it
> > should be faster.
> > 
> > > > Even if the traffic does not target to LB VIPs has the same
> > > > performance
> > > > drop and it also leads to the
> > > > total datapath cannot be offloaded to hardware.
> > > > 
> > 
> > Was it clear why the total datapath cannot be offloaded to HW?
> > There might
> > be problems of supporting HW offloading in earlier version of OVN.
> > There
> > have been improvements to make it more HW offload friendly.
> > 
> > > > And finally we turn to using Cilium's chaining mode to replace
> > > > the OVN
> > L2
> > > > LB to implement kube-proxy to
> > > > resolve the above issues. We hope to see the lb optimization by
> > eBPF/XDP on
> > > > the OVN side.
> > > > 
> > > 
> > > Thanks for your comments and inputs.   I think we should
> > > definitely
> > > explore optimizing this use case
> > > and see if its possible to leverage eBPF/XDP for this.
> > > 
> > 
> > I am sorry that I am confused by OVN "L2" LB. I think you might
> > mean OVN
> > "L3/L4" LB?
> > 
> > Some general thoughts on this is, OVN is primarily to program OVS
> > (or
> > other OpenFlow based datapath) to implement SDN. OVS OpenFlow is a
> > data-driven approach (as mentioned by Ben in several talks). The
> > advantage
> > is that it 

Re: [ovs-dev] [PATCH v2 2/2] ofproto-dpif: avoid unneccesary backer revalidation

2022-06-08 Thread Paolo Valerio
lic121  writes:

> If lldp didn't change, we are not supposed to trigger backer
> revalidation.
> Without this patch, bridge_reconfigure() always trigger udpif
> revalidator because of lldp.
>
> Signed-off-by: lic121 
> Signed-off-by: Eelco Chaudron 
> Co-authored-by: Eelco Chaudron 
> ---

LGTM,

Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/2] lldp: fix lldp memory leak

2022-06-08 Thread Paolo Valerio
lic121  writes:

> lldp_create() malloc memory for lldp->lldpd->g_hardware. lldp_unref
> is supposed to free the memory regardless of hw->h_flags.
>
> Signed-off-by: lic121 
> Acked-by: Eelco Chaudron 
> ---
>  lib/lldp/lldpd.c | 10 +++---
>  1 file changed, 3 insertions(+), 7 deletions(-)
>
> diff --git a/lib/lldp/lldpd.c b/lib/lldp/lldpd.c
> index 403f1f5..4bff7b0 100644
> --- a/lib/lldp/lldpd.c
> +++ b/lib/lldp/lldpd.c
> @@ -140,13 +140,9 @@ lldpd_cleanup(struct lldpd *cfg)
>  VLOG_DBG("cleanup all ports");
>  
>  LIST_FOR_EACH_SAFE (hw, h_entries, >g_hardware) {
> -if (!hw->h_flags) {
> -ovs_list_remove(>h_entries);
> -lldpd_remote_cleanup(hw, NULL, true);
> -lldpd_hardware_cleanup(cfg, hw);
> -} else {
> -lldpd_remote_cleanup(hw, NULL, false);
> -}
> +ovs_list_remove(>h_entries);
> +lldpd_remote_cleanup(hw, NULL, true);
> +lldpd_hardware_cleanup(cfg, hw);
>  }
>  
>  VLOG_DBG("cleanup all chassis");


Acked-by: Paolo Valerio 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread 刘梦馨
> Could you tell roughly how many packets were sent in a single test? Was
the latency measured for all the UDP packets in average?

Let me describe my test method more clearly. In fact, we only tested
pod-to-pod performance *not* pod-to-service and then do profile with
flamegraph and find the loadbalancer process took about 30% CPU usage.

Run two Pods in two different node, and one run qperf server the other run
qperf client to test udp latency and bandwidth performance with command
`qperf {another Pod IP} -ub -oo msg_size:1 -vu udp_lat udp_bw`.

In the first test, we use kube-ovn default setup which use ovn loadbalancer
to replace kube-proxy and got the result latency  25.7us and bandwidth
2.8Mb/s

Then we manually delete all ovn loadbalancer rules bind to the logical
switch, and got a much better result 18.5us and 6Mb/s

> Was it clear why the total datapath cannot be offloaded to HW?
The issue we meet with hw-offload is that mellanox cx5/cx6 didn't support
dp_hash and hash at the moment and these two method are used by
group table to select a backend.
What makes things worse is that when any lb bind to a ls all packet will go
through the lb pipeline even if it not designate to service. So the total
ls datapath cannot be offloaded.

We have a customized path to bypaas the lb pipeline if traffic not
designate to service here
https://github.com/kubeovn/ovn/commit/d26ae4de0ab070f6b602688ba808c8963f69d5c4.patch

> I am sorry that I am confused by OVN "L2" LB. I think you might mean OVN
"L3/L4" LB?
I mean loadbalancers add to ls by ls-lb-add, kube-ovn uses it to replace
kube-proxy

>   I am asking because if the packets hit mega flows in the kernel cache,
it shouldn't be slower than kube-proxy which also uses conntrack. If it is
HW offloaded it should be faster.

In my previous profile it seems unrelated to mega flow cache. The flame
graph shows that there is extra ovs clone and reprocess compared to the
flame graph without lb. I have introduced how to profile and optimize
kube-ovn performance before and give more detail about the lb performance
issue at the beginning of the video in Chinese
https://www.youtube.com/watch?v=eqKHs05NUlg=27s hope it can provide more
help

On Wed, 8 Jun 2022 at 23:53, Han Zhou  wrote:

>
>
> On Wed, Jun 8, 2022 at 8:08 AM Numan Siddique  wrote:
> >
> > On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨  wrote:
> > >
> > > Just give some input about eBPF/XDP support.
> > >
> > > We used to use OVN L2 LB to replace kube-proxy in Kubernetes, but found
> > > that
> > > the L2 LB will use conntrack and ovs clone which hurts performance
> badly.
> > > The latency
> > > for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth drop
> from
> > > 6Mb/s to 2.8Mb/s.
> > >
> Thanks for the input!
> Could you tell roughly how many packets were sent in a single test? Was
> the latency measured for all the UDP packets in average? I am asking
> because if the packets hit mega flows in the kernel cache, it shouldn't be
> slower than kube-proxy which also uses conntrack. If it is HW offloaded it
> should be faster.
>
> > > Even if the traffic does not target to LB VIPs has the same performance
> > > drop and it also leads to the
> > > total datapath cannot be offloaded to hardware.
> > >
>
> Was it clear why the total datapath cannot be offloaded to HW? There might
> be problems of supporting HW offloading in earlier version of OVN. There
> have been improvements to make it more HW offload friendly.
>
> > > And finally we turn to using Cilium's chaining mode to replace the OVN
> L2
> > > LB to implement kube-proxy to
> > > resolve the above issues. We hope to see the lb optimization by
> eBPF/XDP on
> > > the OVN side.
> > >
> >
> > Thanks for your comments and inputs.   I think we should definitely
> > explore optimizing this use case
> > and see if its possible to leverage eBPF/XDP for this.
> >
>
> I am sorry that I am confused by OVN "L2" LB. I think you might mean OVN
> "L3/L4" LB?
>
> Some general thoughts on this is, OVN is primarily to program OVS (or
> other OpenFlow based datapath) to implement SDN. OVS OpenFlow is a
> data-driven approach (as mentioned by Ben in several talks). The advantage
> is that it uses caches to accelerate datapath, regardless of the number of
> pipeline stages in the forwarding logic; and the disadvantage is of course
> when a packet has a cache miss, it will be slow. So I would think the
> direction of using eBPF/XDP is better to be within OVS itself, instead of
> adding an extra stage that cannot be cached within the OVS framework,
> because even if the extra stage is very fast, it is still extra.
>
> I would consider such an extra eBPF/XDP stage in OVN directly only for the
> cases that we know it is likely to miss the OVS/HW flow caches. One example
> may be DOS attacks that always trigger CT unestablished entries, which is
> not HW offload friendly. (But I don't have concrete use cases/scenarios)
>
> In the case of OVN LB, I don't see a reason why it would miss the 

Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread Han Zhou
On Wed, Jun 8, 2022 at 8:08 AM Numan Siddique  wrote:
>
> On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨  wrote:
> >
> > Just give some input about eBPF/XDP support.
> >
> > We used to use OVN L2 LB to replace kube-proxy in Kubernetes, but found
> > that
> > the L2 LB will use conntrack and ovs clone which hurts performance
badly.
> > The latency
> > for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth drop from
> > 6Mb/s to 2.8Mb/s.
> >
Thanks for the input!
Could you tell roughly how many packets were sent in a single test? Was the
latency measured for all the UDP packets in average? I am asking because if
the packets hit mega flows in the kernel cache, it shouldn't be slower than
kube-proxy which also uses conntrack. If it is HW offloaded it should be
faster.

> > Even if the traffic does not target to LB VIPs has the same performance
> > drop and it also leads to the
> > total datapath cannot be offloaded to hardware.
> >

Was it clear why the total datapath cannot be offloaded to HW? There might
be problems of supporting HW offloading in earlier version of OVN. There
have been improvements to make it more HW offload friendly.

> > And finally we turn to using Cilium's chaining mode to replace the OVN
L2
> > LB to implement kube-proxy to
> > resolve the above issues. We hope to see the lb optimization by
eBPF/XDP on
> > the OVN side.
> >
>
> Thanks for your comments and inputs.   I think we should definitely
> explore optimizing this use case
> and see if its possible to leverage eBPF/XDP for this.
>

I am sorry that I am confused by OVN "L2" LB. I think you might mean OVN
"L3/L4" LB?

Some general thoughts on this is, OVN is primarily to program OVS (or other
OpenFlow based datapath) to implement SDN. OVS OpenFlow is a data-driven
approach (as mentioned by Ben in several talks). The advantage is that it
uses caches to accelerate datapath, regardless of the number of pipeline
stages in the forwarding logic; and the disadvantage is of course when a
packet has a cache miss, it will be slow. So I would think the direction of
using eBPF/XDP is better to be within OVS itself, instead of adding an
extra stage that cannot be cached within the OVS framework, because even if
the extra stage is very fast, it is still extra.

I would consider such an extra eBPF/XDP stage in OVN directly only for the
cases that we know it is likely to miss the OVS/HW flow caches. One example
may be DOS attacks that always trigger CT unestablished entries, which is
not HW offload friendly. (But I don't have concrete use cases/scenarios)

In the case of OVN LB, I don't see a reason why it would miss the cache
except for the first packets. Adding an extra eBPF/XDP stage on top of the
OVS cached pipeline doesn't seem to improve the performance.

> > On Wed, 8 Jun 2022 at 14:43, Han Zhou  wrote:
> >
> > > On Mon, May 30, 2022 at 5:46 PM  wrote:
> > > >
> > > > From: Numan Siddique 
> > > >
> > > > XDP program - ovn_xdp.c added in this RFC patch  series implements
basic
> > > port
> > > > security and drops any packet if the port security check fails.
> > > > There are still few TODOs in the port security checks. Like
> > > >   - Make ovn xdp configurable.
> > > >   - Removing the ingress Openflow rules from table 73 and 74 if
ovn
> > > xdp
> > > > is enabled.
> > > >   - Add IPv6 support.
> > > >   - Enhance the port security xdp program for ARP/IPv6 ND
checks.
> > > >
> > > > This patch adds a basic XDP support in OVN and in future we can
> > > > leverage eBPF/XDP features.
> > > >
> > > > I'm not sure how much value this RFC patch adds to make use of
eBPF/XDP
> > > > just for port security.  Submitting as RFC to get some feedback and
> > > > start some conversation on eBPF/XDP in OVN.
> > > >
> > > Hi Numan,
> > >
> > > This is really cool. It demonstrates how OVN could leverage eBPF/XDP.
> > >
> > > On the other hand, for the port-security feature in XDP, I keep
thinking
> > > about the scenarios and it is still not very clear to me. One
advantage I
> > > can think of is to prevent DOS attacks from VM/Pod when invalid
IP/MAC are
> > > used, XDP may perform better and drop packets with lower CPU cost
> > > (comparing with OVS kernel datapath). However, I am also wondering why
> > > would a attacker use invalid IP/MAC for DOS attacks? Do you have some
more
> > > thoughts about the use cases?
>
> My idea was to demonstrate the use of eBPF/XDP and port security
> checks were easy to do
> before the packet hits the OVS pipeline.
>
Understand. It is indeed a great demonstration.

> If we were to move the port security check to XDP, then the only
> advantage we would be getting
> in my opinion is to remove the corresponding ingress port security
> check related OF rules from ovs-vswitchd, thereby decreasing some
> looks up during
> flow translation.
>
For slow path, it might reduce the lookups in two tables, but considering
that we have tens of tables, this cost may be negligible?
For fast path, there is no impact on 

Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Eli Britstein via dev
Hi Ivan,

>-Original Message-
>From: Ivan Malov 
>Sent: Wednesday, June 8, 2022 5:46 PM
>To: Eli Britstein 
>Cc: d...@openvswitch.org; Andrew Rybchenko
>; Ilya Maximets ;
>Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
>; Stephen Hemminger
>; David Marchand
>; Gaetan Rivet ; Maxime
>Coquelin 
>Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
>mechanism
>
>External email: Use caution opening links or attachments
>
>
>Hi Eli,
>
>On Wed, 8 Jun 2022, Eli Britstein wrote:
>
>> Hi Ivan,
>>
>>> -Original Message-
>>> From: Ivan Malov 
>>> Sent: Tuesday, June 7, 2022 11:56 PM
>>> To: Eli Britstein 
>>> Cc: d...@openvswitch.org; Andrew Rybchenko
>>> ; Ilya Maximets ;
>>> Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
>>> ; Stephen Hemminger
>>> ; David Marchand
>>> ; Gaetan Rivet ; Maxime
>>> Coquelin 
>>> Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
>>> mechanism
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Hi Eli,
>>>
>>> On Wed, 1 Jun 2022, Eli Britstein wrote:
>>>
 - Missing proper handling of the testpmd syntax logging. It changes
 the used
>>> port according to "transfer", but the log still uses
>netdev_dpdk_get_port_id().
>>>
>>> Thanks for noticing. I will see to it in the next version.
>>>
 - The usage of the "proxy" port for rte-flow implies that this proxy
 port is
>>> attached to OVS, otherwise it is not "started" and creation of flows will
>fail.
>>>
>>> That's the way it is. If there is no proxy for a given port, then the
>>> original port value will be used for managing flows. For vendors that
>>> don't need the proxy, this will work. For others, it won't. That's OK.
>
>> I don't really understand why this can't be done inside dpdk domain (if there
>is a proxy, and it is up, use it, otherwise don't).
>> That's *currently* the way it is. I understand that if dpdk works like this 
>> OVS
>should align, but maybe you or someone else here knows why dpdk works like
>this? (not too late to change, this is experimental...).
>
>
>Regardless of DPDK, on some NICs, it is possible to insert rules via
>unprivileged PFs or VFs, but there are also NICs which cannot do it.
>
>In DPDK, this contradiction has to be resolved somehow.
>In example, for NICs that can only manage flows via privileged ports, two
>possible solutions exist:
>
>1. Route flow management requests from unprivileged ethdevs
>to the privileged one implicitly, inside the PMD. This
>is transparent to users, but, at the same time, it is
>tricky because the application does not realise that
>flows it manages via an ethdev "B" are in fact
>communicated to the NIC via an ethdev "A".
>
>Unbeknownst of the implicit scheme, the application may
>detach the privileged ethdev "A" in-between. And, when
>time comes to remove flows, doing so via ethdev "B"
>will fail. This scheme breaks in-app housekeeping.
>
>2. Expose the "proxy" port existence to the application.
>If it knows the truth about the real ethdev that
>handles the transfer flows, it won't attempt to
>detach it in-between. The housekeeping is fine.
>
>Outing the existence of the "proxy" port to users seems like the most
>reasonable approach. This is why it was implemented in DPDK like this.
>Currently, it's indeed an experimental feature. DPDK PMDs which need it, are
>supposed to switch to it during the transition phase.
Thanks very much for the explanation, though IMHO relevant PMDs could still 
hide it and not do this "outing" of their internals.
>
>However, I should stress out that to NICs that support managing transfer
>flows on any PFs and VFs, this proxy scheme is a don't care. The
>corresponding drivers may not implement the proxy query method at all:
>
>https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
>b.com%2FDPDK%2Fdpdk%2Fblob%2Fmain%2Flib%2Fethdev%2Frte_flow.c%2
>3L1345data=05%7C01%7Celibr%40nvidia.com%7Cf5a80eb00f0342498
>63308da495dab8b%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C6
>37902963929533013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD
>AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>sdata=ojwUOsPlz09NXtDXfeO8lAT%2BHcgGYWNRdIhxB6f0cy0%3D
>mp;reserved=0
>
>The generic part of the API will just return the original port ID to the
>application.
Yes, I saw that. Thanks.
>
>
>>>

> -Original Message-
> From: Ivan Malov 
> Sent: Monday, May 30, 2022 5:16 PM
> To: d...@openvswitch.org
> Cc: Andrew Rybchenko ; Ilya
>Maximets
> ; Ori Kam ; Eli Britstein
> ; NBU-Contact-Thomas Monjalon (EXTERNAL)
> ; Stephen Hemminger
> ; David Marchand
> ; Gaetan Rivet ;
>Maxime
> Coquelin 
> Subject: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
> mechanism
>
> External email: Use caution opening links or attachments
>
>
> Manage "transfer" flows via the corresponding mechanism.
> Doing so requires that the 

Re: [ovs-dev] [PATCH ovn v14] Implement RARP activation strategy for ports

2022-06-08 Thread Ihar Hrachyshka
Sorry, should have included changelog for this revision.

===

v14: introduce clear_tracked_data for activated_ports
v14: don't modify activated_ports node data inside handler
v14: remove unnecessary I-P input dependency
v14: make get_activated_ports track if any new entries were added to a
list to avoid unnecessary I-P pflow node activation.
v14: fixed a bug in _run handler for activated_ports where it marked
as UNCHANGED when activated_ports list was not empty.

On Tue, Jun 7, 2022 at 10:06 PM Ihar Hrachyshka  wrote:
>
> When options:activation-strategy is set to "rarp" for LSP, when used in
> combination with multiple chassis names listed in
> options:requested-chassis, additional chassis will install special flows
> that would block all ingress and egress traffic for the port until a
> special activation event happens.
>
> For "rarp" strategy, an observation of a RARP packet sent from the port
> on the additional chassis is such an event. When it occurs, a special
> flow passes control to a controller() action handler that eventually
> removes the installed blocking flows and also marks the port as
> options:additional-chassis-activated in southbound db.
>
> This feature is useful in live migration scenarios where it's not
> advisable to unlock the destination port location prematurily to avoid
> duplicate packets originating from the port.
>
> Signed-off-by: Ihar Hrachyshka 
> ---
>  NEWS|   2 +
>  controller/lport.c  |  22 +++
>  controller/lport.h  |   3 +
>  controller/ovn-controller.c |  97 ++
>  controller/physical.c   |  94 ++
>  controller/pinctrl.c| 155 ++-
>  controller/pinctrl.h|  13 ++
>  include/ovn/actions.h   |   3 +
>  northd/northd.c |  10 +
>  northd/ovn-northd.c |   5 +-
>  ovn-nb.xml  |  11 ++
>  ovn-sb.xml  |  15 ++
>  tests/ovn.at| 365 
>  13 files changed, 792 insertions(+), 3 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 2ee283a56..7c54670ed 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -29,6 +29,8 @@ OVN v22.06.0 - XX XXX 
>- Added support for setting the Next server IP in the DHCP header
>  using the private DHCP option - 253 in native OVN DHCPv4 responder.
>- Support list of chassis for 
> Logical_Switch_Port:options:requested-chassis.
> +  - Support Logical_Switch_Port:options:activation-strategy for live 
> migration
> +scenarios.
>
>  OVN v22.03.0 - 11 Mar 2022
>  --
> diff --git a/controller/lport.c b/controller/lport.c
> index bf55d83f2..add7e91aa 100644
> --- a/controller/lport.c
> +++ b/controller/lport.c
> @@ -197,3 +197,25 @@ get_peer_lport(const struct sbrec_port_binding *pb,
>  peer_name);
>  return (peer && peer->datapath) ? peer : NULL;
>  }
> +
> +bool
> +lport_is_activated_by_activation_strategy(const struct sbrec_port_binding 
> *pb,
> +  const struct sbrec_chassis 
> *chassis)
> +{
> +const char *activated_chassis = smap_get(>options,
> + "additional-chassis-activated");
> +if (activated_chassis) {
> +char *save_ptr;
> +char *tokstr = xstrdup(activated_chassis);
> +for (const char *chassis_name = strtok_r(tokstr, ",", _ptr);
> + chassis_name != NULL;
> + chassis_name = strtok_r(NULL, ",", _ptr)) {
> +if (!strcmp(chassis_name, chassis->name)) {
> +free(tokstr);
> +return true;
> +}
> +}
> +free(tokstr);
> +}
> +return false;
> +}
> diff --git a/controller/lport.h b/controller/lport.h
> index 115881655..644c67255 100644
> --- a/controller/lport.h
> +++ b/controller/lport.h
> @@ -70,4 +70,7 @@ const struct sbrec_port_binding *lport_get_peer(
>  const struct sbrec_port_binding *lport_get_l3gw_peer(
>  const struct sbrec_port_binding *,
>  struct ovsdb_idl_index *sbrec_port_binding_by_name);
> +bool
> +lport_is_activated_by_activation_strategy(const struct sbrec_port_binding 
> *pb,
> +  const struct sbrec_chassis 
> *chassis);
>  #endif /* controller/lport.h */
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index 2793c8687..78f58e312 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -1417,6 +1417,100 @@ en_runtime_data_run(struct engine_node *node, void 
> *data)
>  engine_set_node_state(node, EN_UPDATED);
>  }
>
> +struct ed_type_activated_ports {
> +struct ovs_list *activated_ports;
> +};
> +
> +static void *
> +en_activated_ports_init(struct engine_node *node OVS_UNUSED,
> +struct engine_arg *arg OVS_UNUSED)
> +{
> +struct ed_type_activated_ports *data = xzalloc(sizeof *data);
> +data->activated_ports = NULL;
> +return data;

Re: [ovs-dev] [PATCH v1 1/1] datapath-windows: Alg support for ftp and tftp in conntrack

2022-06-08 Thread 0-day Robot
References:  <20220608150339.87756-1-svc.ovs-commun...@vmware.com>
 

Bleep bloop.  Greetings ldejing, I am a robot and I have tried out your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 80 characters long (recommended limit is 79)
#87 FILE: Documentation/intro/install/windows.rst:875:
   > ovs-ofctl add-flow br-test "table=1,priority=1, ct_state=-new+trk+est+rel,\

WARNING: Line is 80 characters long (recommended limit is 79)
#89 FILE: Documentation/intro/install/windows.rst:877:
   > ovs-ofctl add-flow br-test "table=2,priority=1,ip6,ipv6_dst=$Vif38Address,\

WARNING: Line is 80 characters long (recommended limit is 79)
#91 FILE: Documentation/intro/install/windows.rst:879:
   > ovs-ofctl add-flow br-test "table=2,priority=1,ip6,ipv6_dst=$Vif40Address,\

WARNING: Line is 80 characters long (recommended limit is 79)
#160 FILE: Documentation/intro/install/windows.rst:917:
 ipv6_dst=$NatAddress,$Protocol,action=ct(table=2,commit,nat(dst=$Vif42Ip),\

WARNING: Line is 80 characters long (recommended limit is 79)
#205 FILE: Documentation/intro/install/windows.rst:962:
   > ovs-ofctl add-flow br-test "table=2,priority=1,ip6,ipv6_dst=$Vif38Address,\

WARNING: Line is 80 characters long (recommended limit is 79)
#207 FILE: Documentation/intro/install/windows.rst:964:
   > ovs-ofctl add-flow br-test "table=2,priority=1,ip6,ipv6_dst=$Vif40Address,\

WARNING: Line is 80 characters long (recommended limit is 79)
#229 FILE: Documentation/intro/install/windows.rst:986:
   > ovs-ofctl add-flow br-test "table=0,priority=2,ipv6,dl_dst=$NatMacAddress,\

Lines checked: 805, Warnings: 7, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread Numan Siddique
On Wed, Jun 8, 2022 at 6:34 AM 刘梦馨  wrote:
>
> Just give some input about eBPF/XDP support.
>
> We used to use OVN L2 LB to replace kube-proxy in Kubernetes, but found
> that
> the L2 LB will use conntrack and ovs clone which hurts performance badly.
> The latency
> for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth drop from
> 6Mb/s to 2.8Mb/s.
>
> Even if the traffic does not target to LB VIPs has the same performance
> drop and it also leads to the
> total datapath cannot be offloaded to hardware.
>
> And finally we turn to using Cilium's chaining mode to replace the OVN L2
> LB to implement kube-proxy to
> resolve the above issues. We hope to see the lb optimization by eBPF/XDP on
> the OVN side.
>

Thanks for your comments and inputs.   I think we should definitely
explore optimizing this use case
and see if its possible to leverage eBPF/XDP for this.

> On Wed, 8 Jun 2022 at 14:43, Han Zhou  wrote:
>
> > On Mon, May 30, 2022 at 5:46 PM  wrote:
> > >
> > > From: Numan Siddique 
> > >
> > > XDP program - ovn_xdp.c added in this RFC patch  series implements basic
> > port
> > > security and drops any packet if the port security check fails.
> > > There are still few TODOs in the port security checks. Like
> > >   - Make ovn xdp configurable.
> > >   - Removing the ingress Openflow rules from table 73 and 74 if ovn
> > xdp
> > > is enabled.
> > >   - Add IPv6 support.
> > >   - Enhance the port security xdp program for ARP/IPv6 ND checks.
> > >
> > > This patch adds a basic XDP support in OVN and in future we can
> > > leverage eBPF/XDP features.
> > >
> > > I'm not sure how much value this RFC patch adds to make use of eBPF/XDP
> > > just for port security.  Submitting as RFC to get some feedback and
> > > start some conversation on eBPF/XDP in OVN.
> > >
> > Hi Numan,
> >
> > This is really cool. It demonstrates how OVN could leverage eBPF/XDP.
> >
> > On the other hand, for the port-security feature in XDP, I keep thinking
> > about the scenarios and it is still not very clear to me. One advantage I
> > can think of is to prevent DOS attacks from VM/Pod when invalid IP/MAC are
> > used, XDP may perform better and drop packets with lower CPU cost
> > (comparing with OVS kernel datapath). However, I am also wondering why
> > would a attacker use invalid IP/MAC for DOS attacks? Do you have some more
> > thoughts about the use cases?

My idea was to demonstrate the use of eBPF/XDP and port security
checks were easy to do
before the packet hits the OVS pipeline.

If we were to move the port security check to XDP, then the only
advantage we would be getting
in my opinion is to remove the corresponding ingress port security
check related OF rules from ovs-vswitchd, thereby decreasing some
looks up during
flow translation.

I'm not sure why an attacker would use invalid IP/MAC for DOS attacks.
But from what I know, ovn-kubernetes do want to restrict each POD to
its assigned IP/MAC.

 And do you have any performance results
> > comparing with the current OVS implementation?

I didn't do any scale/performance related tests.

If we were to move port security feature to XDP in OVN, then I think  we need to
   - Complete the TODO's like adding IPv6 and ARP/ND related checks
   - Do some scale testing and see whether its reducing memory
footprint of ovs-vswitchd and ovn-controller because of the reduction
in OF rules

> >
> > Another question is, would it work with smart NIC HW-offload, where VF
> > representer ports are added to OVS on the smart NIC? I guess XDP doesn't
> > support representer port, right?

I think so. I don't have much experience/knowledge on this.  From what
I understand,  if datapath flows are offloaded and since XDP is not
offloaded, the xdo checks will be totally missed.
So if XDP is to be used, then offloading should be disabled.

Thanks
Numan

> >
> > Thanks,
> > Han
> >
> > > In order to attach and detach xdp programs,  libxdp [1] and libbpf is
> > used.
> > >
> > > To test it out locally, please install libxdp-devel and libbpf-devel
> > > and the compile OVN first and then compile ovn_xdp by running "make
> > > bpf".  Copy ovn_xdp.o to either /usr/share/ovn/ or /usr/local/share/ovn/
> > >
> > >
> > > Numan Siddique (2):
> > >   RFC: Add basic xdp/eBPF support in OVN.
> > >   RFC: ovn-controller: Attach XDP progs to the VIFs of the logical
> > > ports.
> > >
> > >  Makefile.am |   6 +-
> > >  bpf/.gitignore  |   5 +
> > >  bpf/automake.mk |  23 +++
> > >  bpf/ovn_xdp.c   | 156 +++
> > >  configure.ac|   2 +
> > >  controller/automake.mk  |   4 +-
> > >  controller/binding.c|  45 +++--
> > >  controller/binding.h|   7 +
> > >  controller/ovn-controller.c |  79 +++-
> > >  controller/xdp.c| 389 
> > >  controller/xdp.h|  41 
> > >  m4/ovn.m4   |  20 ++
> > >  

Re: [ovs-dev] [PATCH RFC ovn] nb: Remove possibility of disabling logical datapath groups.

2022-06-08 Thread Numan Siddique
On Wed, Jun 8, 2022 at 1:19 AM Han Zhou  wrote:
>
> On Tue, Jun 7, 2022 at 8:20 AM Dumitru Ceara  wrote:
> >
> > In large scale scenarios this option hugely reduces the size of the
> > Southbound database positively affecting end to end performance.  In
> > such scenarios there's no real reason to ever disable datapath groups.
> >
> > In lower scale scenarios any potential overhead due to logical datapath
> > groups is, very likely, negligible.
> >
> > Aside from potential scalability concerns, the
> > NB.NB_Global.options:use_logical_dp_group knob was kept until now to
> > ensure that in case of a bug in the logical datapath groups code a CMS
> > may turn it off and fall back to the mode in which logical flows are not
> > grouped together.  As far as I know, this has never happened until now.
> >
> > Moreover, datpath_groups are enabled by default since v21.09.0 (4 stable
> > releases ago), via 90daa7ce18dc ("northd: Enable logical dp groups by
> > default.").
> >
> > From a testing perspective removing this knob will halve the CI matrix.
> > This is desirable, especially in the context of more tests being added,
> > e.g.:
> >
> >
> https://patchwork.ozlabs.org/project/ovn/patch/20220531093318.2270409-1-mh...@redhat.com/
> >
> > Signed-off-by: Dumitru Ceara 
>
> Thanks Dumitru! I haven't reviewed the patch in detail, but the idea sounds
> good to me.
>

+1 from me to move this patch from RFC to a formal one.

Numan

> Han
>
> > ---
> >  NEWS|  1 +
> >  northd/northd.c | 77 +++--
> >  ovn-nb.xml  | 19 +++---
> >  tests/ovn-macros.at | 23 ++--
> >  tests/ovn-northd.at | 92 -
> >  tests/ovs-macros.at |  4 +-
> >  6 files changed, 41 insertions(+), 175 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index e015ae8e7..20b4c5d91 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -4,6 +4,7 @@ Post v22.06.0
> >  "ovn-encap-df_default" to enable or disable tunnel DF flag.
> >- Add option "localnet_learn_fdb" to LSP that will allow localnet
> >  ports to learn MAC addresses and store them in FDB table.
> > +  - Removed possibility of disabling logical datapath groups.
> >
> >  OVN v22.06.0 - XX XXX 
> >  --
> > diff --git a/northd/northd.c b/northd/northd.c
> > index 0207f6ce1..a97a321cd 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -4813,10 +4813,6 @@ ovn_lflow_equal(const struct ovn_lflow *a, const
> struct ovn_datapath *od,
> >  && nullable_string_is_equal(a->ctrl_meter, ctrl_meter));
> >  }
> >
> > -/* If this option is 'true' northd will combine logical flows that
> differ by
> > - * logical datapath only by creating a datapath group. */
> > -static bool use_logical_dp_groups = false;
> > -
> >  enum {
> >  STATE_NULL,   /* parallelization is off */
> >  STATE_INIT_HASH_SIZES,/* parallelization is on; hashes sizing
> needed */
> > @@ -4841,8 +4837,7 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct
> ovn_datapath *od,
> >  lflow->ctrl_meter = ctrl_meter;
> >  lflow->dpg = NULL;
> >  lflow->where = where;
> > -if ((parallelization_state != STATE_NULL)
> > -&& use_logical_dp_groups) {
> > +if (parallelization_state != STATE_NULL) {
> >  ovs_mutex_init(>odg_lock);
> >  }
> >  }
> > @@ -4852,7 +4847,7 @@ ovn_dp_group_add_with_reference(struct ovn_lflow
> *lflow_ref,
> >  struct ovn_datapath *od)
> >  OVS_NO_THREAD_SAFETY_ANALYSIS
> >  {
> > -if (!use_logical_dp_groups || !lflow_ref) {
> > +if (!lflow_ref) {
> >  return false;
> >  }
> >
> > @@ -4931,13 +4926,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct
> ovn_datapath *od,
> >  struct ovn_lflow *old_lflow;
> >  struct ovn_lflow *lflow;
> >
> > -if (use_logical_dp_groups) {
> > -old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority,
> match,
> > -   actions, ctrl_meter, hash);
> > -if (old_lflow) {
> > -ovn_dp_group_add_with_reference(old_lflow, od);
> > -return old_lflow;
> > -}
> > +old_lflow = ovn_lflow_find(lflow_map, NULL, stage, priority, match,
> > +   actions, ctrl_meter, hash);
> > +if (old_lflow) {
> > +ovn_dp_group_add_with_reference(old_lflow, od);
> > +return old_lflow;
> >  }
> >
> >  lflow = xmalloc(sizeof *lflow);
> > @@ -4993,8 +4986,7 @@ ovn_lflow_add_at_with_hash(struct hmap *lflow_map,
> struct ovn_datapath *od,
> >  struct ovn_lflow *lflow;
> >
> >  ovs_assert(ovn_stage_to_datapath_type(stage) ==
> ovn_datapath_get_type(od));
> > -if (use_logical_dp_groups
> > -&& (parallelization_state == STATE_USE_PARALLELIZATION)) {
> > +if (parallelization_state == STATE_USE_PARALLELIZATION) {
> >  lflow = do_ovn_lflow_add_pd(lflow_map, 

[ovs-dev] [PATCH v1 1/1] datapath-windows: Alg support for ftp and tftp in conntrack

2022-06-08 Thread ldejing via dev
From: ldejing 

In this patch, mainly finished some remaining work of previous
patch(53b75e91) about ipv6 conntrack. Tftp with alg mainly
parse the tftp packet (IPv4/IPv6) and create the related
connection. Ftp with alg mainly fixed some bugs in the
IPv4/IPv6. Additionally, this patch includes some misc
work for ipv6 conntrack:
   1) Fix some bugs for Icmpv6 error code when use the
  "ct_state=+rel" as match field.
   2) Test flow action like ct(nat([20::1]:40))

Test cases:
1) ftp ipv4/ipv6 use alg field in the normal and nat scenario.
2) tftp ipv4/ipv6 use alg field in the normal and nat scenario.
3) icmpv6 error code scenario, including ICMP6_PACKET_TOO_BIG,
   ICMP6_DST_UNREACH,ICMP6_TIME_EXCEEDED,ICMP6_PARAM_PROB

Signed-off-by: ldejing 
---
 Documentation/intro/install/windows.rst | 187 ++--
 datapath-windows/automake.mk|   1 +
 datapath-windows/ovsext/Actions.c   |   1 -
 datapath-windows/ovsext/Conntrack-ftp.c | 109 ++--
 datapath-windows/ovsext/Conntrack-icmp.c|   6 +-
 datapath-windows/ovsext/Conntrack-related.c |  30 +++-
 datapath-windows/ovsext/Conntrack-tftp.c|  70 
 datapath-windows/ovsext/Conntrack.c |  81 +++--
 datapath-windows/ovsext/Conntrack.h |  19 +-
 datapath-windows/ovsext/ovsext.vcxproj  |   1 +
 include/windows/netinet/in.h|   1 +
 11 files changed, 373 insertions(+), 133 deletions(-)
 create mode 100644 datapath-windows/ovsext/Conntrack-tftp.c

diff --git a/Documentation/intro/install/windows.rst 
b/Documentation/intro/install/windows.rst
index 0a392d781..8759d1263 100644
--- a/Documentation/intro/install/windows.rst
+++ b/Documentation/intro/install/windows.rst
@@ -852,32 +852,33 @@ related state.
 
normal scenario
Vif38(20::1, ofport:2)->Vif40(20:2, ofport:3)
-   Vif38Name="podvif38"
-   Vif40Name="podvif40"
+   Vif38Name="podvif70"
+   Vif40Name="Ethernet1"
Vif38Port=2
-   Vif38Address="20::1"
-   Vif38MacAddressCli="00-15-5D-F0-01-0b"
+   Vif38Address="20::88"
Vif40Port=3
-   Vif40Address="20::2"
-   Vif40MacAddressCli="00-15-5D-F0-01-0C"
+   Vif40Address="20::45"
+   Vif40MacAddressCli="00-50-56-98-9d-97"
+   Vif38MacAddressCli="00-15-5D-F0-01-0B"
Protocol="tcp6"
-   > netsh int ipv6 set neighbors $Vif38Name $Vif40Address \
- $Vif40MacAddressCli
-   > netsh int ipv6 set neighbors $Vif40Name $Vif38Address \
- $Vif38MacAddressCli
-   > ovs-ofctl del-flows br-int --strict "table=0,priority=0"
-   > ovs-ofctl add-flow br-int "table=0,priority=1,$Protocol \
+   > netsh int ipv6 set neighbors $Vif38Name $Vif40Address $Vif40MacAddressCli
+   > netsh int ipv6 set neighbors $Vif42Name $Vif38Ip $Vif38MacAddressCli
+   > ovs-ofctl del-flows br-test --strict "table=0,priority=0"
+   > ovs-ofctl add-flow br-test "table=0,priority=1,$Protocol
  actions=ct(table=1)"
-   > ovs-ofctl add-flow br-int "table=1,priority=1,ct_state=+new+trk-est, \
+   > ovs-ofctl add-flow br-test "table=1,priority=1,tp_dst=21, $Protocol,\
+ actions=ct(commit,table=2,alg=ftp)"
+   > ovs-ofctl add-flow br-test "table=1,priority=1,tp_src=21, $Protocol,\
+ actions=ct(commit,table=2,alg=ftp)"
+   > ovs-ofctl add-flow br-test "table=1,priority=1, ct_state=+new+trk+rel,\
  $Protocol,actions=ct(commit,table=2)"
-   > ovs-ofctl add-flow br-int "table=1,priority=1, \
- ct_state=-new+trk+est-rel, $Protocol,actions=ct(commit,table=2)"
-   > ovs-ofctl add-flow br-int "table=1,priority=1, \
- ct_state=-new+trk+est+rel, $Protocol,actions=ct(commit,table=2)"
-   > ovs-ofctl add-flow br-int "table=2,priority=1,ip6, \
- ipv6_dst=$Vif38Address,$Protocol,actions=output:$Vif38Port"
-   > ovs-ofctl add-flow br-int "table=2,priority=1,ip6, \
- ipv6_dst=$Vif40Address,$Protocol,actions=output:$Vif40Port"
+   > ovs-ofctl add-flow br-test "table=1,priority=1, 
ct_state=-new+trk+est+rel,\
+ $Protocol,actions=ct(commit,table=2)"
+   > ovs-ofctl add-flow br-test 
"table=2,priority=1,ip6,ipv6_dst=$Vif38Address,\
+ $Protocol,actions=output:$Vif38Port"
+   > ovs-ofctl add-flow br-test 
"table=2,priority=1,ip6,ipv6_dst=$Vif40Address,\
+ $Protocol,actions=output:$Vif40Port"
+
 
 ::
 
@@ -885,45 +886,127 @@ related state.
Vif38(20::1, ofport:2) -> nat address(20::9) -> Vif42(21::3, ofport:4)
Due to not construct flow to return neighbor mac address, we set the
neighbor mac address manually
+   Vif38Name="podvif70"
+   Vif42Name="Ethernet1"
+   Vif38Ip="20::88"
Vif38Port=2
-   Vif42Port=4
-   Vif38Name="podvif38"
-   Vif42Name="podvif42"
+   Vif42Port=3
NatAddress="20::9"
NatMacAddress="aa:bb:cc:dd:ee:ff"
NatMacAddressForCli="aa-bb-cc-dd-ee-ff"
Vif42Ip="21::3"
-   Vif38MacAddress="00:15:5D:F0:01:0B"
-   Vif42MacAddress="00:15:5D:F0:01:0D"
+   Vif38MacAddress="00:15:5D:F0:01:14"
+   Vif38MacAddressCli="00-15-5D-F0-01-14"
+   Vif42MacAddress="00:50:56:98:9d:97"
Protocol="tcp6"
-   > netsh int ipv6 set neighbors 

Re: [ovs-dev] [PATCH net-next] net: rename reference+tracking helpers

2022-06-08 Thread Jiri Pirko
Wed, Jun 08, 2022 at 06:39:55AM CEST, k...@kernel.org wrote:
>Netdev reference helpers have a dev_ prefix for historic
>reasons. Renaming the old helpers would be too much churn

Hmm, I think it would be great to eventually rename the rest too in
order to maintain unique prefix for netdev things. Why do you think the
"churn" would be an issue?


>but we can rename the tracking ones which are relatively
>recent and should be the default for new code.
>
>Rename:
> dev_hold_track()-> netdev_hold()
> dev_put_track() -> netdev_put()
> dev_replace_track() -> netdev_ref_replace()

[...]


>diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
>index 817577e713d7..815738c0e067 100644
>--- a/drivers/net/macsec.c
>+++ b/drivers/net/macsec.c
>@@ -3462,7 +3462,7 @@ static int macsec_dev_init(struct net_device *dev)
>   memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len);
> 
>   /* Get macsec's reference to real_dev */
>-  dev_hold_track(real_dev, >dev_tracker, GFP_KERNEL);
>+  netdev_hold(real_dev, >dev_tracker, GFP_KERNEL);

So we later decide to rename dev_hold() to obey the netdev_*() naming
scheme, we would have collision. Also, seems to me odd to have:
OLDPREFIX_x()
and
NEWPREFIX_x()
to be different functions.

For the sake of not making naming mess, could we rather have:
netdev_hold_track()
or
netdev_hold_tr() if the prior is too long
?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH net-next] net: rename reference+tracking helpers

2022-06-08 Thread Jakub Kicinski
On Wed, 8 Jun 2022 10:27:15 +0200 Jiri Pirko wrote:
> Wed, Jun 08, 2022 at 06:39:55AM CEST, k...@kernel.org wrote:
> >Netdev reference helpers have a dev_ prefix for historic
> >reasons. Renaming the old helpers would be too much churn  
> 
> Hmm, I think it would be great to eventually rename the rest too in
> order to maintain unique prefix for netdev things. Why do you think the
> "churn" would be an issue?

Felt like we're better of moving everyone to the new tracking helpers
than doing just a pure rename. But I'm not opposed to a pure rename.

> >diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
> >index 817577e713d7..815738c0e067 100644
> >--- a/drivers/net/macsec.c
> >+++ b/drivers/net/macsec.c
> >@@ -3462,7 +3462,7 @@ static int macsec_dev_init(struct net_device *dev)
> > memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len);
> > 
> > /* Get macsec's reference to real_dev */
> >-dev_hold_track(real_dev, >dev_tracker, GFP_KERNEL);
> >+netdev_hold(real_dev, >dev_tracker, GFP_KERNEL);  
> 
> So we later decide to rename dev_hold() to obey the netdev_*() naming
> scheme, we would have collision.

dev_hold() should not be used in new code, we should use tracking
everywhere. Given that we can name the old helpers __netdev_hold().

> Also, seems to me odd to have:
> OLDPREFIX_x()
> and
> NEWPREFIX_x()
> to be different functions.
> 
> For the sake of not making naming mess, could we rather have:
> netdev_hold_track()
> or
> netdev_hold_tr() if the prior is too long
> ?

See above, one day non-track version should be removed.
IMO to encourage use of the track-capable API we could keep their names
short and call the legacy functions __netdev_hold() as I mentioned or
maybe netdev_hold_notrack().
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Ivan Malov

Hi Eli,

On Wed, 8 Jun 2022, Eli Britstein wrote:


Hi Ivan,


-Original Message-
From: Ivan Malov 
Sent: Tuesday, June 7, 2022 11:56 PM
To: Eli Britstein 
Cc: d...@openvswitch.org; Andrew Rybchenko
; Ilya Maximets ;
Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Hi Eli,

On Wed, 1 Jun 2022, Eli Britstein wrote:


- Missing proper handling of the testpmd syntax logging. It changes the used

port according to "transfer", but the log still uses netdev_dpdk_get_port_id().

Thanks for noticing. I will see to it in the next version.


- The usage of the "proxy" port for rte-flow implies that this proxy port is

attached to OVS, otherwise it is not "started" and creation of flows will fail.

That's the way it is. If there is no proxy for a given port, then the original 
port
value will be used for managing flows. For vendors that don't need the proxy,
this will work. For others, it won't. That's OK.



I don't really understand why this can't be done inside dpdk domain (if there 
is a proxy, and it is up, use it, otherwise don't).
That's *currently* the way it is. I understand that if dpdk works like this OVS 
should align, but maybe you or someone else here knows why dpdk works like 
this? (not too late to change, this is experimental...).



Regardless of DPDK, on some NICs, it is possible to insert rules via
unprivileged PFs or VFs, but there are also NICs which cannot do it.

In DPDK, this contradiction has to be resolved somehow.
In example, for NICs that can only manage flows via
privileged ports, two possible solutions exist:

1. Route flow management requests from unprivileged ethdevs
   to the privileged one implicitly, inside the PMD. This
   is transparent to users, but, at the same time, it is
   tricky because the application does not realise that
   flows it manages via an ethdev "B" are in fact
   communicated to the NIC via an ethdev "A".

   Unbeknownst of the implicit scheme, the application may
   detach the privileged ethdev "A" in-between. And, when
   time comes to remove flows, doing so via ethdev "B"
   will fail. This scheme breaks in-app housekeeping.

2. Expose the "proxy" port existence to the application.
   If it knows the truth about the real ethdev that
   handles the transfer flows, it won't attempt to
   detach it in-between. The housekeeping is fine.

Outing the existence of the "proxy" port to users seems
like the most reasonable approach. This is why it was
implemented in DPDK like this. Currently, it's indeed
an experimental feature. DPDK PMDs which need it, are
supposed to switch to it during the transition phase.

However, I should stress out that to NICs that support
managing transfer flows on any PFs and VFs, this proxy
scheme is a don't care. The corresponding drivers may
not implement the proxy query method at all:

https://github.com/DPDK/dpdk/blob/main/lib/ethdev/rte_flow.c#L1345

The generic part of the API will just return
the original port ID to the application.







-Original Message-
From: Ivan Malov 
Sent: Monday, May 30, 2022 5:16 PM
To: d...@openvswitch.org
Cc: Andrew Rybchenko ; Ilya Maximets
; Ori Kam ; Eli Britstein
; NBU-Contact-Thomas Monjalon (EXTERNAL)
; Stephen Hemminger
; David Marchand
; Gaetan Rivet ; Maxime
Coquelin 
Subject: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
mechanism

External email: Use caution opening links or attachments


Manage "transfer" flows via the corresponding mechanism.
Doing so requires that the traffic source be specified explicitly,
via the corresponding pattern item.

Signed-off-by: Ivan Malov 
Acked-by: Andrew Rybchenko 
---
lib/netdev-dpdk.c | 73 ---
lib/netdev-dpdk.h |  2 +-
lib/netdev-offload-dpdk.c | 43 ++-
3 files changed, 103 insertions(+), 15 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
45e5d26d2..d0bf4613a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {

struct netdev_dpdk {
PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE,

cacheline0,

+dpdk_port_t flow_transfer_proxy_port_id;
dpdk_port_t port_id;

/* If true, device was attached by rte_eth_dev_attach(). */
@@ -1115,6
+1116,23 @@ dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
  DPDK_PORT_ID_FMT, dev->port_id);
}
}
+
+static void
+dpdk_eth_dev_init_flow_transfer_proxy(struct netdev_dpdk *dev) {
+int ret;
+
+ret = rte_flow_pick_transfer_proxy(dev->port_id,
+   >flow_transfer_proxy_port_id, 
NULL);
+if (ret == 0)
+return;
+
+/*
+ * The PMD does not indicate the proxy port.
+ * It is OK to assume the proxy is unneeded.
+ */
+

Re: [ovs-dev] [PATCH v5 ovn 1/2] Handle re-used pids in pidfile_is_running

2022-06-08 Thread 0-day Robot
Bleep bloop.  Greetings Terry Wilson, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: Line is 135 characters long (recommended limit is 79)
#28 FILE: utilities/ovn-ctl:46:
test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && 
pid_exists "$pid" && [ -z $cmd -o pid_comm_check "$cmd" "$pid" ]

Lines checked: 34, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 ovn 2/2] Ensure pid belongs to ovsdb-server in ovn-ctl

2022-06-08 Thread Terry Wilson
When checking if ovsdb-server is running, ensure that the binary
we are going to run matches the one actually running with the the
pid that was in our pidfile.

Signed-off-by: Terry Wilson 
---
 utilities/ovn-ctl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index 14d37a3d6..e2f05915b 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -200,7 +200,7 @@ start_ovsdb__() {
 ovn_install_dir "$ovn_etcdir"
 
 # Check and eventually start ovsdb-server for DB
-if pidfile_is_running $db_pid_file; then
+if pidfile_is_running $db_pid_file ovsdb-server; then
 return
 fi
 
-- 
2.34.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5 ovn 1/2] Handle re-used pids in pidfile_is_running

2022-06-08 Thread Terry Wilson
Since pids can be re-used, it is necessary to check that the
process that is running with a pid matches the one that we expect.

This adds the ability to optionally pass a 'binary' argument to
pidfile_is_running, and if it is passed to match the binary against
/proc/$pid/exe.

Signed-off-by: Terry Wilson 
---
 utilities/ovn-ctl | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index d733aa42d..14d37a3d6 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -42,7 +42,8 @@ ovn_ic_db_conf_file="$ovn_etcdir/ovn-ic-db-params.conf"
 
 pidfile_is_running () {
 pidfile=$1
-test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && 
pid_exists "$pid"
+cmd=$2
+test -e "$pidfile" && [ -s "$pidfile" ] && pid=`cat "$pidfile"` && 
pid_exists "$pid" && [ -z $cmd -o pid_comm_check "$cmd" "$pid" ]
 } >/dev/null 2>&1
 
 stop_nb_ovsdb() {
-- 
2.34.3

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 7/7] dp-packet: Add _ol_ to functions using OL flags.

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

This helps to identify when it is about the flags or
the packet itself.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/conntrack.c   |  8 
  lib/dp-packet.c   |  2 +-
  lib/dp-packet.h   | 10 +-
  lib/ipf.c |  4 ++--
  lib/netdev-native-tnl.c   |  4 ++--
  lib/netdev-offload-dpdk.c |  2 +-
  lib/netdev.c  |  2 +-
  lib/packets.c |  2 +-
  8 files changed, 17 insertions(+), 17 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread 刘梦馨
Just give some input about eBPF/XDP support.

We used to use OVN L2 LB to replace kube-proxy in Kubernetes, but found
that
the L2 LB will use conntrack and ovs clone which hurts performance badly.
The latency
for 1byte udp packet jumps from 18.5us to 25.7us and bandwidth drop from
6Mb/s to 2.8Mb/s.

Even if the traffic does not target to LB VIPs has the same performance
drop and it also leads to the
total datapath cannot be offloaded to hardware.

And finally we turn to using Cilium's chaining mode to replace the OVN L2
LB to implement kube-proxy to
resolve the above issues. We hope to see the lb optimization by eBPF/XDP on
the OVN side.

On Wed, 8 Jun 2022 at 14:43, Han Zhou  wrote:

> On Mon, May 30, 2022 at 5:46 PM  wrote:
> >
> > From: Numan Siddique 
> >
> > XDP program - ovn_xdp.c added in this RFC patch  series implements basic
> port
> > security and drops any packet if the port security check fails.
> > There are still few TODOs in the port security checks. Like
> >   - Make ovn xdp configurable.
> >   - Removing the ingress Openflow rules from table 73 and 74 if ovn
> xdp
> > is enabled.
> >   - Add IPv6 support.
> >   - Enhance the port security xdp program for ARP/IPv6 ND checks.
> >
> > This patch adds a basic XDP support in OVN and in future we can
> > leverage eBPF/XDP features.
> >
> > I'm not sure how much value this RFC patch adds to make use of eBPF/XDP
> > just for port security.  Submitting as RFC to get some feedback and
> > start some conversation on eBPF/XDP in OVN.
> >
> Hi Numan,
>
> This is really cool. It demonstrates how OVN could leverage eBPF/XDP.
>
> On the other hand, for the port-security feature in XDP, I keep thinking
> about the scenarios and it is still not very clear to me. One advantage I
> can think of is to prevent DOS attacks from VM/Pod when invalid IP/MAC are
> used, XDP may perform better and drop packets with lower CPU cost
> (comparing with OVS kernel datapath). However, I am also wondering why
> would a attacker use invalid IP/MAC for DOS attacks? Do you have some more
> thoughts about the use cases? And do you have any performance results
> comparing with the current OVS implementation?
>
> Another question is, would it work with smart NIC HW-offload, where VF
> representer ports are added to OVS on the smart NIC? I guess XDP doesn't
> support representer port, right?
>
> Thanks,
> Han
>
> > In order to attach and detach xdp programs,  libxdp [1] and libbpf is
> used.
> >
> > To test it out locally, please install libxdp-devel and libbpf-devel
> > and the compile OVN first and then compile ovn_xdp by running "make
> > bpf".  Copy ovn_xdp.o to either /usr/share/ovn/ or /usr/local/share/ovn/
> >
> >
> > Numan Siddique (2):
> >   RFC: Add basic xdp/eBPF support in OVN.
> >   RFC: ovn-controller: Attach XDP progs to the VIFs of the logical
> > ports.
> >
> >  Makefile.am |   6 +-
> >  bpf/.gitignore  |   5 +
> >  bpf/automake.mk |  23 +++
> >  bpf/ovn_xdp.c   | 156 +++
> >  configure.ac|   2 +
> >  controller/automake.mk  |   4 +-
> >  controller/binding.c|  45 +++--
> >  controller/binding.h|   7 +
> >  controller/ovn-controller.c |  79 +++-
> >  controller/xdp.c| 389 
> >  controller/xdp.h|  41 
> >  m4/ovn.m4   |  20 ++
> >  tests/automake.mk   |   1 +
> >  13 files changed, 753 insertions(+), 25 deletions(-)
> >  create mode 100644 bpf/.gitignore
> >  create mode 100644 bpf/automake.mk
> >  create mode 100644 bpf/ovn_xdp.c
> >  create mode 100644 controller/xdp.c
> >  create mode 100644 controller/xdp.h
> >
> > --
> > 2.35.3
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


-- 
刘梦馨
Blog: http://oilbeater.com
Weibo: @oilbeater 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v4] dpif-netdev: Allow cross-NUMA polling on selected ports

2022-06-08 Thread Anurag Agarwal
From: Jan Scheurich 

Today dpif-netdev considers PMD threads on a non-local NUMA node for automatic
assignment of the rxqs of a port only if there are no local,non-isolated PMDs.

On typical servers with both physical ports on one NUMA node, this often
leaves the PMDs on the other NUMA node under-utilized, wasting CPU resources.
The alternative, to manually pin the rxqs to PMDs on remote NUMA nodes, also
has drawbacks as it limits OVS' ability to auto load-balance the rxqs.

This patch introduces a new interface configuration option to allow ports to
be automatically polled by PMDs on any NUMA node:

ovs-vsctl set interface  other_config:cross-numa-polling=true

The group assignment algorithm now has the ability to select lowest loaded PMD
on any NUMA, and not just the local NUMA on which the rxq of the port resides

If this option is not present or set to false, legacy behaviour applies.

Co-authored-by: Anurag Agarwal 
Signed-off-by: Jan Scheurich 
Signed-off-by: Anurag Agarwal 
---

Changes in this patch:
- Addressed comments from Kevin Traynor

Please refer this thread for an earlier discussion on this topic:
https://mail.openvswitch.org/pipermail/ovs-dev/2022-March/392310.html
---
 Documentation/topics/dpdk/pmd.rst |  23 ++
 lib/dpif-netdev.c | 130 ++
 tests/pmd.at  |  38 +
 vswitchd/vswitch.xml  |  20 +
 4 files changed, 177 insertions(+), 34 deletions(-)

diff --git a/Documentation/topics/dpdk/pmd.rst 
b/Documentation/topics/dpdk/pmd.rst
index b259cc8b3..387f962d1 100644
--- a/Documentation/topics/dpdk/pmd.rst
+++ b/Documentation/topics/dpdk/pmd.rst
@@ -99,6 +99,25 @@ core cycles for each Rx queue::
 
 $ ovs-appctl dpif-netdev/pmd-rxq-show
 
+Normally, Rx queues are assigned to PMD threads automatically.  By default
+OVS only assigns Rx queues to PMD threads executing on the same NUMA
+node in order to avoid unnecessary latency for accessing packet buffers
+across the NUMA boundary.  Typically this overhead is higher for vhostuser
+ports than for physical ports due to the packet copy that is done for all
+rx packets.
+
+On NUMA servers with physical ports only on one NUMA node, the NUMA-local
+polling policy can lead to an under-utilization of the PMD threads on the
+remote NUMA node.  For the overall OVS performance it may in such cases be
+beneficial to utilize the spare capacity and allow polling of a physical
+port's rxqs across NUMA nodes despite the overhead involved.
+The policy can be set per port with the following configuration option::
+
+$ ovs-vsctl set Interface  \
+other_config:cross-numa-polling=true|false
+
+The default value is false.
+
 .. note::
 
A history of one minute is recorded and shown for each Rx queue to allow for
@@ -115,6 +134,10 @@ core cycles for each Rx queue::
A ``overhead`` statistics is shown per PMD: it represents the number of
cycles inherently consumed by the OVS PMD processing loop.
 
+.. versionchanged:: 2.18.0
+
+   Added the interface parameter ``other_config:cross-numa-polling``
+
 Rx queue to PMD assignment takes place whenever there are configuration changes
 or can be triggered by using::
 
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index ff57b3961..86f88964b 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -467,6 +467,7 @@ struct dp_netdev_port {
 char *type; /* Port type as requested by user. */
 char *rxq_affinity_list;/* Requested affinity of rx queues. */
 enum txq_req_mode txq_requested_mode;
+bool cross_numa_polling;/* If true cross polling will be enabled. */
 };
 
 static bool dp_netdev_flow_ref(struct dp_netdev_flow *);
@@ -2101,6 +2102,7 @@ port_create(const char *devname, const char *type,
 port->sf = NULL;
 port->emc_enabled = true;
 port->need_reconfigure = true;
+port->cross_numa_polling = false;
 ovs_mutex_init(>txq_used_mutex);
 
 *portp = port;
@@ -5013,6 +5015,7 @@ dpif_netdev_port_set_config(struct dpif *dpif, odp_port_t 
port_no,
 bool emc_enabled = smap_get_bool(cfg, "emc-enable", true);
 const char *tx_steering_mode = smap_get(cfg, "tx-steering");
 enum txq_req_mode txq_mode;
+bool cross_numa_polling = smap_get_bool(cfg, "cross-numa-polling", false);
 
 ovs_rwlock_wrlock(>port_rwlock);
 error = get_port_by_number(dp, port_no, );
@@ -5020,6 +5023,14 @@ dpif_netdev_port_set_config(struct dpif *dpif, 
odp_port_t port_no,
 goto unlock;
 }
 
+if (cross_numa_polling != port->cross_numa_polling) {
+port->cross_numa_polling = cross_numa_polling;
+VLOG_INFO("%s:cross-numa-polling has been %s.",
+  netdev_get_name(port->netdev),
+  cross_numa_polling? "enabled" : "disabled");
+dp_netdev_request_reconfigure(dp);
+}
+
 if (emc_enabled != port->emc_enabled) {
 struct dp_netdev_pmd_thread *pmd;
 struct ds ds = 

Re: [ovs-dev] [PATCH 6/7] dp-packet: Rename dp_packet_ol l4 functions.

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

Rename to better represent their flags.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/conntrack.c|  4 ++--
  lib/dp-packet.h| 14 +++---
  lib/ipf.c  |  6 +++---
  lib/netdev-linux.c | 14 +++---
  lib/netdev.c   | 16 +++-
  5 files changed, 26 insertions(+), 28 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 5/7] dp-packet: Rename dp_packet_ol_tcp_seg

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

Rename to dp_packet_ol_tcp_seg, because that is less
redundant and allows other protocols.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/dp-packet.h| 2 +-
  lib/netdev-linux.c | 2 +-
  lib/netdev.c   | 4 ++--
  3 files changed, 4 insertions(+), 4 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] dpif-netdev: Allow cross-NUMA polling on selected ports

2022-06-08 Thread Anurag Agarwal via dev
Hi Kevin,
 Thanks for your feedback. Please find my response inline. 

I will be uploading a new version v4 of the patch with comments addressed. 

Regards,
Anurag

> -Original Message-
> From: Kevin Traynor 
> Sent: Friday, June 3, 2022 9:36 PM
> To: Anurag Agarwal ; ovs-dev@openvswitch.org
> Cc: Jan Scheurich ; Anurag Agarwal
> 
> Subject: Re: [PATCH v3] dpif-netdev: Allow cross-NUMA polling on selected
> ports
> 
> Hi Anurag,
> 
> Thanks for submitting this. Some initial comments on the code below.
> 
> On 03/06/2022 05:25, Anurag Agarwal wrote:
> > From: Jan Scheurich 
> >
> > Today dpif-netdev considers PMD threads on a non-local NUMA node for
> > automatic assignment of the rxqs of a port only if there are no local,non-
> isolated PMDs.
> >
> > On typical servers with both physical ports on one NUMA node, this
> > often leaves the PMDs on the other NUMA node under-utilized, wasting CPU
> resources.
> > The alternative, to manually pin the rxqs to PMDs on remote NUMA
> > nodes, also has drawbacks as it limits OVS' ability to auto load-balance the
> rxqs.
> >
> > This patch introduces a new interface configuration option to allow
> > ports to be automatically polled by PMDs on any NUMA node:
> >
> > ovs-vsctl set interface  other_config:cross-numa-polling=true
> >
> > The group assignment algorithm now has the ability to select lowest
> > loaded PMD on any NUMA, and not just the local NUMA on which the rxq
> > of the port resides
> >
> > If this option is not present or set to false, legacy behaviour applies.
> >
> > Co-authored-by: Anurag Agarwal 
> > Signed-off-by: Jan Scheurich 
> > Signed-off-by: Anurag Agarwal 
> > ---
> 
> It would be good if you could include some links below the cut line here or 
> in a
> cover-letter to the previous discussion on this feature and the possible side-
> effect, so reviewers can be aware of that.
Done

> 
> Also, fine so far as it was minor revisions sent close together, but for 
> future
> revisions, please add a note about what has changed. That will help reviewers 
> to
> know what they need to focus on in the new revision.
> 
Sure, will do.

> >   Documentation/topics/dpdk/pmd.rst |  23 ++
> >   lib/dpif-netdev.c | 123 ++
> >   tests/pmd.at  |  33 
> >   vswitchd/vswitch.xml  |  20 +
> >   4 files changed, 166 insertions(+), 33 deletions(-)
> >
> > diff --git a/Documentation/topics/dpdk/pmd.rst
> > b/Documentation/topics/dpdk/pmd.rst
> > index b259cc8b3..387f962d1 100644
> > --- a/Documentation/topics/dpdk/pmd.rst
> > +++ b/Documentation/topics/dpdk/pmd.rst
> > @@ -99,6 +99,25 @@ core cycles for each Rx queue::
> >
> >   $ ovs-appctl dpif-netdev/pmd-rxq-show
> >
> > +Normally, Rx queues are assigned to PMD threads automatically.  By
> > +default OVS only assigns Rx queues to PMD threads executing on the
> > +same NUMA node in order to avoid unnecessary latency for accessing
> > +packet buffers across the NUMA boundary.  Typically this overhead is
> > +higher for vhostuser ports than for physical ports due to the packet
> > +copy that is done for all rx packets.
> > +
> 
> I don't think it needs double space for the start of each sentence, but that 
> could
> be because I'm comparing with other parts of the documentation that I wrote
> incorrectly without that o_O

Checked the documentation, no extra space added at the start of each line. 

> 
> > +On NUMA servers with physical ports only on one NUMA node, the
> > +NUMA-local polling policy can lead to an under-utilization of the PMD
> > +threads on the remote NUMA node.  For the overall OVS performance it
> > +may in such cases be beneficial to utilize the spare capacity and
> > +allow polling of a physical port's rxqs across NUMA nodes despite the
> overhead involved.
> > +The policy can be set per port with the following configuration option::
> > +
> > +$ ovs-vsctl set Interface  \
> > +other_config:cross-numa-polling=true|false
> > +
> > +The default value is false.
> > +
> >   .. note::
> >
> >  A history of one minute is recorded and shown for each Rx queue
> > to allow for @@ -115,6 +134,10 @@ core cycles for each Rx queue::
> >  A ``overhead`` statistics is shown per PMD: it represents the number of
> >  cycles inherently consumed by the OVS PMD processing loop.
> >
> > +.. versionchanged:: 2.18.0
> > +
> > +   Added the interface parameter ``other_config:cross-numa-polling``
> > +
> >   Rx queue to PMD assignment takes place whenever there are configuration
> changes
> >   or can be triggered by using::
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > ff57b3961..ace5c1920 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -467,6 +467,7 @@ struct dp_netdev_port {
> >   char *type; /* Port type as requested by user. */
> >   char *rxq_affinity_list;/* Requested affinity of rx queues. */
> >   enum 

Re: [ovs-dev] [PATCH 4/7] dp-packet: Use p for packet and b for batch.

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

Currently 'p' and 'b' and used for packets, so use
a convention that struct dp_packet is 'p' and
struct dp_packet_batch is 'b'.

Some comments needed new formatting to not pass the
80 column.

Some variables were using 'p' or 'b' were renamed
as well.

There should be no functional change with this patch.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/dp-packet.c| 345 +++
  lib/dp-packet.h| 504 ++---
  lib/netdev-dummy.c |   8 +-
  lib/netdev-linux.c |  56 ++---
  4 files changed, 457 insertions(+), 456 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/7] Rename dp_packet_hwol to dp_packet_ol.

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

The name correlates better with the flag names.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/conntrack.c|  8 
  lib/dp-packet.h| 28 ++--
  lib/ipf.c  |  6 +++---
  lib/netdev-dpdk.c  | 20 ++--
  lib/netdev-linux.c | 24 
  lib/netdev.c   | 14 +++---
  6 files changed, 50 insertions(+), 50 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/7] Prefix netdev offload flags with NETDEV_OFFLOAD_.

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

Use the 'NETDEV_OFFLOAD_' prefix in the flags to indicate
we are talking about hardware offloading capabilities.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/netdev-dpdk.c | 20 ++--
  lib/netdev-linux.c| 10 +-
  lib/netdev-provider.h | 10 +-
  lib/netdev.c  |  8 
  4 files changed, 24 insertions(+), 24 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 1/7] Rename flags with CKSUM to CSUM.

2022-06-08 Thread Maxime Coquelin




On 6/3/22 17:15, Mike Pattrick wrote:

From: Flavio Leitner 

It seems csum is more common and shorter.

Signed-off-by: Flavio Leitner 
Co-authored-by: Mike Pattrick 
Signed-off-by: Mike Pattrick 
---
  lib/dp-packet.h   | 72 +--
  lib/netdev-dpdk.c | 16 +-
  lib/netdev-linux.c|  8 ++---
  lib/netdev-provider.h |  8 ++---
  lib/netdev.c  |  6 ++--
  5 files changed, 55 insertions(+), 55 deletions(-)



Acked-by: Maxime Coquelin 

Thanks,
Maxime

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2] OVN-CI: Add test cases with monitor-all enabled.

2022-06-08 Thread Dumitru Ceara
On 5/31/22 11:33, Mohammad Heib wrote:
> currently ovn ci only has one test case with the option
> ovn-monitor-all enabled, this patch will add one more
> execution with option ovn-monitor-all=true for each test case that
> are wrapped by OVN_FOR_EACH_NORTHD macro.
> 
> This will more or less double the number of test cases.
> It is possible to select a reduce set of test cases using -k "keywords".
> Keyword such as
> dp-groups=yes
> dp-groups=no
> parallelization=yes
> parallelization=no
> ovn-northd
> ovn-northd-ddlog
> ovn_monitor_all=yes
> can be used to select a range of tests, as title is searched as well.
> 
> For instance, to run ovn-monitor-all tests, with dp-groups enabled and ddlog 
> disabled:
> make check TESTSUITEFLAGS="-k 
> dp-groups=yes,ovn_monitor_all=yes,\!ovn-northd-ddlog"
> 
> Signed-off-by: Mohammad Heib 
> ---

Hi Mohammad,

Thanks for working on this, we were really missing tests with
conditional monitoring disabled!

>  tests/ovn-macros.at | 19 ---
>  tests/ovs-macros.at |  4 +++-
>  2 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
> index c6f0f6251..7484b32c3 100644
> --- a/tests/ovn-macros.at
> +++ b/tests/ovn-macros.at
> @@ -323,6 +323,19 @@ ovn_az_attach() {
>  -- --may-exist add-br br-int \
>  -- set bridge br-int fail-mode=secure 
> other-config:disable-in-band=true \
>  || return 1
> +
> +# currently this is the optimal place to add the ovn-monitor-all=true 
> option,
> +# this can be implemented in a different way by redefining the sim-add 
> function
> +# to add the ovn-related external-ids when we add a new simulated node 
> via sim-add.
> +#
> +# wait one sec to make sure that the ovn notice and update it 
> configuration
> +# according to the new option.
> +#
> +if test X$OVN_MONITOR_ALL = Xyes; then
> +ovs-vsctl set open . external_ids:ovn-monitor-all=true
> +sleep 1
> +fi
> +

I think we can avoid the "sleep 1" if we just do this in the same
transaction as ovn-remote (just above).

>  start_daemon ovn-controller --enable-dummy-vif-plug || return 1
>  }
>  
> @@ -751,19 +764,19 @@ m4_define([OVN_FOR_EACH_NORTHD],
>[m4_foreach([NORTHD_TYPE], [ovn-northd, ovn-northd-ddlog],
>   [m4_foreach([NORTHD_USE_DP_GROUPS], [yes, no],
> [m4_foreach([NORTHD_USE_PARALLELIZATION], [yes, no], [$1
> -])])])])
> +]) m4_foreach([OVN_MONITOR_ALL], [yes], [$1])])])])

Maybe I misunderstood the goal but this doesn't seem to generate
all possible combinations, e.g.:

$ make check TESTSUITEFLAGS="-l" | grep 'IP relocation using GARP request'
 291: ovn.at:4906IP relocation using GARP request -- ovn-northd -- 
dp-groups=yes -- parallelization=yes
 292: ovn.at:4906IP relocation using GARP request -- ovn-northd -- 
dp-groups=yes -- parallelization=no
 293: ovn.at:4906IP relocation using GARP request -- ovn-northd -- 
dp-groups=yes -- ovn_monitor_all=yes
 294: ovn.at:4906IP relocation using GARP request -- ovn-northd -- 
dp-groups=no -- parallelization=yes
 295: ovn.at:4906IP relocation using GARP request -- ovn-northd -- 
dp-groups=no -- parallelization=no
 296: ovn.at:4906IP relocation using GARP request -- ovn-northd -- 
dp-groups=no -- ovn_monitor_all=yes
 297: ovn.at:4906IP relocation using GARP request -- ovn-northd-ddlog 
-- dp-groups=yes -- parallelization=yes
 298: ovn.at:4906IP relocation using GARP request -- ovn-northd-ddlog 
-- dp-groups=yes -- parallelization=no
 299: ovn.at:4906IP relocation using GARP request -- ovn-northd-ddlog 
-- dp-groups=yes -- ovn_monitor_all=yes
 300: ovn.at:4906IP relocation using GARP request -- ovn-northd-ddlog 
-- dp-groups=no -- parallelization=yes
 301: ovn.at:4906IP relocation using GARP request -- ovn-northd-ddlog 
-- dp-groups=no -- parallelization=no
 302: ovn.at:4906IP relocation using GARP request -- ovn-northd-ddlog 
-- dp-groups=no -- ovn_monitor_all=yes

It seems to me we're missing:
- dp-groups=yes, parallelization=yes, ovn_monitor_all=yes

I think this should be:

# Defines a versions of a test with all combinations of northd and
# datapath groups.
m4_define([OVN_FOR_EACH_NORTHD],
  [m4_foreach([NORTHD_TYPE], [ovn-northd, ovn-northd-ddlog],
 [m4_foreach([NORTHD_USE_DP_GROUPS], [yes, no],
   [m4_foreach([NORTHD_USE_PARALLELIZATION], [yes, no],
  [m4_foreach([OVN_MONITOR_ALL], [yes, no], [$1
])])])])])

Also, related, I sent an RFC patch to remove dp_groups=false all
together.  That will reduce the matrix size.

Speaking of which, I think we can follow up with a patch to split the CI
runs and run subsets of the test suite in "parallel".  This could be
done by relying on the keyword filtering you suggested suggested in the
commit log.  In .github/workflows/test.yml we now use the TESTSUITE env
variable 

Re: [ovs-dev] [PATCH v4 4/5] system-offloads-traffic: Properly initialize offload before testing.

2022-06-08 Thread Eelco Chaudron



On 6 Jun 2022, at 9:18, Roi Dayan wrote:

> On 2022-06-03 11:58 AM, Eelco Chaudron wrote:
>> This patch will properly initialize offload, as it requires the
>> setting to be enabled before starting ovs-vswitchd (or do a
>> restart once configured).
>>
>> Signed-off-by: Eelco Chaudron 
>> ---



> Acked-by: Roi Dayan 

Thanks Roi and Mike for the quick review!!

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy mechanism

2022-06-08 Thread Eli Britstein via dev
Hi Ivan,

>-Original Message-
>From: Ivan Malov 
>Sent: Tuesday, June 7, 2022 11:56 PM
>To: Eli Britstein 
>Cc: d...@openvswitch.org; Andrew Rybchenko
>; Ilya Maximets ;
>Ori Kam ; NBU-Contact-Thomas Monjalon (EXTERNAL)
>; Stephen Hemminger
>; David Marchand
>; Gaetan Rivet ; Maxime
>Coquelin 
>Subject: RE: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
>mechanism
>
>External email: Use caution opening links or attachments
>
>
>Hi Eli,
>
>On Wed, 1 Jun 2022, Eli Britstein wrote:
>
>> - Missing proper handling of the testpmd syntax logging. It changes the used
>port according to "transfer", but the log still uses netdev_dpdk_get_port_id().
>
>Thanks for noticing. I will see to it in the next version.
>
>> - The usage of the "proxy" port for rte-flow implies that this proxy port is
>attached to OVS, otherwise it is not "started" and creation of flows will fail.
>
>That's the way it is. If there is no proxy for a given port, then the original 
>port
>value will be used for managing flows. For vendors that don't need the proxy,
>this will work. For others, it won't. That's OK.
I don't really understand why this can't be done inside dpdk domain (if there 
is a proxy, and it is up, use it, otherwise don't).
That's *currently* the way it is. I understand that if dpdk works like this OVS 
should align, but maybe you or someone else here knows why dpdk works like 
this? (not too late to change, this is experimental...).
>
>>
>>> -Original Message-
>>> From: Ivan Malov 
>>> Sent: Monday, May 30, 2022 5:16 PM
>>> To: d...@openvswitch.org
>>> Cc: Andrew Rybchenko ; Ilya Maximets
>>> ; Ori Kam ; Eli Britstein
>>> ; NBU-Contact-Thomas Monjalon (EXTERNAL)
>>> ; Stephen Hemminger
>>> ; David Marchand
>>> ; Gaetan Rivet ; Maxime
>>> Coquelin 
>>> Subject: [PATCH 3/3] netdev-offload-dpdk: use flow transfer proxy
>>> mechanism
>>>
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> Manage "transfer" flows via the corresponding mechanism.
>>> Doing so requires that the traffic source be specified explicitly,
>>> via the corresponding pattern item.
>>>
>>> Signed-off-by: Ivan Malov 
>>> Acked-by: Andrew Rybchenko 
>>> ---
>>> lib/netdev-dpdk.c | 73 ---
>>> lib/netdev-dpdk.h |  2 +-
>>> lib/netdev-offload-dpdk.c | 43 ++-
>>> 3 files changed, 103 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index
>>> 45e5d26d2..d0bf4613a 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -420,6 +420,7 @@ enum dpdk_hw_ol_features {
>>>
>>> struct netdev_dpdk {
>>> PADDED_MEMBERS_CACHELINE_MARKER(CACHE_LINE_SIZE,
>cacheline0,
>>> +dpdk_port_t flow_transfer_proxy_port_id;
>>> dpdk_port_t port_id;
>>>
>>> /* If true, device was attached by rte_eth_dev_attach(). */
>>> @@ -1115,6
>>> +1116,23 @@ dpdk_eth_dev_init_rx_metadata(struct netdev_dpdk *dev)
>>>   DPDK_PORT_ID_FMT, dev->port_id);
>>> }
>>> }
>>> +
>>> +static void
>>> +dpdk_eth_dev_init_flow_transfer_proxy(struct netdev_dpdk *dev) {
>>> +int ret;
>>> +
>>> +ret = rte_flow_pick_transfer_proxy(dev->port_id,
>>> +   >flow_transfer_proxy_port_id, 
>>> NULL);
>>> +if (ret == 0)
>>> +return;
>>> +
>>> +/*
>>> + * The PMD does not indicate the proxy port.
>>> + * It is OK to assume the proxy is unneeded.
>>> + */
>>> +dev->flow_transfer_proxy_port_id = dev->port_id; }
>>> #endif /* ALLOW_EXPERIMENTAL_API */
>>>
>>> static int
>>> @@ -1141,6 +1159,19 @@ dpdk_eth_dev_init(struct netdev_dpdk *dev)
>>>  * Request delivery of such metadata.
>>>  */
>>> dpdk_eth_dev_init_rx_metadata(dev);
>>> +
>>> +/*
>>> + * Managing "transfer" flows requires that the user communicate them
>>> + * via a port which has the privilege to control the embedded switch.
>>> + * For some vendors, all ports in a given switching domain have
>>> + * this privilege. For other vendors, it's only one port.
>>> + *
>>> + * Get the proxy port ID and remember it for later use.
>>> + */
>>> +dpdk_eth_dev_init_flow_transfer_proxy(dev);
>>> +#else /* ! ALLOW_EXPERIMENTAL_API */
>>> +/* It is OK to assume the proxy is unneeded. */
>>> +dev->flow_transfer_proxy_port_id = dev->port_id;
>>> #endif /* ALLOW_EXPERIMENTAL_API */
>>>
>>> rte_eth_dev_info_get(dev->port_id, ); @@ -5214,13 +5245,15
>>> @@ out:
>>>
>>> int
>>> netdev_dpdk_rte_flow_destroy(struct netdev *netdev,
>>> - struct rte_flow *rte_flow,
>>> + bool transfer, struct rte_flow
>>> + *rte_flow,
>>>  struct rte_flow_error *error) {
>>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>> int ret;
>>>
>>> -ret = rte_flow_destroy(dev->port_id, rte_flow, error);
>>> +ret = rte_flow_destroy(transfer ?
>>> +

Re: [ovs-dev] [RFC ovn 0/2] Basic eBPF/XDP support in OVN.

2022-06-08 Thread Han Zhou
On Mon, May 30, 2022 at 5:46 PM  wrote:
>
> From: Numan Siddique 
>
> XDP program - ovn_xdp.c added in this RFC patch  series implements basic
port
> security and drops any packet if the port security check fails.
> There are still few TODOs in the port security checks. Like
>   - Make ovn xdp configurable.
>   - Removing the ingress Openflow rules from table 73 and 74 if ovn
xdp
> is enabled.
>   - Add IPv6 support.
>   - Enhance the port security xdp program for ARP/IPv6 ND checks.
>
> This patch adds a basic XDP support in OVN and in future we can
> leverage eBPF/XDP features.
>
> I'm not sure how much value this RFC patch adds to make use of eBPF/XDP
> just for port security.  Submitting as RFC to get some feedback and
> start some conversation on eBPF/XDP in OVN.
>
Hi Numan,

This is really cool. It demonstrates how OVN could leverage eBPF/XDP.

On the other hand, for the port-security feature in XDP, I keep thinking
about the scenarios and it is still not very clear to me. One advantage I
can think of is to prevent DOS attacks from VM/Pod when invalid IP/MAC are
used, XDP may perform better and drop packets with lower CPU cost
(comparing with OVS kernel datapath). However, I am also wondering why
would a attacker use invalid IP/MAC for DOS attacks? Do you have some more
thoughts about the use cases? And do you have any performance results
comparing with the current OVS implementation?

Another question is, would it work with smart NIC HW-offload, where VF
representer ports are added to OVS on the smart NIC? I guess XDP doesn't
support representer port, right?

Thanks,
Han

> In order to attach and detach xdp programs,  libxdp [1] and libbpf is
used.
>
> To test it out locally, please install libxdp-devel and libbpf-devel
> and the compile OVN first and then compile ovn_xdp by running "make
> bpf".  Copy ovn_xdp.o to either /usr/share/ovn/ or /usr/local/share/ovn/
>
>
> Numan Siddique (2):
>   RFC: Add basic xdp/eBPF support in OVN.
>   RFC: ovn-controller: Attach XDP progs to the VIFs of the logical
> ports.
>
>  Makefile.am |   6 +-
>  bpf/.gitignore  |   5 +
>  bpf/automake.mk |  23 +++
>  bpf/ovn_xdp.c   | 156 +++
>  configure.ac|   2 +
>  controller/automake.mk  |   4 +-
>  controller/binding.c|  45 +++--
>  controller/binding.h|   7 +
>  controller/ovn-controller.c |  79 +++-
>  controller/xdp.c| 389 
>  controller/xdp.h|  41 
>  m4/ovn.m4   |  20 ++
>  tests/automake.mk   |   1 +
>  13 files changed, 753 insertions(+), 25 deletions(-)
>  create mode 100644 bpf/.gitignore
>  create mode 100644 bpf/automake.mk
>  create mode 100644 bpf/ovn_xdp.c
>  create mode 100644 controller/xdp.c
>  create mode 100644 controller/xdp.h
>
> --
> 2.35.3
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev