Re: [ovs-dev] [PATCH net v2] net: openvswitch: fix race on port output

2023-04-04 Thread Felix Hüttner via dev
On Tue, Apr 4, 2023 at 10:03 AM Eric Dumazet wrote:
> On Tue, Apr 4, 2023 at 9:33 AM Felix Huettner
>  wrote:
> >
> > assume the following setup on a single machine:
> > 1. An openvswitch instance with one bridge and default flows
> > 2. two network namespaces "server" and "client"
> > 3. two ovs interfaces "server" and "client" on the bridge
> > 4. for each ovs interface a veth pair with a matching name and 32 rx and
> >tx queues
> > 5. move the ends of the veth pairs to the respective network namespaces
> > 6. assign ip addresses to each of the veth ends in the namespaces (needs
> >to be the same subnet)
> > 7. start some http server on the server network namespace
> > 8. test if a client in the client namespace can reach the http server
> >
> > when following the actions below the host has a chance of getting a cpu
> > stuck in a infinite loop:
> > 1. send a large amount of parallel requests to the http server (around
> >3000 curls should work)
> > 2. in parallel delete the network namespace (do not delete interfaces or
> >stop the server, just kill the namespace)
> >
>
> > Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
> > Co-developed-by: Luca Czesla 
> > Signed-off-by: Luca Czesla 
> > Signed-off-by: Felix Huettner 
> > ---
> > v2:
> >   - replace BUG_ON with DEBUG_NET_WARN_ON_ONCE
> >   - use netif_carrier_ok() instead of checking for NETREG_REGISTERED
> > v1: https://lore.kernel.org/netdev/ZCaXfZTwS9MVk8yZ@kernel-bug-kernel-bug/
> >
> >  net/core/dev.c| 1 +
> >  net/openvswitch/actions.c | 2 +-
> >  2 files changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 253584777101..37b26017f458 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3199,6 +3199,7 @@ static u16 skb_tx_hash(const struct net_device *dev,
> > }
> >
> > if (skb_rx_queue_recorded(skb)) {
> > +   DEBUG_NET_WARN_ON_ONCE(unlikely(qcount == 0));
>
> No need for unlikely(), it is already done in DEBUG_NET_WARN_ON_ONCE()
>
> Thanks.

Thanks for the feedback. Will include that in v3.
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] net: openvswitch: fix race on port output

2023-04-04 Thread Felix Hüttner via dev
On Mon, 3 Apr 2023 20:50:00 + Jakub Kicinski wrote:
> On Mon, 3 Apr 2023 11:18:46 + Felix Hüttner wrote:
> > On Sat, 1 Apr 2023 6:25:00 + Jakub Kicinski wrote:
> > > On Fri, 31 Mar 2023 06:25:13 + Felix Hüttner wrote:
> > > > diff --git a/net/core/dev.c b/net/core/dev.c
> > > > index 253584777101..6628323b7bea 100644
> > > > --- a/net/core/dev.c
> > > > +++ b/net/core/dev.c
> > > > @@ -3199,6 +3199,7 @@ static u16 skb_tx_hash(const struct net_device 
> > > > *dev,
> > > > }
> > > >
> > > > if (skb_rx_queue_recorded(skb)) {
> > > > +   BUG_ON(unlikely(qcount == 0));
> > >
> > > DEBUG_NET_WARN_ON()
> > >
> >
> > However if this condition triggers we will be permanently stuck in the loop 
> > below.
> > From my understading this also means that future calls to `synchronize_net` 
> > will never
> finish (as the packet never finishes processing).
> > So the user will quite probably need to restart his system.
> > I find DEBUG_NET_WARN_ON_ONCE to offer too little visiblity as 
> > CONFIG_DEBUG_NET is not
> necessarily enabled per default.
> > I as the user would see it as helpful to have this information available 
> > without
> additional config flags.
> > I would propose to use WARN_ON_ONCE
>
> skb_tx_hash() may get called a lot, we shouldn't slow it down on
> production systems just to catch buggy drivers, IMO.

Thanks for the clarification.
Will then use DEBUG_NET_WARN_ON_ONCE in v2
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] net: openvswitch: fix race on port output

2023-04-03 Thread Felix Hüttner via dev
Thanks for the review

On Sat, 1 Apr 2023 6:25:00 + Jakub Kicinski wrote:
> On Fri, 31 Mar 2023 06:25:13 + Felix Hüttner wrote:
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 253584777101..6628323b7bea 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3199,6 +3199,7 @@ static u16 skb_tx_hash(const struct net_device *dev,
> > }
> >
> > if (skb_rx_queue_recorded(skb)) {
> > +   BUG_ON(unlikely(qcount == 0));
>
> DEBUG_NET_WARN_ON()
>

However if this condition triggers we will be permanently stuck in the loop 
below.
>From my understading this also means that future calls to `synchronize_net` 
>will never finish (as the packet never finishes processing).
So the user will quite probably need to restart his system.
I find DEBUG_NET_WARN_ON_ONCE to offer too little visiblity as CONFIG_DEBUG_NET 
is not necessarily enabled per default.
I as the user would see it as helpful to have this information available 
without additional config flags.
I would propose to use WARN_ON_ONCE

> > hash = skb_get_rx_queue(skb);
> > if (hash >= qoffset)
> > hash -= qoffset;
> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index ca3ebfdb3023..33b317e5f9a5 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -913,7 +913,7 @@ static void do_output(struct datapath *dp, struct 
> > sk_buff *skb, int
> out_port,
> >  {
> > struct vport *vport = ovs_vport_rcu(dp, out_port);
> >
> > -   if (likely(vport)) {
> > +   if (likely(vport && vport->dev->reg_state == NETREG_REGISTERED)) {
>
> Without looking too closely netif_carrier_ok() seems like a more
> appropriate check for liveness on the datapath?

Yes, will use that in v2

> > u16 mru = OVS_CB(skb)->mru;
> > u32 cutlen = OVS_CB(skb)->cutlen;
> >
> > --
> > 2.40.0
> >
> > Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für 
> > die Verwertung
> durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der vorgesehene 
> Empfänger
> sein, setzen Sie den Absender bitte unverzüglich in Kenntnis und löschen 
> diese E Mail.
> Hinweise zum Datenschutz finden Sie
> hier warz%2F=05%7C01%7C%7Cbc601e5604854cc671e208db32691a22%7Cd04f47175a6e4b98b3f96918e0385
> f4c%7C0%7C0%7C638159199209626766%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luM
> zIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=OiRwLDMENMut92J%2Fl0Hs6n8sTWFQO1kc
> Dy7mN%2B4AX8Q%3D=0>.
>
> You gotta get rid of this to work upstream.

working on it
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] net: openvswitch: fix race on port output

2023-03-31 Thread Felix Hüttner via dev
assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle not
registered (so e.g. unregistering) devices the same as if the device is
not found (which would be the code path after 9. is done).

Additionally we introduce a `BUG_ON` in `skb_tx_hash` to rather crash
then produce this infinite loop that can not be exited anyway.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 1, 
cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 
21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, 
tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 
21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, 
event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, 
skb_addr: 0x9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0x9f0a46d4a000 real_num_tx_queues: 0, 
cpu: 2, pid: 28024, tid: 28024, skb_addr: 0x9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, 
reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 

[ovs-dev] [PATCH ovn v2] northd: add router broadcast option to logical switch

2023-03-20 Thread Felix Hüttner via dev
Assume the following setup:

++
| Logical Router |
| lr001  +-+
++ |
   |
++ |
| Logical Router | | ++ +--+
| lr002  +-+-+ Logical Switch +-+ Phyiscal Network |
++ | | ls-ext | |  |
   | ++ +--+
  ...  |
   |
++ |
| Logical Router | |
| lr300  +-+
++

If a arp request for the ip of lr001 on ls-ext is now received it is
only forwarded to that individual logical router.

If we however now receive a arp request for an ip not used by any of
lr001-lr300 we try to flood the arp request to all logical ports on ls-ext.
With around 300 routers this causes the arp request to be dropped after
some routers as we hit the 4096 resubmit limit.

In the most cases forwarding the arp requests to the logical routers is
pointless as we already know all of their ip addresses and they will
therefor not be able to answer the arp requests anyway.
Only if someone sends garps this is not the case. Then the request would
need to be flooded to all logical routers.

We can therefor not generally send these arp requests to MC_FLOOD_L2 as
this would break garps. As we can also not detect garps we need to leave
the solution to our users.

To do this we introduce the other_config `broadcast-arps-to-all-routers`
on logical switches (which is per default true). If set to false we add
a logical flow that forwards arp requests where we do not know a
specific target logical switch port to MC_FLOOD_L2, thereby bypassing
all logical routers.

Signed-off-by: Felix Huettner 
---
 NEWS|  5 +
 northd/northd.c |  8 
 northd/ovn-northd.8.xml |  7 +++
 ovn-nb.xml  | 12 
 tests/ovn-northd.at | 31 +++
 5 files changed, 63 insertions(+)

diff --git a/NEWS b/NEWS
index 637adcff3..2379f5089 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ Post v23.03.0
 -
   - Enhance LSP.options:arp_proxy to support IPv6, configurable MAC
 addresses and CIDRs.
+  - Add LS.other_config:broadcast-arps-to-all-routers. If false then arp
+requests are only send to Logical Routers on that Logical Switch if the
+target mac address matches. Arp requests matching no Logical Router will
+only be forwarded to non-router ports. Default is true which keeps the
+existing behaviour of flooding these arp requests to all attached Ports.

 OVN v23.03.0 - 03 Mar 2023
 --
diff --git a/northd/northd.c b/northd/northd.c
index 5f0b436c2..be6d70d94 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -9030,6 +9030,14 @@ build_lswitch_destination_lookup_bmcast(struct 
ovn_datapath *od,
 }
 }

+
+if (!smap_get_bool(>nbs->other_config,
+   "broadcast-arps-to-all-routers", true)) {
+ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 72,
+"eth.mcast && (arp.op == 1 || nd_ns)",
+"outport = \""MC_FLOOD_L2"\"; output;");
+}
+
 ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 70, "eth.mcast",
   "outport = \""MC_FLOOD"\"; output;");
 }
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 5d513e65a..3d5f579fe 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1880,6 +1880,13 @@ output;
 non-router logical ports.
   

+  
+A priority-72 flow that outputs all ARP requests and ND packets with
+an Ethernet broadcast or multicast eth.dst to the
+MC_FLOOD_L2 multicast group if
+other_config:broadcast-arps-to-all-routers=true.
+  
+
   
 A priority-70 flow that outputs all packets with an Ethernet broadcast
 or multicast eth.dst to the MC_FLOOD
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 73f707aa0..d106af8be 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -729,6 +729,18 @@
 localnet ports, fabric traffic that belongs to other tagged networks 
may
 be passed through such a port.
   
+
+  
+Determines whether arp requests and ipv6 neighbor solicitations should
+be send to all routers and other switchports (default) or if it should
+only be send to switchports where the ip/mac address is unknown.
+Setting this to false can significantly reduce the load if the logical
+switch can receive arp requests for ips it does not know about.
+However setting this to false also means that garps are no longer
+forwarded to all routers and therefor the mac bindings of the routers
+are no longer updated.
+  
 

 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index ef29233db..4bf59f4af 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -6421,6 +6421,37 @@ 

[ovs-dev] ovn: Handling of arp/nd learning on logical switches

2023-03-03 Thread Felix Hüttner via dev
Hello everyone,

We had a discussion on ovs-discuss that we wanted to bring here [1]:

Assume a physical network connected to a OVN Logical_Switch and then multiple 
Logical_Routers like so:

++
| Logical Router |
| lr001  +-+
++ |
   |
++ |
| Logical Router | | ++ +--+
| lr002  +-+-+ Logical Switch +-+ Phyiscal Network |
++ | | ls-ext | |  |
   | ++ +--+
  ...  |
   |
++ |
| Logical Router | |
| lr300  +-+
++

If now a multicast packet comes in to ls-ext from the physical network (e.g. 
for our case a arp request or a neighbor discovery) it is flooded to all 
attached lsp's.
The logical routers then try to lookup the source of the arp request/nd in 
their arp/neighbor table and insert it if it is not found.
For the picture above that means that a single arp packet can trigger 300 
lookup_arp and put_arp actions and each put_arp would result in a controller 
action.
The outcome would be the same for all 300 logical routers: Each of them would 
insert the mac/ip binding in the MAC_Binding table and for each of them flows 
would be added.
This leads to load on the ovn-controllers and the southbound database. Also 
some of the put_arp action can easily be lost due to the queue limit to the 
ovn-controller.

In the mailinglist thread mentioned above ([1]) there was the discussion to 
move this learning process to the Logical Switch.
This would mean that:
1. we only need to handle the learning in one location (and therefor only do it 
once)
2. the MAC_Binding table in the southbound database does not contain the 
(roughly) same entry 300 times

However we where unsure if there is anything speaking against such an approach.
>From my naive understanding I would propose the following changes:

1. move the whole lr_in_lookup_neighbor (table 1) and lr_in_learn_neighbor 
(table 2) from the logical router pipeline to the logical switch pipeline 
between ls_in_hairpin (table 18) and ls_in_arp_rsp (table 19)

2a. Leave the arp resolve stage in the logical routers: teach 
add_neighbor_flows to not only learn from the Mac_Binding table based on 
logical_ports but also based on the datapaths these ports are connected to

2b. Or move the arp resolve stage to the logical switches:
   1) move lr_in_arp_resolve (table 17) and lr_in_arp_request (table 21) from 
the logical router to the logical switch pipeline
   2) clarify what we would then use as a source address from arp requests that 
would then originate from the logical switch pipeline
   3) clarify how we signal to the logical switch that it should actually do 
the arp lookup (as static mac_bindings for individual routers probably still 
need to work).

I would for now tend to just move the learning stage to the logical switches 
while keeping the arp resolution stage on the logical router side.

I guess I have overlooked something important in here as well, so it would be 
great if I could get feedback on your views on this.

Thanks

[1]: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-March/052268.html

--
Felix Huettner

Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] northd: add router broadcast option to logical switch

2023-02-24 Thread Felix Hüttner via dev
Assume the following setup:

++
| Logical Router |
| lr001  +-+
++ |
   |
++ |
| Logical Router | | ++ +--+
| lr002  +-+-+ Logical Switch +-+ Phyiscal Network |
++ | | ls-ext | |  |
   | ++ +--+
  ...  |
   |
++ |
| Logical Router | |
| lr300  +-+
++

If a arp request for the ip of lr001 on ls-ext is now received it is
only forwarded to that individual logical router.

If we however now receive a arp request for an ip not used by any of
lr001-lr300 we try to flood the arp request to all logical ports on ls-ext.
With around 300 routers this causes the arp request to be dropped after
some routers as we hit the 4096 resubmit limit.

In the most cases forwarding the arp requests to the logical routers is
pointless as we already know all of their ip addresses and they will
therefor not be able to answer the arp requests anyway.
Only if someone sends garps this is not the case. Then the request would
need to be flooded to all logical routers.

We can therefor not generally send these arp requests to MC_FLOOD_L2 as
this would break garps. As we can also not detect garps we need to leave
the solution to our users.

To do this we introduce the other_config `broadcast-arps-to-all-routers`
on logical switches (which is per default true). If set to false we add
a logical flow that forwards arp requests where we do not know a
specific target logical switch port to MC_FLOOD_L2, thereby bypassing
all logical routers.

Note that the testcase is quite flaky in the ci (as it takes verry long)
but runs well locally. I'm unsure how to best proceed there.

Signed-off-by: Felix Huettner 
---
 northd/northd.c |   7 +++
 northd/ovn-northd.8.xml |   7 +++
 ovn-nb.xml  |  12 +
 tests/ovn.at| 114 
 4 files changed, 140 insertions(+)

diff --git a/northd/northd.c b/northd/northd.c
index c366b545e..6aff04cc5 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -8964,6 +8964,13 @@ build_lswitch_destination_lookup_bmcast(struct 
ovn_datapath *od,
 }
 }

+
+if (!smap_get_bool(>nbs->other_config, 
"broadcast-arps-to-all-routers", true)) {
+ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 72,
+"eth.mcast && (arp.op == 1 || nd_ns)",
+"outport = \""MC_FLOOD_L2"\"; output;");
+}
+
 ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 70, "eth.mcast",
   "outport = \""MC_FLOOD"\"; output;");
 }
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 2eab2c4ae..033841383 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1873,6 +1873,13 @@ output;
 non-router logical ports.
   

+  
+A priority-72 flow that outputs all ARP requests and ND packets with
+an Ethernet broadcast or multicast eth.dst to the
+MC_FLOOD_L2 multicast group if
+other_config:broadcast-arps-to-all-routers=true.
+  
+
   
 A priority-70 flow that outputs all packets with an Ethernet broadcast
 or multicast eth.dst to the MC_FLOOD
diff --git a/ovn-nb.xml b/ovn-nb.xml
index 8d56d0c6e..a41d5b11f 100644
--- a/ovn-nb.xml
+++ b/ovn-nb.xml
@@ -729,6 +729,18 @@
 localnet ports, fabric traffic that belongs to other tagged networks 
may
 be passed through such a port.
   
+
+  
+Determines whether arp requests and ipv6 neighbor solicitations should
+be send to all routers and other switchports (default) or if it should
+only be send to switchports where the ip/mac address is unknown.
+Setting this to false can significantly reduce the load if the logical
+switch can receive arp requests for ips it does not know about.
+However setting this to false also means that garps are no longer
+forwarded to all routers and therefor the mac bindings of the routers
+are no longer updated.
+  
 

 
diff --git a/tests/ovn.at b/tests/ovn.at
index dc5c5df3f..dfef5dacc 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -5048,6 +5048,120 @@ OVN_CLEANUP([hv1], [hv2], [hv3])
 AT_CLEANUP
 ])

+# 1 hypervisor, 1 logical switch with 1 logical port, 300 logical router
+OVN_FOR_EACH_NORTHD([
+AT_SETUP([1 HVs, 1 LS, 2 lports/LS, 300 LR])
+AT_KEYWORDS([slowtest])
+ovn_start
+
+# Logical network:
+#
+# One logical switch ls1.
+# 300 logical routers lr001 - lr299. All connected to ls1
+# with one subnet:
+#lrp001 on ls1 for subnet 192.168.0.1/20
+#lrp002 on ls1 for subnet 192.168.0.2/20
+#...
+#lrp101 on ls1 for subnet 192.168.1.1/20
+#...
+#lrp299 on ls1 for subnet 192.168.2.99/20
+#
+# also 2 VIF on the first hypervisor.
+#

[ovs-dev] [PATCH ovn] northd: fix comments on functions

2023-02-24 Thread Felix Hüttner via dev
the comments did refer to tables 1 below the actual table number.

Signed-off-by: Felix Huettner 
---
 northd/northd.c | 34 +-
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 770a5b50e..c366b545e 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -7856,7 +7856,7 @@ build_lrouter_groups(struct hmap *ports, struct ovs_list 
*lr_list)
 }

 /*
- * Ingress table 24: Flows that flood self originated ARP/RARP/ND packets in
+ * Ingress table 25: Flows that flood self originated ARP/RARP/ND packets in
  * the switching domain.
  */
 static void
@@ -7970,7 +7970,7 @@ lrouter_port_ipv6_reachable(const struct ovn_port *op,
 }

 /*
- * Ingress table 24: Flows that forward ARP/ND requests only to the routers
+ * Ingress table 25: Flows that forward ARP/ND requests only to the routers
  * that own the addresses. Other ARP/ND packets are still flooded in the
  * switching domain as regular broadcast.
  */
@@ -8007,7 +8007,7 @@ build_lswitch_rport_arp_req_flow(const char *ips,
 }

 /*
- * Ingress table 24: Flows that forward ARP/ND requests only to the routers
+ * Ingress table 25: Flows that forward ARP/ND requests only to the routers
  * that own the addresses.
  * Priorities:
  * - 80: self originated GARPs that need to follow regular processing.
@@ -8336,7 +8336,7 @@ build_lswitch_flows(const struct hmap *datapaths,

 struct ovn_datapath *od;

-/* Ingress table 25: Destination lookup for unknown MACs (priority 0). */
+/* Ingress table 25/26: Destination lookup for unknown MACs (priority 0). 
*/
 HMAP_FOR_EACH (od, key_node, datapaths) {
 if (!od->nbs) {
 continue;
@@ -8411,7 +8411,7 @@ build_lswitch_lflows_admission_control(struct 
ovn_datapath *od,
 }
 }

-/* Ingress table 18: ARP/ND responder, skip requests coming from localnet
+/* Ingress table 98: ARP/ND responder, skip requests coming from localnet
  * ports. (priority 100); see ovn-northd.8.xml for the rationale. */

 static void
@@ -8429,7 +8429,7 @@ build_lswitch_arp_nd_responder_skip_local(struct ovn_port 
*op,
 }
 }

-/* Ingress table 18: ARP/ND responder, reply for known IPs.
+/* Ingress table 19: ARP/ND responder, reply for known IPs.
  * (priority 50). */
 static void
 build_lswitch_arp_nd_responder_known_ips(struct ovn_port *op,
@@ -8689,7 +8689,7 @@ build_lswitch_arp_nd_responder_known_ips(struct ovn_port 
*op,
 }
 }

-/* Ingress table 18: ARP/ND responder, by default goto next.
+/* Ingress table 19: ARP/ND responder, by default goto next.
  * (priority 0)*/
 static void
 build_lswitch_arp_nd_responder_default(struct ovn_datapath *od,
@@ -8700,7 +8700,7 @@ build_lswitch_arp_nd_responder_default(struct 
ovn_datapath *od,
 }
 }

-/* Ingress table 18: ARP/ND responder for service monitor source ip.
+/* Ingress table 19: ARP/ND responder for service monitor source ip.
  * (priority 110)*/
 static void
 build_lswitch_arp_nd_service_monitor(struct ovn_northd_lb *lb,
@@ -8769,7 +8769,7 @@ build_lswitch_arp_nd_service_monitor(struct ovn_northd_lb 
*lb,
 }


-/* Logical switch ingress table 19 and 20: DHCP options and response
+/* Logical switch ingress table 20 and 21: DHCP options and response
  * priority 100 flows. */
 static void
 build_lswitch_dhcp_options_and_response(struct ovn_port *op,
@@ -8821,11 +8821,11 @@ build_lswitch_dhcp_options_and_response(struct ovn_port 
*op,
 }
 }

-/* Ingress table 19 and 20: DHCP options and response, by default goto
+/* Ingress table 20 and 21: DHCP options and response, by default goto
  * next. (priority 0).
- * Ingress table 21 and 22: DNS lookup and response, by default goto next.
+ * Ingress table 22 and 23: DNS lookup and response, by default goto next.
  * (priority 0).
- * Ingress table 23 - External port handling, by default goto next.
+ * Ingress table 24 - External port handling, by default goto next.
  * (priority 0). */
 static void
 build_lswitch_dhcp_and_dns_defaults(struct ovn_datapath *od,
@@ -8840,7 +8840,7 @@ build_lswitch_dhcp_and_dns_defaults(struct ovn_datapath 
*od,
 }
 }

-/* Logical switch ingress table 21 and 22: DNS lookup and response
+/* Logical switch ingress table 22 and 23: DNS lookup and response
 * priority 100 flows.
 */
 static void
@@ -8868,7 +8868,7 @@ build_lswitch_dns_lookup_and_response(struct ovn_datapath 
*od,
 }
 }

-/* Table 23: External port. Drop ARP request for router ips from
+/* Table 24: External port. Drop ARP request for router ips from
  * external ports  on chassis not binding those ports.
  * This makes the router pipeline to be run only on the chassis
  * binding the external ports. */
@@ -8885,7 +8885,7 @@ build_lswitch_external_port(struct ovn_port *op,
 }
 }

-/* Ingress table 24: Destination lookup, broadcast and multicast handling
+/* Ingress table 25: Destination lookup, broadcast and multicast handling
  * (priority 70 - 100). */
 static void
 build_lswitch_destination_lookup_bmcast(struct 

[ovs-dev] [PATCH ovn v3 4/4] pinctrl: Send RARPs for external ipv6 interfaces

2022-11-04 Thread Felix Hüttner via dev
previously garps/rarps were only sent for NAT IPs if these had an
ipv4 address attached. For lsp's on gateway routers that do not have
an ipv4 address assigned (e.g. if they are ipv6 only) no rarps where
send out.

This causes traffic outages when changing the priority of a gateway
chassis as the physical switches to not get the information where the
mac address now resides. To fix this, we send out rarps with just the mac
address of the interface and no ip address.

This change has been tested in an environment with 600 logical routers
on a single ipv6 external network.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 controller/pinctrl.c | 23 +
 tests/ovn.at | 80 +---
 2 files changed, 99 insertions(+), 4 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 8859cb080..767fa02d8 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -4512,6 +4512,24 @@ send_garp_rarp_update(struct ovsdb_idl_txn 
*ovnsb_idl_txn,
 }
 free(name);
 }
+/*
+ * Send RARPs even if we do not have a ipv4 address as it e.g.
+ * happens on ipv6 only ports.
+ */
+if (laddrs->n_ipv4_addrs == 0) {
+char *name = xasprintf("%s-noip",
+   binding_rec->logical_port);
+garp_rarp = shash_find_data(_garp_rarp_data, name);
+if (garp_rarp) {
+garp_rarp->dp_key = binding_rec->datapath->tunnel_key;
+garp_rarp->port_key = binding_rec->tunnel_key;
+} else {
+add_garp_rarp(name, laddrs->ea,
+  0, binding_rec->datapath->tunnel_key,
+  binding_rec->tunnel_key);
+}
+free(name);
+}
 destroy_lport_addresses(laddrs);
 free(laddrs);
 }
@@ -5824,6 +5842,11 @@ consider_nat_address(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 sset_add(nat_address_keys, name);
 free(name);
 }
+if (laddrs->n_ipv4_addrs == 0) {
+char *name = xasprintf("%s-noip", pb->logical_port);
+sset_add(nat_address_keys, name);
+free(name);
+}
 shash_add(nat_addresses, pb->logical_port, laddrs);
 }

diff --git a/tests/ovn.at b/tests/ovn.at
index 184fc0fdd..6552681bd 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9017,6 +9017,76 @@ OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])

+OVN_FOR_EACH_NORTHD([
+AT_SETUP([send reverse arp for router without ipv4 address])
+ovn_start
+# Create logical switch
+ovn-nbctl ls-add ls0
+# Create gateway router
+ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1
+# Add router port to gateway router
+ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 fd12:3456:789a:1::1/64
+ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \
+type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"'
+# Add nat-address option
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router"
+
+net_add n1
+sim_add hv1
+as hv1
+ovs-vsctl \
+-- add-br br-phys \
+-- add-br br-eth0
+
+ovn_attach n1 br-phys fd12:3456:789a:1::1 64
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0])
+AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface snoopvif 
options:tx_pcap=hv1/snoopvif-tx.pcap options:rxq_pcap=hv1/snoopvif-rx.pcap])
+
+# Create a localnet port.
+AT_CHECK([ovn-nbctl lsp-add ls0 ln_port])
+AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown])
+AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet])
+AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1])
+
+# Wait until the patch ports are created to connect br-int to br-eth0
+OVS_WAIT_UNTIL([test 1 = `ovs-vsctl show | \
+grep "Port patch-br-int-to-ln_port" | wc -l`])
+
+ovn-sbctl list port_binding lrp0-rp
+echo "*"
+ovn-nbctl list logical_switch_port lrp0-rp
+ovn-nbctl list logical_router_port lrp0
+ovn-nbctl show
+# Wait for packet to be received.
+OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50])
+$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/snoopvif-tx.pcap  | sort | 
uniq > packets
+expected="f00180350001080006040003f001f001"
+echo $expected > expout
+AT_CHECK([sort packets], [0], [expout])
+
+# Temporarily remove nat-addresses option to avoid race conditions
+# due to GARP backoff
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=""
+
+as hv1 reset_pcap_file snoopvif hv1/snoopvif
+
+# Re-add nat-addresses option
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" 
exclude-lb-vips-from-garp="true"
+
+# Wait for packets to be received.
+OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50])
+

[ovs-dev] [PATCH ovn v3 2/4] northd: handle own rarps like garps

2022-11-04 Thread Felix Hüttner via dev
Previously graceful rarps sent from ovn-controller were handled as
normal packets and flooded to other routers. As the other routers should
already have that information, we can skip flooding (just like it is done
for GARPs already) and thereby mitigate ovs refusing to send the packet
because of too many resubmits.

This change has been tested in combination with the previous one in the
series and works well in environments which contain an external ipv6
network with 600 ovn logical routers.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 northd/northd.c | 11 ++-
 northd/ovn-northd.8.xml |  4 ++--
 tests/ovn-northd.at | 18 +-
 tests/ovn.at|  8 +++-
 4 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index b7388afc5..e1f3bace8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -7336,8 +7336,8 @@ build_lrouter_groups(struct hmap *ports, struct ovs_list 
*lr_list)
 }

 /*
- * Ingress table 24: Flows that flood self originated ARP/ND packets in the
- * switching domain.
+ * Ingress table 24: Flows that flood self originated ARP/RARP/ND packets in
+ * the switching domain.
  */
 static void
 build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
@@ -7369,7 +7369,7 @@ build_lswitch_rport_arp_req_self_orig_flow(struct 
ovn_port *op,
 sset_add(_eth_addrs, nat->external_mac);
 }

-/* Self originated ARP requests/ND need to be flooded to the L2 domain
+/* Self originated ARP requests/RARP/ND need to be flooded to the L2 domain
  * (except on router ports).  Determine that packets are self originated
  * by also matching on source MAC. Matching on ingress port is not
  * reliable in case this is a VLAN-backed network.
@@ -7385,7 +7385,8 @@ build_lswitch_rport_arp_req_self_orig_flow(struct 
ovn_port *op,
 ds_chomp(_src, ',');
 ds_put_cstr(_src, "}");

-ds_put_format(, "eth.src == %s && (arp.op == 1 || nd_ns)",
+ds_put_format(,
+  "eth.src == %s && (arp.op == 1 || rarp.op == 3 || nd_ns)",
   ds_cstr(_src));
 ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, priority, ds_cstr(),
   "outport = \""MC_FLOOD_L2"\"; output;");
@@ -7581,7 +7582,7 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
 lflows, stage_hint);
 }

-/* Self originated ARP requests/ND need to be flooded as usual.
+/* Self originated ARP requests/RARP/ND need to be flooded as usual.
  *
  * However, if the switch doesn't have any non-router ports we shouldn't
  * even try to flood.
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index a70f2e678..051f3dc6e 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1723,8 +1723,8 @@ output;

   
 Priority-75 flows for each port connected to a logical router
-matching self originated ARP request/ND packets.  These packets
-are flooded to the MC_FLOOD_L2 which contains all
+matching self originated ARP request/RARP request/ND packets.  These
+packets are flooded to the MC_FLOOD_L2 which contains all
 non-router logical ports.
   

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 4f399eccb..e849afd85 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4759,7 +4759,7 @@ AT_CHECK([grep "ls_in_l2_lkup" ls1_lflows | sed 
's/table=../table=??/' | sort],
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:01:01), action=(outport = "ls1-ro1"; output;)
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:01:02), action=(outport = "vm1"; output;)
   table=??(ls_in_l2_lkup  ), priority=70   , match=(eth.mcast), 
action=(outport = "_MC_flood"; output;)
-  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:01:01} && (arp.op == 1 || nd_ns)), action=(outport = 
"_MC_flood_l2"; output;)
+  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:01:01} && (arp.op == 1 || rarp.op == 3 || nd_ns)), action=(outport 
= "_MC_flood_l2"; output;)
   table=??(ls_in_l2_lkup  ), priority=80   , match=(flags[[1]] == 0 && 
arp.op == 1 && arp.tpa == 192.168.1.1), action=(clone {outport = "ls1-ro1"; 
output; }; outport = "_MC_flood_l2"; output;)
   table=??(ls_in_l2_lkup  ), priority=80   , match=(flags[[1]] == 0 && 
nd_ns && nd.target == fe80::200:ff:fe00:101), action=(clone {outport = 
"ls1-ro1"; output; }; outport = "_MC_flood_l2"; output;)
 ])
@@ -4771,7 +4771,7 @@ AT_CHECK([grep "ls_in_l2_lkup" ls2_lflows | sed 
's/table=../table=??/' | sort],
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:02:01), action=(outport = "ls2-ro2"; output;)
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:02:02), action=(outport = "vm2"; output;)
   table=??(ls_in_l2_lkup  ), priority=70   , 

[ovs-dev] [PATCH ovn v3 3/4] ovn-macros: support ipv6 in ovn_attach

2022-11-04 Thread Felix Hüttner via dev
in order to easily add future ipv6 test cases the common `ovn_attach`
function should also support ipv6 addresses.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 tests/ovn-macros.at | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index b234019a9..ee6e09d39 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -288,14 +288,19 @@ net_attach () {
 || return 1
 }

-# ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN]
+# ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN] [ENCAP]
 ovn_az_attach() {
 local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} encap=${6-geneve,vxlan}
 net_attach $net $bridge || return 1

 mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
 arp_table="$arp_table $sandbox,$bridge,$ip,$mac"
-ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
+if test -z $(echo $ip | sed '/:/d'); then
+ipversion="6"
+else
+ipversion="4"
+fi
+ovs-appctl netdev-dummy/ip${ipversion}addr $bridge $ip/$masklen >/dev/null 
|| return 1
 ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1

 local ovn_remote
@@ -329,7 +334,7 @@ ovn_az_attach() {
 start_daemon ovn-controller --enable-dummy-vif-plug || return 1
 }

-# ovn_attach NETWORK BRIDGE IP [MASKLEN]
+# ovn_attach NETWORK BRIDGE IP [MASKLEN] [ENCAP]
 #
 # First, this command attaches BRIDGE to interconnection network NETWORK, just
 # like "net_attach NETWORK BRIDGE".  Second, it configures (simulated) IP
--
2.38.1
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v3 1/4] logical-fields: add rarp fields

2022-11-04 Thread Felix Hüttner via dev
We need to be able to handle rarp fields in order to ensure we can
handle rarp messages we send ourselves.
This will be used by the next patch in the series.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 lib/logical-fields.c | 8 
 lib/ovn-util.c   | 2 +-
 ovn-sb.xml   | 2 ++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index ed3ec62e1..fc131791e 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -261,6 +261,14 @@ ovn_init_symtab(struct shash *symtab)
 expr_symtab_add_field(symtab, "arp.tpa", MFF_ARP_TPA, "arp", false);
 expr_symtab_add_field(symtab, "arp.tha", MFF_ARP_THA, "arp", false);

+/* RARPs use the same layout as arp packets -> use the same field_id */
+expr_symtab_add_predicate(symtab, "rarp", "eth.type == 0x8035");
+expr_symtab_add_field(symtab, "rarp.op", MFF_ARP_OP, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.spa", MFF_ARP_SPA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.sha", MFF_ARP_SHA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.tpa", MFF_ARP_TPA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.tha", MFF_ARP_THA, "rarp", false);
+
 expr_symtab_add_predicate(symtab, "nd",
   "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 255");
 expr_symtab_add_predicate(symtab, "nd_ns",
diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index 5dca72714..597625a29 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -817,7 +817,7 @@ ip_address_and_port_from_lb_key(const char *key, char 
**ip_address,
  *
  * This value is also used to handle some backward compatibility during
  * upgrading. It should never decrease or rewind. */
-#define OVN_INTERNAL_MINOR_VER 4
+#define OVN_INTERNAL_MINOR_VER 5

 /* Returns the OVN version. The caller must free the returned value. */
 char *
diff --git a/ovn-sb.xml b/ovn-sb.xml
index 315d60853..42e6fa3ee 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -1052,6 +1052,7 @@
 ip4.src ip4.dst
 ip6.src ip6.dst 
ip6.label
 arp.op arp.spa arp.tpa 
arp.sha arp.tha
+rarp.op rarp.spa rarp.tpa 
rarp.sha rarp.tha
 tcp.src tcp.dst 
tcp.flags
 udp.src udp.dst
 sctp.src sctp.dst
@@ -1115,6 +1116,7 @@
 ip.later_frag expands to ip.frag[1]
 ip.first_frag expands to ip.is_frag  
!ip.later_frag
 arp expands to eth.type == 0x806
+rarp expands to eth.type == 0x8035
 nd expands to icmp6.type == {135, 136} 
 icmp6.code == 0  ip.ttl == 255
 nd_ns expands to icmp6.type == 135  
icmp6.code == 0  ip.ttl == 255
 nd_na expands to icmp6.type == 136  
icmp6.code == 0  ip.ttl == 255
--
2.38.1
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v3 0/4] Send Rarps for ipv6 router lsp

2022-11-04 Thread Felix Hüttner via dev
previously garps/rarps where only sent for "external" lsp's if these
had an ipv4 address attached. For lsp's on gateway routers that do
not have an ipv4 address assigned (e.g. if they are ipv6 only) no
rarps were send out.

This causes traffic outages when changing the priority of a gateway
chassis as the phyiscal switches to not get the information where the
mac address now resides. To fix this we send out rarps with just the mac
address of the interface and no ip address.

This change has been tested in an environment with 600 logical routers
on a single ipv6 external network.

Additionally we fix the issue that self-created rarp's are flooded
to logical routers even if this is unnecessary (causing ovs to potentially
drop the packet because of too many resubmits).

This change is also available as a PR at https://github.com/ovn-org/ovn/pull/157

Changes since v2:
- simplified the support of ipv6 in ovn_attach
Changes since v1:
- fix documentation
- remove unnecessary ddlog change

Felix Huettner (4):
  logical-fields: add rarp fields
  northd: handle own rarps like garps
  ovn-macros: support ipv6 in ovn_attach
  pinctrl: Send RARPs for external ipv6 interfaces

 controller/pinctrl.c| 23 +++
 lib/logical-fields.c|  8 
 lib/ovn-util.c  |  2 +-
 northd/northd.c | 11 +++---
 northd/ovn-northd.8.xml |  4 +-
 ovn-sb.xml  |  2 +
 tests/ovn-macros.at | 11 --
 tests/ovn-northd.at | 18 -
 tests/ovn.at| 88 ++---
 9 files changed, 142 insertions(+), 25 deletions(-)

--
2.38.1
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2 2/4] northd: handle own rarps like garps

2022-11-03 Thread Felix Hüttner via dev
Previously graceful rarps sent from ovn-controller were handled as
normal packets and flooded to other routers. As the other routers should
already have that information, we can skip flooding (just like it is done
for GARPs already) and thereby mitigate ovs refusing to send the packet
because of too many resubmits.

This change has been tested in combination with the previous one in the
series and works well in environments which contain an external ipv6
network with 600 ovn logical routers.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 northd/northd.c | 11 ++-
 northd/ovn-northd.8.xml |  4 ++--
 tests/ovn-northd.at | 18 +-
 tests/ovn.at|  8 +++-
 4 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index b7388afc5..e1f3bace8 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -7336,8 +7336,8 @@ build_lrouter_groups(struct hmap *ports, struct ovs_list 
*lr_list)
 }

 /*
- * Ingress table 24: Flows that flood self originated ARP/ND packets in the
- * switching domain.
+ * Ingress table 24: Flows that flood self originated ARP/RARP/ND packets in
+ * the switching domain.
  */
 static void
 build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op,
@@ -7369,7 +7369,7 @@ build_lswitch_rport_arp_req_self_orig_flow(struct 
ovn_port *op,
 sset_add(_eth_addrs, nat->external_mac);
 }

-/* Self originated ARP requests/ND need to be flooded to the L2 domain
+/* Self originated ARP requests/RARP/ND need to be flooded to the L2 domain
  * (except on router ports).  Determine that packets are self originated
  * by also matching on source MAC. Matching on ingress port is not
  * reliable in case this is a VLAN-backed network.
@@ -7385,7 +7385,8 @@ build_lswitch_rport_arp_req_self_orig_flow(struct 
ovn_port *op,
 ds_chomp(_src, ',');
 ds_put_cstr(_src, "}");

-ds_put_format(, "eth.src == %s && (arp.op == 1 || nd_ns)",
+ds_put_format(,
+  "eth.src == %s && (arp.op == 1 || rarp.op == 3 || nd_ns)",
   ds_cstr(_src));
 ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, priority, ds_cstr(),
   "outport = \""MC_FLOOD_L2"\"; output;");
@@ -7581,7 +7582,7 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
 lflows, stage_hint);
 }

-/* Self originated ARP requests/ND need to be flooded as usual.
+/* Self originated ARP requests/RARP/ND need to be flooded as usual.
  *
  * However, if the switch doesn't have any non-router ports we shouldn't
  * even try to flood.
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index a70f2e678..051f3dc6e 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -1723,8 +1723,8 @@ output;

   
 Priority-75 flows for each port connected to a logical router
-matching self originated ARP request/ND packets.  These packets
-are flooded to the MC_FLOOD_L2 which contains all
+matching self originated ARP request/RARP request/ND packets.  These
+packets are flooded to the MC_FLOOD_L2 which contains all
 non-router logical ports.
   

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 4f399eccb..e849afd85 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4759,7 +4759,7 @@ AT_CHECK([grep "ls_in_l2_lkup" ls1_lflows | sed 
's/table=../table=??/' | sort],
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:01:01), action=(outport = "ls1-ro1"; output;)
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:01:02), action=(outport = "vm1"; output;)
   table=??(ls_in_l2_lkup  ), priority=70   , match=(eth.mcast), 
action=(outport = "_MC_flood"; output;)
-  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:01:01} && (arp.op == 1 || nd_ns)), action=(outport = 
"_MC_flood_l2"; output;)
+  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:01:01} && (arp.op == 1 || rarp.op == 3 || nd_ns)), action=(outport 
= "_MC_flood_l2"; output;)
   table=??(ls_in_l2_lkup  ), priority=80   , match=(flags[[1]] == 0 && 
arp.op == 1 && arp.tpa == 192.168.1.1), action=(clone {outport = "ls1-ro1"; 
output; }; outport = "_MC_flood_l2"; output;)
   table=??(ls_in_l2_lkup  ), priority=80   , match=(flags[[1]] == 0 && 
nd_ns && nd.target == fe80::200:ff:fe00:101), action=(clone {outport = 
"ls1-ro1"; output; }; outport = "_MC_flood_l2"; output;)
 ])
@@ -4771,7 +4771,7 @@ AT_CHECK([grep "ls_in_l2_lkup" ls2_lflows | sed 
's/table=../table=??/' | sort],
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:02:01), action=(outport = "ls2-ro2"; output;)
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:02:02), action=(outport = "vm2"; output;)
   table=??(ls_in_l2_lkup  ), priority=70   , 

[ovs-dev] [PATCH ovn v2 4/4] pinctrl: Send RARPs for external ipv6 interfaces

2022-11-03 Thread Felix Hüttner via dev
previously garps/rarps were only sent for NAT IPs if these had an
ipv4 address attached. For lsp's on gateway routers that do not have
an ipv4 address assigned (e.g. if they are ipv6 only) no rarps where
send out.

This causes traffic outages when changing the priority of a gateway
chassis as the physical switches to not get the information where the
mac address now resides. To fix this, we send out rarps with just the mac
address of the interface and no ip address.

This change has been tested in an environment with 600 logical routers
on a single ipv6 external network.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 controller/pinctrl.c | 23 +
 tests/ovn.at | 80 +---
 2 files changed, 99 insertions(+), 4 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 8859cb080..767fa02d8 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -4512,6 +4512,24 @@ send_garp_rarp_update(struct ovsdb_idl_txn 
*ovnsb_idl_txn,
 }
 free(name);
 }
+/*
+ * Send RARPs even if we do not have a ipv4 address as it e.g.
+ * happens on ipv6 only ports.
+ */
+if (laddrs->n_ipv4_addrs == 0) {
+char *name = xasprintf("%s-noip",
+   binding_rec->logical_port);
+garp_rarp = shash_find_data(_garp_rarp_data, name);
+if (garp_rarp) {
+garp_rarp->dp_key = binding_rec->datapath->tunnel_key;
+garp_rarp->port_key = binding_rec->tunnel_key;
+} else {
+add_garp_rarp(name, laddrs->ea,
+  0, binding_rec->datapath->tunnel_key,
+  binding_rec->tunnel_key);
+}
+free(name);
+}
 destroy_lport_addresses(laddrs);
 free(laddrs);
 }
@@ -5824,6 +5842,11 @@ consider_nat_address(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 sset_add(nat_address_keys, name);
 free(name);
 }
+if (laddrs->n_ipv4_addrs == 0) {
+char *name = xasprintf("%s-noip", pb->logical_port);
+sset_add(nat_address_keys, name);
+free(name);
+}
 shash_add(nat_addresses, pb->logical_port, laddrs);
 }

diff --git a/tests/ovn.at b/tests/ovn.at
index 3d54c9153..6b08ef124 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9017,6 +9017,76 @@ OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])

+OVN_FOR_EACH_NORTHD([
+AT_SETUP([send reverse arp for router without ipv4 address])
+ovn_start
+# Create logical switch
+ovn-nbctl ls-add ls0
+# Create gateway router
+ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1
+# Add router port to gateway router
+ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 fd12:3456:789a:1::1/64
+ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \
+type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"'
+# Add nat-address option
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router"
+
+net_add n1
+sim_add hv1
+as hv1
+ovs-vsctl \
+-- add-br br-phys \
+-- add-br br-eth0
+
+ovn_attach n1 br-phys fd12:3456:789a:1::1 64 6
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0])
+AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface snoopvif 
options:tx_pcap=hv1/snoopvif-tx.pcap options:rxq_pcap=hv1/snoopvif-rx.pcap])
+
+# Create a localnet port.
+AT_CHECK([ovn-nbctl lsp-add ls0 ln_port])
+AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown])
+AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet])
+AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1])
+
+# Wait until the patch ports are created to connect br-int to br-eth0
+OVS_WAIT_UNTIL([test 1 = `ovs-vsctl show | \
+grep "Port patch-br-int-to-ln_port" | wc -l`])
+
+ovn-sbctl list port_binding lrp0-rp
+echo "*"
+ovn-nbctl list logical_switch_port lrp0-rp
+ovn-nbctl list logical_router_port lrp0
+ovn-nbctl show
+# Wait for packet to be received.
+OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50])
+$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/snoopvif-tx.pcap  | sort | 
uniq > packets
+expected="f00180350001080006040003f001f001"
+echo $expected > expout
+AT_CHECK([sort packets], [0], [expout])
+
+# Temporarily remove nat-addresses option to avoid race conditions
+# due to GARP backoff
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=""
+
+as hv1 reset_pcap_file snoopvif hv1/snoopvif
+
+# Re-add nat-addresses option
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" 
exclude-lb-vips-from-garp="true"
+
+# Wait for packets to be received.
+OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50])
+

[ovs-dev] [PATCH ovn v2 3/4] ovn-macros: support ipv6 in ovn_attach

2022-11-03 Thread Felix Hüttner via dev
in order to easily add future ipv6 test cases the common `ovn_attach`
function should also support ipv6 addresses.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 tests/ovn-macros.at |  9 +
 tests/ovn.at| 22 +++---
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index b234019a9..9c7f60113 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -288,14 +288,14 @@ net_attach () {
 || return 1
 }

-# ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN]
+# ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN] [IPVERSION] [ENCAP]
 ovn_az_attach() {
-local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} encap=${6-geneve,vxlan}
+local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} ipversion=${6-4} 
encap=${7-geneve,vxlan}
 net_attach $net $bridge || return 1

 mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
 arp_table="$arp_table $sandbox,$bridge,$ip,$mac"
-ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
+ovs-appctl netdev-dummy/ip${ipversion}addr $bridge $ip/$masklen >/dev/null 
|| return 1
 ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1

 local ovn_remote
@@ -329,13 +329,14 @@ ovn_az_attach() {
 start_daemon ovn-controller --enable-dummy-vif-plug || return 1
 }

-# ovn_attach NETWORK BRIDGE IP [MASKLEN]
+# ovn_attach NETWORK BRIDGE IP [MASKLEN] [IPVERSION] [ENCAP]
 #
 # First, this command attaches BRIDGE to interconnection network NETWORK, just
 # like "net_attach NETWORK BRIDGE".  Second, it configures (simulated) IP
 # address IP (with network mask length MASKLEN, which defaults to 24) on
 # BRIDGE.  Finally, it configures the Open vSwitch database to work with OVN
 # and starts ovn-controller.
+# IPVERSION must be set to 6 for ipv6 addresses.
 ovn_attach() {
 ovn_az_attach NONE $@
 }
diff --git a/tests/ovn.at b/tests/ovn.at
index 184fc0fdd..3d54c9153 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -3534,7 +3534,7 @@ for i in 1 2; do
 as hv-$i
 check ovs-vsctl add-br br-phys
 check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
-ovn_attach net br-phys 192.168.0.$i 24 vxlan
+ovn_attach net br-phys 192.168.0.$i 24 4 vxlan
 done

 check ovn-nbctl ls-add ls
@@ -3983,7 +3983,7 @@ ovn_start
 net_add net
 check ovs-vsctl add-br br-phys
 check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
-ovn_attach net br-phys 192.168.0.1 24 vxlan
+ovn_attach net br-phys 192.168.0.1 24 4 vxlan
 check ovn-nbctl --wait=sb sync
 OVS_WAIT_UNTIL([ovn-sbctl get chassis main _uuid])

@@ -22432,7 +22432,7 @@ m4_define([DVR_N_S_ARP_HANDLING],
ovs-vsctl add-br br-phys
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
-   ovn_attach n1 br-phys 192.168.0.$i 24 $encap
+   ovn_attach n1 br-phys 192.168.0.$i 24 4 $encap

ovs-vsctl add-port br-int vif$i$i -- \
set Interface vif$i$i external-ids:iface-id=lp$i$i \
@@ -22473,14 +22473,14 @@ m4_define([DVR_N_S_ARP_HANDLING],
as hv3 ovs-vsctl add-br br-phys
as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as hv3 ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
-   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 $encap
+   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 4 $encap

# Add 4th hypervisor
sim_add hv4
as hv4 ovs-vsctl add-br br-phys
as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as hv4 ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
-   as hv4 ovn_attach n1 br-phys 192.168.0.4 24 $encap
+   as hv4 ovn_attach n1 br-phys 192.168.0.4 24 4 $encap

as hv4 ovs-vsctl add-port br-int vif-north -- \
set Interface vif-north external-ids:iface-id=lp-north \
@@ -22704,7 +22704,7 @@ m4_define([DVR_N_S_PING],
ovs-vsctl add-br br-phys
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
-   ovn_attach n1 br-phys 192.168.0.$i 24 $encap
+   ovn_attach n1 br-phys 192.168.0.$i 24 4 $encap

ovs-vsctl add-port br-int vif$i$i -- \
set Interface vif$i$i external-ids:iface-id=lp$i$i \
@@ -22745,14 +22745,14 @@ m4_define([DVR_N_S_PING],
as hv3 ovs-vsctl add-br br-phys
as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as hv3 ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
-   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 $encap
+   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 4 $encap

# Add 4th hypervisor
sim_add hv4
as hv4 ovs-vsctl add-br br-phys
as hv4 ovs-vsctl set open . 

[ovs-dev] [PATCH ovn v2 0/4] Send Rarps for ipv6 router lsp

2022-11-03 Thread Felix Hüttner via dev
previously garps/rarps where only sent for "external" lsp's if these
had an ipv4 address attached. For lsp's on gateway routers that do
not have an ipv4 address assigned (e.g. if they are ipv6 only) no
rarps were send out.

This causes traffic outages when changing the priority of a gateway
chassis as the phyiscal switches to not get the information where the
mac address now resides. To fix this we send out rarps with just the mac
address of the interface and no ip address.

This change has been tested in an environment with 600 logical routers
on a single ipv6 external network.

Additionally we fix the issue that self-created rarp's are flooded
to logical routers even if this is unnecessary (causing ovs to potentially
drop the packet because of too many resubmits).

This change is also available as a PR at https://github.com/ovn-org/ovn/pull/157

Changes since v1:
- fix documentation
- remove unnecessary ddlog change

Felix Huettner (4):
  logical-fields: add rarp fields
  northd: handle own rarps like garps
  ovn-macros: support ipv6 in ovn_attach
  pinctrl: Send RARPs for external ipv6 interfaces

 controller/pinctrl.c|  23 +
 lib/logical-fields.c|   8 +++
 lib/ovn-util.c  |   2 +-
 northd/northd.c |   9 ++--
 northd/ovn-northd.8.xml |   4 +-
 ovn-sb.xml  |   2 +
 tests/ovn-macros.at |   9 ++--
 tests/ovn-northd.at |  18 +++
 tests/ovn.at| 110 ++--
 9 files changed, 149 insertions(+), 36 deletions(-)

--
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2 1/4] logical-fields: add rarp fields

2022-11-03 Thread Felix Hüttner via dev
We need to be able to handle rarp fields in order to ensure we can
handle rarp messages we send ourselves.
This will be used by the next patch in the series.

Acked-by: Numan Siddique 
Signed-off-by: Felix Huettner 
---
 lib/logical-fields.c | 8 
 lib/ovn-util.c   | 2 +-
 ovn-sb.xml   | 2 ++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index ed3ec62e1..fc131791e 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -261,6 +261,14 @@ ovn_init_symtab(struct shash *symtab)
 expr_symtab_add_field(symtab, "arp.tpa", MFF_ARP_TPA, "arp", false);
 expr_symtab_add_field(symtab, "arp.tha", MFF_ARP_THA, "arp", false);

+/* RARPs use the same layout as arp packets -> use the same field_id */
+expr_symtab_add_predicate(symtab, "rarp", "eth.type == 0x8035");
+expr_symtab_add_field(symtab, "rarp.op", MFF_ARP_OP, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.spa", MFF_ARP_SPA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.sha", MFF_ARP_SHA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.tpa", MFF_ARP_TPA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.tha", MFF_ARP_THA, "rarp", false);
+
 expr_symtab_add_predicate(symtab, "nd",
   "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 255");
 expr_symtab_add_predicate(symtab, "nd_ns",
diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index 5dca72714..597625a29 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -817,7 +817,7 @@ ip_address_and_port_from_lb_key(const char *key, char 
**ip_address,
  *
  * This value is also used to handle some backward compatibility during
  * upgrading. It should never decrease or rewind. */
-#define OVN_INTERNAL_MINOR_VER 4
+#define OVN_INTERNAL_MINOR_VER 5

 /* Returns the OVN version. The caller must free the returned value. */
 char *
diff --git a/ovn-sb.xml b/ovn-sb.xml
index 315d60853..42e6fa3ee 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -1052,6 +1052,7 @@
 ip4.src ip4.dst
 ip6.src ip6.dst 
ip6.label
 arp.op arp.spa arp.tpa 
arp.sha arp.tha
+rarp.op rarp.spa rarp.tpa 
rarp.sha rarp.tha
 tcp.src tcp.dst 
tcp.flags
 udp.src udp.dst
 sctp.src sctp.dst
@@ -1115,6 +1116,7 @@
 ip.later_frag expands to ip.frag[1]
 ip.first_frag expands to ip.is_frag  
!ip.later_frag
 arp expands to eth.type == 0x806
+rarp expands to eth.type == 0x8035
 nd expands to icmp6.type == {135, 136} 
 icmp6.code == 0  ip.ttl == 255
 nd_ns expands to icmp6.type == 135  
icmp6.code == 0  ip.ttl == 255
 nd_na expands to icmp6.type == 136  
icmp6.code == 0  ip.ttl == 255
--
2.38.1
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 4/4] pinctrl: Send RARPs for external ipv6 interfaces

2022-10-24 Thread Felix Hüttner via dev
previously garps/rarps were only sent for NAT IPs if these had an
ipv4 address attached. For lsp's on gateway routers that do not have
an ipv4 address assigned (e.g. if they are ipv6 only) no rarps where
send out.

This causes traffic outages when changing the priority of a gateway
chassis as the physical switches to not get the information where the
mac address now resides. To fix this, we send out rarps with just the mac
address of the interface and no ip address.

This change has been tested in an environment with 600 logical routers
on a single ipv6 external network.

Signed-off-by: Felix Huettner 
---
 controller/pinctrl.c | 23 +
 tests/ovn.at | 80 +---
 2 files changed, 99 insertions(+), 4 deletions(-)

diff --git a/controller/pinctrl.c b/controller/pinctrl.c
index 8859cb080..767fa02d8 100644
--- a/controller/pinctrl.c
+++ b/controller/pinctrl.c
@@ -4512,6 +4512,24 @@ send_garp_rarp_update(struct ovsdb_idl_txn 
*ovnsb_idl_txn,
 }
 free(name);
 }
+/*
+ * Send RARPs even if we do not have a ipv4 address as it e.g.
+ * happens on ipv6 only ports.
+ */
+if (laddrs->n_ipv4_addrs == 0) {
+char *name = xasprintf("%s-noip",
+   binding_rec->logical_port);
+garp_rarp = shash_find_data(_garp_rarp_data, name);
+if (garp_rarp) {
+garp_rarp->dp_key = binding_rec->datapath->tunnel_key;
+garp_rarp->port_key = binding_rec->tunnel_key;
+} else {
+add_garp_rarp(name, laddrs->ea,
+  0, binding_rec->datapath->tunnel_key,
+  binding_rec->tunnel_key);
+}
+free(name);
+}
 destroy_lport_addresses(laddrs);
 free(laddrs);
 }
@@ -5824,6 +5842,11 @@ consider_nat_address(struct ovsdb_idl_index 
*sbrec_port_binding_by_name,
 sset_add(nat_address_keys, name);
 free(name);
 }
+if (laddrs->n_ipv4_addrs == 0) {
+char *name = xasprintf("%s-noip", pb->logical_port);
+sset_add(nat_address_keys, name);
+free(name);
+}
 shash_add(nat_addresses, pb->logical_port, laddrs);
 }

diff --git a/tests/ovn.at b/tests/ovn.at
index 3d54c9153..6b08ef124 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -9017,6 +9017,76 @@ OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])

+OVN_FOR_EACH_NORTHD([
+AT_SETUP([send reverse arp for router without ipv4 address])
+ovn_start
+# Create logical switch
+ovn-nbctl ls-add ls0
+# Create gateway router
+ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1
+# Add router port to gateway router
+ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 fd12:3456:789a:1::1/64
+ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \
+type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"'
+# Add nat-address option
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router"
+
+net_add n1
+sim_add hv1
+as hv1
+ovs-vsctl \
+-- add-br br-phys \
+-- add-br br-eth0
+
+ovn_attach n1 br-phys fd12:3456:789a:1::1 64 6
+
+AT_CHECK([ovs-vsctl set Open_vSwitch . 
external-ids:ovn-bridge-mappings=physnet1:br-eth0])
+AT_CHECK([ovs-vsctl add-port br-eth0 snoopvif -- set Interface snoopvif 
options:tx_pcap=hv1/snoopvif-tx.pcap options:rxq_pcap=hv1/snoopvif-rx.pcap])
+
+# Create a localnet port.
+AT_CHECK([ovn-nbctl lsp-add ls0 ln_port])
+AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown])
+AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet])
+AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1])
+
+# Wait until the patch ports are created to connect br-int to br-eth0
+OVS_WAIT_UNTIL([test 1 = `ovs-vsctl show | \
+grep "Port patch-br-int-to-ln_port" | wc -l`])
+
+ovn-sbctl list port_binding lrp0-rp
+echo "*"
+ovn-nbctl list logical_switch_port lrp0-rp
+ovn-nbctl list logical_router_port lrp0
+ovn-nbctl show
+# Wait for packet to be received.
+OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50])
+$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv1/snoopvif-tx.pcap  | sort | 
uniq > packets
+expected="f00180350001080006040003f001f001"
+echo $expected > expout
+AT_CHECK([sort packets], [0], [expout])
+
+# Temporarily remove nat-addresses option to avoid race conditions
+# due to GARP backoff
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=""
+
+as hv1 reset_pcap_file snoopvif hv1/snoopvif
+
+# Re-add nat-addresses option
+ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" 
exclude-lb-vips-from-garp="true"
+
+# Wait for packets to be received.
+OVS_WAIT_UNTIL([test `wc -c < "hv1/snoopvif-tx.pcap"` -ge 50])
+
+$PYTHON 

[ovs-dev] [PATCH ovn 3/4] ovn-macros: support ipv6 in ovn_attach

2022-10-24 Thread Felix Hüttner via dev
in order to easily add future ipv6 test cases the common `ovn_attach`
function should also support ipv6 addresses.

Signed-off-by: Felix Huettner 
---
 tests/ovn-macros.at |  9 +
 tests/ovn.at| 22 +++---
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index b234019a9..9c7f60113 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -288,14 +288,14 @@ net_attach () {
 || return 1
 }

-# ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN]
+# ovn_az_attach AZ NETWORK BRIDGE IP [MASKLEN] [IPVERSION] [ENCAP]
 ovn_az_attach() {
-local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} encap=${6-geneve,vxlan}
+local az=$1 net=$2 bridge=$3 ip=$4 masklen=${5-24} ipversion=${6-4} 
encap=${7-geneve,vxlan}
 net_attach $net $bridge || return 1

 mac=`ovs-vsctl get Interface $bridge mac_in_use | sed s/\"//g`
 arp_table="$arp_table $sandbox,$bridge,$ip,$mac"
-ovs-appctl netdev-dummy/ip4addr $bridge $ip/$masklen >/dev/null || return 1
+ovs-appctl netdev-dummy/ip${ipversion}addr $bridge $ip/$masklen >/dev/null 
|| return 1
 ovs-appctl ovs/route/add $ip/$masklen $bridge >/dev/null || return 1

 local ovn_remote
@@ -329,13 +329,14 @@ ovn_az_attach() {
 start_daemon ovn-controller --enable-dummy-vif-plug || return 1
 }

-# ovn_attach NETWORK BRIDGE IP [MASKLEN]
+# ovn_attach NETWORK BRIDGE IP [MASKLEN] [IPVERSION] [ENCAP]
 #
 # First, this command attaches BRIDGE to interconnection network NETWORK, just
 # like "net_attach NETWORK BRIDGE".  Second, it configures (simulated) IP
 # address IP (with network mask length MASKLEN, which defaults to 24) on
 # BRIDGE.  Finally, it configures the Open vSwitch database to work with OVN
 # and starts ovn-controller.
+# IPVERSION must be set to 6 for ipv6 addresses.
 ovn_attach() {
 ovn_az_attach NONE $@
 }
diff --git a/tests/ovn.at b/tests/ovn.at
index 184fc0fdd..3d54c9153 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -3534,7 +3534,7 @@ for i in 1 2; do
 as hv-$i
 check ovs-vsctl add-br br-phys
 check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
-ovn_attach net br-phys 192.168.0.$i 24 vxlan
+ovn_attach net br-phys 192.168.0.$i 24 4 vxlan
 done

 check ovn-nbctl ls-add ls
@@ -3983,7 +3983,7 @@ ovn_start
 net_add net
 check ovs-vsctl add-br br-phys
 check ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
-ovn_attach net br-phys 192.168.0.1 24 vxlan
+ovn_attach net br-phys 192.168.0.1 24 4 vxlan
 check ovn-nbctl --wait=sb sync
 OVS_WAIT_UNTIL([ovn-sbctl get chassis main _uuid])

@@ -22432,7 +22432,7 @@ m4_define([DVR_N_S_ARP_HANDLING],
ovs-vsctl add-br br-phys
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
-   ovn_attach n1 br-phys 192.168.0.$i 24 $encap
+   ovn_attach n1 br-phys 192.168.0.$i 24 4 $encap

ovs-vsctl add-port br-int vif$i$i -- \
set Interface vif$i$i external-ids:iface-id=lp$i$i \
@@ -22473,14 +22473,14 @@ m4_define([DVR_N_S_ARP_HANDLING],
as hv3 ovs-vsctl add-br br-phys
as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as hv3 ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
-   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 $encap
+   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 4 $encap

# Add 4th hypervisor
sim_add hv4
as hv4 ovs-vsctl add-br br-phys
as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as hv4 ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:44"
-   as hv4 ovn_attach n1 br-phys 192.168.0.4 24 $encap
+   as hv4 ovn_attach n1 br-phys 192.168.0.4 24 4 $encap

as hv4 ovs-vsctl add-port br-int vif-north -- \
set Interface vif-north external-ids:iface-id=lp-north \
@@ -22704,7 +22704,7 @@ m4_define([DVR_N_S_PING],
ovs-vsctl add-br br-phys
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:$i$i"
-   ovn_attach n1 br-phys 192.168.0.$i 24 $encap
+   ovn_attach n1 br-phys 192.168.0.$i 24 4 $encap

ovs-vsctl add-port br-int vif$i$i -- \
set Interface vif$i$i external-ids:iface-id=lp$i$i \
@@ -22745,14 +22745,14 @@ m4_define([DVR_N_S_PING],
as hv3 ovs-vsctl add-br br-phys
as hv3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as hv3 ovs-vsctl set open . 
external-ids:ovn-chassis-mac-mappings="phys:aa:bb:cc:dd:ee:33"
-   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 $encap
+   as hv3 ovn_attach n1 br-phys 192.168.0.3 24 4 $encap

# Add 4th hypervisor
sim_add hv4
as hv4 ovs-vsctl add-br br-phys
as hv4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys
as 

[ovs-dev] [PATCH ovn 2/4] northd: handle own rarps like garps

2022-10-24 Thread Felix Hüttner via dev
Previously graceful rarps sent from ovn-controller were handled as
normal packets and flooded to other routers. As the other routers should
already have that information, we can skip flooding (just like it is done
for GARPs already) and thereby mitigate ovs refusing to send the packet
because of too many resubmits.

This change has been tested in combination with the previous one in the
series and works well in environments which contain an external ipv6
network with 600 ovn logical routers.

Signed-off-by: Felix Huettner 
---
 northd/northd.c  |  9 +
 northd/ovn_northd.dl |  2 +-
 tests/ovn-northd.at  | 18 +-
 tests/ovn.at |  8 +++-
 4 files changed, 22 insertions(+), 15 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 6771ccce5..63054a775 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -7345,7 +7345,7 @@ build_lrouter_groups(struct hmap *ports, struct ovs_list 
*lr_list)
 }

 /*
- * Ingress table 24: Flows that flood self originated ARP/ND packets in the
+ * Ingress table 24: Flows that flood self originated ARP/RARP/ND packets in 
the
  * switching domain.
  */
 static void
@@ -7378,7 +7378,7 @@ build_lswitch_rport_arp_req_self_orig_flow(struct 
ovn_port *op,
 sset_add(_eth_addrs, nat->external_mac);
 }

-/* Self originated ARP requests/ND need to be flooded to the L2 domain
+/* Self originated ARP requests/RARP/ND need to be flooded to the L2 domain
  * (except on router ports).  Determine that packets are self originated
  * by also matching on source MAC. Matching on ingress port is not
  * reliable in case this is a VLAN-backed network.
@@ -7394,7 +7394,8 @@ build_lswitch_rport_arp_req_self_orig_flow(struct 
ovn_port *op,
 ds_chomp(_src, ',');
 ds_put_cstr(_src, "}");

-ds_put_format(, "eth.src == %s && (arp.op == 1 || nd_ns)",
+ds_put_format(,
+  "eth.src == %s && (arp.op == 1 || rarp.op == 3 || nd_ns)",
   ds_cstr(_src));
 ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, priority, ds_cstr(),
   "outport = \""MC_FLOOD_L2"\"; output;");
@@ -7590,7 +7591,7 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op,
 lflows, stage_hint);
 }

-/* Self originated ARP requests/ND need to be flooded as usual.
+/* Self originated ARP requests/RARP/ND need to be flooded as usual.
  *
  * However, if the switch doesn't have any non-router ports we shouldn't
  * even try to flood.
diff --git a/northd/ovn_northd.dl b/northd/ovn_northd.dl
index 2fe73959c..bdaa23d04 100644
--- a/northd/ovn_northd.dl
+++ b/northd/ovn_northd.dl
@@ -4636,7 +4636,7 @@ Flow(.logical_datapath = sw._uuid,
 eth_src_set
 },
 var eth_src = "{" ++ eth_src_set.to_vec().join(", ") ++ "}",
-var __match = i"eth.src == ${eth_src} && (arp.op == 1 || nd_ns)",
+var __match = i"eth.src == ${eth_src} && (arp.op == 1 || rarp.op == 3 || 
nd_ns)",
 var mc_flood_l2 = json_escape(mC_FLOOD_L2().0),
 var actions = i"outport = ${mc_flood_l2}; output;".

diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 7d879b642..c6e269fba 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4759,7 +4759,7 @@ AT_CHECK([grep "ls_in_l2_lkup" ls1_lflows | sed 
's/table=../table=??/' | sort],
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:01:01), action=(outport = "ls1-ro1"; output;)
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:01:02), action=(outport = "vm1"; output;)
   table=??(ls_in_l2_lkup  ), priority=70   , match=(eth.mcast), 
action=(outport = "_MC_flood"; output;)
-  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:01:01} && (arp.op == 1 || nd_ns)), action=(outport = 
"_MC_flood_l2"; output;)
+  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:01:01} && (arp.op == 1 || rarp.op == 3 || nd_ns)), action=(outport 
= "_MC_flood_l2"; output;)
   table=??(ls_in_l2_lkup  ), priority=80   , match=(flags[[1]] == 0 && 
arp.op == 1 && arp.tpa == 192.168.1.1), action=(clone {outport = "ls1-ro1"; 
output; }; outport = "_MC_flood_l2"; output;)
   table=??(ls_in_l2_lkup  ), priority=80   , match=(flags[[1]] == 0 && 
nd_ns && nd.target == fe80::200:ff:fe00:101), action=(clone {outport = 
"ls1-ro1"; output; }; outport = "_MC_flood_l2"; output;)
 ])
@@ -4771,7 +4771,7 @@ AT_CHECK([grep "ls_in_l2_lkup" ls2_lflows | sed 
's/table=../table=??/' | sort],
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:02:01), action=(outport = "ls2-ro2"; output;)
   table=??(ls_in_l2_lkup  ), priority=50   , match=(eth.dst == 
00:00:00:00:02:02), action=(outport = "vm2"; output;)
   table=??(ls_in_l2_lkup  ), priority=70   , match=(eth.mcast), 
action=(outport = "_MC_flood"; output;)
-  table=??(ls_in_l2_lkup  ), priority=75   , match=(eth.src == 
{00:00:00:00:02:01} && (arp.op 

[ovs-dev] [PATCH ovn 1/4] logical-fields: add rarp fields

2022-10-24 Thread Felix Hüttner via dev
We need to be able to handle rarp fields in order to ensure we can
handle rarp messages we send ourselves.
This will be used by the next patch in the series.

Signed-off-by: Felix Huettner 
---
 lib/logical-fields.c | 8 
 lib/ovn-util.c   | 2 +-
 ovn-sb.xml   | 2 ++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/lib/logical-fields.c b/lib/logical-fields.c
index ed3ec62e1..fc131791e 100644
--- a/lib/logical-fields.c
+++ b/lib/logical-fields.c
@@ -261,6 +261,14 @@ ovn_init_symtab(struct shash *symtab)
 expr_symtab_add_field(symtab, "arp.tpa", MFF_ARP_TPA, "arp", false);
 expr_symtab_add_field(symtab, "arp.tha", MFF_ARP_THA, "arp", false);

+/* RARPs use the same layout as arp packets -> use the same field_id */
+expr_symtab_add_predicate(symtab, "rarp", "eth.type == 0x8035");
+expr_symtab_add_field(symtab, "rarp.op", MFF_ARP_OP, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.spa", MFF_ARP_SPA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.sha", MFF_ARP_SHA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.tpa", MFF_ARP_TPA, "rarp", false);
+expr_symtab_add_field(symtab, "rarp.tha", MFF_ARP_THA, "rarp", false);
+
 expr_symtab_add_predicate(symtab, "nd",
   "icmp6.type == {135, 136} && icmp6.code == 0 && ip.ttl == 255");
 expr_symtab_add_predicate(symtab, "nd_ns",
diff --git a/lib/ovn-util.c b/lib/ovn-util.c
index 5dca72714..597625a29 100644
--- a/lib/ovn-util.c
+++ b/lib/ovn-util.c
@@ -817,7 +817,7 @@ ip_address_and_port_from_lb_key(const char *key, char 
**ip_address,
  *
  * This value is also used to handle some backward compatibility during
  * upgrading. It should never decrease or rewind. */
-#define OVN_INTERNAL_MINOR_VER 4
+#define OVN_INTERNAL_MINOR_VER 5

 /* Returns the OVN version. The caller must free the returned value. */
 char *
diff --git a/ovn-sb.xml b/ovn-sb.xml
index 315d60853..42e6fa3ee 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -1052,6 +1052,7 @@
 ip4.src ip4.dst
 ip6.src ip6.dst 
ip6.label
 arp.op arp.spa arp.tpa 
arp.sha arp.tha
+rarp.op rarp.spa rarp.tpa 
rarp.sha rarp.tha
 tcp.src tcp.dst 
tcp.flags
 udp.src udp.dst
 sctp.src sctp.dst
@@ -1115,6 +1116,7 @@
 ip.later_frag expands to ip.frag[1]
 ip.first_frag expands to ip.is_frag  
!ip.later_frag
 arp expands to eth.type == 0x806
+rarp expands to eth.type == 0x8035
 nd expands to icmp6.type == {135, 136} 
 icmp6.code == 0  ip.ttl == 255
 nd_ns expands to icmp6.type == 135  
icmp6.code == 0  ip.ttl == 255
 nd_na expands to icmp6.type == 136  
icmp6.code == 0  ip.ttl == 255
--
2.38.0
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn 0/4] Send Rarps for ipv6 router lsp

2022-10-24 Thread Felix Hüttner via dev
previously garps/rarps where only sent for "external" lsp's if these
had an ipv4 address attached. For lsp's on gateway routers that do
not have an ipv4 address assigned (e.g. if they are ipv6 only) no
rarps were send out.

This causes traffic outages when changing the priority of a gateway
chassis as the phyiscal switches to not get the information where the
mac address now resides. To fix this we send out rarps with just the mac
address of the interface and no ip address.

This change has been tested in an environment with 600 logical routers
on a single ipv6 external network.

Additionally we fix the issue that self-created rarp's are flooded
to logical routers even if this is unnecessary (causing ovs to potentially
drop the packet because of too many resubmits).

This change is also available as a PR at https://github.com/ovn-org/ovn/pull/157

Felix Huettner (4):
  logical-fields: add rarp fields
  northd: handle own rarps like garps
  ovn-macros: support ipv6 in ovn_attach
  pinctrl: Send RARPs for external ipv6 interfaces

 controller/pinctrl.c |  23 +
 lib/logical-fields.c |   8 
 lib/ovn-util.c   |   2 +-
 northd/northd.c  |   9 ++--
 northd/ovn_northd.dl |   2 +-
 ovn-sb.xml   |   2 +
 tests/ovn-macros.at  |   9 ++--
 tests/ovn-northd.at  |  18 +++
 tests/ovn.at | 110 ---
 9 files changed, 148 insertions(+), 35 deletions(-)

--
2.38.0
Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
hier.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev