On Fri, Feb 24, 2023 at 6:04 AM Felix Hüttner via dev
<[email protected]> wrote:
>
> Assume the following setup:
>
> +----------------+
> | Logical Router |
> | lr001          +-+
> +----------------+ |
>                    |
> +----------------+ |
> | Logical Router | | +----------------+ +------------------+
> | lr002          +-+-+ Logical Switch +-+ Phyiscal Network |
> +----------------+ | | ls-ext         | |                  |
>                    | +----------------+ +------------------+
>       ...          |
>                    |
> +----------------+ |
> | Logical Router | |
> | lr300          +-+
> +----------------+
>
> If a arp request for the ip of lr001 on ls-ext is now received it is
> only forwarded to that individual logical router.
>
> If we however now receive a arp request for an ip not used by any of
> lr001-lr300 we try to flood the arp request to all logical ports on ls-ext.
> With around 300 routers this causes the arp request to be dropped after
> some routers as we hit the 4096 resubmit limit.
>
> In the most cases forwarding the arp requests to the logical routers is
> pointless as we already know all of their ip addresses and they will
> therefor not be able to answer the arp requests anyway.
> Only if someone sends garps this is not the case. Then the request would
> need to be flooded to all logical routers.
>
> We can therefor not generally send these arp requests to MC_FLOOD_L2 as
> this would break garps. As we can also not detect garps we need to leave
> the solution to our users.
>
> To do this we introduce the other_config `broadcast-arps-to-all-routers`
> on logical switches (which is per default true). If set to false we add
> a logical flow that forwards arp requests where we do not know a
> specific target logical switch port to MC_FLOOD_L2, thereby bypassing
> all logical routers.
>
> Note that the testcase is quite flaky in the ci (as it takes verry long)
> but runs well locally. I'm unsure how to best proceed there.

Thanks for the patch and sorry for the delay in reviews.

I don't see any harm in supporting this option and the patch overall
looks ok to me.

Regarding the test case flakiness,  I don't think we want to introduce
a flaky test.
I'd suggest the following
    - Either find a solution to fix the flakiness in the CI or
   -  Drop this test case and add a test case in ovn-northd.at which
makes sure that when this
     option is set, ovn-northd generates the required logical flow and
when the option is cleared
     it clears the logical flow.  This should be sufficient IMO to
test this patch.


Also please add a NEWS entry for this new option.

Thanks
Numan



>
> Signed-off-by: Felix Huettner <[email protected]>
> ---
>  northd/northd.c         |   7 +++
>  northd/ovn-northd.8.xml |   7 +++
>  ovn-nb.xml              |  12 +++++
>  tests/ovn.at            | 114 ++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 140 insertions(+)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index c366b545e..6aff04cc5 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -8964,6 +8964,13 @@ build_lswitch_destination_lookup_bmcast(struct 
> ovn_datapath *od,
>              }
>          }
>
> +
> +        if (!smap_get_bool(&od->nbs->other_config, 
> "broadcast-arps-to-all-routers", true)) {
> +            ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 72,
> +                        "eth.mcast && (arp.op == 1 || nd_ns)",
> +                        "outport = \""MC_FLOOD_L2"\"; output;");
> +        }
> +
>          ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, 70, "eth.mcast",
>                        "outport = \""MC_FLOOD"\"; output;");
>      }
> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
> index 2eab2c4ae..033841383 100644
> --- a/northd/ovn-northd.8.xml
> +++ b/northd/ovn-northd.8.xml
> @@ -1873,6 +1873,13 @@ output;
>          non-router logical ports.
>        </li>
>
> +      <li>
> +        A priority-72 flow that outputs all ARP requests and ND packets with
> +        an Ethernet broadcast or multicast <code>eth.dst</code> to the
> +        <code>MC_FLOOD_L2</code> multicast group if
> +        <code>other_config:broadcast-arps-to-all-routers=true</code>.
> +      </li>
> +
>        <li>
>          A priority-70 flow that outputs all packets with an Ethernet 
> broadcast
>          or multicast <code>eth.dst</code> to the <code>MC_FLOOD</code>
> diff --git a/ovn-nb.xml b/ovn-nb.xml
> index 8d56d0c6e..a41d5b11f 100644
> --- a/ovn-nb.xml
> +++ b/ovn-nb.xml
> @@ -729,6 +729,18 @@
>          localnet ports, fabric traffic that belongs to other tagged networks 
> may
>          be passed through such a port.
>        </column>
> +
> +      <column name="other_config" key="broadcast-arps-to-all-routers"
> +          type='{"type": "boolean"}'>
> +        Determines whether arp requests and ipv6 neighbor solicitations 
> should
> +        be send to all routers and other switchports (default) or if it 
> should
> +        only be send to switchports where the ip/mac address is unknown.
> +        Setting this to false can significantly reduce the load if the 
> logical
> +        switch can receive arp requests for ips it does not know about.
> +        However setting this to false also means that garps are no longer
> +        forwarded to all routers and therefor the mac bindings of the routers
> +        are no longer updated.
> +      </column>
>      </group>
>
>      <group title="Common Columns">
> diff --git a/tests/ovn.at b/tests/ovn.at
> index dc5c5df3f..dfef5dacc 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -5048,6 +5048,120 @@ OVN_CLEANUP([hv1], [hv2], [hv3])
>  AT_CLEANUP
>  ])
>
> +# 1 hypervisor, 1 logical switch with 1 logical port, 300 logical router
> +OVN_FOR_EACH_NORTHD([
> +AT_SETUP([1 HVs, 1 LS, 2 lports/LS, 300 LR])
> +AT_KEYWORDS([slowtest])
> +ovn_start
> +
> +# Logical network:
> +#
> +# One logical switch ls1.
> +# 300 logical routers lr001 - lr299. All connected to ls1
> +# with one subnet:
> +#    lrp001 on ls1 for subnet 192.168.0.1/20
> +#    lrp002 on ls1 for subnet 192.168.0.2/20
> +#    ...
> +#    lrp101 on ls1 for subnet 192.168.1.1/20
> +#    ...
> +#    lrp299 on ls1 for subnet 192.168.2.99/20
> +#
> +# also 2 VIF on the first hypervisor.
> +#    lp-vif1 on ls1 for subnet 192.168.3.1/20
> +#    lp-vif2 on ls1 without ip
> +ovn-nbctl \
> +    -- ls-add ls1 \
> +    -- lsp-add ls1 lp-vif1 \
> +    -- lsp-set-addresses lp-vif1 \
> +        "f0:00:00:00:00:01 192.168.3.1" \
> +    -- lsp-add ls1 lp-vif2 \
> +    -- lsp-set-addresses lp-vif2 \
> +        "f0:00:00:00:00:02 unknown"
> +
> +for i in 0 1 2; do
> +    for j in `seq 1 99`; do
> +        hex=$(printf '%02x' $j)
> +        ovn-nbctl \
> +            -- lr-add lr$i$j \
> +            -- lrp-add lr$i$j lrp$i$j 00:00:00:00:f$i:$hex 192.168.$i.$j/20 \
> +            -- lsp-add ls1 lrp$i$j-attachment \
> +            -- set Logical_Switch_Port lrp$i$j-attachment type=router \
> +                             options:router-port=lrp$i$j \
> +                             addresses='"00:00:00:00:f'$i':'$hex'"'
> +    done
> +done
> +
> +# Physical network:
> +#
> +# One hypervisors hv1.
> +# lp-vif1 on hv1.
> +# lp-vif2 on hv1.
> +
> +net_add n1
> +sim_add hv1
> +as hv1
> +ovs-vsctl add-br br-phys
> +ovn_attach n1 br-phys 192.168.255.1
> +ovs-vsctl \
> +    -- add-port br-int vif1 \
> +    -- set Interface vif1 \
> +        external-ids:iface-id=lp-vif1 \
> +        options:tx_pcap=hv1/vif1-tx.pcap \
> +        options:rxq_pcap=hv1/vif1-rx.pcap \
> +        ofport-request=1 \
> +    -- add-port br-int vif2 \
> +    -- set Interface vif2 \
> +        external-ids:iface-id=lp-vif2 \
> +        options:tx_pcap=hv1/vif2-tx.pcap \
> +        options:rxq_pcap=hv1/vif2-rx.pcap \
> +        ofport-request=2
> +
> +ovs-vsctl show
> +ovs-vsctl list Interface
> +
> +# This might take longer than normally
> +export OVS_CTL_TIMEOUT=120
> +wait_for_ports_up
> +check ovn-nbctl --wait=hv --timeout=$OVS_CTL_TIMEOUT sync
> +
> +# now back to normal timeouts
> +export OVS_CTL_TIMEOUT=30
> +
> +# Pre-populate the hypervisors' ARP tables so that we don't lose any
> +# packets for ARP resolution (native tunneling doesn't queue packets
> +# for ARP resolution).
> +OVN_POPULATE_ARP
> +
> +send_arp() {
> +    ovs-appctl ofproto/trace br-int \
> +        
> arp,in_port=1,vlan_tci=0x0000,dl_src=f0:00:00:00:00:01,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=192.168.3.1,arp_tpa=1.2.3.4,arp_op=1,arp_sha=f0:00:00:00:00:01,arp_tha=ff:ff:ff:ff:ff:ff
> +}
> +
> +is_arp_dropped_on_recirc() {
> +    send_arp | grep "Too many resubmits"
> +}
> +
> +is_arp_received_by_vif2() {
> +    send_arp | grep "output:2"
> +}
> +
> +# we now send a test arp request that the routers can not answer.
> +# without the later setting this will die because of the recirc limit
> +
> +OVS_WAIT_UNTIL(is_arp_dropped_on_recirc)
> +
> +# now we reconfigure the logical switch to not forward these arp packets to
> +# the routers
> +ovn-nbctl set Logical_Switch ls1 
> other_config:broadcast-arps-to-all-routers=false
> +
> +OVS_WAIT_UNTIL(is_arp_received_by_vif2)
> +
> +# Gracefully terminate daemons
> +OVN_CLEANUP([hv1])
> +
> +AT_CLEANUP
> +])
> +
>  OVN_FOR_EACH_NORTHD([
>  AT_SETUP([IP relocation using GARP request])
>  ovn_start
> --
> 2.39.1
> Diese E Mail enthält möglicherweise vertrauliche Inhalte und ist nur für die 
> Verwertung durch den vorgesehenen Empfänger bestimmt. Sollten Sie nicht der 
> vorgesehene Empfänger sein, setzen Sie den Absender bitte unverzüglich in 
> Kenntnis und löschen diese E Mail. Hinweise zum Datenschutz finden Sie 
> hier<https://www.datenschutz.schwarz>.
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to