On Tue, Apr 12, 2022 at 8:06 PM Ihar Hrachyshka <[email protected]> wrote: > > There's a discussion to have on the intended behavior of a port with 2+ > chassis when the switch is attached to localnet. > > In general, the patch series makes all chassis switch to tunneling when > multiple chassis are set. (See patch 11/15 "Clone packets to both port > chassis" of the series.) This is done to guarantee delivery and to clone > packets between multiple locations using smart flow rules. But because > interfaces used to transfer tunneled packets may have different MTU > properties from localnet, patch 14/15 "Allow to disable tunneling > enforcement for multi-chassis port" introduces an option to enforce > using localnet port to communicate between chassis. > > The question arises of the intended behavior when localnet communication > is enforced. The problem is that each port binding location will > generate ARP replies (gratuitous or otherwise) for the same MAC address, > that will then be analyzed by upstream switches to update their ARP > tables. Because of this, upstream switches will flip from one location > to another. This is not optimal / wrong. > > We could designate one chassis as "main" (first in requested-chassis > list) and generate gratuitous ARPs on this chassis only. But any packet > generated by any other chassis hosting the same port binding will still > make the upstream switch update its ARP tables, still making the port flip. > > With that in mind, 1) if there is a way to avoid flipping problems while > hosting the same MAC address in two locations that I am not aware of? > and 2) if not, do we really want to support localnet connectivity for 2+ > chassis port bindings [as implemented in this patch]? > > There is also an option to not support 2+ chassis for localnet attached > logical switches. This would be unfortunate because in environments > where MTU discrepancy is not an issue the usual approach of switching to > tunneling works fine. > > Thoughts?
OVN supports the option - ovn-chassis-mac-mappings (which allows to configure a unique mac for each chassis) and this option is used to support routing between provider logical switches. Can we make use of this here ? i.e when the VM sends out the ARP/GARP ovn-controller will modify the source mac to the chassis unique mac before sending out via the patch port. I'm not sure if it's feasible or not. Probably you can explore and see. Meanwhile I'd suggest not supporting 2+ chassis for localnet attached logical switches to start with and document it. Thanks Numan > > Ihar > > On 3/29/22 8:47 PM, Ihar Hrachyshka wrote: > > When chassis-mirroring-enabled is set, controller will not enforce > > tunneling for localnet-attached switches. This may be useful when the > > network used to deliver tunnel packets doesn't have MTU headway to > > redirect all packets originally sent through localnet port. > > > > Signed-off-by: Ihar Hrachyshka <[email protected]> > > --- > > controller/physical.c | 12 ++++++-- > > ovn-nb.xml | 11 ++++++++ > > ovn-sb.xml | 11 ++++++++ > > tests/ovn.at | 66 ++++++++++++++++++++++++++++++++++++++----- > > 4 files changed, 90 insertions(+), 10 deletions(-) > > > > diff --git a/controller/physical.c b/controller/physical.c > > index 02b771f3b..df5fe8cb3 100644 > > --- a/controller/physical.c > > +++ b/controller/physical.c > > @@ -1257,6 +1257,9 @@ consider_port_binding(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > } > > } > > > > + bool mirroring_enabled = smap_get_bool( > > + &binding->options, "chassis-mirroring-enabled", true); > > + > > bool is_ha_remote = false; > > const struct chassis_tunnel *tun = NULL; > > const struct sbrec_port_binding *localnet_port = > > @@ -1266,7 +1269,8 @@ consider_port_binding(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > is_remote = true; > > /* Enforce tunneling while we clone packets to additional chassis > > b/c > > * otherwise upstream switch won't flood the packet to both > > chassis. */ > > - if (localnet_port && !binding->additional_chassis) { > > + if (localnet_port && > > + (!binding->additional_chassis || !mirroring_enabled)) { > > ofport = u16_to_ofp(simap_get(patch_ofports, > > localnet_port->logical_port)); > > if (!ofport) { > > @@ -1316,8 +1320,10 @@ consider_port_binding(struct ovsdb_idl_index > > *sbrec_port_binding_by_name, > > } > > > > /* Clone packets to additional chassis if needed. */ > > - additional_tuns = get_additional_tunnels(binding, chassis, > > - chassis_tunnels); > > + if (mirroring_enabled) { > > + additional_tuns = get_additional_tunnels(binding, chassis, > > + chassis_tunnels); > > + } > > > > if (!is_remote) { > > /* Packets that arrive from a vif can belong to a VM or > > diff --git a/ovn-nb.xml b/ovn-nb.xml > > index 3fd6c5140..c3ca7ba36 100644 > > --- a/ovn-nb.xml > > +++ b/ovn-nb.xml > > @@ -1022,6 +1022,17 @@ > > additional chassis that are allowed to bind the same port. > > </column> > > > > + <column name="options" key="chassis-mirroring-enabled"> > > + When multiple chassis are set for the port, and the logical > > switch is > > + connected to an external network through a <code>localnet</code> > > + port, tunneling is enforced for the port to guarantee delivery of > > + packets directed to the port to all its locations. This has MTU > > + implications because the network used for tunneling must have > > headway > > + for additional tunnel headers attached to packets. If the > > network is > > + not capable to deliver packets with additional tunnel headers, > > then > > + tunneling enforcement can be disabled using this option. > > + </column> > > + > > <column name="options" key="activation-strategy"> > > If set and used with <ref > > column="requested-additional-chassis"/>, > > specifies an activation strategy for the additional chassis. By > > diff --git a/ovn-sb.xml b/ovn-sb.xml > > index e8f91f351..dc276916a 100644 > > --- a/ovn-sb.xml > > +++ b/ovn-sb.xml > > @@ -3285,6 +3285,17 @@ tcp.flags = RST; > > chassis that are allowed to bind the same port. > > </column> > > > > + <column name="options" key="chassis-mirroring-enabled"> > > + When multiple chassis are set for the port, and the logical switch > > is > > + connected to an external network through a <code>localnet</code> > > + port, tunneling is enforced for the port to guarantee delivery of > > + packets directed to the port to all its locations. This has MTU > > + implications because the network used for tunneling must have > > headway > > + for additional tunnel headers attached to packets. If the network > > is > > + not capable to deliver packets with additional tunnel headers, then > > + tunneling enforcement can be disabled using this option. > > + </column> > > + > > <column name="options" key="activation-strategy"> > > If used with multiple chassis set in <ref > > column="requested-chassis"/>, > > specifies an activation strategy for all additional chassis. By > > diff --git a/tests/ovn.at b/tests/ovn.at > > index c8992608e..760a4fd58 100644 > > --- a/tests/ovn.at > > +++ b/tests/ovn.at > > @@ -14227,10 +14227,12 @@ check ovn-nbctl lsp-add ls0 first > > check ovn-nbctl lsp-add ls0 second > > check ovn-nbctl lsp-add ls0 third > > check ovn-nbctl lsp-add ls0 migrator > > +check ovn-nbctl lsp-add ls0 migrator-no-cloning > > check ovn-nbctl lsp-set-addresses first "00:00:00:00:00:01 10.0.0.1" > > check ovn-nbctl lsp-set-addresses second "00:00:00:00:00:02 10.0.0.2" > > check ovn-nbctl lsp-set-addresses third "00:00:00:00:00:03 10.0.0.3" > > check ovn-nbctl lsp-set-addresses migrator "00:00:00:00:00:ff 10.0.0.100" > > +check ovn-nbctl lsp-set-addresses migrator-no-cloning "00:00:00:00:01:ff > > 10.0.1.100" > > > > check ovn-nbctl lsp-add ls0 public > > check ovn-nbctl lsp-set-type public localnet > > @@ -14242,6 +14244,9 @@ check ovn-nbctl lsp-set-options public > > network_name=phys > > # chassis locations. Connectivity will be checked for resources located > > at hv1 > > # (First) and hv2 (Second) as well as for hv3 (Third) that does not take > > part > > # in port migration. > > +# > > +# Also, check that a port with chassis-mirroring-enabled=false doesn't > > clone > > +# packets through an enforced tunnel; instead localnet port still used > > check ovn-nbctl lsp-set-options first requested-chassis=hv1 > > check ovn-nbctl lsp-set-options second requested-chassis=hv2 > > check ovn-nbctl lsp-set-options third requested-chassis=hv3 > > @@ -14265,6 +14270,10 @@ for hv in hv1 hv2; do > > set Interface migrator external-ids:iface-id=migrator \ > > options:tx_pcap=$hv/migrator-tx.pcap \ > > options:rxq_pcap=$hv/migrator-rx.pcap > > + as $hv check ovs-vsctl -- add-port br-int migrator-no-cloning -- \ > > + set Interface migrator-no-cloning > > external-ids:iface-id=migrator-no-cloning \ > > + options:tx_pcap=$hv/migrator-no-cloning-tx.pcap \ > > + options:rxq_pcap=$hv/migrator-no-cloning-rx.pcap > > done > > > > send_arp() { > > @@ -14298,8 +14307,12 @@ reset_env() { > > reset_pcap_file hv3 third hv3/third > > reset_pcap_file hv1 migrator hv1/migrator > > reset_pcap_file hv2 migrator hv2/migrator > > + reset_pcap_file hv1 migrator-no-cloning hv1/migrator-no-cloning > > + reset_pcap_file hv2 migrator-no-cloning hv2/migrator-no-cloning > > > > - for port in hv1/migrator hv2/migrator hv1/first hv2/second hv3/third; > > do > > + for port in hv1/migrator hv2/migrator \ > > + hv1/migrator-no-cloning hv2/migrator-no-cloning \ > > + hv1/first hv2/second hv3/third; do > > : > $port.expected > > done > > } > > @@ -14309,12 +14322,15 @@ check_packets() { > > # attachment, hence using CONTAIN instead of strict matching > > OVN_CHECK_PACKETS_CONTAIN([hv1/migrator-tx.pcap], > > [hv1/migrator.expected]) > > OVN_CHECK_PACKETS_CONTAIN([hv2/migrator-tx.pcap], > > [hv2/migrator.expected]) > > + OVN_CHECK_PACKETS_CONTAIN([hv1/migrator-no-cloning-tx.pcap], > > [hv1/migrator-no-cloning.expected]) > > + OVN_CHECK_PACKETS_CONTAIN([hv2/migrator-no-cloning-tx.pcap], > > [hv2/migrator-no-cloning.expected]) > > OVN_CHECK_PACKETS_CONTAIN([hv1/first-tx.pcap], [hv1/first.expected]) > > OVN_CHECK_PACKETS_CONTAIN([hv2/second-tx.pcap], [hv2/second.expected]) > > OVN_CHECK_PACKETS_CONTAIN([hv3/third-tx.pcap], [hv3/third.expected]) > > } > > > > migrator_tpa=$(ip_to_hex 10 0 0 100) > > +nocloning_tpa=$(ip_to_hex 10 0 1 100) > > first_spa=$(ip_to_hex 10 0 0 1) > > second_spa=$(ip_to_hex 10 0 0 2) > > third_spa=$(ip_to_hex 10 0 0 3) > > @@ -14336,7 +14352,8 @@ wait_column "" Port_Binding > > requested_additional_chassis logical_port=migrator > > wait_for_ports_up > > > > # advertise location of ports through localnet port > > -send_garp hv1 migrator 0000000000ff ffffffffffff $migrator_spa > > $migrator_tpa > > +send_garp hv1 migrator 0000000000ff ffffffffffff $migrator_tpa > > $migrator_tpa > > +send_garp hv1 migrator-no-cloning 0000000001ff ffffffffffff $nocloning_tpa > > $nocloning_tpa > > send_garp hv1 first 000000000001 ffffffffffff $first_spa $first_tpa > > send_garp hv2 second 000000000002 ffffffffffff $second_spa $second_tpa > > send_garp hv3 third 000000000003 ffffffffffff $third_spa $third_tpa > > @@ -14404,13 +14421,19 @@ request=$(send_arp hv2 migrator 0000000000ff > > ffffffffffff $migrator_tpa $first_s > > check_packets > > reset_env > > > > -# Start port migration hv1 -> hv2: both hypervisors are now bound > > +# Start ports migration hv1 -> hv2: both hypervisors are now bound > > check ovn-nbctl lsp-set-options migrator requested-chassis=hv1,hv2 > > +check ovn-nbctl lsp-set-options migrator-no-cloning > > requested-chassis=hv1,hv2 \ > > + > > chassis-mirroring-enabled=false > > wait_for_ports_up > > -wait_column "$hv1_uuid" Port_Binding chassis logical_port=migrator > > -wait_column "$hv1_uuid" Port_Binding requested_chassis > > logical_port=migrator > > -wait_column "$hv2_uuid" Port_Binding additional_chassis > > logical_port=migrator > > -wait_column "$hv2_uuid" Port_Binding requested_additional_chassis > > logical_port=migrator > > +for port in migrator migrator-no-cloning; do > > + wait_column "$hv1_uuid" Port_Binding chassis logical_port=$port > > + wait_column "$hv1_uuid" Port_Binding requested_chassis > > logical_port=$port > > + wait_column "$hv2_uuid" Port_Binding additional_chassis > > logical_port=$port > > + wait_column "$hv2_uuid" Port_Binding requested_additional_chassis > > logical_port=$port > > +done > > + > > +send_garp hv1 migrator-no-cloning 0000000001ff ffffffffffff $nocloning_tpa > > $nocloning_tpa > > > > # check that... > > # unicast from First arrives to hv1:Migrator > > @@ -14419,11 +14442,20 @@ request=$(send_arp hv1 first 000000000001 > > 0000000000ff $first_spa $migrator_tpa) > > echo $request >> hv1/migrator.expected > > echo $request >> hv2/migrator.expected > > > > +# unicast from First arrives to hv1:Migrator-no-cloning > > +# unicast from First doesn't arrive to hv2:Migrator-no-cloning > > +request=$(send_arp hv1 first 000000000001 0000000001ff $first_spa > > $nocloning_tpa) > > +echo $request >> hv1/migrator-no-cloning.expected > > + > > # mcast from First arrives to hv1:Migrator > > # mcast from First arrives to hv2:Migrator > > +# mcast from First arrives to hv1:Migrator-no-cloning > > +# mcast from First arrives to hv2:Migrator-no-cloning > > request=$(send_arp hv1 first 000000000001 ffffffffffff $first_spa > > $migrator_tpa) > > echo $request >> hv1/migrator.expected > > echo $request >> hv2/migrator.expected > > +echo $request >> hv1/migrator-no-cloning.expected > > +echo $request >> hv2/migrator-no-cloning.expected > > echo $request >> hv3/third.expected > > echo $request >> hv2/second.expected > > > > @@ -14433,11 +14465,20 @@ request=$(send_arp hv2 second 000000000002 > > 0000000000ff $second_spa $migrator_tp > > echo $request >> hv1/migrator.expected > > echo $request >> hv2/migrator.expected > > > > +# unicast from Second doesn't arrive to hv1:Migrator-no-cloning > > +# unicast from Second arrives to hv2:Migrator-no-cloning > > +request=$(send_arp hv2 second 000000000002 0000000001ff $second_spa > > $nocloning_tpa) > > +echo $request >> hv2/migrator-no-cloning.expected > > + > > # mcast from Second arrives to hv1:Migrator > > # mcast from Second arrives to hv2:Migrator > > +# mcast from Second arrives to hv1:Migrator-no-cloning > > +# mcast from Second arrives to hv2:Migrator-no-cloning > > request=$(send_arp hv2 second 000000000002 ffffffffffff $second_spa > > $migrator_tpa) > > echo $request >> hv1/migrator.expected > > echo $request >> hv2/migrator.expected > > +echo $request >> hv1/migrator-no-cloning.expected > > +echo $request >> hv2/migrator-no-cloning.expected > > echo $request >> hv3/third.expected > > echo $request >> hv1/first.expected > > > > @@ -14447,11 +14488,20 @@ request=$(send_arp hv3 third 000000000003 > > 0000000000ff $third_spa $migrator_tpa) > > echo $request >> hv1/migrator.expected > > echo $request >> hv2/migrator.expected > > > > +# unicast from Third arrives to hv1:Migrator-no-cloning > > +# unicast from Third doesn't arrive to hv2:Migrator-no-cloning > > +request=$(send_arp hv3 third 000000000003 0000000001ff $third_spa > > $nocloning_tpa) > > +echo $request >> hv1/migrator-no-cloning.expected > > + > > # mcast from Third arrives to hv1:Migrator > > # mcast from Third arrives to hv2:Migrator > > +# mcast from Third arrives to hv1:Migrator-no-cloning > > +# mcast from Third arrives to hv2:Migrator-no-cloning > > request=$(send_arp hv3 third 000000000003 ffffffffffff $third_spa > > $migrator_tpa) > > echo $request >> hv1/migrator.expected > > echo $request >> hv2/migrator.expected > > +echo $request >> hv1/migrator-no-cloning.expected > > +echo $request >> hv2/migrator-no-cloning.expected > > echo $request >> hv1/first.expected > > echo $request >> hv2/second.expected > > > > @@ -14476,12 +14526,14 @@ request=$(send_arp hv1 migrator 0000000000ff > > ffffffffffff $migrator_tpa $first_s > > echo $request >> hv1/first.expected > > echo $request >> hv2/second.expected > > echo $request >> hv3/third.expected > > +echo $request >> hv1/migrator-no-cloning.expected > > > > # mcast from hv2:Migrator arrives to First, Second, and Third > > request=$(send_arp hv2 migrator 0000000000ff ffffffffffff $migrator_tpa > > $first_spa) > > echo $request >> hv1/first.expected > > echo $request >> hv2/second.expected > > echo $request >> hv3/third.expected > > +echo $request >> hv1/migrator-no-cloning.expected > > > > check_packets > > > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
