On Tue, May 23, 2023 at 11:56 PM Ihar Hrachyshka <[email protected]> wrote:
> This series fixes a non-optimal behavior with some multichassis ports. > > Specifically, > > - when a multichassis port belongs to a switch that also has a localnet > port, > - because ingress and egress traffic for the port is funnelled through > tunnels to guarantee delivery of packets to all chassis that host the > port, > - because tunnels add overhead to each funnelled packet, > - leaving less space for actual packets, > > ... the effective MTU of the path for multichassis ports is reduced by > the size that takes the tunnel to wrap a packet around. (This depends on > the type of tunnel, also on IP version of the tunnel.) > > When this happens, the port and its peers are not notified about the > reduced MTU for the path to the port, effectively blackholing some of > the larger packets. > > This patch series makes OVN detect these scenarios and handle them as > follows: > > - for all multichassis ports, 4 flows are added - for inport and > outport matches - to detect "too-big" IP packets; > - for all "too-big" packets, 2 flows are added to produce a ICMP > Fragmentation Needed (for IPv4) or Too Big (for IPv6) packet and > return it to the offending sender. > > The proposed implementation is isolated in ovn-controller. An > alternative would be to implement flows producing ICMP errors via the > existing Logical_Flow action icmp4_error, as is done for router ports. > In this case, the 4 flows that detect "too-big" packets would still have > to reside in ovn-controller because of limitations of the current OVN > port model, namely lack of `mtu` as an attribute of OVN logical port. > > Caveats: this works for IP traffic only. The party receiving the ICMP > error should handle it by temporarily lowering the MTU used to reach the > port. Behavior may depend on protocol implementation of the offending > peer as well as its networking stack configuration. > > This patch series was successfully tested with Linux kernel 6.0.8. > > This patch series is designed to simplify backporting process. It is > split in logical pieces to simplify cherry-picking. I consider the > original behavior described at the start of the cover letter a bug and > worth a backport. > > --- > v1: - initial version. > v2: - more system test cases adjusted to new table numbers for egress > pipeline; > - switch from xxd to coreutils (tr, head) tools to avoid a new > dependency; > - adjusted a comment in the new test case to avoid "migrator" > terminology that makes no sense outside live migration context - > which is not the only potential use case for multichassis ports; > - guard the new test case with HAVE_SCAPY; > - nit: rewrapped some lines in patch 6/7 to avoid unnecessary line > changes; > - added a NEWS entry for patch 6/7 that implements the core of the > feature; > - rebased to latest main. > v3: - expose and reuse lib/actions.c:encode_[start|finish]_controller_op > functions; > - reuse ETH_HEADER_LEN, IP_HEADER_LEN, and IPV6_HEADER_LEN from > ovs:lib/packets.h; > - add a comment for REGBIT_PKT_LARGER that mentions that the > register is also used in OFTABLE_OUTPUT_LARGE_PKT_DETECT table; > - squash the patch that makes if-status-mgr track changes to > interface mtu into the patch that introduces the mtu field in the > manager; > - doc: don't describe path discovery flows in the patch that > introduces new tables (only in the patch that actually implements > the flows); > - nit: use MAX to shave off several lines of code; > - nit: remove redundant if_status_mgr_iface_get_mtu declaration from > a header file; > - rebased to latest main; > - updated acks based on latest review comments. > v4: - fix compilation issue in the middle of the series due to previous > commit rearrangement. > v5: - introduce a new if-status-mgr input to pflow_output, then process > OVS interface updates inside if-status-mgr handlers. > - if-status-mgr: remove set_mtu public function, instead other > modules should call to if_status_mgr_iface_update. This allows the > callers to not care which particular fields if-status-mgr would > like to persist from the OVS record, keeping concerns separated. > v6: - update more tests in ovn-controller.at to reflect new table > numbers. > v7: - remove ovs submodule bump that creeped in by mistake. > --- > > Ihar Hrachyshka (5): > Track ip version of tunnel in chassis_tunnel struct > Track interface MTU in if-status-mgr > if-status: track interfaces for additional chassis > Add new egress tables to accommodate for too-big packets handling > Implement MTU Path Discovery for multichassis ports > > NEWS | 6 + > controller/binding.c | 48 +-- > controller/binding.h | 4 + > controller/if-status.c | 48 ++- > controller/if-status.h | 8 +- > controller/lflow.c | 4 +- > controller/lflow.h | 49 +-- > controller/local_data.c | 2 + > controller/local_data.h | 1 + > controller/ovn-controller.c | 122 +++++++ > controller/ovsport.c | 9 + > controller/ovsport.h | 2 + > controller/physical.c | 337 ++++++++++++++++++-- > controller/physical.h | 2 + > controller/pinctrl.c | 8 +- > include/ovn/actions.h | 3 + > lib/actions.c | 4 +- > lib/ovn-util.h | 7 + > northd/northd.c | 2 + > ovn-architecture.7.xml | 76 ++--- > tests/ovn-controller.at | 298 +++++++++--------- > tests/ovn.at | 611 +++++++++++++++++++++++++++--------- > tests/system-ovn-kmod.at | 2 +- > tests/system-ovn.at | 8 +- > 24 files changed, 1241 insertions(+), 420 deletions(-) > > -- > 2.38.1 > > _______________________________________________ > dev mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > The whole series looks good to me, thanks. Reviewed-by: Ales Musil <[email protected]> -- Ales Musil Senior Software Engineer - OVN Core Red Hat EMEA <https://www.redhat.com> [email protected] IM: amusil <https://red.ht/sig> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
