On Tue, Oct 14, 2025 at 03:44:11PM +0200, Max Lamprecht via discuss wrote: > Hi everyone, > > In the last OVN Community Meeting we talked about synchronizing connection > tracking information.
Hi everyone, i just wanted to add an idea for how this could be done, then we can see where it will take us. > > Our primary goal is to ensure seamless failover during maintenance > (preserve stateful connections to reduce disruptions) > We have identified two key use cases, with the first being our priority: > - Live Migration of VMs(openstack) with Multi-Chassis Port Bindings in OVN > - LRP Failover on Gateway Chassis What would definately be needed here would be some coordinated failover method. For live-migration we already have this with "activation-strategy". For LRP failovers something like this is not available. I would therefor focus on the Live Migration case for now. The high level approach would need to be: 1. The CMS tells OVN that a LSP should have a secondary requested chassis, adds an activation-strategy and some information to activate conntrack syncing. 2. The ovn-controller on the secondary chassis binds the port and adds itself the southbound. In this process it already allocates a conntrack zone for the port. 3. The ovn-controller on the secondary chassis would then need to start accepting incoming conntrack information in some way (or delegate that to some outside tool). That information should be key'ed by the LSP UUID (or a similar identifier). Incoming information should only be accepted from the primary chassis (e.g. via ip filtering) 4. The ovn-controller on the primary chassis would need to start sending conntrack information in some way (or delegate it). This needs to be reliable in some way, so that we only start sending if the receiving side is actually ready (maybe signaled with LSP status). 5. The primary chassis needs to send an initial dump of the conntrack information and afterwards send changes for each change in the source conntrack zone. 6. At some point on the secondary chassis the activation-strategy triggers (e.g. live-migration has finished). The ovn-controller there will enable the local port and set itself as primary chassis in the Port_Binding. At the same time the secondary chassis must stop accepting conntrack information from other chassis. 7. The primary chassis can stop sending conntrack information now If we need some kind of communication channel between the two chassis that we can rely on being available we could use the existing tunnels. The tunnel-id 0 is afaik already only used for OVN purposed (e.g. BFD) so we could send conntrack information this way as well. > > As we plan to move to DPDK in the future, the ideal mechanism would support > the kernel and userspace datapath. > > During our conversation we had the following thoughts: > - external agent such as conntrackd? how does it work with dpdk? > - some syncing logic in ovs-vswitchd between nodes In order to support DPDK ovs-vswitchd must be in some way part of the solution, there is no other component that would have conntrack information available. ovs-vswitchd could also listen to the kernel conntrack table, but it would not be necessary there. My feeling is that trying out an implementation in ovs-vswitchd would be the most direct approach. There we could also do some kind of "rewrite" for conntrack zones. I would probably go with expanding Bridge other_config and add other_config:ct-zone-replicate in the form of: `<ZONE-ID>,<UID>,<Type>,<Remote>;...` Where: * ZONE-ID: the id of the conntrack zone * UID: some globally unique id that needs to match on source and destination * Type: "Send" or "Receive" * Remote: Name of the tunnel port to send this over This could then be the interface that ovn-controller fills. ovs-vswitchd could then use a similar protocol as conntrackd to sync the information. It would send them via the tunnel port with the tunnel-id 0. What do people think about that high level idea? Thanks, Felix > > Has anyone else encountered this problem or has already a solution for dpdk > conntrack sync > Are there any existing or planned OVN/OVS features that might help address > this? > What architectural approach or layer do you think would be most suitable > for this functionality? > > We are reaching out to the ovs community to gather more ideas and to hear > your thoughts on this. > > -Max > _______________________________________________ > discuss mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss _______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
