On Thu, Jul 03, 2025 at 08:27:01AM +0200, Ales Musil via dev wrote: > On Mon, Jun 30, 2025 at 4:28 PM Felix Huettner <felix.huettner@stackit.cloud> > wrote: > > > On Tue, Jun 24, 2025 at 04:56:56PM +0200, Ales Musil wrote: > > > On Mon, Jun 23, 2025 at 11:14 AM Ilya Maximets <i.maxim...@ovn.org> > > wrote: > > > > > > > On 6/10/25 2:03 PM, Felix Huettner via dev wrote: > > > > > Previously the needed garps and rarps where calculated in each loop > > of > > > > > the ovn-controller in pinctrl_run. This is quite wasteful as most of > > the > > > > > time nothing relevant changes. In large external networks this can > > have > > > > > a significant performance impact on the ovn-controller. > > > > > > > > > > Previous patches > > > > > > https://patchwork.ozlabs.org/project/ovn/patch/zzxvora5keucn...@sit-sdelap1634.int.lidl.net/ > > > > > extraced just a limited part of the functionality to an engine node. > > > > > This patchset now limits the logic in pinctrl_run to the absolute > > > > > minimum. > > > > > Also it addresses some issues with incremental processing changes > > while > > > > > the southbound connection is unavailable. > > > > > > > > > > v11->v12: > > > > > * rebased > > > > > v10->v11: > > > > > * removed merged patches > > > > > * fix concurrency bug in updating announcement times > > > > > v9->v10: > > > > > * rebased > > > > > * fixed countdown in first patch > > > > > v8->v9: > > > > > * rebased > > > > > * rework usage of cmap for garp_rarp data > > > > > v7->v8: > > > > > * rebased > > > > > * improved handling of daemon_started_recently > > > > > v6->v7: > > > > > * rebased > > > > > * fixed ct-zone inconsitencies that break ci > > > > > v5->v6: rebased > > > > > > > > > > Felix Huettner (1): > > > > > controller: Extract garp_rarp to engine node. > > > > > > > > > > > > Hello there. Unfortunately, our ovn-heater scale CI reported a > > significant > > > > performance regression this week. Further bisection points to this > > change > > > > being the root cause. > > > > > > > > What's happening is after this change, ovn-controller is in a constant > > loop > > > > of (forced) recomputes, which is very heavy at scale: > > > > > > > > 2025-06-22T16:14:02.292Z|02003|binding|INFO|Claiming lport lp-0-75 for > > > > this chassis. > > > > 2025-06-22T16:14:02.292Z|02004|binding|INFO|lp-0-75: Claiming > > > > 22:94:4c:ca:da:9b 16.0.0.76 > > > > 2025-06-22T16:14:03.317Z|02005|inc_proc_eng|INFO|node: lb_data, > > recompute > > > > (forced) took 613ms > > > > 2025-06-22T16:14:08.939Z|02006|inc_proc_eng|INFO|node: lflow_output, > > > > recompute (forced) took 5622ms > > > > 2025-06-22T16:14:09.322Z|02007|binding|INFO|Setting lport lp-0-75 > > > > ovn-installed in OVS > > > > 2025-06-22T16:14:09.322Z|02008|binding|INFO|Setting lport lp-0-75 up in > > > > Southbound > > > > 2025-06-22T16:14:09.323Z|02009|timeval|WARN|Unreasonably long 6964ms > > poll > > > > interval (5657ms user, 1287ms system) > > > > > > > > Before this change, the cluster-density test, for example, didn't have > > any > > > > long poll intervals and everything was computed incrementally. After > > the > > > > change it recomputes pretty much on every iteration and average > > > > ovn-installed > > > > latency went up from 2.8 to 9.2 seconds. > > > > > > > > Could someone, please, take a look? > > > > > > > > Best regards, Ilya Maximets. > > > > > > > > > > > Hi Felix and Ilya, > > > > > > I have posted a patch to hopefully mitigate the issue > > > for most workloads [0], we will need some long term solution. > > > > > > [0] > > > > > https://patchwork.ozlabs.org/project/ovn/patch/20250624145405.3197685-1-amu...@redhat.com/ > > > > Hi Ales, Hi Ilya, > > > > Hi Felix, > > > > > > sorry for that and thanks for fixing it already. > > > > If i got the above and the commit message right, the issue is basically > > that there is no incremental processing yet and if recomputes are > > not allowed this will trigger a recompute of all engine nodes. > > > > That is correct. > > > > > > While we could definately add incremental processing here i wounder > > about making this less problematic. > > > > From the comments it looks like we disallow recomputes if: > > 1. We do not have a southbound transaction (e.g. the previous one is > > still in flight or we are not connected) > > 2. OR we have a backlog on the vswitchd connection > > > > However the first is only problematic if we write to southbound and the > > second only if write to vswitchd. > > For a node like the garp_rarp one, which just processes data for > > ovn-controller to work with both of these are irrelevant. We neither > > write to southbound nor to vswitchd. > > > > Would it be an option to make the decission if recomputes are allowed on > > a per engine node basis? Then we could do the recompute_allowed > > decission more granularly and thereby maybe skip global recomputes in > > some cases. > > > > > That is something that we were discussing internally, the garp_rarp > node is not the only one that was triggering cancellations, but it > happened more frequently. So this is a global problem and as you said > one of the solutions might be to "mark" nodes as not SB dependent, in > that case we can allow recompute when sb is readonly because it > doesn't affect the data at all. The only problem that we are trying > to solve before proceeding is how to make this "marking" error proof > programatically. Mostly in cases where the node is wrongly considered > as SB independent, but still writes into SB. For that we would > probably need support from idl layer, but any other ideas how to > achieve that are very much welcome.
Hi Ales, i just had an idea about that. Not sure if it is nice though :) 1. Extend "struct engine_node" with a bool "needs_context" 2. Add a define to set the value during node definition 3. Set this on all nodes that require any txn/client_ctx 4. Require "struct engine_node" to be passed to "engine_get_context" 5. Crash "engine_get_context" if the provided engine node has "!needs_context" 6. In "engine_recompute" change the following - if (!allowed) { + if (!allowed && node->needs_context) { Thanks a lot, Felix > > > > Thanks a lot, > > Felix > > > > > > > > Thanks, > > > Ales > > > > > > > Thanks, > Ales > _______________________________________________ > dev mailing list > d...@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev