On Thu, Jul 3, 2025 at 9:46 AM Felix Huettner <felix.huettner@stackit.cloud>
wrote:

> On Thu, Jul 03, 2025 at 08:48:23AM +0200, Ales Musil wrote:
> > On Thu, Jul 3, 2025 at 8:39 AM Felix Huettner
> <felix.huettner@stackit.cloud>
> > wrote:
> >
> > > On Thu, Jul 03, 2025 at 08:27:01AM +0200, Ales Musil via dev wrote:
> > > > On Mon, Jun 30, 2025 at 4:28 PM Felix Huettner
> > > <felix.huettner@stackit.cloud>
> > > > wrote:
> > > >
> > > > > On Tue, Jun 24, 2025 at 04:56:56PM +0200, Ales Musil wrote:
> > > > > > On Mon, Jun 23, 2025 at 11:14 AM Ilya Maximets <
> i.maxim...@ovn.org>
> > > > > wrote:
> > > > > >
> > > > > > > On 6/10/25 2:03 PM, Felix Huettner via dev wrote:
> > > > > > > > Previously the needed garps and rarps where calculated in
> each
> > > loop
> > > > > of
> > > > > > > > the ovn-controller in pinctrl_run. This is quite wasteful as
> > > most of
> > > > > the
> > > > > > > > time nothing relevant changes. In large external networks
> this
> > > can
> > > > > have
> > > > > > > > a significant performance impact on the ovn-controller.
> > > > > > > >
> > > > > > > > Previous patches
> > > > > > >
> > > > >
> > >
> https://patchwork.ozlabs.org/project/ovn/patch/zzxvora5keucn...@sit-sdelap1634.int.lidl.net/
> > > > > > > > extraced just a limited part of the functionality to an
> engine
> > > node.
> > > > > > > > This patchset now limits the logic in pinctrl_run to the
> absolute
> > > > > > > > minimum.
> > > > > > > > Also it addresses some issues with incremental processing
> changes
> > > > > while
> > > > > > > > the southbound connection is unavailable.
> > > > > > > >
> > > > > > > > v11->v12:
> > > > > > > >   * rebased
> > > > > > > > v10->v11:
> > > > > > > >   * removed merged patches
> > > > > > > >   * fix concurrency bug in updating announcement times
> > > > > > > > v9->v10:
> > > > > > > >   * rebased
> > > > > > > >   * fixed countdown in first patch
> > > > > > > > v8->v9:
> > > > > > > >   * rebased
> > > > > > > >   * rework usage of cmap for garp_rarp data
> > > > > > > > v7->v8:
> > > > > > > >   * rebased
> > > > > > > >   * improved handling of daemon_started_recently
> > > > > > > > v6->v7:
> > > > > > > >   * rebased
> > > > > > > >   * fixed ct-zone inconsitencies that break ci
> > > > > > > > v5->v6: rebased
> > > > > > > >
> > > > > > > > Felix Huettner (1):
> > > > > > > >   controller: Extract garp_rarp to engine node.
> > > > > > >
> > > > > > >
> > > > > > > Hello there.  Unfortunately, our ovn-heater scale CI reported a
> > > > > significant
> > > > > > > performance regression this week.  Further bisection points to
> this
> > > > > change
> > > > > > > being the root cause.
> > > > > > >
> > > > > > > What's happening is after this change, ovn-controller is in a
> > > constant
> > > > > loop
> > > > > > > of (forced) recomputes, which is very heavy at scale:
> > > > > > >
> > > > > > > 2025-06-22T16:14:02.292Z|02003|binding|INFO|Claiming lport
> lp-0-75
> > > for
> > > > > > > this chassis.
> > > > > > > 2025-06-22T16:14:02.292Z|02004|binding|INFO|lp-0-75: Claiming
> > > > > > > 22:94:4c:ca:da:9b 16.0.0.76
> > > > > > > 2025-06-22T16:14:03.317Z|02005|inc_proc_eng|INFO|node: lb_data,
> > > > > recompute
> > > > > > > (forced) took 613ms
> > > > > > > 2025-06-22T16:14:08.939Z|02006|inc_proc_eng|INFO|node:
> > > lflow_output,
> > > > > > > recompute (forced) took 5622ms
> > > > > > > 2025-06-22T16:14:09.322Z|02007|binding|INFO|Setting lport
> lp-0-75
> > > > > > > ovn-installed in OVS
> > > > > > > 2025-06-22T16:14:09.322Z|02008|binding|INFO|Setting lport
> lp-0-75
> > > up in
> > > > > > > Southbound
> > > > > > > 2025-06-22T16:14:09.323Z|02009|timeval|WARN|Unreasonably long
> > > 6964ms
> > > > > poll
> > > > > > > interval (5657ms user, 1287ms system)
> > > > > > >
> > > > > > > Before this change, the cluster-density test, for example,
> didn't
> > > have
> > > > > any
> > > > > > > long poll intervals and everything was computed incrementally.
> > > After
> > > > > the
> > > > > > > change it recomputes pretty much on every iteration and average
> > > > > > > ovn-installed
> > > > > > > latency went up from 2.8 to 9.2 seconds.
> > > > > > >
> > > > > > > Could someone, please, take a look?
> > > > > > >
> > > > > > > Best regards, Ilya Maximets.
> > > > > > >
> > > > > > >
> > > > > > Hi Felix and Ilya,
> > > > > >
> > > > > > I have posted a patch to hopefully mitigate the issue
> > > > > > for most workloads [0], we will need some long term solution.
> > > > > >
> > > > > > [0]
> > > > > >
> > > > >
> > >
> https://patchwork.ozlabs.org/project/ovn/patch/20250624145405.3197685-1-amu...@redhat.com/
> > > > >
> > > > > Hi Ales, Hi Ilya,
> > > > >
> > > >
> > > > Hi Felix,
> > > >
> > > >
> > > > >
> > > > > sorry for that and thanks for fixing it already.
> > > > >
> > > > > If i got the above and the commit message right, the issue is
> basically
> > > > > that there is no incremental processing yet and if recomputes are
> > > > > not allowed this will trigger a recompute of all engine nodes.
> > > > >
> > > >
> > > > That is correct.
> > > >
> > > >
> > > > >
> > > > > While we could definately add incremental processing here i wounder
> > > > > about making this less problematic.
> > > > >
> > > > > From the comments it looks like we disallow recomputes if:
> > > > > 1. We do not have a southbound transaction (e.g. the previous one
> is
> > > > >    still in flight or we are not connected)
> > > > > 2. OR we have a backlog on the vswitchd connection
> > > > >
> > > > > However the first is only problematic if we write to southbound
> and the
> > > > > second only if write to vswitchd.
> > > > > For a node like the garp_rarp one, which just processes data for
> > > > > ovn-controller to work with both of these are irrelevant. We
> neither
> > > > > write to southbound nor to vswitchd.
> > > > >
> > > > > Would it be an option to make the decission if recomputes are
> allowed
> > > on
> > > > > a per engine node basis? Then we could do the recompute_allowed
> > > > > decission more granularly and thereby maybe skip global recomputes
> in
> > > > > some cases.
> > > > >
> > > >
> > > >
> > > > That is something that we were discussing internally, the garp_rarp
> > > > node is not the only one that was triggering cancellations, but it
> > > > happened more frequently. So this is a global problem and as you said
> > > > one of the solutions might be to "mark" nodes as not SB dependent, in
> > > > that case we can allow recompute when sb is readonly because it
> > > > doesn't affect the data at all. The only problem that we are trying
> > > > to solve before proceeding is how to make this "marking" error proof
> > > > programatically. Mostly in cases where the node is wrongly considered
> > > > as SB independent, but still writes into SB. For that we would
> > > > probably need support from idl layer, but any other ideas how to
> > > > achieve that are very much welcome.
> > >
> > > Hi Ales,
> > >
> >
> > Hi Felix,
>
> Hi Ales,
>
> >
> >
> > >
> > > i just had an idea about that. Not sure if it is nice though :)
> > >
> >
> > It makes sense, but there is one catch see down below.
> >
> >
> > >
> > > 1. Extend "struct engine_node" with a bool "needs_context"
> > > 2. Add a define to set the value during node definition
> > > 3. Set this on all nodes that require any txn/client_ctx
> > > 4. Require "struct engine_node" to be passed to "engine_get_context"
> > > 5. Crash "engine_get_context" if the provided engine node has
> > >    "!needs_context"
> > > 6. In "engine_recompute" change the following
> > >
> > > -    if (!allowed) {
> > > +    if (!allowed && node->needs_context) {
> > >
> >
> > That would work only for SB calls that do need the txn directly.
> > Unfortunately any SB update existing row acquires the txn
> > internally in the idl instead through the context.
> >
> > So that part of SB writes wouldn't be error proof.
>
> thanks a lot, thats good to know.
>
> Would we then need a functionality on the ovs side to block the
> transaction in some kind of way?
> But i currently do not see any way the idl supports that.
>

Or Indication that there was an attempt to write, either way
we will need something from the idl layer itself.


>
>
> Thanks,
> Felix
>
> >
> >
> > >
> > > Thanks a lot,
> > > Felix
> > >
> > >
> > Thanks,
> > Ales
> >
> >
> > >
> > > >
> > > >
> > > > > Thanks a lot,
> > > > > Felix
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > > Ales
> > > > >
> > > > >
> > > > >
> > > > Thanks,
> > > > Ales
> > > > _______________________________________________
> > > > dev mailing list
> > > > d...@openvswitch.org
> > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> > >
> > >
>
>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to