On 3/30/26 8:23 AM, Han Zhou wrote:
> 
> 
> On Fri, Mar 27, 2026 at 3:15 AM Dumitru Ceara via dev 
> <[email protected]> wrote:
>>
>> On 3/25/26 8:51 AM, Ales Musil via dev wrote:
>> > On Tue, Mar 24, 2026 at 9:04 PM Ilya Maximets <[email protected]> wrote:
>> >
>> >> On 3/24/26 6:32 PM, Mark Michelson wrote:
>> >>> Hi Ilya,
>> >>>
>> >>> First off, what a great change! I love seeing this sort of improvement!
>> >>>
>> >>> I had a look through the patches, and I couldn't find any technical
>> >>> faults with them. From that point of view:
>> >>>
>> >>> Acked-by: Mark Michelson <[email protected] 
>> >>> <mailto:[email protected]>>
>> >>
>> >> Thanks!
>> >>
>> >>>
>> >>> When it comes to backporting the change to 26.03, I'm a bit hesitant.
>> >>> My instinct is to say that we should absolutely backport it since it
>> >>> is a tremendous performance improvement. I also share your view that
>> >>> people that follow the LTS would miss out on this for two years. We
>> >>> only just released 26.03.0 less than a week ago. This may also help
>> >>> with deployments that see unreasonably large polling intervals in
>> >>> ovn-controller, since this presumably would be a way to trim that
>> >>> down.
>> >>>
>> >>> However, when I think of times we have made exceptions in the past for
>> >>> performance improvement or new feature backports, this patch series is
>> >>> missing some of the typical elements we would normally see.
>> >>> 1. I mentioned that this could be helpful with unreasonably long
>> >>> polling intervals. However, I don't know that we actually have data to
>> >>> show that handling of conjunctive flows is a common cause of this
>> >>> problem amongst large deployments. Typically, if we are going to
>> >>> backport a performance patch, we need to have some evidence that
>> >>> without the performance improvement, reasonably-sized deployments have
>> >>> difficulty functioning. I don't know that we have that for this patch
>> >>> series.
>> >>
>> >> I'm aware of at least a few real world deployments that would have much
>> >> easier time with these changes.  But I also agree that those setups are
>> >> a little unreasonable in the way the ACLs are configured.  At the same
>> >> time, it is often not particularly easy to persuade users to do things
>> >> differently. :)
>> >>
>> >>> 2. It affects the critical path for ovn-controller's functionality,
>> >>> rather than affecting something supplemental or optional.
>> >>> 3. There is no way to opt in or out of this change.
>> >>> 4. The change adds no new tests. In the context of adding this to
>> >>> "main" this is fine, since we rely on existing tests to cover the
>> >>> changes. We have a long testing period where we can plug gaps in
>> >>> coverage in case this change ends up causing an issue. With a
>> >>> backport, we don't have the same luxury.
>> >>
>> >> This set does technically enable extra testing, since the masking of
>> >> conjunction ids is removed for the I-P vs recompute comparison.
>> >> It's not a dedicated test for sure, but we have pre-existing ones for
>> >> merging the conjunctions.  Just wanted to point out the technicality. :)
>> >>
>> >>>
>> >>> It may seem strange, but maybe adding opt-in semantics to the backport
>> >>> would help here. This way, if it turns out there is some sort of error
>> >>> in the code that we did not see during review, it will only affect
>> >>> those who opt into the performance enhancement, and they can easily
>> >>> disable it without the need for any emergency code fixes. For main
>> >>> (i.e. 26.09 and beyond) we would ignore whatever option is added for
>> >>> 26.03 and always use the optimized code.
>> >>
>> >> It doesn't sound right to duplicate the code just to be able to have
>> >> the old slow way of doing things, so I'd vote for not backporting in
>> >> case backporting means an extra knob.
>> >>
>> >>>
>> >>> What do you think? Does anyone else want to weigh in on this?
>> >>
>> >> As I said originally, it's a tricky one.  And it's an enhancement for
>> >> the most part, so the default answer is "no" for backports.  So, I'm
>> >> leaving this entirely for OVN maintainers to decide if it's worth the
>> >> risk.
>> >>
>> >> The last patch in the set is the largest one and the most complicated.
>> >> So, one more option to consider might be to backport only the first 3.
>> >> This would bring the majority of performance of this set to 26.03.
>> >>
>> >> Best regards, Ilya Maximets.
>> >>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Fri, Mar 13, 2026 at 12:38 PM Ilya Maximets <[email protected]>
>> >> wrote:
>> >>>>
>> >>>> When a user creates a lot of ACLs referencing large overlapping address
>> >>>> sets and port groups, ovn-controller takes a lot of time to process
>> >>>> them.  Recompute times observed in real world setups go up to a minute
>> >>>> and beyond.
>> >>>>
>> >>>> A lot of time is spent in the following places while merging a new
>> >>>> desired flow with an existing one:
>> >>>>
>> >>>> 1. ovn-controller iterates over all the referenced by the existing
>> >>>>    desired flow Sb flows and untracks address sets from them.  This
>> >>>>    has a linear complexity per merge, and touches a lot of different
>> >>>>    dynamically allocated chunks of memory.  Takes a lot of time as a
>> >>>>    number of SB flows per desired flow goes up to hundreds and beyond,
>> >>>>    while in practice after the first merge there are no longer any
>> >>>>    address sets referenced in most cases.
>> >>>>
>> >>>> 2. During the same iteration, ovn-controller checks if the same Sb
>> >>>>    flow is already referenced.  This is fine while we're doing a
>> >>>>    linear walk over the list anyway, but it can be done with a hash
>> >>>>    map or in some other way if we get rid of the linear lookup.
>> >>>>
>> >>>> 3. ovn-controller constructs a hash map from all the existing
>> >>>>    conjunctive actions to later check for duplicates.  This is
>> >>>>    inefficient when there is only one conjunction to add, which is how
>> >>>>    it is always in the current implementation.
>> >>>>
>> >>>> 4. Recompute frequently re-installs all the conjunctive flows in OVS,
>> >>>>    because the order of conjunctions in the actions depends on the
>> >>>>    order in which incremental processing is happening, and recompute
>> >>>>    has it's own order in which it processes Sb flows and merges them
>> >>>>    together into desired flows.
>> >>>>
>> >>>> This set is trying to address all the issues above by converting
>> >>>> 'references' into a hash map, counting the number of references with
>> >>>> address sets and performing the duplicate conjunction checks on the
>> >>>> original ofpacts instead of building ad-hoc hash maps, while keeping
>> >>>> the action sorted solving the problem with recompute re-creating all
>> >>>> of them.
>> >>>>
>> >>>> In a setup with 3.5K ACLs referencing the same-ish 450 IP addresses and
>> >>>> a port group with ~20 ports, this set reduces the average full recompute
>> >>>> time from 48 to 18 seconds, which is about 2.7x speed up.
>> >>>>
>> >>>>
>> >>>> As per usual, this is a performance fix, so it's on the edge between
>> >>>> a bug fix and an enhancement.  It doesn't make sense to backport below
>> >>>> 26.03, as older branches do not have the previous commit:
>> >>>>   9a4035402982 ("ofctrl: Optimize and specialize the
>> >> ofctrl_add_or_append_flow function.")
>> >>>> But if we could consider a backport to 26.03, I think, it might be a
>> >>>> good thing, as people will get much better performance in ACL-heavy
>> >>>> setups in the upcoming LTS.  Otherwise, they may need to wait for
>> >>>> another 2 years, if they are following the LTS releases.
>> >>>>
>> >>>> Patch #4 is not a performance fix, so should be backported regardless.
>> >>>>
>> >>>>
>> >>>> Ilya Maximets (5):
>> >>>>   ofctrl: Track Sb flow references in a hash map.
>> >>>>   ofctrl: Use hash map lookup for duplicate references check.
>> >>>>   ofctrl: Count the number of referenced flows with address sets.
>> >>>>   ofctrl: Fix removing wrong conjunction during the duplicate check.
>> >>>>   ofctrl: Look for duplicated conjunctions directly in the ofpacts.
>> >>>>
>> >>>>  controller/ofctrl.c | 245 +++++++++++++++++++++++++++-----------------
>> >>>>  tests/ovn-macros.at <http://ovn-macros.at> |   1 -
>> >>>>  2 files changed, 152 insertions(+), 94 deletions(-)
>> >>>>
>> >>>> --
>> >>>> 2.53.0
>> >>>>
>> >>>> _______________________________________________
>> >>>> dev mailing list
>> >>>> [email protected] <mailto:[email protected]>
>> >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
>> >>>>
>> >>>
>> >>
>> >> _______________________________________________
>> >> dev mailing list
>> >> [email protected] <mailto:[email protected]>
>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev>
>> >
>> >
>> > Thank you Ilya and Mark,
>> >
>>
>> Hi all,
>>
>> > first things first, I applied this to main while we continue the
>> > backport discussion. To be brief, it's true that we usually don't
>> > backport performance improvements. However, two main points lead me
>> > to lean toward backporting to 26.03:
>> >
>> > 1) 26.03 was just released with 26.03.0 so there is still a space to
>> > fix anything that this might broke (not that I believe it would).
>> >
>> > 2) 26.03 is LTS which means many CMSs might stay with this version
>> > and the benefit of having this is undeniable.
>> >
>> > Also, since this is being merged to main we can take the extra step
>> > to perform d/s testing to ensure it's safe to backport. Thoughts?
>> >
>>
>> I definitely vote for backporting this to 26.03 as is.  I had a closer
>> look at the applied patches and they seem safe.  We also most likely
>> have no users on 26.03 yet.
>>
> 
> The change looks safe to me, so I have no objection of backporting to 26.03,
> considering that it was released just this month.
> 
> But I am not sure how critical this is for real world use cases. Is it common
> to have large number of ACLs referencing same address sets? It would be much
> more efficient if the CMS can combine those to a single ACL, if possible, e.g.
> with multiple L4 ports. It would also avoid unlinking AS references triggering
> recomputes. But yeah, it is always hard to predict and make assumptions about
> customer use cases.

Yeah, what I saw in the field with different users is that frequently the
network policies in kubernetes are created in automated fashion with various
software products or in-house developed automation.  It's typically a few
policies per namespace, or per non-overlapping sets of pods based on labels.
These policies include a bunch of the same networks to deny or allow, applied
to different sets of pods.  This translates into ACLs with significantly
overlapping, but not equal, address sets in matches and different port groups.
Those can't really be combined in OVN, they only overlap on the OpenFlow
level.  But it's also hard even fro CMS to do something with these policies
to combine them, as ovn-kubernetes processes each MNP separately and it would
be messy to try and make it combine them.

The real solution here is for the software that generates these policies in
the first place to be converted to use of B/ANP for the large set of ip blocks
used in every MNP, so tiered ACLs can cover all the previously overlapping IPs.
But that's the ask for users to change how they do things or the developer of
those tools to change how they work, which is even harder than asking CMS to
do something.

Note also that ovn-k today never updates ACLs, every time network policy
changes, it deletes old ACLs and creates new ones, so address set I-P is not
really possible there.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to