On 3/7/25 16:51, Aleksandra Rukomoinikova wrote:
> Hi Ilya!
> 
> I’m not entirely clear on your point about the impact of this change on 
> performance.
> Formally, without my changes, two strings are already added to the hash: 
> action and clone,
> which are usually much longer than just the stage name.
> I don’t think the recalculation of flows based on hashes is particularly 
> significant. 
> Could you explain the impact? Is this somehow related to datapath groups?

Each lflow is getting hashed and compared against existing ones right after
creation.  At least in the past, that was a fairly CPU intensive operation.
Especially because in some cases northd generated a large amount of duplicated
logical flows, so the actual number of comparisons was significantly larger
than the total number of logical flows that ended up in the southbound DB.

So, even a smallest of changes in a way lflows are hashed and compared would
result in very significant changes in the overall performance of ovn-northd.

However, over time we optimized different aspects of lflow generation including
optimized lflow generation for datapath groups in places where northd knows
that all the flows will indeed be grouped.  And there are not so many cases
now where duplicates can be generated, so it seems to be less of a concern in
the latest versions of OVN.

> 
> I also thought about an alternative approach where we would know the hashes 
> for each
> stage name in advance and simply add this hash instead of computing it from 
> the string,
> since it’s a stateful value. However, ovn_stage_to_str doesn’t access the 
> database but
> retrieves the string from memory, and this didn’t seem like a significant 
> optimization to me.
> 
> I conducted some research on the impact of this patch on the performance of 
> upstream version.
> In a setup with 6000 logical switches and 15,000 ports, I collected some 
> metrics regarding performance.
> 
> The average utilization metrics for hash maps during recomputation were:
>  • Without the patch:
> hmap_expand: 5928.750/sec (average), 98.8125/sec (overall average)
> 
>  • With the patch:
> hmap_expand: 5921.117/sec (average), 98.6853/sec (overall average)
>

This one is not a very useful metric.  At least for this measurement.

> Statistics for lflows_to_sb:
>  • With the patch: 154.000000 msec
>  • Without the patch: 151.000000 msec
> 
> These results seem to fall within the range of normal deviation.

It seems, you're right.  I did a few tests of my own and I also see an
insignificant performance difference in most cases.  It is a little surprising
to not see much difference, but maybe it shouldn't bee.  All the changes we made
in the last few years indeed decreased the importance of the lflow hashing and
comparisons.

I would like to run a larger set of ovn-heater tests though before concluding 
that
there is actually no real difference.  However, ...

It seems like an OVN_INTERNAL_MINOR_VER wasn't increased for a long time now and
it supposed to be increased every time logical pipeline stages are 
added/modified.
That might be the root of the issue you're trying to solve.

When the internal version changes, that should trigger updates for all the stage
names and hints and the source locators for all the logical flows, see the
'if (ovn_internal_version_changed)' branch in the sync_lflow_to_sb().

I think, we just need to bump that number and maybe add some build assertions,
so we do not forget about it again...

What do you think?  Dumitru, what's your take on this?

Best regards, Ilya Maximets.

> 
> 
>> On 6 Mar 2025, at 16:50, Ilya Maximets <i.maxim...@ovn.org> wrote:
>>
>> On 3/6/25 09:41, Alexandra Rukomoinikova wrote:
>>> When changing the order of stages in the LogicalFlow
>>> table in SB, some rules will be added with an incorrect
>>> stage-name in the external-ids of existing datapath groups.
>>> When adding a stage, new lflows in a new table with the same
>>> match and action that already existed with the same table-id
>>> will remain in the database with unupdated stage names.
>>>
>>> Signed-off-by: Alexandra Rukomoinikova <arukomoinikova@k2.cloud>
>>> ---
>>> lib/ovn-util.c     | 5 ++++-
>>> lib/ovn-util.h     | 2 +-
>>> northd/lflow-mgr.c | 4 +++-
>>> 3 files changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/lib/ovn-util.c b/lib/ovn-util.c
>>> index c65b36bb5..d1572d5ec 100644
>>> --- a/lib/ovn-util.c
>>> +++ b/lib/ovn-util.c
>>> @@ -646,16 +646,19 @@ sbrec_logical_flow_hash(const struct 
>>> sbrec_logical_flow *lf)
>>> {
>>>     return ovn_logical_flow_hash(lf->table_id,
>>>                                  ovn_pipeline_from_name(lf->pipeline),
>>> +                                 smap_get_def(&lf->external_ids,
>>> +                                              "stage-name", ""),
>>>                                  lf->priority, lf->match,
>>>                                  lf->actions);
>>> }
>>
>> Hi, Alexandra.  Thanks for the patch!
>>
>> Though, IIRC, it was an intentional decision to not include the stage
>> name in the hash/comparison to avoid a significant performance impact
>> that this hashmap lookup and the string hashing bring.
>>
>> The issue only affects debugging capabilities, but the performance
>> impact of hashing the name on each database update and comparing extra
>> strings on lflow generation as well as re-writing most of the flows
>> on stage updates is significant and affects every operation in northd.
>>
>> Did you run some scale tests with this change?  What's the impact in
>> current OVN?
>>
>> Best regards, Ilya Maximets.
> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to