On 8/24/21 7:36 AM, Anton Ivanov wrote:
> On 23/08/2021 22:36, Ilya Maximets wrote:
>> On 8/23/21 10:37 PM, Anton Ivanov wrote:
>>> On 23/08/2021 21:26, Ilya Maximets wrote:
>>>> On 8/23/21 10:20 PM, Anton Ivanov wrote:
>>>>> Should not be the case.
>>>>>
>>>>> The map is pre-sized for the size from the previous iterations.
>>>>>
>>>>> Line 12861 in my tree which is probably a few commits out of date:
>>>>>
>>>>> fast_hmap_size_for(&lflows, max_seen_lflow_size);
>>>>>
>>>>> And immediately after building the lflows:
>>>>>
>>>>> if (hmap_count(&lflows) > max_seen_lflow_size) {
>>>>> max_seen_lflow_size = hmap_count(&lflows);
>>>>> }
>>>>>
>>>>> So the map SHOULD be sized correctly - to the most recent seen lflow
>>>>> count.
>>>> Please, re-read the commit message. It's a first run of the loop
>>>> and the 'max_seen_lflow_size' is default 128 at this point.
>>> Ack,
>>>
>>> Not using auto-resizing in single threaded mode is a bug. Thanks for fixing
>>> it.
>>>
>>> From that perspective the patch is a straight +1 from me.
>>>
>>> From the perspective of the use case stated in the commit message- I am
>>> not sure it addresses it.
>>>
>>> If northd is in single-threaded mode and is tackling a GIGANTIC> database,
>>> it may never complete the first iteration before the
>>> expiration of the timers and everyone deciding that northd is AWOL.
>> Well, how do you suggest to fix that? Obviously, we can always create
>> a database that northd will never be able to process in a reasonable
>> amount of time. And it doesn't matter if it's single- or multi-threaded.
>>
>> In this case NbDB is only 9MB in size, which is very reasonable, and
>> northd on my laptop takes more than 15 minutes to process it (I killed
>> it at this point). With the patch applied it took only 11 seconds.
>> So, for me, this patch pretty much fixes the issue. 11 seconds is not
>> that bad, e.g. ovn-k8s configures inactivity probes for clients to 180.
>> It would be great to reduce, but we're not there yet.
>>
>>> In that case, if it is multi-threaded from the start, it should probably
>>> grab the sizing from the lflow table hash in south db. That would be a
>>> good approximation for the first run.
>> This will not work for a case where SbDB is empty for any reason while
>> NbDB is not. And there is still a case where northd initially connects
>> to semi-empty databases and after few iterations NbDB receives a big
>> transaction and generates a big update for northd.
>
> A partial fix is to resize to optimum size the hash after lflow processing.
I'm not sure I understand what you mean here, because resizing after
lflow processing will not help. The lflow processing itself is the
very slow part that we're trying to make faster here.
>
> If the sizing was correct - 99.9% of the case this will be a noop.
>
> If the sizing was incorrect, it will be resized so that the DP searches and
> all other ops which were recently added for flow reduction will work
> optimally.
>
> This still does not work for lflow compute with dpgroups + parallel upon
> initial connect and without a SB database to use for size guidance. It will
> work for all other cases.
>
> I will send two separate patches to address the cases which can be easily
> addressed and see what can be done with the dp+parallel upon initial connect
> to an empty sb database.
>
> Brgds,
>
> A
>
>>
>>> A.
>>>
>>>>> A.
>>>>>
>>>>> On 23/08/2021 21:02, Ilya Maximets wrote:
>>>>>> 'lflow_map' is never expanded because northd always uses fast
>>>>>> insertion. This leads to the case where we have a hash map
>>>>>> with only 128 initial buckets and every ovn_lflow_find() ends
>>>>>> up iterating over n_lflows / 128 entries. It's thousands of
>>>>>> logical flows or even more. For example, it takes forever for
>>>>>> ovn-northd to start up with the Northbound Db from the 120 node
>>>>>> density-heavy test from ovn-heater, because every lookup is slower
>>>>>> than previous one. I aborted the process after 15 minutes of
>>>>>> waiting, because there was no sign that it will converge. With
>>>>>> this change applied the loop completes in only 11 seconds.
>>>>>>
>>>>>> Hash map will be pre-allocated to the maximum seen number of
>>>>>> logical flows on a second iteration, but this doesn't help for
>>>>>> the first iteration when northd first time connects to a big
>>>>>> Northbound database, which is a common case during failover or
>>>>>> cluster upgrade. And there is an even trickier case where big
>>>>>> NbDB transaction that explodes the number of logical flows received
>>>>>> on not the first run.
>>>>>>
>>>>>> We can't expand the hash map in case of parallel build, so this
>>>>>> should be fixed separately.
>>>>>>
>>>>>> CC: Anton Ivanov <[email protected]>
>>>>>> Fixes: 74daa0607c7f ("ovn-northd: Introduce parallel lflow build")
>>>>>> Signed-off-by: Ilya Maximets <[email protected]>
>>>>>> ---
>>>>>> northd/ovn-northd.c | 6 +++++-
>>>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>>>> index 3d8e21a4f..40cf957c0 100644
>>>>>> --- a/northd/ovn-northd.c
>>>>>> +++ b/northd/ovn-northd.c
>>>>>> @@ -4387,7 +4387,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct
>>>>>> ovn_datapath *od,
>>>>>> nullable_xstrdup(ctrl_meter),
>>>>>> ovn_lflow_hint(stage_hint), where);
>>>>>> hmapx_add(&lflow->od_group, od);
>>>>>> - hmap_insert_fast(lflow_map, &lflow->hmap_node, hash);
>>>>>> + if (!use_parallel_build) {
>>>>>> + hmap_insert(lflow_map, &lflow->hmap_node, hash);
>>>>>> + } else {
>>>>>> + hmap_insert_fast(lflow_map, &lflow->hmap_node, hash);
>>>>>> + }
>>>>>> }
>>>>>> /* Adds a row with the specified contents to the Logical_Flow
>>>>>> table. */
>>
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev