On 23/08/2021 21:26, Ilya Maximets wrote:
On 8/23/21 10:20 PM, Anton Ivanov wrote:
Should not be the case.

The map is pre-sized for the size from the previous iterations.

Line 12861 in my tree which is probably a few commits out of date:

     fast_hmap_size_for(&lflows, max_seen_lflow_size);

And immediately after building the lflows:

     if (hmap_count(&lflows) > max_seen_lflow_size) {
         max_seen_lflow_size = hmap_count(&lflows);
     }

So the map SHOULD be sized correctly - to the most recent seen lflow count.
Please, re-read the commit message.  It's a first run of the loop
and the 'max_seen_lflow_size' is default 128 at this point.

Ack,

Not using auto-resizing in single threaded mode is a bug. Thanks for fixing it.

From that perspective the patch is a straight +1 from me.

From the perspective of the use case stated in the commit message- I am not sure it addresses it.

If northd is in single-threaded mode and is tackling a GIGANTIC database, it may never complete the first iteration before the expiration of the timers and everyone deciding that northd is AWOL.

In that case, if it is multi-threaded from the start, it should probably grab the sizing from the lflow table hash in south db. That would be a good approximation for the first run.

A.


A.

On 23/08/2021 21:02, Ilya Maximets wrote:
'lflow_map' is never expanded because northd always uses fast
insertion.  This leads to the case where we have a hash map
with only 128 initial buckets and every ovn_lflow_find() ends
up iterating over n_lflows / 128 entries.  It's thousands of
logical flows or even more.  For example, it takes forever for
ovn-northd to start up with the Northbound Db from the 120 node
density-heavy test from ovn-heater, because every lookup is slower
than previous one.  I aborted the process after 15 minutes of
waiting, because there was no sign that it will converge.  With
this change applied the loop completes in only 11 seconds.

Hash map will be pre-allocated to the maximum seen number of
logical flows on a second iteration, but this doesn't help for
the first iteration when northd first time connects to a big
Northbound database, which is a common case during failover or
cluster upgrade.  And there is an even trickier case where big
NbDB transaction that explodes the number of logical flows received
on not the first run.

We can't expand the hash map in case of parallel build, so this
should be fixed separately.

CC: Anton Ivanov <[email protected]>
Fixes: 74daa0607c7f ("ovn-northd: Introduce parallel lflow build")
Signed-off-by: Ilya Maximets <[email protected]>
---
   northd/ovn-northd.c | 6 +++++-
   1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 3d8e21a4f..40cf957c0 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -4387,7 +4387,11 @@ do_ovn_lflow_add(struct hmap *lflow_map, struct 
ovn_datapath *od,
                      nullable_xstrdup(ctrl_meter),
                      ovn_lflow_hint(stage_hint), where);
       hmapx_add(&lflow->od_group, od);
-    hmap_insert_fast(lflow_map, &lflow->hmap_node, hash);
+    if (!use_parallel_build) {
+        hmap_insert(lflow_map, &lflow->hmap_node, hash);
+    } else {
+        hmap_insert_fast(lflow_map, &lflow->hmap_node, hash);
+    }
   }
     /* Adds a row with the specified contents to the Logical_Flow table. */



--
Anton R. Ivanov
Cambridgegreys Limited. Registered in England. Company Number 10273661
https://www.cambridgegreys.com/

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to