Both ovn-ic and ovn-northod were trying to update Service_Monitor options
az-name, but their updates were differing by the _comment.
This resulted in 100% CPU use by ovn-northd and ovn-ic.

The test "Service Monitor synchronization" was hitting the issue, but was
successful. The test has been updated to detect the issue and fail if the issue
is still encountered.

Reported-at: https://issues.redhat.com/browse/FDP-2657
Fixes: f40a5c3c803a ("ic: Implement cross-AZ service monitor synchronization.")
Signed-off-by: Xavier Simonart <[email protected]>
---
 ic/ovn-ic.c     | 21 ++++++++++++---------
 tests/ovn-ic.at |  8 ++++++++
 2 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/ic/ovn-ic.c b/ic/ovn-ic.c
index fd5ecefb3..95d73cb4b 100644
--- a/ic/ovn-ic.c
+++ b/ic/ovn-ic.c
@@ -3027,16 +3027,19 @@ sync_service_monitor(struct ic_context *ctx)
             sbrec_service_monitor_set_ic_learned(sb_rec, true);
         }
 
-        /* Always update options since they may change via
-         * NB configuration. Also update chassis_name if
-         * the port has been reassigned to a different chassis.
-         */
-        if (svc_mon->chassis_name) {
-            sbrec_service_monitor_set_chassis_name(sb_rec,
-                svc_mon->chassis_name);
+        /* Only update if ic owns it */
+        if (sb_rec->ic_learned) {
+            /* Always update options since they may change via
+             * NB configuration. Also update chassis_name if
+             * the port has been reassigned to a different chassis.
+             */
+            if (svc_mon->chassis_name) {
+                sbrec_service_monitor_set_chassis_name(sb_rec,
+                    svc_mon->chassis_name);
+            }
+            sbrec_service_monitor_set_options(sb_rec, &db_rec->options);
+            refresh_sb_record_cache(&sync_data.local_sb_svcs_map, sb_rec);
         }
-        sbrec_service_monitor_set_options(sb_rec, &db_rec->options);
-        refresh_sb_record_cache(&sync_data.local_sb_svcs_map, sb_rec);
     }
 
     /* Delete local created records that are no longer used. */
diff --git a/tests/ovn-ic.at b/tests/ovn-ic.at
index 370a755be..1a826aa1c 100644
--- a/tests/ovn-ic.at
+++ b/tests/ovn-ic.at
@@ -4038,6 +4038,14 @@ check ovn-nbctl lb-del az2_lb1
 check_row_count sb:Service_Monitor 0
 check_row_count ic-sb:Service_Monitor 0
 
+# We expect something around 20 Service_Monitor updates.
+# Make sure there is no fight between ic and northd, which used to cause
+# up to more than 2000 updates.
+for az in az1 az2 az3; do
+    svc_update_count=$(grep -c Service_Monitor $az/ovn-sb/ovn-sb.db)
+    check test $svc_update_count -lt 50
+done
+
 OVN_CLEANUP_IC([az1], [az2], [az3])
 AT_CLEANUP
 ])
-- 
2.47.1

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to