The garp_rarp node was doing recompute on datapaths that are not
local, this is problematic with ovn-monitor-all enabled as the
whole cluster gets notifications about unrelated port bindings.
This could lead to the node being cancelled on large scale
deployments as there is chance that SB might be readonly when
the port binding recompute is triggered.

To avoid that skip datapaths that are not local and add condition
to recompute when the datapath becomes local. This should prevent
an issue when the port is remote and becomes local as there are
only 2 conditions when that could happen, the datapath is already
local in that case it will be processed by the port binding handler.
In the second case the datapath is remote and becomes local when
the port is bound to local chassis.

It follows the same semantics in the opposite case local->remote,
with one exception, currently we are not removing datapath that
is local during incremental processing.

Fixes: 05527bd6ccdb ("controller: Extract garp_rarp to engine node.")
Reported-by: Ilya Maximets <i.maxim...@ovn.org>
Tested-by: Ilya Maximets <i.maxim...@ovn.org>
Signed-off-by: Ales Musil <amu...@redhat.com>
---
 controller/ovn-controller.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index 1365f3e65..ad2c77485 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -5545,7 +5545,11 @@ garp_rarp_sb_port_binding_handler(struct engine_node 
*node,
         struct local_datapath *ld = get_local_datapath(
             &rt_data->local_datapaths, pb->datapath->tunnel_key);
 
-        if (!ld || ld->localnet_port) {
+        if (!ld) {
+            continue;
+        }
+
+        if (ld->localnet_port) {
             /* XXX: actually handle this incrementally. */
             return EN_UNHANDLED;
         }
@@ -5589,16 +5593,22 @@ garp_rarp_runtime_data_handler(struct engine_node 
*node, void *data OVS_UNUSED)
 
     struct tracked_datapath *tdp;
     HMAP_FOR_EACH (tdp, node, &rt_data->tracked_dp_bindings) {
-        if (tdp->tracked_type == TRACKED_RESOURCE_REMOVED) {
-            /* This is currently not handled incrementally in runtime_data
-             * so it should never happen. Recompute just in case. */
+        if (tdp->tracked_type == TRACKED_RESOURCE_REMOVED ||
+            tdp->tracked_type == TRACKED_RESOURCE_NEW) {
+            /* The removal is currently not handled incrementally in
+             * runtime_data so it should never happen. Addition can happen,
+             * recompute in both cases. */
             return EN_UNHANDLED;
         }
 
         struct local_datapath *ld = get_local_datapath(
             &rt_data->local_datapaths, tdp->dp->tunnel_key);
 
-        if (!ld || ld->localnet_port) {
+        if (!ld) {
+            continue;
+        }
+
+        if (ld->localnet_port) {
             /* XXX: actually handle this incrementally. */
             return EN_UNHANDLED;
         }
-- 
2.49.0

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to