The garp_rarp node was doing recompute on datapaths that are not local, this is problematic with ovn-monitor-all enabled as the whole cluster gets notifications about unrelated port bindings. This could lead to the node being cancelled on large scale deployments as there is chance that SB might be readonly when the port binding recompute is triggered.
To avoid that skip datapaths that are not local and add condition to recompute when the datapath becomes local. This should prevent an issue when the port is remote and becomes local as there are only 2 conditions when that could happen, the datapath is already local in that case it will be processed by the port binding handler. In the second case the datapath is remote and becomes local when the port is bound to local chassis. It follows the same semantics in the opposite case local->remote, with one exception, currently we are not removing datapath that is local during incremental processing. Fixes: 05527bd6ccdb ("controller: Extract garp_rarp to engine node.") Reported-by: Ilya Maximets <i.maxim...@ovn.org> Tested-by: Ilya Maximets <i.maxim...@ovn.org> Signed-off-by: Ales Musil <amu...@redhat.com> --- controller/ovn-controller.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 1365f3e65..ad2c77485 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -5545,7 +5545,11 @@ garp_rarp_sb_port_binding_handler(struct engine_node *node, struct local_datapath *ld = get_local_datapath( &rt_data->local_datapaths, pb->datapath->tunnel_key); - if (!ld || ld->localnet_port) { + if (!ld) { + continue; + } + + if (ld->localnet_port) { /* XXX: actually handle this incrementally. */ return EN_UNHANDLED; } @@ -5589,16 +5593,22 @@ garp_rarp_runtime_data_handler(struct engine_node *node, void *data OVS_UNUSED) struct tracked_datapath *tdp; HMAP_FOR_EACH (tdp, node, &rt_data->tracked_dp_bindings) { - if (tdp->tracked_type == TRACKED_RESOURCE_REMOVED) { - /* This is currently not handled incrementally in runtime_data - * so it should never happen. Recompute just in case. */ + if (tdp->tracked_type == TRACKED_RESOURCE_REMOVED || + tdp->tracked_type == TRACKED_RESOURCE_NEW) { + /* The removal is currently not handled incrementally in + * runtime_data so it should never happen. Addition can happen, + * recompute in both cases. */ return EN_UNHANDLED; } struct local_datapath *ld = get_local_datapath( &rt_data->local_datapaths, tdp->dp->tunnel_key); - if (!ld || ld->localnet_port) { + if (!ld) { + continue; + } + + if (ld->localnet_port) { /* XXX: actually handle this incrementally. */ return EN_UNHANDLED; } -- 2.49.0 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev