On 8/15/25 3:42 PM, Dumitru Ceara wrote:
> On 8/15/25 11:17 AM, Dumitru Ceara wrote:
>> On 7/31/25 10:52 PM, Lorenzo Bianconi wrote:
>>> This patch handles the logical switch creation incrementally
>>> in the northd engine node.  The dependent engine nodes - ls_stateful,
>>> lflow and few others still fall back to full recompute, which
>>> will be handled in separate patches.
>>>
>>> Reported-at: https://issues.redhat.com/browse/FDP-754
>>> Co-authored-by: Numan Siddique <num...@ovn.org>
>>> Signed-off-by: Numan Siddique <num...@ovn.org>
>>> Signed-off-by: Lorenzo Bianconi <lorenzo.bianc...@redhat.com>
>>> ---
>>
>> Hi Lorenzo,
>>
>> Not a full review, just one thing I found while debugging the rebase of
>> this series on the current main (and with Jacob's series to do LR
>> creation I-P).
>>
>>>  northd/en-lflow.c        |   4 +
>>>  northd/en-ls-stateful.c  |   4 +
>>>  northd/en-multicast.c    |   4 +
>>>  northd/inc-proc-northd.c |   5 +-
>>>  northd/lb.c              |   2 +-
>>>  northd/lb.h              |   1 +
>>>  northd/lflow-mgr.c       |   8 +-
>>>  northd/northd.c          | 185 +++++++++++++++++++++++++++++++++------
>>>  northd/northd.h          |  22 ++++-
>>>  tests/ovn-northd.at      |  61 +++++++++++++
>>>  10 files changed, 261 insertions(+), 35 deletions(-)
>>>
>>
>> [...]
>>
>>> @@ -4910,6 +4985,54 @@ northd_handle_ls_changes(struct ovsdb_idl_txn 
>>> *ovnsb_idl_txn,
>>>          }
>>>      }
>>>  
>>> +    HMAPX_FOR_EACH (node, &ni->synced_lses->deleted) {
>>> +        const struct ovn_synced_logical_switch *synced = node->data;
>>> +        const struct nbrec_logical_switch *deleted_ls = synced->nb;
>>> +
>>> +        struct ovn_datapath *od = ovn_datapath_find_(
>>> +                                    &nd->ls_datapaths.datapaths,
>>> +                                    &deleted_ls->header_.uuid);
>>> +        if (!od) {
>>> +            static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1);
>>> +            VLOG_WARN_RL(&rl, "Internal error: a tracked updated LS 
>>> doesn't "
>>> +                         "exist in ls_datapaths: "UUID_FMT,
>>> +                         UUID_ARGS(&deleted_ls->header_.uuid));
>>> +            goto fail;
>>> +        }
>>> +
>>> +        if (deleted_ls->copp || deleted_ls->n_dns_records ||
>>> +            deleted_ls->n_forwarding_groups || deleted_ls->n_qos_rules ||
>>> +            deleted_ls->n_load_balancer || 
>>> deleted_ls->n_load_balancer_group) {
>>> +            goto fail;
>>> +        }
>>> +
>>> +        if (!ls_handle_lsp_changes(ovnsb_idl_txn, deleted_ls,
>>> +                                   ni, nd, od, &trk_data->trk_lsps)) {
>>> +            goto fail;
>>> +        }
>>> +
>>> +        hmap_remove(&nd->ls_datapaths.datapaths, &od->key_node);
>>> +        vector_remove(&nd->ls_datapaths.dps, od->index, NULL);
>>> +        bitmap_set0(nd->ls_datapaths.dps_index_map.map, od->index);
>>> +        nd->ls_datapaths.dps_index_map.n_elems--;
>>> +
>>
>> At this point, all remaining datapaths (after od->index) have wrong
>> indices.  That's because vector removal of an element shifts the
>> remainder of the vector to the left.
>>
>>> +        const struct sbrec_ip_multicast *ip_mcast =
>>> +            ip_mcast_lookup(ni->sbrec_ip_mcast_by_dp, od->sb);
>>> +        if (ip_mcast) {
>>> +            sbrec_ip_multicast_delete(ip_mcast);
>>> +        }
>>> +
>>> +        if (is_ls_acls_changed(deleted_ls)) {
>>> +            hmapx_add(&trk_data->ls_with_changed_acls, od);
>>> +        }
>>> +        hmapx_add(&trk_data->trk_switches.deleted, od);
>>> +    }
>>> +
>>> +    if (!hmapx_is_empty(&trk_data->trk_switches.crupdated) ||
>>> +        !hmapx_is_empty(&trk_data->trk_switches.deleted)) {
>>> +        trk_data->type |= NORTHD_TRACKED_SWITCHES;
>>> +    }
>>> +
>>
>> One potential way of fixing this is to reinitialize _all_ od->index
>> values here, after we performed all removals.
>>
> 
> Thinking more about it, that won't be enough either.  All dependent
> nodes (ls_stateful, lr_stateful, lb_data) they all rely on the
> ovn_datapath index not changing.
> 
> I think a simpler and safer solution is to not call vector_remove()
> above but instead just "clear" the entry in the vector (generating a hole).
> 
> All dependent nodes can still rely on indices of other (not removed)
> datapaths because those don't change.
> 
> One thing to take into account is: this might generate very sparse
> vectors at some point so we should have a heuristic that triggers a
> recompute once we reach, e.g., 50% empty vectors.
> 
> I'm not sure I can get this done, but I'll try to implement this on top
> of your series in the hope that we can still merge it in 25.09, before
> branching today.
> 

I take it back, I won't be able to provide a well tested implementation
today.  I think we should defer this to 25.09.

Regards,
Dumitru

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to