On 7/9/21 3:46 AM, Han Zhou wrote:
> On Thu, Jul 8, 2021 at 1:10 PM Mark Michelson <[email protected]> wrote:
>>
>> Acked-by: Mark Michelson <[email protected]>
>>
> Thanks Dumitru and Mark!
> 

Hi Mark and Han,

Thanks for the reviews!

>>
>> If I'm to extrapolate this a bit more, it sounds like any time a
>> Datapath_Binding is updated, it results in the potential for many
>> Logical_Flows to be re-calculated. In this particular case, it was
>> triggered based on modifying a wide-reaching load balancer. I suppose
>> the same thing would happen if Datapath_Binding external_ids were updated.
>>
> Yes. Luckily it seems external_ids of datapath_binding shouldn't change
> very often. But I agree this is a general problem of the I-P.
> 

I see it in the same way.  I-P doesn't give per column change tracking
information.

>> I wonder if it's worth investigating trying to create a new reference
>> type for the purposes of the IDL. The idea would be that if a column in
>> table A references rows in table B, then if the rows in table B are
>> updated, then any relevant rows in table A will reported as updated as
>> well. However, any columns in tables C, D, E, etc. that reference table
>> A will not result in the rows in tables C, D, E, etc. being reported as
>> updated.
>>
>> Essentially, it's the same as what you've done here, except we can have
>> the IDL fill in the sbrec_load_balancer in the Datapath_Binding record
>> instead of having to look it up when needed.
>>
> It may be helpful to have a new reference type, but deciding when and where
> to use it would be very tricky. If not careful enough, we might be creating
> some dependency in the code without getting proper change notification.
> 
>> Alternatively, would it be worth listing the relevant columns that a
>> particular type "cares" about? For instance, Logical_Datapaths are only
>> affected if the tunnel_key is updated on a Datapath_Binding. The
>> load_balancers and external_ids are irrelevant. Would it be worthwhile
>> to be able to note in the reference the columns that should elicit an
>> update?
> 
> This sounds more reliable. Ideally, if referenced columns are noted, the
> generated IDL structure of A's field B should only contains the columns
> that are interested by A. This way we can make sure it is impossible to
> secretly use a column without being notified of updates that possibly need
> to be handled by I-P. I am not sure how hard this would be.
> 

I think this is part of a larger discussion (there were some started a
while back) about refactoring ovn-controller in general.  As far as I've
seen there are multiple options:

- keep using the IDL and enhance it (and the schema), suggested above
- stop using IDL, use ovsdb-cs directly instead and implement a new type
of incremental processing in ovn-controller.  This can be either ddlog
on top of ovsdb-cs (like ovn-northd-ddlog does), or something else.

>>
>> Is it worth it? Or is this such a niche use case that it's not worth
>> implementing?
> 
> I think we can see if there are more such scenarios that justify the IDL
> change. I am ok with changing the schema for the current use case to
> replace the reference by just raw uuid. However, we should never directly
> use IDL from the ctx and access any table, this would break the I-P
> dependency tracking. Please see more comments inlined.
> 

In the general case I tend to agree but in this specific case
Datapath_Binding doesn't really depend on Load_Balancer, it just uses
those uuids as hints.  Please see my replies inline.

>>
>> On 7/7/21 10:56 AM, Dumitru Ceara wrote:
>>> Whenever a Load_Balancer is updated, e.g., a VIP is added, the following
>>> sequence of events happens:
>>>
>>> 1. The Southbound Load_Balancer record is updated.
>>> 2. The Southbound Datapath_Binding records on which the Load_Balancer is
>>>     applied are updated.
>>> 3. Southbound ovsdb-server sends updates about the Load_Balancer and
>>>     Datapath_Binding records to ovn-controller.
>>> 4. The IDL layer in ovn-controller processes the updates at #3, but
>>>     because of the SB schema references between tables [0] all logical
>>>     flows referencing the updated Datapath_Binding are marked as
>>>     "updated".  The same is true for Logical_DP_Group records
>>>     referencing the Datapath_Binding, and also for all logical flows
>>>     pointing to the new "updated" datapath groups.
>>> 5. ovn-controller ends up recomputing (removing/readding) all flows for
>>>     all these tracked updates.
>>>
>>>  From the SB Schema:
>>>          "Datapath_Binding": {
>>>              "columns": {
>>>                  [...]
>>>                  "load_balancers": {"type": {"key": {"type": "uuid",
>>>                                                     "refTable":
> "Load_Balancer",
>>>                                                     "refType": "weak"},
>>>                                              "min": 0,
>>>                                              "max": "unlimited"}},
>>>          [...]
>>>          "Load_Balancer": {
>>>              "columns": {
>>>                  "datapaths": {
>>>                  [...]
>>>                      "type": {"key": {"type": "uuid",
>>>                                       "refTable": "Datapath_Binding"},
>>>                               "min": 0, "max": "unlimited"}},
>>>          [...]
>>>          "Logical_DP_Group": {
>>>              "columns": {
>>>                  "datapaths":
>>>                      {"type": {"key": {"type": "uuid",
>>>                                        "refTable": "Datapath_Binding",
>>>                                        "refType": "weak"},
>>>                                "min": 0, "max": "unlimited"}}},
>>>          [...]
>>>          "Logical_Flow": {
>>>              "columns": {
>>>                  "logical_datapath":
>>>                      {"type": {"key": {"type": "uuid",
>>>                                        "refTable": "Datapath_Binding"},
>>>                                "min": 0, "max": 1}},
>>>                  "logical_dp_group":
>>>                      {"type": {"key": {"type": "uuid",
>>>                                        "refTable": "Logical_DP_Group"},
>>>
>>> In order to avoid this unnecessary Logical_Flow notification storm we
>>> now remove the explicit reference from Datapath_Binding to
>>> Load_Balancer and instead store raw UUIDs.
>>>
>>> This means that on the ovn-controller side we need to perform a
>>> Load_Balancer table lookup by UUID whenever a new datapath is added,
>>> but that doesn't happen too often and the cost of the lookup is
>>> negligible compared to the huge cost of processing the unnecessary
>>> logical flow updates.
>>>
>>> This change is backwards compatible because the contents stored in the
>>> database are not changed, just that the schema constraints are relaxed a
>>> bit.
>>>
>>> Some performance measurements, on a scale test deployment simulating an
>>> ovn-kubernetes deployment with 120 nodes and a large load balancer
>>> with 16K VIPs associated to each node's logical switch, the event
>>> processing loop time in ovn-controller, when adding a new VIP, is
>>> reduced from ~39 seconds to ~8 seconds.
>>>
>>> There's no need to change the northd DDlog implementation.
>>>
>>> Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1978605
>>> Signed-off-by: Dumitru Ceara <[email protected]>
>>> ---
>>>   controller/lflow.c          |  6 ++++--
>>>   controller/lflow.h          |  2 ++
>>>   controller/ovn-controller.c |  2 ++
>>>   lib/inc-proc-eng.h          |  1 +
>>>   northd/ovn-northd.c         | 14 ++++++--------
>>>   ovn-sb.ovsschema            |  6 ++----
>>>   6 files changed, 17 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/controller/lflow.c b/controller/lflow.c
>>> index 60aa011ff..877bd49b0 100644
>>> --- a/controller/lflow.c
>>> +++ b/controller/lflow.c
>>> @@ -1744,8 +1744,10 @@ lflow_processing_end:
>>>       /* Add load balancer hairpin flows if the datapath has any load
> balancers
>>>        * associated. */
>>>       for (size_t i = 0; i < dp->n_load_balancers; i++) {
>>> -        consider_lb_hairpin_flows(dp->load_balancers[i],
>>> -                                  l_ctx_in->local_datapaths,
>>> +        const struct sbrec_load_balancer *lb =
>>> +            sbrec_load_balancer_get_for_uuid(l_ctx_in->sb_idl,
>>> +                                             &dp->load_balancers[i]);
> 
> We should never directly access DB tables this way within the I-P engine.
> We will lose track of the dependencies. Now that we removed the reference,
> we should add load_balancer table as a new dependency, get the table data
> through EN_OVSDB_GET, and explicitly add a change handler to avoid
> recompute.
> 

There's some confusion here.

Datapath doesn't really depend on Load_Balancer.  From the beginning
there shouldn't have been a reference from Datapath_Binding to
Load_Balancer.

As a matter of fact NB.Load_Balancer used to be completely handled by
ovn-northd and would generate logical flows.  So the real dependency is
"en_lflow_output" depends on SB load balancers.

We already have an I-P handler for that:
lflow_output_sb_load_balancer_handler()

We don't *really* need to store load_balancer references in
Datapath_Binding.  The only reason they were added there when
Load_Balancer flow generation was moved to ovn-controller was to avoid
walking all load balancers whenever a datapath was added.

However, when a load balancer is changed there's no change to be handled
for datapaths themselves.

As a matter of fact, an alternative implementation (that obviously
doesn't scale well) is:

SBREC_LOAD_BALANCER_TABLE_FOR_EACH (lb, l_ctx_in->lb_table) {
   for (size_t i = 0; i < lb->n_datapaths; i++) {
      if (lb->datapaths[i] == dp) {
          consider_lb_hairpin_flows(lb, ...);
          break;
      }
   }
}

To avoid this potentially expensive table walk, we use the load_balancer
uuids stored in the datapath record itself (it's probably best to see
those as hints I guess).

>>> +        consider_lb_hairpin_flows(lb, l_ctx_in->local_datapaths,
>>>                                     l_ctx_out->flow_table);
>>>       }
>>>
>>> diff --git a/controller/lflow.h b/controller/lflow.h
>>> index c17ff6dd4..5ba7af894 100644
>>> --- a/controller/lflow.h
>>> +++ b/controller/lflow.h
>>> @@ -40,6 +40,7 @@
>>>   #include "openvswitch/list.h"
>>>
>>>   struct ovn_extend_table;
>>> +struct ovsdb_idl;
>>>   struct ovsdb_idl_index;
>>>   struct ovn_desired_flow_table;
>>>   struct hmap;
>>> @@ -126,6 +127,7 @@ void lflow_resource_destroy(struct
> lflow_resource_ref *);
>>>   void lflow_resource_clear(struct lflow_resource_ref *);
>>>
>>>   struct lflow_ctx_in {
>>> +    struct ovsdb_idl *sb_idl;
>>>       struct ovsdb_idl_index *sbrec_multicast_group_by_name_datapath;
>>>       struct ovsdb_idl_index *sbrec_logical_flow_by_logical_datapath;
>>>       struct ovsdb_idl_index *sbrec_logical_flow_by_logical_dp_group;
>>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
>>> index 9050380f3..24443bad1 100644
>>> --- a/controller/ovn-controller.c
>>> +++ b/controller/ovn-controller.c
>>> @@ -2016,6 +2016,7 @@ init_lflow_ctx(struct engine_node *node,
>>>           engine_get_input_data("port_groups", node);
>>>       struct shash *port_groups = &pg_data->port_groups_cs_local;
>>>
>>> +    l_ctx_in->sb_idl = engine_get_context()->ovnsb_idl;
>>>       l_ctx_in->sbrec_multicast_group_by_name_datapath =
>>>           sbrec_mc_group_by_name_dp;
>>>       l_ctx_in->sbrec_logical_flow_by_logical_datapath =
>>> @@ -3124,6 +3125,7 @@ main(int argc, char *argv[])
>>>           }
>>>
>>>           struct engine_context eng_ctx = {
>>> +            .ovnsb_idl = ovnsb_idl_loop.idl,
>>>               .ovs_idl_txn = ovs_idl_txn,
>>>               .ovnsb_idl_txn = ovnsb_idl_txn,
>>>               .client_ctx = &ctrl_engine_ctx
>>> diff --git a/lib/inc-proc-eng.h b/lib/inc-proc-eng.h
>>> index 7e9f5bb70..506da51a1 100644
>>> --- a/lib/inc-proc-eng.h
>>> +++ b/lib/inc-proc-eng.h
>>> @@ -64,6 +64,7 @@
>>>   #define ENGINE_MAX_OVSDB_INDEX 256
>>>
>>>   struct engine_context {
>>> +    struct ovsdb_idl *ovnsb_idl;
>>>       struct ovsdb_idl_txn *ovs_idl_txn;
>>>       struct ovsdb_idl_txn *ovnsb_idl_txn;
>>>       void *client_ctx;
>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>> index 570c6a3ef..8649a590b 100644
>>> --- a/northd/ovn-northd.c
>>> +++ b/northd/ovn-northd.c
>>> @@ -3529,19 +3529,17 @@ build_ovn_lbs(struct northd_context *ctx,
> struct hmap *datapaths,
>>>               continue;
>>>           }
>>>
>>> -        const struct sbrec_load_balancer **sbrec_lbs =
>>> -            xmalloc(od->nbs->n_load_balancer * sizeof *sbrec_lbs);
>>> +        struct uuid *lb_uuids =
>>> +            xmalloc(od->nbs->n_load_balancer * sizeof *lb_uuids);
>>>           for (size_t i = 0; i < od->nbs->n_load_balancer; i++) {
>>>               const struct uuid *lb_uuid =
>>>                   &od->nbs->load_balancer[i]->header_.uuid;
>>>               lb = ovn_northd_lb_find(lbs, lb_uuid);
>>> -            sbrec_lbs[i] = lb->slb;
>>> +            lb_uuids[i] = lb->slb->header_.uuid;
>>>           }
>>> -
>>> -        sbrec_datapath_binding_set_load_balancers(
>>> -            od->sb, (struct sbrec_load_balancer **)sbrec_lbs,
>>> -            od->nbs->n_load_balancer);
>>> -        free(sbrec_lbs);
>>> +        sbrec_datapath_binding_set_load_balancers(od->sb, lb_uuids,
>>> +
>  od->nbs->n_load_balancer);
>>> +        free(lb_uuids);
> 
> I believe we need to change the deletion part as well. Now that we don't
> have the weak reference, we need to take care of the LB deletions by
> removing the reference explicitly from table datapath_binding.

This is handling both addition and deletion, it's just storing the list
of load balancers that are currently associated to the logical datapath.

If a load balancer was deleted it won't be in the list.

We already have tests validating this works fine:

https://github.com/ovn-org/ovn/blob/ecc941c70f1675b66f6ecb2cdf09df362f62c97a/tests/ovn-northd.at#L2432

> 
> Thanks,
> Han
> 
> 

Thanks,
Dumitru

>>>       }
>>>   }
>>>
>>> diff --git a/ovn-sb.ovsschema b/ovn-sb.ovsschema
>>> index bbf60781d..57fbc3ca4 100644
>>> --- a/ovn-sb.ovsschema
>>> +++ b/ovn-sb.ovsschema
>>> @@ -1,7 +1,7 @@
>>>   {
>>>       "name": "OVN_Southbound",
>>>       "version": "20.18.0",
>>> -    "cksum": "1816525029 26536",
>>> +    "cksum": "779623696 26386",
>>>       "tables": {
>>>           "SB_Global": {
>>>               "columns": {
>>> @@ -166,9 +166,7 @@
>>>                        "type": {"key": {"type": "integer",
>>>                                         "minInteger": 1,
>>>                                         "maxInteger": 16777215}}},
>>> -                "load_balancers": {"type": {"key": {"type": "uuid",
>>> -                                                   "refTable":
> "Load_Balancer",
>>> -                                                   "refType": "weak"},
>>> +                "load_balancers": {"type": {"key": {"type": "uuid"},
>>>                                               "min": 0,
>>>                                               "max": "unlimited"}},
>>>                   "external_ids": {
>>>
>>
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to