On Tue, Jul 12, 2022 at 11:55 AM Mark Michelson <[email protected]> wrote:
>
> Hi Ihar, I have a few comments below.
>
> On 7/9/22 01:33, Ihar Hrachyshka wrote:
> > When multiple chassis are fighting for the same port (requested-chassis
> > is not set, e.g. for gateway ports), they may produce an unreasonable
> > number of chassis field updates in a very short time frame (hundreds of
> > updates in several seconds). This puts unnecessary load on OVN as well
> > as any db notification consumers trying to keep up with the barrage.
> >
> > This patch throttles port claim attempts so that they don't happen more
> > frequently than once per 0.5 seconds.
> >
> > Reported: https://bugzilla.redhat.com/show_bug.cgi?id=1974898
> > Signed-off-by: Ihar Hrachyshka <[email protected]>
> > ---
> >   controller/binding.c        | 119 +++++++++++++++++++++++++++++++++---
> >   controller/binding.h        |   8 +++
> >   controller/ovn-controller.c |  48 +++++++++++++++
> >   tests/ovn.at                |  41 +++++++++++++
> >   4 files changed, 208 insertions(+), 8 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 1eeca44a9..133ef1ad0 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -48,6 +48,54 @@ VLOG_DEFINE_THIS_MODULE(binding);
> >
> >   #define OVN_QOS_TYPE "linux-htb"
> >
> > +#define CLAIM_TIME_THRESHOLD_MS 500
> > +
> > +struct claimed_port {
> > +    long long int last_claimed;
> > +};
> > +
> > +static struct shash _claimed_ports = SHASH_INITIALIZER(&_claimed_ports);
>
> One thing I noticed is that claimed ports are never removed from this
> shash until ovn-controller shuts down. If ovn-controller runs for a long
> time and has many ports come and go, this shash can grow very large,
> presumably with long-irrelevant ports in it.
>
> It may be a good idea to add a periodic traversal of _claimed_ports and
> remove any claimed ports whose last_claimed is more than
> CLAIM_TIME_THRESHOLD_MS (or some multiple of CLAIM_TIME_THRESHOLD_MS to
> avoid malloc/free churn). Since you already have a failsafe in
> get_claim_timestamp() that returns 0 if the port is not in
> _claimed_ports, this would still allow the postponement mechanism to
> work as intended.
>
> I don't think you need to schedule a poll wakeup just for this. I think
> this traversal can be performed during binding_run() and
> binding_handle_port_binding_changes() and the claimed ports will
> naturally get cleaned up as ovn-controller continues to run.
>

Good idea, thanks. Adding in the next revision in a sec.

> > +static struct sset _postponed_ports = SSET_INITIALIZER(&_postponed_ports);
> > +
> > +struct sset *
> > +get_postponed_ports(void)
> > +{
> > +    return &_postponed_ports;
> > +}
> > +
> > +static long long int
> > +get_claim_timestamp(const char *port_name)
> > +{
> > +    struct claimed_port *cp = shash_find_data(&_claimed_ports, port_name);
> > +    if (!cp) {
> > +        return 0;
> > +    }
> > +    return cp->last_claimed;
> > +}
> > +
> > +static void
> > +register_claim_timestamp(const char *port_name, long long int t)
> > +{
> > +    struct claimed_port *cp = shash_find_data(&_claimed_ports, port_name);
> > +    if (!cp) {
> > +        cp = xzalloc(sizeof *cp);
> > +        shash_add(&_claimed_ports, port_name, cp);
> > +    }
> > +    cp->last_claimed = t;
> > +}
> > +
> > +void
> > +schedule_postponed_ports(void)
> > +{
> > +    const char *port_name;
> > +    SSET_FOR_EACH_SAFE (port_name, &_postponed_ports) {
> > +        long long int t = get_claim_timestamp(port_name);
> > +        if (t) {
> > +            poll_timer_wait_until(t + CLAIM_TIME_THRESHOLD_MS);
> > +        }
> > +    }
> > +}
> > +
> >   struct qos_queue {
> >       struct hmap_node node;
> >       uint32_t queue_id;
> > @@ -1045,6 +1093,23 @@ remove_additional_chassis(const struct 
> > sbrec_port_binding *pb,
> >       remove_additional_encap_for_chassis(pb, chassis_rec);
> >   }
> >
> > +static bool
> > +lport_maybe_postpone(const char *port_name, long long int now,
> > +                     struct sset *postponed_ports)
> > +{
> > +    long long int last_claimed = get_claim_timestamp(port_name);
> > +    if (now - last_claimed >= CLAIM_TIME_THRESHOLD_MS) {
> > +        return false;
> > +    }
> > +
> > +    sset_add(postponed_ports, port_name);
> > +    VLOG_DBG("Postponed claim on logical port %s.", port_name);
> > +
> > +    /* Schedule the next attempt. */
> > +    poll_timer_wait_until(last_claimed + CLAIM_TIME_THRESHOLD_MS);
> > +    return true;
> > +}
> > +
> >   /* Returns false if lport is not claimed due to 'sb_readonly'.
> >    * Returns true otherwise.
> >    */
> > @@ -1055,14 +1120,16 @@ claim_lport(const struct sbrec_port_binding *pb,
> >               const struct ovsrec_interface *iface_rec,
> >               bool sb_readonly, bool notify_up,
> >               struct hmap *tracked_datapaths,
> > -            struct if_status_mgr *if_mgr)
> > +            struct if_status_mgr *if_mgr,
> > +            struct sset *postponed_ports)
> >   {
> >       if (!sb_readonly) {
> >           claimed_lport_set_up(pb, parent_pb, chassis_rec, notify_up, 
> > if_mgr);
> >       }
> >
> >       enum can_bind can_bind = lport_can_bind_on_this_chassis(chassis_rec, 
> > pb);
> > -    bool update_tracked = false;
> > +    bool claimed = false;
> > +    long long int now = time_msec();
> >
> >       if (can_bind == CAN_BIND_AS_MAIN) {
> >           if (pb->chassis != chassis_rec) {
> > @@ -1070,6 +1137,10 @@ claim_lport(const struct sbrec_port_binding *pb,
> >                   return false;
> >               }
> >
> > +            if (lport_maybe_postpone(pb->logical_port, now, 
> > postponed_ports)) {
> > +                return true;
> > +            }
> > +
> >               if (pb->chassis) {
> >                   VLOG_INFO("Changing chassis for lport %s from %s to %s.",
> >                           pb->logical_port, pb->chassis->name,
> > @@ -1086,7 +1157,7 @@ claim_lport(const struct sbrec_port_binding *pb,
> >               if (is_additional_chassis(pb, chassis_rec)) {
> >                   remove_additional_chassis(pb, chassis_rec);
> >               }
> > -            update_tracked = true;
> > +            claimed = true;
> >           }
> >       } else if (can_bind == CAN_BIND_AS_ADDITIONAL) {
> >           if (!is_additional_chassis(pb, chassis_rec)) {
> > @@ -1094,6 +1165,10 @@ claim_lport(const struct sbrec_port_binding *pb,
> >                   return false;
> >               }
> >
> > +            if (lport_maybe_postpone(pb->logical_port, now, 
> > postponed_ports)) {
> > +                return true;
> > +            }
> > +
> >               VLOG_INFO("Claiming lport %s for this additional chassis.",
> >                         pb->logical_port);
> >               for (size_t i = 0; i < pb->n_mac; i++) {
> > @@ -1104,12 +1179,16 @@ claim_lport(const struct sbrec_port_binding *pb,
> >               if (pb->chassis == chassis_rec) {
> >                   sbrec_port_binding_set_chassis(pb, NULL);
> >               }
> > -            update_tracked = true;
> > +            claimed = true;
> >           }
> >       }
> >
> > -    if (update_tracked && tracked_datapaths) {
> > -        update_lport_tracking(pb, tracked_datapaths, true);
> > +    if (claimed) {
> > +        if (tracked_datapaths) {
> > +            update_lport_tracking(pb, tracked_datapaths, true);
> > +        }
> > +        register_claim_timestamp(pb->logical_port, now);
> > +        sset_find_and_delete(postponed_ports, pb->logical_port);
> >       }
> >
> >       /* Check if the port encap binding, if any, has changed */
> > @@ -1271,7 +1350,8 @@ consider_vif_lport_(const struct sbrec_port_binding 
> > *pb,
> >                                b_lport->lbinding->iface,
> >                                !b_ctx_in->ovnsb_idl_txn,
> >                                !parent_pb, b_ctx_out->tracked_dp_bindings,
> > -                             b_ctx_out->if_mgr)){
> > +                             b_ctx_out->if_mgr,
> > +                             b_ctx_out->postponed_ports)) {
> >                   return false;
> >               }
> >
> > @@ -1567,7 +1647,8 @@ consider_nonvif_lport_(const struct 
> > sbrec_port_binding *pb,
> >           return claim_lport(pb, NULL, b_ctx_in->chassis_rec, NULL,
> >                              !b_ctx_in->ovnsb_idl_txn, false,
> >                              b_ctx_out->tracked_dp_bindings,
> > -                           b_ctx_out->if_mgr);
> > +                           b_ctx_out->if_mgr,
> > +                           b_ctx_out->postponed_ports);
> >       }
> >
> >       if (pb->chassis == b_ctx_in->chassis_rec ||
> > @@ -2772,6 +2853,21 @@ delete_done:
> >           }
> >       }
> >
> > +    /* Also handle any postponed (throttled) ports. */
> > +    const char *port_name;
> > +    SSET_FOR_EACH_SAFE (port_name, b_ctx_out->postponed_ports) {
> > +        pb = lport_lookup_by_name(b_ctx_in->sbrec_port_binding_by_name,
> > +                                  port_name);
> > +        if (!pb) {
> > +            sset_find_and_delete(b_ctx_out->postponed_ports, port_name);
> > +            continue;
> > +        }
> > +        handled = handle_updated_port(b_ctx_in, b_ctx_out, pb, 
> > qos_map_ptr);
> > +        if (!handled) {
> > +            break;
> > +        }
> > +    }
> > +
> >       if (handled && qos_map_ptr && set_noop_qos(b_ctx_in->ovs_idl_txn,
> >                                                  b_ctx_in->port_table,
> >                                                  b_ctx_in->qos_table,
> > @@ -3214,3 +3310,10 @@ ovs_iface_matches_lport_iface_id_ver(const struct 
> > ovsrec_interface *iface,
> >
> >       return true;
> >   }
> > +
> > +void
> > +binding_destroy(void)
> > +{
> > +    shash_destroy_free_data(&_claimed_ports);
> > +    sset_clear(&_postponed_ports);
> > +}
> > diff --git a/controller/binding.h b/controller/binding.h
> > index 1fed06674..161057592 100644
> > --- a/controller/binding.h
> > +++ b/controller/binding.h
> > @@ -103,6 +103,8 @@ struct binding_ctx_out {
> >       struct hmap *tracked_dp_bindings;
> >
> >       struct if_status_mgr *if_mgr;
> > +
> > +    struct sset *postponed_ports;
> >   };
> >
> >   /* Local bindings. binding.c module binds the logical port (represented by
> > @@ -219,4 +221,10 @@ struct binding_lport {
> >       size_t n_port_security;
> >   };
> >
> > +struct sset *get_postponed_ports(void);
> > +void schedule_postponed_ports(void);
> > +
> > +/* Clean up module state. */
> > +void binding_destroy(void);
> > +
> >   #endif /* controller/binding.h */
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index 69615308e..851f53ce0 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -1175,6 +1175,42 @@ en_activated_ports_run(struct engine_node *node, 
> > void *data_)
> >       engine_set_node_state(node, state);
> >   }
> >
> > +struct ed_type_postponed_ports {
> > +    struct sset *postponed_ports;
> > +};
> > +
> > +static void *
> > +en_postponed_ports_init(struct engine_node *node OVS_UNUSED,
> > +                        struct engine_arg *arg OVS_UNUSED)
> > +{
> > +    struct ed_type_postponed_ports *data = xzalloc(sizeof *data);
> > +    data->postponed_ports = get_postponed_ports();
> > +    return data;
> > +}
> > +
> > +static void
> > +en_postponed_ports_cleanup(void *data_)
> > +{
> > +    struct ed_type_postponed_ports *data = data_;
> > +    if (!data->postponed_ports) {
> > +        return;
> > +    }
> > +    data->postponed_ports = NULL;
> > +}
> > +
> > +static void
> > +en_postponed_ports_run(struct engine_node *node, void *data_)
> > +{
> > +    struct ed_type_postponed_ports *data = data_;
> > +    enum engine_node_state state = EN_UNCHANGED;
> > +    data->postponed_ports = get_postponed_ports();
> > +    if (!sset_is_empty(data->postponed_ports)) {
> > +        schedule_postponed_ports();
> > +        state = EN_UPDATED;
> > +    }
> > +    engine_set_node_state(node, state);
> > +}
> > +
> >   struct ed_type_runtime_data {
> >       /* Contains "struct local_datapath" nodes. */
> >       struct hmap local_datapaths;
> > @@ -1205,6 +1241,8 @@ struct ed_type_runtime_data {
> >
> >       struct shash local_active_ports_ipv6_pd;
> >       struct shash local_active_ports_ras;
> > +
> > +    struct sset *postponed_ports;
> >   };
> >
> >   /* struct ed_type_runtime_data has the below members for tracking the
> > @@ -1405,6 +1443,7 @@ init_binding_ctx(struct engine_node *node,
> >       b_ctx_out->egress_ifaces = &rt_data->egress_ifaces;
> >       b_ctx_out->lbinding_data = &rt_data->lbinding_data;
> >       b_ctx_out->local_iface_ids = &rt_data->local_iface_ids;
> > +    b_ctx_out->postponed_ports = rt_data->postponed_ports;
> >       b_ctx_out->tracked_dp_bindings = NULL;
> >       b_ctx_out->if_mgr = ctrl_ctx->if_mgr;
> >   }
> > @@ -1442,6 +1481,10 @@ en_runtime_data_run(struct engine_node *node, void 
> > *data)
> >           local_binding_data_init(&rt_data->lbinding_data);
> >       }
> >
> > +    struct ed_type_postponed_ports *pp_data =
> > +        engine_get_input_data("postponed_ports", node);
> > +    rt_data->postponed_ports = pp_data->postponed_ports;
> > +
> >       struct binding_ctx_in b_ctx_in;
> >       struct binding_ctx_out b_ctx_out;
> >       init_binding_ctx(node, rt_data, &b_ctx_in, &b_ctx_out);
> > @@ -3536,6 +3579,7 @@ main(int argc, char *argv[])
> >       ENGINE_NODE(mff_ovn_geneve, "mff_ovn_geneve");
> >       ENGINE_NODE(ofctrl_is_connected, "ofctrl_is_connected");
> >       ENGINE_NODE_WITH_CLEAR_TRACK_DATA(activated_ports, "activated_ports");
> > +    ENGINE_NODE(postponed_ports, "postponed_ports");
> >       ENGINE_NODE(pflow_output, "physical_flow_output");
> >       ENGINE_NODE_WITH_CLEAR_TRACK_DATA(lflow_output, 
> > "logical_flow_output");
> >       ENGINE_NODE(flow_output, "flow_output");
> > @@ -3675,6 +3719,9 @@ main(int argc, char *argv[])
> >                        runtime_data_sb_datapath_binding_handler);
> >       engine_add_input(&en_runtime_data, &en_sb_port_binding,
> >                        runtime_data_sb_port_binding_handler);
> > +    /* Reuse the same handler for any previously postponed ports. */
> > +    engine_add_input(&en_runtime_data, &en_postponed_ports,
> > +                     runtime_data_sb_port_binding_handler);
> >
> >       /* The OVS interface handler for runtime_data changes MUST be executed
> >        * after the sb_port_binding_handler as port_binding deletes must be
> > @@ -4312,6 +4359,7 @@ loop_done:
> >       lflow_destroy();
> >       ofctrl_destroy();
> >       pinctrl_destroy();
> > +    binding_destroy();
> >       patch_destroy();
> >       if_status_mgr_destroy(if_mgr);
> >       shash_destroy(&vif_plug_deleted_iface_ids);
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index 6e35bb5b8..3411e3a34 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -15392,6 +15392,47 @@ OVN_CLEANUP([hv1],[hv2])
> >   AT_CLEANUP
> >   ])
> >
> > +OVN_FOR_EACH_NORTHD([
> > +AT_SETUP([tug-of-war between two chassis for the same port])
> > +ovn_start
> > +
> > +ovn-nbctl ls-add ls0
> > +ovn-nbctl lsp-add ls0 lsp0
> > +
> > +net_add n1
> > +for i in 1 2; do
> > +    sim_add hv$i
> > +    as hv$i
> > +    ovs-vsctl add-br br-phys
> > +    ovn_attach n1 br-phys 192.168.0.$i
> > +done
> > +
> > +for i in 1 2; do
> > +    as hv$i
> > +    ovs-vsctl -- add-port br-int vif \
> > +              -- set Interface vif external-ids:iface-id=lsp0
> > +done
> > +
> > +# give controllers some time to fight for the port binding
> > +sleep 3
> > +
> > +# calculate the number of port claims registered by each fighting chassis
> > +hv1_claims=$(as hv1 grep -c 'Claiming\|Changing chassis' 
> > hv1/ovn-controller.log)
> > +hv2_claims=$(as hv2 grep -c 'Claiming\|Changing chassis' 
> > hv2/ovn-controller.log)
> > +
> > +echo $hv1_claims
> > +echo $hv2_claims
>
> Nit: For debuggability purposes, these echos should probably have some
> more context, like:
>
> echo "hv1 claimed ${hv1_claims} times"
>
> This makes it a bit easier to read the testsuite logs instead of just
> seeing an arbitrary "7" or whatever.
>

Huh. This was actually a debug message that I haven't meant to push
but now that it's there, I'll keep it by adopting your suggestion. :)

> > +
> > +# check that neither registered an outrageous number of port claims
> > +max_claims=10
> > +AT_CHECK([test "${hv1_claims}" -le "${max_claims}"], [0], [])
> > +AT_CHECK([test "${hv2_claims}" -le "${max_claims}"], [0], [])
> > +
> > +OVN_CLEANUP([hv1],[hv2])
> > +
> > +AT_CLEANUP
> > +])
> > +
> >   OVN_FOR_EACH_NORTHD([
> >   AT_SETUP([options:requested-chassis with hostname])
> >
> >
>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to