On Thu, Aug 11, 2022 at 2:45 PM Dumitru Ceara <[email protected]> wrote:

> On 8/10/22 18:37, Han Zhou wrote:
> > On Tue, Aug 9, 2022 at 11:22 PM Ales Musil <[email protected]> wrote:
> >>
> >>
> >>
> >> On Wed, Aug 10, 2022 at 8:04 AM Han Zhou <[email protected]> wrote:
> >>
> >> Hi Han,
> >>
> >> thank you for the review, see my reply inline below.
> >>
> >>> On Tue, Aug 9, 2022 at 4:05 AM Ales Musil <[email protected]> wrote:
> >>>>
> >>>> Add MAC binding aging mechanism, that utilizes
> >>>> the timestamp column of MAC_Binding table.
> >>>> When the MAC binding exceeds the threshold it is
> >>>> removed from SB DB, this is postponed only in case
> >>>> we receive update ARP with update to MAC address.
> >>>>
> >>>> The threshold is configurable via option
> >>>> "mac_binding_age_threshold" that can be specified
> >>>> for each logical router. The option is defaulting to
> >>>> 0 which means that by default the aging is disabled
> >>>> and the MAC binding rows will be persisted the same
> >>>> way as before.
> >>>>
> >>>> Reported-at: https://bugzilla.redhat.com/2084668
> >>>> Signed-off-by: Ales Musil <[email protected]>
> >>>> ---
> >>>> v3: Rebase on top of current main.
> >>>>     Update according to the final conclusion.
> >>>> v4: Rebase on top of current main.
> >>>>     Skip one row instead of all if time elapsed is negative.
> >>>>     Add note to NEWS.
> >>>> ---
> >>>>  NEWS                       |   3 +
> >>>>  northd/automake.mk         |   2 +
> >>>>  northd/inc-proc-northd.c   |  13 ++++
> >>>>  northd/mac-binding-aging.c | 151
> +++++++++++++++++++++++++++++++++++++
> >>>>  northd/mac-binding-aging.h |  33 ++++++++
> >>>>  ovn-nb.xml                 |   7 ++
> >>>>  6 files changed, 209 insertions(+)
> >>>>  create mode 100644 northd/mac-binding-aging.c
> >>>>  create mode 100644 northd/mac-binding-aging.h
> >>>>
> >>>> diff --git a/NEWS b/NEWS
> >>>> index 20cea579e..eff8e472a 100644
> >>>> --- a/NEWS
> >>>> +++ b/NEWS
> >>>> @@ -14,6 +14,9 @@ Post v22.06.0
> >>>>      NAT-T UDP encapsulation. Requires OVS support for IPsec custom
> > tunnel
> >>>>      options (which will be available in OVS 2.18).
> >>>>    - Removed possibility of disabling logical datapath groups.
> >>>> +  - Added MAC binding aging mechanism, that is disabled by default.
> >>>> +    It can be enabled per logical router with option
> >>>> +    "mac_binding_age_threshold".
> >>>>
> >>>>  OVN v22.06.0 - 03 Jun 2022
> >>>>  --------------------------
> >>>> diff --git a/northd/automake.mk b/northd/automake.mk
> >>>> index 4862ec7b7..81582867d 100644
> >>>> --- a/northd/automake.mk
> >>>> +++ b/northd/automake.mk
> >>>> @@ -1,6 +1,8 @@
> >>>>  # ovn-northd
> >>>>  bin_PROGRAMS += northd/ovn-northd
> >>>>  northd_ovn_northd_SOURCES = \
> >>>> +       northd/mac-binding-aging.c \
> >>>> +       northd/mac-binding-aging.h \
> >>>>         northd/northd.c \
> >>>>         northd/northd.h \
> >>>>         northd/ovn-northd.c \
> >>>> diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c
> >>>> index 43093cb5a..4a3699106 100644
> >>>> --- a/northd/inc-proc-northd.c
> >>>> +++ b/northd/inc-proc-northd.c
> >>>> @@ -22,9 +22,11 @@
> >>>>  #include "ip-mcast-index.h"
> >>>>  #include "static-mac-binding-index.h"
> >>>>  #include "lib/inc-proc-eng.h"
> >>>> +#include "lib/mac-binding-index.h"
> >>>>  #include "lib/ovn-nb-idl.h"
> >>>>  #include "lib/ovn-sb-idl.h"
> >>>>  #include "mcast-group-index.h"
> >>>> +#include "northd/mac-binding-aging.h"
> >>>>  #include "openvswitch/poll-loop.h"
> >>>>  #include "openvswitch/vlog.h"
> >>>>  #include "inc-proc-northd.h"
> >>>> @@ -149,6 +151,8 @@ enum sb_engine_node {
> >>>>   * avoid sparse errors. */
> >>>>  static ENGINE_NODE(northd, "northd");
> >>>>  static ENGINE_NODE(lflow, "lflow");
> >>>> +static ENGINE_NODE(mac_binding_aging, "mac_binding_aging");
> >>>> +static ENGINE_NODE(mac_binding_aging_waker,
> > "mac_binding_aging_waker");
> >>>>
> >>>>  void inc_proc_northd_init(struct ovsdb_idl_loop *nb,
> >>>>                            struct ovsdb_idl_loop *sb)
> >>>> @@ -211,12 +215,16 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
> > *nb,
> >>>>      engine_add_input(&en_northd, &en_sb_load_balancer, NULL);
> >>>>      engine_add_input(&en_northd, &en_sb_fdb, NULL);
> >>>>      engine_add_input(&en_northd, &en_sb_static_mac_binding, NULL);
> >>>> +    engine_add_input(&en_mac_binding_aging, &en_sb_mac_binding,
> NULL);
> >>>> +    engine_add_input(&en_mac_binding_aging, &en_northd, NULL);
> >>>> +    engine_add_input(&en_mac_binding_aging,
> > &en_mac_binding_aging_waker, NULL);
> >>>>      engine_add_input(&en_lflow, &en_nb_bfd, NULL);
> >>>>      engine_add_input(&en_lflow, &en_sb_bfd, NULL);
> >>>>      engine_add_input(&en_lflow, &en_sb_logical_flow, NULL);
> >>>>      engine_add_input(&en_lflow, &en_sb_multicast_group, NULL);
> >>>>      engine_add_input(&en_lflow, &en_sb_igmp_group, NULL);
> >>>>      engine_add_input(&en_lflow, &en_northd, NULL);
> >>>> +    engine_add_input(&en_lflow, &en_mac_binding_aging, NULL);
> >>>>
> >>>>      struct engine_arg engine_arg = {
> >>>>          .nb_idl = nb->idl,
> >>>> @@ -235,6 +243,8 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
> > *nb,
> >>>>          chassis_hostname_index_create(sb->idl);
> >>>>      struct ovsdb_idl_index *sbrec_static_mac_binding_by_lport_ip
> >>>>          = static_mac_binding_index_create(sb->idl);
> >>>> +    struct ovsdb_idl_index *sbrec_mac_binding_by_datapath
> >>>> +        = mac_binding_by_datapath_index_create(sb->idl);
> >>>>
> >>>>      engine_init(&en_lflow, &engine_arg);
> >>>>
> >>>> @@ -256,6 +266,9 @@ void inc_proc_northd_init(struct ovsdb_idl_loop
> > *nb,
> >>>>      engine_ovsdb_node_add_index(&en_sb_static_mac_binding,
> >>>>
> >  "sbrec_static_mac_binding_by_lport_ip",
> >>>>
> sbrec_static_mac_binding_by_lport_ip);
> >>>> +    engine_ovsdb_node_add_index(&en_sb_mac_binding,
> >>>> +                                "sbrec_mac_binding_by_datapath",
> >>>> +                                sbrec_mac_binding_by_datapath);
> >>>>  }
> >>>>
> >>>>  void inc_proc_northd_run(struct ovsdb_idl_txn *ovnnb_txn,
> >>>> diff --git a/northd/mac-binding-aging.c b/northd/mac-binding-aging.c
> >>>> new file mode 100644
> >>>> index 000000000..e8217e8bc
> >>>> --- /dev/null
> >>>> +++ b/northd/mac-binding-aging.c
> >>>> @@ -0,0 +1,151 @@
> >>>> +/* Copyright (c) 2022, Red Hat, Inc.
> >>>> + *
> >>>> + * Licensed under the Apache License, Version 2.0 (the "License");
> >>>> + * you may not use this file except in compliance with the License.
> >>>> + * You may obtain a copy of the License at:
> >>>> + *
> >>>> + *     http://www.apache.org/licenses/LICENSE-2.0
> >>>> + *
> >>>> + * Unless required by applicable law or agreed to in writing,
> software
> >>>> + * distributed under the License is distributed on an "AS IS" BASIS,
> >>>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> > implied.
> >>>> + * See the License for the specific language governing permissions
> and
> >>>> + * limitations under the License.
> >>>> + */
> >>>> +
> >>>> +#include <config.h>
> >>>> +
> >>>> +#include "lib/inc-proc-eng.h"
> >>>> +#include "lib/ovn-nb-idl.h"
> >>>> +#include "lib/ovn-sb-idl.h"
> >>>> +#include "lib/timeval.h"
> >>>> +#include "northd/mac-binding-aging.h"
> >>>> +#include "northd/northd.h"
> >>>> +#include "openvswitch/hmap.h"
> >>>> +#include "openvswitch/poll-loop.h"
> >>>> +#include "openvswitch/util.h"
> >>>> +#include "openvswitch/vlog.h"
> >>>> +
> >>>> +VLOG_DEFINE_THIS_MODULE(mac_binding_aging);
> >>>> +
> >>>> +struct mac_binding_waker {
> >>>> +    bool should_schedule;
> >>>> +    long long next_wake_msec;
> >>>> +};
> >>>> +
> >>>> +static struct mac_binding_waker waker;
> >>>> +
> >>>> +static void
> >>>> +mac_binding_aging_run_for_datapath(const struct
> > sbrec_datapath_binding *dp,
> >>>> +                                   const struct nbrec_logical_router
> > *nbr,
> >>>> +                                   struct ovsdb_idl_index
> > *mb_by_datapath,
> >>>> +                                   int64_t now, int64_t *wake_delay)
> >>>> +{
> >>>> +    uint64_t threshold = smap_get_uint(&nbr->options,
> >>>> +                                         "mac_binding_age_threshold",
> >>>> +                                         0) * 1000;
>
> Nit: indentation.
>
> >>>> +    if (!threshold) {
> >>>> +        return;
> >>>> +    }
> >>>> +
> >>>> +    struct sbrec_mac_binding *mb_index_row =
> >>>> +        sbrec_mac_binding_index_init_row(mb_by_datapath);
> >>>> +    sbrec_mac_binding_index_set_datapath(mb_index_row, dp);
> >>>> +
> >>>> +    const struct sbrec_mac_binding *mb;
> >>>> +    SBREC_MAC_BINDING_FOR_EACH_EQUAL (mb, mb_index_row,
> > mb_by_datapath) {
> >>>> +        int64_t elapsed = now - mb->timestamp;
> >>>> +
> >>>> +        if (elapsed < 0) {
> >>>> +            continue;
> >>>> +        } else if (elapsed >= threshold) {
> >>>> +            sbrec_mac_binding_delete(mb);
> >>>> +        } else {
> >>>> +            *wake_delay = MIN(*wake_delay, threshold - elapsed);
> >>>> +        }
> >>>> +    }
> >>>> +    sbrec_mac_binding_index_destroy_row(mb_index_row);
> >>>> +}
> >>>> +
> >>>> +void
> >>>> +en_mac_binding_aging_run(struct engine_node *node, void *data
> > OVS_UNUSED)
> >>>> +{
> >>>> +    const struct engine_context *eng_ctx = engine_get_context();
> >>>> +
> >>>> +    if (!eng_ctx->ovnsb_idl_txn) {
> >>>> +        return;
> >>>> +    }
> >>>> +
> >>>> +    int64_t next_expire_msec = INT64_MAX;
> >>>> +    int64_t now = time_wall_msec();
> >>>> +    struct northd_data *northd_data = engine_get_input_data("northd",
> > node);
> >>>> +    struct ovsdb_idl_index *sbrec_mac_binding_by_datapath =
> >>>> +
> >  engine_ovsdb_node_get_index(engine_get_input("SB_mac_binding", node),
> >>>> +                                    "sbrec_mac_binding_by_datapath");
> >>>> +
> >>>> +    struct ovn_datapath *od;
> >>>> +    HMAP_FOR_EACH (od, key_node, &northd_data->datapaths) {
> >>>> +        if (od->sb && od->nbr) {
> >>>> +            mac_binding_aging_run_for_datapath(od->sb, od->nbr,
> >>>> +
> > sbrec_mac_binding_by_datapath,
> >>>> +                                               now,
> > &next_expire_msec);
> >>>> +        }
> >>>> +    }
> >>>> +
> >>>> +    if (next_expire_msec < INT64_MAX) {
> >>>> +        waker.should_schedule = true;
> >>>> +        waker.next_wake_msec = time_msec() + next_expire_msec;
> >>>> +        /* Run the engine right after so the waker can reflect on the
> > new wake
> >>>> +         * time. */
> >>>> +        poll_immediate_wake();
> >>>
> >>> Why would we immediately wake up? Wouldn't it cause busy-loop?  Shall
> we
> > use poll_timer_wait_until? Sorry if I missed anything here and please
> > correct me.
> >>
> >>
> >> That's because of the dependency of the nodes, we need to run the waker
> > that will schedule the sleep.
> >> We could probably use poll_timer_wait_until as well and it should have
> > the same effect.
> >>
> > Thanks, I think I did misunderstand a little and now I got the idea of
> the
> > waker node. It is an input node just to trigger the timer, which wakes up
> > the aging processing when needed and at the same time avoids running the
> > aging scanning when the process is woken up for other reasons. So the
> > poll_immediate_wake() here won't cause busy-loop. However, it does wake
> up
> > the process at least one extra time for each run, although the cost may
> be
> > small. I'd still use poll_timer_wait_until() here since we already know
> the
> > next_wake_msec.
> >
> > In addition, it is a little weird from I-P engine's point of view that
> the
> > output node execution updates the input node's data, but for this timer
> use
> > case I think it should work and I haven't come up with anything better.
> It
> > would be helpful to leave some comments for the waker node to explain the
> > motivation and make this part easier to understand.
> >
>
> +1 for both points (comment + poll_timer_wait_until()).
>
> Also, can we at least use engine_get_input_data() to get the input
> node's data?  This will make it more clear that we're using data that's
> related to the waker node.  The 'waker' variable would have to be moved
> inside the en_mac_binding_aging_waker data.
>
> > Thanks,
> > Han
> >
> >>>
> >>>
> >>>> +    } else {
> >>>> +        waker.should_schedule = false;
> >>>> +    }
> >>>> +
> >>>> +    /* This node is part of lflow, but lflow does not depend on it.
> > Setting
> >>>> +     * state as unchanged does not trigger lflow node when it is not
> > needed. */
> >>>
> >>> This should work, but it is like a hack. Ideally we should allow the
> I-P
> > engine to have multiple "root" nodes. This can be a future improvement
> > though. Maybe add a "XXX" comment here (a TODO item).
> >>
> >>
> >> Yes unfortunately this is not the cleanest solution, but I did not come
> > with a better one. I will add a TODO comment here about moving it once we
> > allow multiple root nodes.
> >>
> >>>
> >>>
> >>> Thanks,
> >>> Han
> >>>
> >>
> >> Thanks,
> >> Ales
> >>
> >>>
> >>>
> >>>> +    engine_set_node_state(node, EN_UNCHANGED);
> >>>> +}
> >>>> +
> >>>> +void *
> >>>> +en_mac_binding_aging_init(struct engine_node *node OVS_UNUSED,
> >>>> +                          struct engine_arg *arg OVS_UNUSED)
> >>>> +{
> >>>> +    return NULL;
> >>>> +}
> >>>> +
> >>>> +void
> >>>> +en_mac_binding_aging_cleanup(void *data OVS_UNUSED)
> >>>> +{
> >>>> +}
> >>>> +
> >>>> +void
> >>>> +en_mac_binding_aging_waker_run(struct engine_node *node, void *data
> > OVS_UNUSED)
> >>>> +{
> >>>> +    if (!waker.should_schedule) {
> >>>> +        return;
> >>>> +    }
> >>>> +
> >>>> +    if (time_msec() >= waker.next_wake_msec) {
> >>>> +        waker.should_schedule = false;
> >>>> +        engine_set_node_state(node, EN_UPDATED);
> >>>> +        return;
> >>>> +    }
> >>>> +
> >>>> +    engine_set_node_state(node, EN_UNCHANGED);
> >>>> +    poll_timer_wait_until(waker.next_wake_msec);
> >>>> +}
> >>>> +
> >>>> +void *
> >>>> +en_mac_binding_aging_waker_init(struct engine_node *node OVS_UNUSED,
> >>>> +                                struct engine_arg *arg OVS_UNUSED)
> >>>> +{
> >>>> +    waker.should_schedule = false;
> >>>> +    waker.next_wake_msec = 0;
> >>>> +    return NULL;
> >>>> +}
> >>>> +
> >>>> +void
> >>>> +en_mac_binding_aging_waker_cleanup(void *data OVS_UNUSED)
> >>>> +{
> >>>> +}
> >>>> diff --git a/northd/mac-binding-aging.h b/northd/mac-binding-aging.h
> >>>> new file mode 100644
> >>>> index 000000000..296a7ab38
> >>>> --- /dev/null
> >>>> +++ b/northd/mac-binding-aging.h
> >>>> @@ -0,0 +1,33 @@
> >>>> +/* Copyright (c) 2022, Red Hat, Inc.
> >>>> + *
> >>>> + * Licensed under the Apache License, Version 2.0 (the "License");
> >>>> + * you may not use this file except in compliance with the License.
> >>>> + * You may obtain a copy of the License at:
> >>>> + *
> >>>> + *     http://www.apache.org/licenses/LICENSE-2.0
> >>>> + *
> >>>> + * Unless required by applicable law or agreed to in writing,
> software
> >>>> + * distributed under the License is distributed on an "AS IS" BASIS,
> >>>> + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
> > implied.
> >>>> + * See the License for the specific language governing permissions
> and
> >>>> + * limitations under the License.
> >>>> + */
> >>>> +
> >>>> +#ifndef MAC_BINDING_AGING_H
> >>>> +#define MAC_BINDING_AGING_H 1
> >>>> +
> >>>> +#include "lib/inc-proc-eng.h"
> >>>> +
> >>>> +/* The MAC binding aging node functions. */
> >>>> +void en_mac_binding_aging_run(struct engine_node *node, void *data);
> >>>> +void *en_mac_binding_aging_init(struct engine_node *node,
> >>>> +                                struct engine_arg *arg);
> >>>> +void en_mac_binding_aging_cleanup(void *data);
> >>>> +
> >>>> +/* The MAC binding aging waker node functions. */
> >>>> +void en_mac_binding_aging_waker_run(struct engine_node *node, void
> > *data);
> >>>> +void *en_mac_binding_aging_waker_init(struct engine_node *node,
> >>>> +                                      struct engine_arg *arg);
> >>>> +void en_mac_binding_aging_waker_cleanup(void *data);
> >>>> +
> >>>> +#endif /* northd/mac-binding-aging.h */
> >>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
> >>>> index e26afd83c..d3fba9bdc 100644
> >>>> --- a/ovn-nb.xml
> >>>> +++ b/ovn-nb.xml
> >>>> @@ -2392,6 +2392,13 @@
> >>>>          and other sources. This way, OVN and the other sources can
> > make use of
> >>>>          the same conntrack zone.
> >>>>        </column>
> >>>> +
> >>>> +      <column name="options" key="mac_binding_age_threshold"
> >>>> +              type='{"type": "integer", "minInteger": 0,
> > "maxInteger": 4294967295}'>
> >>>> +        MAC binding aging <code>threshold</code> value in secs. MAC
> > binding
>
> Nit: I'd use 'seconds'.
>
> Thanks,
> Dumitru
>
> >>>> +        exceeding this timeout will be automatically removed. The
> > value
> >>>> +        defaults to 0, which means disabled.
> >>>> +      </column>
> >>>>      </group>
> >>>>
> >>>>      <group title="Common Columns">
> >>>> --
> >>>> 2.37.1
> >>>>
> >>
> >>
> >>
> >> --
> >>
> >> Ales Musil
> >>
> >> Senior Software Engineer - OVN Core
> >>
> >> Red Hat EMEA
> >>
> >> [email protected]    IM: amusil
> >
>
>
Thank you for the reviews.

v5 should address most of the comments except the open one from Numan about
upgrade.

Thanks,
Ales

-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]    IM: amusil
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to