On Mon, Oct 21, 2024 at 10:48 PM Mark Michelson <[email protected]> wrote:

> When ACL tiers were introduced, the code kept track of the highest ACL
> tier so that when iterating through ACL tiers, we would not attempt to
> advance the current tier past the highest configured tier.
>
> Unfortunately, keeping track of a single max ACL tier doesn't work when
> ACLs are evaluated in three separate places. ACLs can be evaluated on
> ingress before load balancing, on ingress after load balancing, and on
> egress. By only keeping track of a single max ACL tier, it means that we
> could perform superfluous checks if one stage of ACLs has a higher max
> tier than other stages. As an example, if ingress pre-load balancing
> ACLs have a maximum tier of 1, and egress ACLs have a maximum tier of 2,
> then it means that for all stages of ACLs, we will evaluate tiers 0, 1,
> and 2 of ACLs, even though only one stage of ACLs uses tier 2.
>
> From a pure functionality standpoint, this doesn't cause any issues.
> Even if we advance the tier past the highest configured value, it
> results in a no-op and the same net result happens.
>
> However, the addition of sampling into ACLs has caused an unwanted side
> effect. In the example scenario above, let's say the tier 1 ACL in the
> ingress pre-load balancing stage evaluates to "pass". After the
> evaluation, we send a sample for the "pass" result. We then advance
> the tier to 2, then move back to ACL evaluation. There are no tier 2
> ACLs, so we move on to the sampling stage again. We then send a second
> sample for the previous "pass" result from tier 1. The result is
> confusing since we've sent two samples for the same ACL evaluation.
>
> To remedy this, we now track the max ACL tier in each of the stages
> where ACLs are evaluated. Now there are no superfluous ACL evaluations
> and no superfluous samples sent either.
>
> Reported-at: https://issues.redhat.com/browse/FDP-760
> Signed-off-by: Mark Michelson <[email protected]>
> ---
> v1 -> v2:
>  * Add a comment to the memcmp() call to caution in case the
>    acl_tier struct is altered and has padding added.
>  * Switch to using MAX macro when assigning max ACL tiers.
>  * Add a system test to ensure the referenced issue is fixed. For the
>    record, this test fails on the main branch, but passes with this
>    patch.
> ---
>

Hi Mark,

thank you for the v2. The code seems to be fine, but the
new portion of the system test seems to be flaky. Could
you please take a look?


 northd/en-ls-stateful.c | 54 +++++++++++++++++++++++++++++++++++------
>  northd/en-ls-stateful.h |  8 +++++-
>  northd/northd.c         | 47 ++++++++++++++++++++++++++---------
>  tests/ovn-northd.at     | 34 ++++++++++++++++++++++++++
>  tests/system-ovn.at     | 51 ++++++++++++++++++++++++++++++++++++++
>  5 files changed, 174 insertions(+), 20 deletions(-)
>
> diff --git a/northd/en-ls-stateful.c b/northd/en-ls-stateful.c
> index 44f74ea08..11e97aa9b 100644
> --- a/northd/en-ls-stateful.c
> +++ b/northd/en-ls-stateful.c
> @@ -204,16 +204,22 @@ ls_stateful_port_group_handler(struct engine_node
> *node, void *data_)
>              ovn_datapaths_find_by_index(input_data.ls_datapaths,
>                                          ls_stateful_rec->ls_index);
>          bool had_stateful_acl = ls_stateful_rec->has_stateful_acl;
> -        uint64_t max_acl_tier = ls_stateful_rec->max_acl_tier;
> +        struct acl_tier old_max = ls_stateful_rec->max_acl_tier;
>          bool had_acls = ls_stateful_rec->has_acls;
>          bool modified = false;
>
>          ls_stateful_record_reinit(ls_stateful_rec, od, ls_pg,
>                                    input_data.ls_port_groups);
>
> +        struct acl_tier new_max = ls_stateful_rec->max_acl_tier;
> +
> +        /* Using memcmp for struct acl_tier is fine since there is no
> padding
> +         * in the struct. However, if the structure is changed, the memcmp
> +         * may need to be updated to compare individual struct fields.
> +         */
>          if ((had_stateful_acl != ls_stateful_rec->has_stateful_acl)
>              || (had_acls != ls_stateful_rec->has_acls)
> -            || max_acl_tier != ls_stateful_rec->max_acl_tier) {
> +            || memcmp(&old_max, &new_max, sizeof(old_max))) {
>              modified = true;
>          }
>
> @@ -365,7 +371,8 @@ ls_stateful_record_set_acl_flags(struct
> ls_stateful_record *ls_stateful_rec,
>                                   const struct ls_port_group_table *ls_pgs)
>  {
>      ls_stateful_rec->has_stateful_acl = false;
> -    ls_stateful_rec->max_acl_tier = 0;
> +    memset(&ls_stateful_rec->max_acl_tier, 0,
> +           sizeof ls_stateful_rec->max_acl_tier);
>      ls_stateful_rec->has_acls = false;
>
>      if (ls_stateful_record_set_acl_flags_(ls_stateful_rec, od->nbs->acls,
> @@ -391,6 +398,38 @@ ls_stateful_record_set_acl_flags(struct
> ls_stateful_record *ls_stateful_rec,
>      }
>  }
>
> +static void
> +update_ls_max_acl_tier(struct ls_stateful_record *ls_stateful_rec,
> +                       const struct nbrec_acl *acl)
> +{
> +    if (!acl->tier) {
> +        return;
> +    }
> +
> +    uint64_t *tier;
> +
> +    if (!strcmp(acl->direction, "from-lport")) {
> +        if (smap_get_bool(&acl->options, "apply-after-lb", false)) {
> +            tier = &ls_stateful_rec->max_acl_tier.ingress_post_lb;
> +        } else {
> +            tier = &ls_stateful_rec->max_acl_tier.ingress_pre_lb;
> +        }
> +    } else {
> +        tier = &ls_stateful_rec->max_acl_tier.egress;
> +    }
> +
> +    *tier = MAX(*tier, acl->tier);
> +}
> +
> +static bool
> +ls_acl_tiers_are_maxed_out(struct acl_tier *acl_tier,
> +                           uint64_t max_allowed_acl_tier)
> +{
> +    return acl_tier->ingress_post_lb == max_allowed_acl_tier &&
> +           acl_tier->ingress_pre_lb == max_allowed_acl_tier &&
> +           acl_tier->egress == max_allowed_acl_tier;
> +}
> +
>  static bool
>  ls_stateful_record_set_acl_flags_(struct ls_stateful_record
> *ls_stateful_rec,
>                                    struct nbrec_acl **acls,
> @@ -408,16 +447,15 @@ ls_stateful_record_set_acl_flags_(struct
> ls_stateful_record *ls_stateful_rec,
>      ls_stateful_rec->has_acls = true;
>      for (size_t i = 0; i < n_acls; i++) {
>          const struct nbrec_acl *acl = acls[i];
> -        if (acl->tier > ls_stateful_rec->max_acl_tier) {
> -            ls_stateful_rec->max_acl_tier = acl->tier;
> -        }
> +        update_ls_max_acl_tier(ls_stateful_rec, acl);
>          if (!ls_stateful_rec->has_stateful_acl
>                  && !strcmp(acl->action, "allow-related")) {
>              ls_stateful_rec->has_stateful_acl = true;
>          }
>          if (ls_stateful_rec->has_stateful_acl &&
> -            ls_stateful_rec->max_acl_tier ==
> -                nbrec_acl_col_tier.type.value.integer.max) {
> +            ls_acl_tiers_are_maxed_out(
> +                &ls_stateful_rec->max_acl_tier,
> +                nbrec_acl_col_tier.type.value.integer.max)) {
>              return true;
>          }
>      }
> diff --git a/northd/en-ls-stateful.h b/northd/en-ls-stateful.h
> index eae4b08e2..18a7398a6 100644
> --- a/northd/en-ls-stateful.h
> +++ b/northd/en-ls-stateful.h
> @@ -33,6 +33,12 @@
>
>  struct lflow_ref;
>
> +struct acl_tier {
> +    uint64_t ingress_pre_lb;
> +    uint64_t ingress_post_lb;
> +    uint64_t egress;
> +};
> +
>  struct ls_stateful_record {
>      struct hmap_node key_node;
>
> @@ -46,7 +52,7 @@ struct ls_stateful_record {
>      bool has_stateful_acl;
>      bool has_lb_vip;
>      bool has_acls;
> -    uint64_t max_acl_tier;
> +    struct acl_tier max_acl_tier;
>
>      /* 'lflow_ref' is used to reference logical flows generated for
>       * this ls_stateful record.
> diff --git a/northd/northd.c b/northd/northd.c
> index a267cd5f8..9d3edee32 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -7321,28 +7321,34 @@ build_acl_action_lflows(const struct
> ls_stateful_record *ls_stateful_rec,
>          S_SWITCH_OUT_ACL_EVAL,
>      };
>
> +    uint64_t max_acl_tiers[] = {
> +        ls_stateful_rec->max_acl_tier.ingress_pre_lb,
> +        ls_stateful_rec->max_acl_tier.ingress_post_lb,
> +        ls_stateful_rec->max_acl_tier.egress,
> +    };
> +
>      ds_clear(actions);
>      ds_put_cstr(actions, REGBIT_ACL_VERDICT_ALLOW " = 0; "
>                          REGBIT_ACL_VERDICT_DROP " = 0; "
>                          REGBIT_ACL_VERDICT_REJECT " = 0; ");
> -    if (ls_stateful_rec->max_acl_tier) {
> -        ds_put_cstr(actions, REG_ACL_TIER " = 0; ");
> -    }
>
>      size_t verdict_len = actions->length;
> -
>      for (size_t i = 0; i < ARRAY_SIZE(stages); i++) {
>          enum ovn_stage stage = stages[i];
> +        if (max_acl_tiers[i]) {
> +            ds_put_cstr(actions, REG_ACL_TIER " = 0; ");
> +        }
> +        size_t verdict_tier_len = actions->length;
>          if (!ls_stateful_rec->has_acls) {
>              ovn_lflow_add(lflows, od, stage, 0, "1", "next;", lflow_ref);
>              continue;
>          }
> -        ds_truncate(actions, verdict_len);
> +        ds_truncate(actions, verdict_tier_len);
>          ds_put_cstr(actions, "next;");
>          ovn_lflow_add(lflows, od, stage, 1000,
>                        REGBIT_ACL_VERDICT_ALLOW " == 1", ds_cstr(actions),
>                        lflow_ref);
> -        ds_truncate(actions, verdict_len);
> +        ds_truncate(actions, verdict_tier_len);
>          ds_put_cstr(actions, debug_implicit_drop_action());
>          ovn_lflow_add(lflows, od, stage, 1000,
>                        REGBIT_ACL_VERDICT_DROP " == 1",
> @@ -7350,7 +7356,7 @@ build_acl_action_lflows(const struct
> ls_stateful_record *ls_stateful_rec,
>                        lflow_ref);
>          bool ingress = ovn_stage_get_pipeline(stage) == P_IN;
>
> -        ds_truncate(actions, verdict_len);
> +        ds_truncate(actions, verdict_tier_len);
>          build_acl_reject_action(actions, ingress);
>
>          ovn_lflow_metered(lflows, od, stage, 1000,
> @@ -7358,12 +7364,12 @@ build_acl_action_lflows(const struct
> ls_stateful_record *ls_stateful_rec,
>                            copp_meter_get(COPP_REJECT, od->nbs->copp,
>                            meter_groups), lflow_ref);
>
> -        ds_truncate(actions, verdict_len);
> +        ds_truncate(actions, verdict_tier_len);
>          ds_put_cstr(actions, default_acl_action);
>          ovn_lflow_add(lflows, od, stage, 0, "1", ds_cstr(actions),
> lflow_ref);
>
>          struct ds tier_actions = DS_EMPTY_INITIALIZER;
> -        for (size_t j = 0; j < ls_stateful_rec->max_acl_tier; j++) {
> +        for (size_t j = 0; j < max_acl_tiers[i]; j++) {
>              ds_clear(match);
>              ds_put_format(match, REG_ACL_TIER " == %"PRIuSIZE, j);
>              ds_clear(&tier_actions);
> @@ -7375,6 +7381,7 @@ build_acl_action_lflows(const struct
> ls_stateful_record *ls_stateful_rec,
>                           ds_cstr(&tier_actions), lflow_ref);
>          }
>          ds_destroy(&tier_actions);
> +        ds_truncate(actions, verdict_len);
>      }
>  }
>
> @@ -7443,6 +7450,21 @@ build_acl_log_related_flows(const struct
> ovn_datapath *od,
>                              &acl->header_, lflow_ref);
>  }
>
> +static uint64_t
> +choose_max_acl_tier(const struct ls_stateful_record *ls_stateful_rec,
> +                    const struct nbrec_acl *acl)
> +{
> +    if (!strcmp(acl->direction, "from-lport")) {
> +        if (smap_get_bool(&acl->options, "apply-after-lb", false)) {
> +            return ls_stateful_rec->max_acl_tier.ingress_post_lb;
> +        } else {
> +            return ls_stateful_rec->max_acl_tier.ingress_pre_lb;
> +        }
> +    } else {
> +        return ls_stateful_rec->max_acl_tier.egress;
> +    }
> +}
> +
>  static void
>  build_acls(const struct ls_stateful_record *ls_stateful_rec,
>             const struct ovn_datapath *od,
> @@ -7639,8 +7661,9 @@ build_acls(const struct ls_stateful_record
> *ls_stateful_rec,
>          build_acl_log_related_flows(od, lflows, acl, has_stateful,
>                                      meter_groups, &match, &actions,
>                                      lflow_ref);
> +        uint64_t max_acl_tier = choose_max_acl_tier(ls_stateful_rec, acl);
>          consider_acl(lflows, od, acl, has_stateful,
> -                     meter_groups, ls_stateful_rec->max_acl_tier,
> +                     meter_groups, max_acl_tier,
>                       &match, &actions, lflow_ref);
>          build_acl_sample_flows(ls_stateful_rec, od, lflows, acl,
>                                 &match, &actions, sampling_apps,
> @@ -7658,8 +7681,10 @@ build_acls(const struct ls_stateful_record
> *ls_stateful_rec,
>                  build_acl_log_related_flows(od, lflows, acl, has_stateful,
>                                              meter_groups, &match,
> &actions,
>                                              lflow_ref);
> +                uint64_t max_acl_tier =
> choose_max_acl_tier(ls_stateful_rec,
> +                                                            acl);
>                  consider_acl(lflows, od, acl, has_stateful,
> -                             meter_groups, ls_stateful_rec->max_acl_tier,
> +                             meter_groups, max_acl_tier,
>                               &match, &actions, lflow_ref);
>                  build_acl_sample_flows(ls_stateful_rec, od, lflows, acl,
>                                         &match, &actions, sampling_apps,
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index dcc3dbbc3..dd2ad2330 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -13864,3 +13864,37 @@ check_no_redirect
>
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([ACL mismatched tiers])
> +ovn_start
> +
> +check ovn-nbctl ls-add S1
> +
> +# Ingress pre-lb ACLs have only a tier 1 ACL configured.
> +# Ingress post-lb ACLs have tier up to 3 configured.
> +# Egress ACLs have up to tier 2 configured.
> +check ovn-nbctl --tier=1 acl-add S1 from-lport 1001 "tcp" allow
> +check ovn-nbctl --tier=3 --apply-after-lb acl-add  S1 from-lport 1001
> "tcp" allow
> +check ovn-nbctl --tier=2 acl-add S1 to-lport 1001 "tcp" allow
> +
> +# Ingress pre-lb ACLs should only ever increment the tier to 1.
> +AT_CHECK([ovn-sbctl lflow-list S1 | grep ls_in_acl_action | grep
> priority=500 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_acl_action   ), priority=500  , match=(reg8[[30..31]] ==
> 0), action=(reg8[[30..31]] = 1; next(pipeline=ingress,table=??);)
> +])
> +
> +# Ingress post-lb ACLs should increment the tier to 3.
> +AT_CHECK([ovn-sbctl lflow-list S1 | grep ls_in_acl_after_lb_action | grep
> priority=500 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_in_acl_after_lb_action), priority=500  ,
> match=(reg8[[30..31]] == 0), action=(reg8[[30..31]] = 1;
> next(pipeline=ingress,table=??);)
> +  table=??(ls_in_acl_after_lb_action), priority=500  ,
> match=(reg8[[30..31]] == 1), action=(reg8[[30..31]] = 2;
> next(pipeline=ingress,table=??);)
> +  table=??(ls_in_acl_after_lb_action), priority=500  ,
> match=(reg8[[30..31]] == 2), action=(reg8[[30..31]] = 3;
> next(pipeline=ingress,table=??);)
> +])
> +
> +# Egress ACLs should increment the tier to 2.
> +AT_CHECK([ovn-sbctl lflow-list S1 | grep ls_out_acl_action | grep
> priority=500 | ovn_strip_lflows], [0], [dnl
> +  table=??(ls_out_acl_action  ), priority=500  , match=(reg8[[30..31]] ==
> 0), action=(reg8[[30..31]] = 1; next(pipeline=egress,table=??);)
> +  table=??(ls_out_acl_action  ), priority=500  , match=(reg8[[30..31]] ==
> 1), action=(reg8[[30..31]] = 2; next(pipeline=egress,table=??);)
> +])
> +
> +AT_CLEANUP
> +])
> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
> index b585e10da..7572c7223 100644
> --- a/tests/system-ovn.at
> +++ b/tests/system-ovn.at
> @@ -13384,6 +13384,57 @@ AT_CHECK([for f in $(ls -1 nfcapd.*); do nfdump
> -o json -r $f; done | grep obser
>  "observationPointID" : 2002,
>  ])
>
> +dnl This next part of the test may seem odd, but it's to test a specific
> +dnl issue that was reported.
> +dnl
> +dnl For this issue to be triggered, we need the following:
> +dnl 1) The final ingress ACL tier gives a "pass" verdict.
> +dnl 2) An egress ACL exists with a higher tier than the final ingress ACL
> tier.
> +dnl
> +dnl We need to ensure that the ingress ACL only generates a single sample
> +dnl for the pass verdict that was matched.
> +
> +dnl Delete the tier 1 ACL, so that the only ACL that matches is the tier 0
> +dnl "pass" ACL.
> +check ovn-nbctl --tier=1 acl-del ls from-lport 1 "inport == \"vm1\" &&
> udp.dst == 1000"
> +
> +dnl Double check that there is only one ACL now.
> +check_row_count nb:ACL 1
> +
> +dnl Now create an unrelated egress ACL that is at tier 1.
> +check ovn-nbctl --tier=1 acl-add ls to-lport 1 "tcp" allow
> +
> +dnl Restart the IPFIX collector.
> +DAEMONIZE([nfcapd -B 1024000 -w . -p 4242 2> collector.err],
> [collector.pid])
> +
> +dnl Wait for the collector to be up.
> +OVS_WAIT_UNTIL([grep -q 'Startup nfcapd.' collector.err])
> +
> +dnl Send traffic to the UDP server (hits tier 0 on ingress, nothing on
> egress).
> +NS_CHECK_EXEC([vm1], [echo a | nc --send-only -u 42.42.42.3 1000])
> +
> +dnl Only one packet should be sampled. Since we caught 4 samples before,
> the
> +dnl total number of samples is now 5.
> +OVS_WAIT_UNTIL([ovs-ofctl dump-ipfix-flow br-int | grep -q 'sampled
> pkts=5'])
> +
> +dnl Check the IPFIX samples.
> +kill $(cat collector.pid)
> +OVS_WAIT_WHILE([kill -0 $(cat collector.pid) 2>/dev/null])
> +
> +dnl Can't match on observation domain ID due to the followig fix not being
> +dnl available in any released version of nfdump:
> +dnl https://github.com/phaag/nfdump/issues/544
> +dnl
> +dnl Only match on the point ID.
> +dnl The four from before are present, plus the new one that was captured.
> +AT_CHECK([for f in $(ls -1 nfcapd.*); do nfdump -o json -r $f; done |
> grep observationPointID | awk '{$1=$1;print}' | sort], [0], [dnl
> +"observationPointID" : 1001,
> +"observationPointID" : 1001,
> +"observationPointID" : 2001,
> +"observationPointID" : 2002,
> +"observationPointID" : 2002,
> +])
> +
>  OVS_APP_EXIT_AND_WAIT([ovn-controller])
>
>  as ovn-sb
> --
> 2.45.2
>
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
>
Thanks,
Ales
-- 

Ales Musil

Senior Software Engineer - OVN Core

Red Hat EMEA <https://www.redhat.com>

[email protected]
<https://red.ht/sig>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to