On 16 May 2025, at 23:25, Ilya Maximets wrote:

> The thread fence in the classifier is supposed to ensure that when the
> subtable->trie_plen is updated, the actual prefix tree is ready to be
> used.  On the write side in trie_init(), the fence is between the
> tree configuration and the 'trie_plen' update.  On the reader's side
> however, the fence is at the beginning of the classifier_lookup__(),
> and both reads of the 'trie_plen' and the accesses to the tree itself
> are happening afterwards.  And since both types of the reads are on
> the same side of the fence, the fence is kind of pointless and doesn't
> guarantee any memory ordering.  So, readers can be accessing partially
> initialized prefix trees.

Thanks for this patch, and for the excellent description and code comments, 
they really made the review a lot easier, especially since I'm not too familiar 
with the classification/trie code. Some of the code got me puzzled, but it was 
all clear in the end (guess I should have started with Espresso sooner). I 
didn’t do any real testing, as the issue is hard to reproduce. Hopefully, Numan 
has some time to look into that.

One small question, and a nit below. Other than that

Acked-by: Eelco Chaudron <echau...@redhat.com>

> Another problem with the configuration is that cls->n_tries is updated
> without any synchronization as well.  The comment on the fence says
> that it also synchronizes for the cls->n_tries, but that doesn't make
> a lot of sense.  In practice, cls->n_tries is read multiple times
> throughout the classifier_lookup__() and each of these reads may give
> a different value if there is a concurrent update, causing the reader
> to access trees that are not initialized or in the middle of being
> destroyed, leading to OVS crashes while the user updates the flow
> table prefixes.
>
> First thing that needs to be fixed here is to only read cls->n_tries
> once to avoid obvious crashes with access to uninitialized trie_ctx[]
> entries.
>
> The second thing is that we need a proper memory synchronization that
> will guarantee that our prefix trees are fully initialized when
> readers access them.  In the current logic we would need to issue
> a thread fence after every read of a subtable->trie_plen value, i.e.,
> we'd need a fence per subtable lookup.  This would be very expensive
> and wasteful, considering the prefix tree configuration normally
> happens only once somewhere at startup.
>
> What we can do instead is to convert cls->n_tries into atomic and use
> it as a synchronization point:
>
>   Writer (classifier_set_prefix_fields):
>
>   1. Before making any changes, set cls->n_tries to zero.  Relaxed
>      memory order can be used here, because we'll have a full memory
>      barrier at the next step.
>   2. ovsrcu_synchronize() to wait for all threads to stop using tries.
>   3. Update tries while nobody is using them.
>   4. Set cls->n_tries to a new value with memory_order_release.
>
>   Reader (classifier_lookup):
>
>   1. Read the cls->n_tries with the memory_order_acquire.
>   2. Use that once read value throughout.
>
> RCU in this scenario will ensure that every thread no longer uses the
> prefix trees when we're about to change them.  The acquire-release
> semantics on the cls->n_tries just saves us from calling the
> ovsrcu_synchronize() the second time once we're done with the whole
> reconfiguration.  We're just updating the number and making all the
> previous changes visible on CPUs that acquire it.
>
> Alternative solution might be to go full RCU and make the array of
> trees itself RCU-protected.  This way we would not need to do any
> extra RCU synchronization or managing the memory ordering.  However,
> that would mean having multiple layers of RCU with trees and rules
> in them potentially surviving multiple grace periods, which I would
> like to avoid, if possible.
>
> Previous code was also trying to be smart and not disable prefix tree
> lookups for prefixes that are not changing.  We're sacrificing this
> functionality in the name of simpler code.  Attempt to make that work
> would either require a full conversion to RCU or a per-subtable
> synchronization.  Lookups can be done without the prefix match
> optimizations for a brief period of time.  This doesn't affect
> correctness of the resulted datapath flows.

Just out of curiosity, will this take effect immediately during the next 
revalidator run triggered by the table change? Or only when a full revalidation 
is required?

> In the actual implementation instead of dropping cls->n_tries to zero
> at step one, we can keep the access to the first N tries that are not

I’d suggest dropping the word “can” before “keep”, it made me think this wasn’t 
actually implemented, even though it is.

> going to change by setting the cls->n_tries to the index of the first
> trie that will be updated.  So, we'll not be disabling all the prefix
> match optimizations completely.
>
> There was an attempt to solve this problem already in commit:
>   a6117059904b ("classifier: Prevent tries vs n_tries race leading to NULL 
> dereference.")
> But it was focused on one particular crash and didn't take into account
> a wider issue with the memory ordering on these trees in general.  The
> changes made in that commit are mostly reverted as not needed anymore.
>
> Fixes: f358a2cb2e54 ("lib/classifier: RCUify prefix trie code.")
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-dev/2025-April/422765.html
> Reported-by: Numan Siddique <num...@ovn.org>
> Signed-off-by: Ilya Maximets <i.maxim...@ovn.org>
> ---
>  lib/classifier.c        | 196 ++++++++++++++++++++--------------------
>  lib/classifier.h        |  10 +-
>  tests/test-classifier.c |  14 +--
>  3 files changed, 112 insertions(+), 108 deletions(-)
>
> diff --git a/lib/classifier.c b/lib/classifier.c
> index 55f23b976..7efa15022 100644
> --- a/lib/classifier.c
> +++ b/lib/classifier.c
> @@ -115,7 +115,7 @@ static const struct cls_match *find_match_wc(const struct 
> cls_subtable *,
>                                               ovs_version_t version,
>                                               const struct flow *,
>                                               struct trie_ctx *,
> -                                             unsigned int n_tries,
> +                                             uint32_t n_tries,
>                                               struct flow_wildcards *);
>  static struct cls_match *find_equal(const struct cls_subtable *,
>                                      const struct miniflow *, uint32_t hash);
> @@ -149,7 +149,7 @@ static unsigned int trie_lookup(const struct cls_trie *, 
> const struct flow *,
>  static unsigned int trie_lookup_value(const rcu_trie_ptr *,
>                                        const ovs_be32 value[], ovs_be32 
> plens[],
>                                        unsigned int value_bits);
> -static void trie_destroy(rcu_trie_ptr *);
> +static void trie_destroy(struct cls_trie *);
>  static void trie_insert(struct cls_trie *, const struct cls_rule *, int 
> mlen);
>  static void trie_insert_prefix(rcu_trie_ptr *, const ovs_be32 *prefix,
>                                 int mlen);
> @@ -334,10 +334,8 @@ classifier_init(struct classifier *cls, const uint8_t 
> *flow_segments)
>              cls->flow_segments[cls->n_flow_segments++] = *flow_segments++;
>          }
>      }
> -    cls->n_tries = 0;
> -    for (int i = 0; i < CLS_MAX_TRIES; i++) {
> -        trie_init(cls, i, NULL);
> -    }
> +    memset(cls->tries, 0, sizeof cls->tries);
> +    atomic_store_explicit(&cls->n_tries, 0, memory_order_release);
>      cls->publish = true;
>  }
>
> @@ -349,10 +347,11 @@ classifier_destroy(struct classifier *cls)
>  {
>      if (cls) {
>          struct cls_subtable *subtable;
> -        int i;
> +        uint32_t i, n_tries;
>
> -        for (i = 0; i < cls->n_tries; i++) {
> -            trie_destroy(&cls->tries[i].root);
> +        atomic_read_relaxed(&cls->n_tries, &n_tries);
> +        for (i = 0; i < n_tries; i++) {
> +            trie_destroy(&cls->tries[i]);
>          }
>
>          CMAP_FOR_EACH (subtable, cmap_node, &cls->subtables_map) {
> @@ -370,11 +369,14 @@ classifier_set_prefix_fields(struct classifier *cls,
>                               const enum mf_field_id *trie_fields,
>                               unsigned int n_fields)
>  {
> -    const struct mf_field * new_fields[CLS_MAX_TRIES];
> +    const struct mf_field *new_fields[CLS_MAX_TRIES];
>      struct mf_bitmap fields = MF_BITMAP_INITIALIZER;
> -    int i, n_tries = 0;
> +    uint32_t i, n_tries = 0, old_n_tries;
> +    uint32_t first_changed = 0;
>      bool changed = false;
>
> +    atomic_read_relaxed(&cls->n_tries, &old_n_tries);
> +
>      for (i = 0; i < n_fields && n_tries < CLS_MAX_TRIES; i++) {
>          const struct mf_field *field = mf_from_id(trie_fields[i]);
>          if (field->flow_be32ofs < 0 || field->n_bits % 32) {
> @@ -393,52 +395,47 @@ classifier_set_prefix_fields(struct classifier *cls,
>          bitmap_set1(fields.bm, trie_fields[i]);
>
>          new_fields[n_tries] = NULL;
> -        const struct mf_field *cls_field
> -            = ovsrcu_get(struct mf_field *, &cls->tries[n_tries].field);
> -        if (n_tries >= cls->n_tries || field != cls_field) {
> +        if (n_tries >= old_n_tries || field != cls->tries[n_tries].field) {
>              new_fields[n_tries] = field;
> +            if (!changed) {
> +                first_changed = n_tries;
> +            }
>              changed = true;
>          }
>          n_tries++;
>      }
>
> -    if (changed || n_tries < cls->n_tries) {
> -        struct cls_subtable *subtable;
> -
> -        /* Trie configuration needs to change.  Disable trie lookups
> -         * for the tries that are changing and wait all the current readers
> -         * with the old configuration to be done. */
> -        changed = false;
> -        CMAP_FOR_EACH (subtable, cmap_node, &cls->subtables_map) {
> -            for (i = 0; i < cls->n_tries; i++) {
> -                if ((i < n_tries && new_fields[i]) || i >= n_tries) {
> -                    if (subtable->trie_plen[i]) {
> -                        subtable->trie_plen[i] = 0;
> -                        changed = true;
> -                    }
> -                }
> -            }
> -        }
> -        /* Synchronize if any readers were using tries.  The readers may
> -         * temporarily function without the trie lookup based optimizations. 
> */
> -        if (changed) {
> -            /* ovsrcu_synchronize() functions as a memory barrier, so it does
> -             * not matter that subtable->trie_plen is not atomic. */
> -            ovsrcu_synchronize();
> +    if (changed || n_tries < old_n_tries) {
> +        if (!changed) {
> +            /* Threre are no new or changed fields, only removing a few. */
> +            first_changed = n_tries;
>          }
> +        /* Trie configuration needs to change.  Disable trie lookups and wait
> +         * for all the current readers to be done with the old configuration.
> +         * The readers may temporarily function without the trie lookup based
> +         * optimizations.  Keeping the few first entries that didn't change
> +         * accessible.
> +         *
> +         * This store can be relaxed because ovsrcu_synchronize() functions 
> as
> +         * a memory barrier. */
> +        atomic_store_relaxed(&cls->n_tries, first_changed);
> +        ovsrcu_synchronize();
>
> -        /* Now set up the tries. */
> -        for (i = 0; i < n_tries; i++) {
> +        /* Now set up the tries for new and changed fields. */
> +        for (i = first_changed; i < n_tries; i++) {
>              if (new_fields[i]) {
> +                trie_destroy(&cls->tries[i]);
>                  trie_init(cls, i, new_fields[i]);
>              }
>          }
>          /* Destroy the rest, if any. */
> -        for (; i < cls->n_tries; i++) {
> -            trie_init(cls, i, NULL);
> +        for (; i < old_n_tries; i++) {
> +            trie_destroy(&cls->tries[i]);
>          }
>
> -        cls->n_tries = n_tries;
> +        /* Re-enable trie lookups.  Using release memory order, so all the
> +         * previous stores are visible in the classifier_lookup(). */
> +        atomic_store_explicit(&cls->n_tries, n_tries, memory_order_release);
>          return true;
>      }
>
> @@ -451,18 +448,17 @@ trie_init(struct classifier *cls, int trie_idx, const 
> struct mf_field *field)
>      struct cls_trie *trie = &cls->tries[trie_idx];
>      struct cls_subtable *subtable;
>
> -    if (trie_idx < cls->n_tries) {
> -        trie_destroy(&trie->root);
> -    } else {
> -        ovsrcu_set_hidden(&trie->root, NULL);
> -    }
> -    ovsrcu_set_hidden(&trie->field, CONST_CAST(struct mf_field *, field));
> +    ovs_assert(field);
> +    ovs_assert(!trie->field);
> +
> +    trie->field = field;
> +    ovsrcu_set_hidden(&trie->root, NULL);
>
>      /* Add existing rules to the new trie. */
>      CMAP_FOR_EACH (subtable, cmap_node, &cls->subtables_map) {
>          unsigned int plen;
>
> -        plen = field ? minimask_get_prefix_len(&subtable->mask, field) : 0;
> +        plen = minimask_get_prefix_len(&subtable->mask, field);
>          if (plen) {
>              struct cls_match *head;
>
> @@ -470,9 +466,7 @@ trie_init(struct classifier *cls, int trie_idx, const 
> struct mf_field *field)
>                  trie_insert(trie, head->cls_rule, plen);
>              }
>          }
> -        /* Initialize subtable's prefix length on this field.  This will
> -         * allow readers to use the trie. */
> -        atomic_thread_fence(memory_order_release);
> +        /* Initialize subtable's prefix length on this field. */
>          subtable->trie_plen[trie_idx] = plen;
>      }
>  }
> @@ -527,10 +521,10 @@ classifier_replace(struct classifier *cls, const struct 
> cls_rule *rule,
>      struct cls_match *head;
>      unsigned int mask_offset;
>      size_t n_rules = 0;
> +    uint32_t i, n_tries;
>      uint8_t n_indices;
>      uint32_t basis;
>      uint32_t hash;
> -    unsigned int i;
>
>      /* 'new' is initially invisible to lookups. */
>      new = cls_match_alloc(rule, version, conjs, n_conjs);
> @@ -558,7 +552,8 @@ classifier_replace(struct classifier *cls, const struct 
> cls_rule *rule,
>           *
>           * Concurrent readers might miss seeing the rule until this update,
>           * which might require being fixed up by revalidation later. */
> -        for (i = 0; i < cls->n_tries; i++) {
> +        atomic_read_relaxed(&cls->n_tries, &n_tries);
> +        for (i = 0; i < n_tries; i++) {
>              if (subtable->trie_plen[i]) {
>                  trie_insert(&cls->tries[i], rule, subtable->trie_plen[i]);
>              }
> @@ -715,9 +710,9 @@ classifier_remove(struct classifier *cls, const struct 
> cls_rule *cls_rule)
>      struct cls_subtable *subtable;
>      uint32_t basis = 0, hash, ihash[CLS_MAX_INDICES];
>      unsigned int mask_offset;
> +    uint32_t i, n_tries;
>      uint8_t n_indices;
>      size_t n_rules;
> -    unsigned int i;
>
>      rule = get_cls_match_protected(cls_rule);
>      if (!rule) {
> @@ -780,7 +775,8 @@ classifier_remove(struct classifier *cls, const struct 
> cls_rule *cls_rule)
>          trie_remove_prefix(&subtable->ports_trie,
>                             &masked_ports, subtable->ports_mask_len);
>      }
> -    for (i = 0; i < cls->n_tries; i++) {
> +    atomic_read_relaxed(&cls->n_tries, &n_tries);
> +    for (i = 0; i < n_tries; i++) {
>          if (subtable->trie_plen[i]) {
>              trie_remove(&cls->tries[i], cls_rule, subtable->trie_plen[i]);
>          }
> @@ -845,6 +841,7 @@ classifier_remove_assert(struct classifier *cls,
>  struct trie_ctx {
>      const struct cls_trie *trie;
>      bool lookup_done;        /* Status of the lookup. */
> +    uint8_t be32ofs;         /* U32 offset of the field in question. */
>      unsigned int maskbits;   /* Prefix length needed to avoid false matches. 
> */
>      union trie_prefix match_plens;  /* Bitmask of prefix lengths with 
> possible
>                                       * matches. */
> @@ -854,6 +851,7 @@ static void
>  trie_ctx_init(struct trie_ctx *ctx, const struct cls_trie *trie)
>  {
>      ctx->trie = trie;
> +    ctx->be32ofs = trie->field->flow_be32ofs;
>      ctx->lookup_done = false;
>  }
>
> @@ -988,13 +986,18 @@ classifier_lookup__(const struct classifier *cls, 
> ovs_version_t version,
>      size_t n_soft = 0, allocated_soft = ARRAY_SIZE(soft_stub);
>      int soft_pri = INT_MIN;    /* n_soft ? MAX(soft[*]->priority) : INT_MIN. 
> */
>
> -    /* Synchronize for cls->n_tries and subtable->trie_plen.  They can change
> -     * when table configuration changes, which happens typically only on
> -     * startup. */
> -    atomic_thread_fence(memory_order_acquire);
> +    uint32_t n_tries;
> +
> +    /* Using memory_order_acquire on cls->n_tries to make sure that all the
> +     * configuration changes for these tries are fully visible after the 
> read.
> +     *
> +     * Trie configuration changes typically happen on startup, but can also
> +     * happen in runtime. */
> +    atomic_read_explicit(&CONST_CAST(struct classifier *, cls)->n_tries,
> +                         &n_tries, memory_order_acquire);
>
>      /* Initialize trie contexts for find_match_wc(). */
> -    for (int i = 0; i < cls->n_tries; i++) {
> +    for (uint32_t i = 0; i < n_tries; i++) {
>          trie_ctx_init(&trie_ctx[i], &cls->tries[i]);
>      }
>
> @@ -1006,8 +1009,7 @@ classifier_lookup__(const struct classifier *cls, 
> ovs_version_t version,
>
>          /* Skip subtables with no match, or where the match is lower-priority
>           * than some certain match we've already found. */
> -        match = find_match_wc(subtable, version, flow, trie_ctx, 
> cls->n_tries,
> -                              wc);
> +        match = find_match_wc(subtable, version, flow, trie_ctx, n_tries, 
> wc);
>          if (!match || match->priority <= hard_pri) {
>              continue;
>          }
> @@ -1535,8 +1537,8 @@ static struct cls_subtable *
>  insert_subtable(struct classifier *cls, const struct minimask *mask)
>  {
>      uint32_t hash = minimask_hash(mask, 0);
> +    uint32_t i, n_tries, index = 0;
>      struct cls_subtable *subtable;
> -    int i, index = 0;
>      struct flowmap stage_map;
>      uint8_t prev;
>      size_t count = miniflow_n_values(&mask->masks);
> @@ -1574,11 +1576,10 @@ insert_subtable(struct classifier *cls, const struct 
> minimask *mask)
>      }
>      *CONST_CAST(uint8_t *, &subtable->n_indices) = index;
>
> -    for (i = 0; i < cls->n_tries; i++) {
> -        const struct mf_field *field
> -            = ovsrcu_get(struct mf_field *, &cls->tries[i].field);
> -        subtable->trie_plen[i]
> -            = field ? minimask_get_prefix_len(mask, field) : 0;
> +    atomic_read_relaxed(&cls->n_tries, &n_tries);
> +    for (i = 0; i < n_tries; i++) {
> +        subtable->trie_plen[i] = minimask_get_prefix_len(mask,
> +                                                         
> cls->tries[i].field);
>      }
>
>      /* Ports trie. */
> @@ -1610,29 +1611,22 @@ static unsigned int be_get_bit_at(const ovs_be32 
> value[], unsigned int ofs);
>  /* Return 'true' if can skip rest of the subtable based on the prefix trie
>   * lookup results. */
>  static inline bool
> -check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], unsigned int n_tries,
> +check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], uint32_t n_tries,
>              const unsigned int field_plen[CLS_MAX_TRIES],
>              const struct flowmap range_map, const struct flow *flow,
>              struct flow_wildcards *wc)
>  {
> -    int j;
> +    uint32_t j;
>
>      /* Check if we could avoid fully unwildcarding the next level of
>       * fields using the prefix tries.  The trie checks are done only as
>       * needed to avoid folding in additional bits to the wildcards mask. */
>      for (j = 0; j < n_tries; j++) {
> -        /* Is the trie field relevant for this subtable? */
> -        if (field_plen[j]) {
> -            struct trie_ctx *ctx = &trie_ctx[j];
> -            const struct mf_field *ctx_field
> -                = ovsrcu_get(struct mf_field *, &ctx->trie->field);
> -
> -            /* Is the trie field within the current range of fields? */
> -            if (!ctx_field
> -                || !flowmap_is_set(&range_map, ctx_field->flow_be32ofs / 2)) 
> {
> -                continue;
> -            }
> +        struct trie_ctx *ctx = &trie_ctx[j];
>
> +        /* Is the trie field relevant for this subtable, and
> +         * is the trie field within the current range of fields? */
> +        if (field_plen[j] && flowmap_is_set(&range_map, ctx->be32ofs / 2)) {
>              /* On-demand trie lookup. */
>              if (!ctx->lookup_done) {
>                  memset(&ctx->match_plens, 0, sizeof ctx->match_plens);
> @@ -1653,16 +1647,14 @@ check_tries(struct trie_ctx trie_ctx[CLS_MAX_TRIES], 
> unsigned int n_tries,
>                   * than this subtable would otherwise. */
>                  if (ctx->maskbits <= field_plen[j]) {
>                      /* Unwildcard the bits and skip the rest. */
> -                    mask_set_prefix_bits(wc, ctx_field->flow_be32ofs,
> -                                         ctx->maskbits);
> +                    mask_set_prefix_bits(wc, ctx->be32ofs, ctx->maskbits);
>                      /* Note: Prerequisite already unwildcarded, as the only
>                       * prerequisite of the supported trie lookup fields is
>                       * the ethertype, which is always unwildcarded. */
>                      return true;
>                  }
>                  /* Can skip if the field is already unwildcarded. */
> -                if (mask_prefix_bits_set(wc, ctx_field->flow_be32ofs,
> -                                         ctx->maskbits)) {
> +                if (mask_prefix_bits_set(wc, ctx->be32ofs, ctx->maskbits)) {
>                      return true;
>                  }
>              }
> @@ -1727,7 +1719,7 @@ find_match(const struct cls_subtable *subtable, 
> ovs_version_t version,
>  static const struct cls_match *
>  find_match_wc(const struct cls_subtable *subtable, ovs_version_t version,
>                const struct flow *flow, struct trie_ctx *trie_ctx,
> -              unsigned int n_tries, struct flow_wildcards *wc)
> +              uint32_t n_tries, struct flow_wildcards *wc)
>  {
>      if (OVS_UNLIKELY(!wc)) {
>          return find_match(subtable, version, flow,
> @@ -1740,7 +1732,7 @@ find_match_wc(const struct cls_subtable *subtable, 
> ovs_version_t version,
>      unsigned int mask_offset = 0;
>      bool adjust_ports_mask = false;
>      ovs_be32 ports_mask;
> -    int i;
> +    uint32_t i;
>
>      /* Try to finish early by checking fields in segments. */
>      for (i = 0; i < subtable->n_indices; i++) {
> @@ -1950,18 +1942,29 @@ trie_node_rcu_realloc(const struct trie_node *node)
>  }
>
>  static void
> -trie_destroy(rcu_trie_ptr *trie)
> +trie_destroy__(rcu_trie_ptr *trie)
>  {
>      struct trie_node *node = ovsrcu_get_protected(struct trie_node *, trie);
>
>      if (node) {
>          ovsrcu_set_hidden(trie, NULL);
> -        trie_destroy(&node->edges[0]);
> -        trie_destroy(&node->edges[1]);
> +        trie_destroy__(&node->edges[0]);
> +        trie_destroy__(&node->edges[1]);
>          trie_node_destroy(node);
>      }
>  }
>
> +static void
> +trie_destroy(struct cls_trie *trie)
> +{
> +    if (!trie) {
> +        return;
> +    }
> +
> +    trie_destroy__(&trie->root);
> +    trie->field = NULL;
> +}
> +
>  static bool
>  trie_is_leaf(const struct trie_node *trie)
>  {
> @@ -2070,12 +2073,12 @@ static unsigned int
>  trie_lookup(const struct cls_trie *trie, const struct flow *flow,
>              union trie_prefix *plens)
>  {
> -    const struct mf_field *mf = ovsrcu_get(struct mf_field *, &trie->field);
> +    const struct mf_field *mf = trie->field;
>
>      /* Check that current flow matches the prerequisites for the trie
>       * field.  Some match fields are used for multiple purposes, so we
>       * must check that the trie is relevant for this flow. */
> -    if (mf && mf_are_prereqs_ok(mf, flow, NULL)) {
> +    if (mf_are_prereqs_ok(mf, flow, NULL)) {
>          return trie_lookup_value(&trie->root,
>                                   &((ovs_be32 *)flow)[mf->flow_be32ofs],
>                                   &plens->be32, mf->n_bits);
> @@ -2122,9 +2125,8 @@ minimask_get_prefix_len(const struct minimask *minimask,
>   * happened to be zeros.
>   */
>  static const ovs_be32 *
> -minimatch_get_prefix(const struct minimatch *match, rcu_field_ptr *field)
> +minimatch_get_prefix(const struct minimatch *match, const struct mf_field 
> *mf)
>  {
> -    struct mf_field *mf = ovsrcu_get_protected(struct mf_field *, field);
>      size_t u64_ofs = mf->flow_be32ofs / 2;
>
>      return (OVS_FORCE const ovs_be32 *)miniflow_get__(match->flow, u64_ofs)
> @@ -2138,7 +2140,7 @@ static void
>  trie_insert(struct cls_trie *trie, const struct cls_rule *rule, int mlen)
>  {
>      trie_insert_prefix(&trie->root,
> -                       minimatch_get_prefix(&rule->match, &trie->field), 
> mlen);
> +                       minimatch_get_prefix(&rule->match, trie->field), 
> mlen);
>  }
>
>  static void
> @@ -2193,7 +2195,7 @@ static void
>  trie_remove(struct cls_trie *trie, const struct cls_rule *rule, int mlen)
>  {
>      trie_remove_prefix(&trie->root,
> -                       minimatch_get_prefix(&rule->match, &trie->field), 
> mlen);
> +                       minimatch_get_prefix(&rule->match, trie->field), 
> mlen);
>  }
>
>  /* 'mlen' must be the (non-zero) CIDR prefix length of the 'trie->field' mask
> diff --git a/lib/classifier.h b/lib/classifier.h
> index 7928601e0..77a3ae5cf 100644
> --- a/lib/classifier.h
> +++ b/lib/classifier.h
> @@ -314,15 +314,13 @@ extern "C" {
>  struct cls_subtable;
>  struct cls_match;
>
> -struct mf_field;
> -typedef OVSRCU_TYPE(struct mf_field *) rcu_field_ptr;
>  struct trie_node;
>  typedef OVSRCU_TYPE(struct trie_node *) rcu_trie_ptr;
>
>  /* Prefix trie for a 'field' */
>  struct cls_trie {
> -    rcu_field_ptr field;   /* Trie field, or NULL. */
> -    rcu_trie_ptr root;     /* NULL if none. */
> +    const struct mf_field *field; /* Trie field, or NULL. */
> +    rcu_trie_ptr root;            /* NULL if none. */
>  };
>
>  enum {
> @@ -340,7 +338,9 @@ struct classifier {
>      struct pvector subtables;
>      struct cmap partitions;         /* Contains "struct cls_partition"s. */
>      struct cls_trie tries[CLS_MAX_TRIES]; /* Prefix tries. */
> -    unsigned int n_tries;
> +    atomic_uint32_t n_tries;        /* Number of tries.  Also serves as a
> +                                     * memory synchronization point for trie
> +                                     * configuration. */
>      bool publish;                   /* Make changes visible to lookups? */
>  };
>
> diff --git a/tests/test-classifier.c b/tests/test-classifier.c
> index ee022ba1e..7f6ba50ee 100644
> --- a/tests/test-classifier.c
> +++ b/tests/test-classifier.c
> @@ -418,6 +418,7 @@ compare_classifiers(struct classifier *cls, size_t 
> n_invisible_rules,
>          const struct cls_rule *cr0, *cr1, *cr2;
>          struct flow flow;
>          struct flow_wildcards wc;
> +        uint32_t n_tries;
>          unsigned int x;
>
>          flow_wildcards_init_catchall(&wc);
> @@ -439,7 +440,8 @@ compare_classifiers(struct classifier *cls, size_t 
> n_invisible_rules,
>          flow.nw_tos = nw_dscp_values[get_value(&x, N_NW_DSCP_VALUES)];
>
>          /* This assertion is here to suppress a GCC 4.9 array-bounds warning 
> */
> -        ovs_assert(cls->n_tries <= CLS_MAX_TRIES);
> +        atomic_read_relaxed(&cls->n_tries, &n_tries);
> +        ovs_assert(n_tries <= CLS_MAX_TRIES);
>
>          cr0 = classifier_lookup(cls, version, &flow, &wc, NULL);
>          cr1 = tcls_lookup(tcls, &flow);
> @@ -510,12 +512,12 @@ verify_tries(struct classifier *cls)
>      OVS_NO_THREAD_SAFETY_ANALYSIS
>  {
>      unsigned int n_rules;
> -    int i;
> +    uint32_t i, n_tries;
>
> -    for (i = 0; i < cls->n_tries; i++) {
> -        const struct mf_field * cls_field
> -            = ovsrcu_get(struct mf_field *, &cls->tries[i].field);
> -        n_rules = trie_verify(&cls->tries[i].root, 0, cls_field->n_bits);
> +    atomic_read_explicit(&cls->n_tries, &n_tries, memory_order_acquire);
> +    for (i = 0; i < n_tries; i++) {
> +        n_rules = trie_verify(&cls->tries[i].root, 0,
> +                              cls->tries[i].field->n_bits);
>          assert(n_rules <= cls->n_rules);
>      }
>  }
> -- 
> 2.49.0
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to