Re: [ovs-dev] [PATCH v7] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule

Mike Pattrick Fri, 01 Sep 2023 09:07:18 -0700

On Fri, Sep 1, 2023 at 9:14 AM Ilya Maximets <[email protected]> wrote:
>
> On 9/1/23 07:06, Mike Pattrick wrote:
> > When the a revalidator thread is updating statistics for an XC_LEARN
> > xcache entry in xlate_push_stats_entry it uses ofproto_flow_mod_learn.
> > The revalidator will update stats for rules even if they are in a
> > removed state or marked as invisible. However, ofproto_flow_mod_learn
> > will detect if a flow has been removed and re-add it in that case. This
> > can result in an old learn action replacing the new learn action that
> > had replaced it in the first place.
> >
> > This change adds a new last_used parameter to ofproto_flow_mod_learn
> > allowing the caller to provide a timestamp that will be fed into the
> > learned rule's modified time. The provided timestamp should be the time
> > of the last packet activity. If last_used is not set then the current
> > time is used, as is the current behaviour.
> >
> > This change also adds a check when replacing a learned rule to favour the
> > newest rule.
> >
> > Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2213892
> > Signed-off-by: Mike Pattrick <[email protected]>
> > ---
> >  v2: Added additional checks if rule is removed
> >  v3: v2 patch was corrupted in transit
> >  v4: Added check against dpif flow stats
> >  v5: Fixed typos, updated commit message
> >      Changed timestamps to use datapath timestamp more consistently
> >  v6: Added a unit test, changed some variable names and comments,
> >      zero is no longer an acceptable value
> >  v7: Simplified unit test, updated comments.
> >
> > ---
> >  ofproto/ofproto-dpif-xlate-cache.c |   2 +-
> >  ofproto/ofproto-dpif-xlate.c       |  10 ++-
> >  ofproto/ofproto-dpif.c             |   2 +-
> >  ofproto/ofproto-provider.h         |   6 +-
> >  ofproto/ofproto.c                  |  60 +++++++++++--
> >  tests/learn.at                     | 132 +++++++++++++++++++++++++++++
> >  6 files changed, 198 insertions(+), 14 deletions(-)
> >
> > diff --git a/ofproto/ofproto-dpif-xlate-cache.c 
> > b/ofproto/ofproto-dpif-xlate-cache.c
> > index 9224ee2e6..2e1fcb3a6 100644
> > --- a/ofproto/ofproto-dpif-xlate-cache.c
> > +++ b/ofproto/ofproto-dpif-xlate-cache.c
> > @@ -125,7 +125,7 @@ xlate_push_stats_entry(struct xc_entry *entry,
> >      case XC_LEARN: {
> >          enum ofperr error;
> >          error = ofproto_flow_mod_learn(entry->learn.ofm, true,
> > -                                       entry->learn.limit, NULL);
> > +                                       entry->learn.limit, NULL, 
> > stats->used);
> >          if (error) {
> >              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> >              VLOG_WARN_RL(&rl, "xcache LEARN action execution failed.");
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index 47ea0f47e..d608a5f25 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -5700,8 +5700,16 @@ xlate_learn_action(struct xlate_ctx *ctx, const 
> > struct ofpact_learn *learn)
> >          if (!error) {
> >              bool success = true;
> >              if (ctx->xin->allow_side_effects) {
> > +                long long int last_used;
> > +
> > +                if (ctx->xin->resubmit_stats) {
> > +                    last_used = ctx->xin->resubmit_stats->used;
> > +                } else {
> > +                    last_used = time_msec();
> > +                }
> >                  error = ofproto_flow_mod_learn(ofm, ctx->xin->xcache != 
> > NULL,
> > -                                               learn->limit, &success);
> > +                                               learn->limit, &success,
> > +                                               last_used);
> >              } else if (learn->limit) {
> >                  if (!ofm->temp_rule
> >                      || ofm->temp_rule->state != RULE_INSERTED) {
> > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> > index e22ca757a..ba5706f6a 100644
> > --- a/ofproto/ofproto-dpif.c
> > +++ b/ofproto/ofproto-dpif.c
> > @@ -4880,7 +4880,7 @@ packet_xlate(struct ofproto *ofproto_, struct 
> > ofproto_packet_out *opo)
> >              if (entry->type == XC_LEARN) {
> >                  struct ofproto_flow_mod *ofm = entry->learn.ofm;
> >
> > -                error = ofproto_flow_mod_learn_refresh(ofm);
> > +                error = ofproto_flow_mod_learn_refresh(ofm, time_msec());
> >                  if (error) {
> >                      goto error_out;
> >                  }
> > diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
> > index 143ded690..9f7b8b6e8 100644
> > --- a/ofproto/ofproto-provider.h
> > +++ b/ofproto/ofproto-provider.h
> > @@ -2027,9 +2027,11 @@ enum ofperr ofproto_flow_mod_init_for_learn(struct 
> > ofproto *,
> >                                              struct ofproto_flow_mod *)
> >      OVS_EXCLUDED(ofproto_mutex);
> >  enum ofperr ofproto_flow_mod_learn(struct ofproto_flow_mod *, bool 
> > keep_ref,
> > -                                   unsigned limit, bool *below_limit)
> > +                                   unsigned limit, bool *below_limit,
> > +                                   long long int last_used)
> >      OVS_EXCLUDED(ofproto_mutex);
> > -enum ofperr ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm);
> > +enum ofperr ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm,
> > +                                           long long int last_used);
> >  enum ofperr ofproto_flow_mod_learn_start(struct ofproto_flow_mod *ofm)
> >      OVS_REQUIRES(ofproto_mutex);
> >  void ofproto_flow_mod_learn_revert(struct ofproto_flow_mod *ofm)
> > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
> > index dbf4958bc..7d3b3b6d1 100644
> > --- a/ofproto/ofproto.c
> > +++ b/ofproto/ofproto.c
> > @@ -5472,7 +5472,8 @@ ofproto_flow_mod_init_for_learn(struct ofproto 
> > *ofproto,
> >  }
> >
> >  enum ofperr
> > -ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm)
> > +ofproto_flow_mod_learn_refresh(struct ofproto_flow_mod *ofm,
> > +                               long long int last_used)
> >  {
> >      enum ofperr error = 0;
> >
> > @@ -5493,9 +5494,37 @@ ofproto_flow_mod_learn_refresh(struct 
> > ofproto_flow_mod *ofm)
> >       * this function is executed the rule will be reinstated. */
> >      if (rule->state == RULE_REMOVED) {
> >          struct cls_rule cr;
> > +        struct oftable *table = &rule->ofproto->tables[rule->table_id];
> > +        ovs_version_t tables_version = rule->ofproto->tables_version;
> > +
> > +        if (!cls_rule_visible_in_version(&rule->cr, tables_version)) {
> > +            const struct cls_rule *curr_cls_rule;
> > +
> > +            /* Only check for matching classifier rules and their modified
> > +             * time, instead of also checking all rule metadata, with the 
> > goal
> > +             * of suprressing a learn action update that would replace a 
> > more
>
> s/ suprressing / suppressing /
>
> > +             * recent rule in the classifier. */
> > +            curr_cls_rule = classifier_find_rule_exactly(&table->cls,
> > +                                                         &rule->cr,
> > +                                                         tables_version);
> > +            if (curr_cls_rule) {
> > +                struct rule *curr_rule = rule_from_cls_rule(curr_cls_rule);
> > +                long long int curr_last_used;
> > +
> > +                ovs_mutex_lock(&curr_rule->mutex);
> > +                curr_last_used = curr_rule->modified;
> > +                ovs_mutex_unlock(&curr_rule->mutex);
> > +
> > +                if (curr_last_used > last_used) {
> > +                    /* In the case of a newer visible rule, don't recreate 
> > the
> > +                     *  current rule. */
> > +                    return 0;
> > +                }
> > +            }
> > +        }
> >
> > -        cls_rule_clone(&cr, &rule->cr);
> >          ovs_mutex_lock(&rule->mutex);
> > +        cls_rule_clone(&cr, &rule->cr);
> >          error = ofproto_rule_create(rule->ofproto, &cr, rule->table_id,
> >                                      rule->flow_cookie,
> >                                      rule->idle_timeout,
> > @@ -5506,6 +5535,7 @@ ofproto_flow_mod_learn_refresh(struct 
> > ofproto_flow_mod *ofm)
> >                                      rule->match_tlv_bitmap,
> >                                      rule->ofpacts_tlv_bitmap,
> >                                      &ofm->temp_rule);
> > +        ofm->temp_rule->modified = last_used;
> >          ovs_mutex_unlock(&rule->mutex);
> >          if (!error) {
> >              ofproto_rule_unref(rule);   /* Release old reference. */
> > @@ -5513,7 +5543,7 @@ ofproto_flow_mod_learn_refresh(struct 
> > ofproto_flow_mod *ofm)
> >      } else {
> >          /* Refresh the existing rule. */
> >          ovs_mutex_lock(&rule->mutex);
> > -        rule->modified = time_msec();
> > +        rule->modified = last_used;
> >          ovs_mutex_unlock(&rule->mutex);
> >      }
> >      return error;
> > @@ -5565,10 +5595,16 @@ ofproto_flow_mod_learn_finish(struct 
> > ofproto_flow_mod *ofm,
> >
> >  /* Refresh 'ofm->temp_rule', for which the caller holds a reference, if 
> > already
> >   * in the classifier, insert it otherwise.  If the rule has already been
> > - * removed from the classifier, a new rule is created using 
> > 'ofm->temp_rule' as
> > - * a template and the reference to the old 'ofm->temp_rule' is freed.  If
> > - * 'keep_ref' is true, then a reference to the current rule is held, 
> > otherwise
> > - * it is released and 'ofm->temp_rule' is set to NULL.
> > + * removed from the classifier and replaced by another rule, the 
> > 'last_used'
> > + * parameter is used to determine whether the newer rule is replaced or 
> > kept.
> > + * If 'last_used' is greater than the last modified time of an identical 
> > rule
> > + * in the classifier, then a new rule is created using 'ofm->temp_rule' as 
> > a
> > + * template and the reference to the old 'ofm->temp_rule' is freed.  If the
> > + * rule has been removed but another identical rule doesn't exist in the
> > + * classifier, then it will be recreated.  If the rule hasn't been removed
> > + * from the classifier, then 'last_used' is used to update the rules 
> > modified
> > + * time.  If 'keep_ref' is true, then a reference to the current rule is 
> > held,
> > + * otherwise it is released and 'ofm->temp_rule' is set to NULL.
> >   *
> >   * If 'limit' != 0, insertion will fail if there are more than 'limit' 
> > rules
> >   * in the same table with the same cookie.  If insertion succeeds,
> > @@ -5579,10 +5615,11 @@ ofproto_flow_mod_learn_finish(struct 
> > ofproto_flow_mod *ofm,
> >   * during the call. */
> >  enum ofperr
> >  ofproto_flow_mod_learn(struct ofproto_flow_mod *ofm, bool keep_ref,
> > -                       unsigned limit, bool *below_limitp)
> > +                       unsigned limit, bool *below_limitp,
> > +                       long long int last_used)
> >      OVS_EXCLUDED(ofproto_mutex)
> >  {
> > -    enum ofperr error = ofproto_flow_mod_learn_refresh(ofm);
> > +    enum ofperr error = ofproto_flow_mod_learn_refresh(ofm, last_used);
> >      struct rule *rule = ofm->temp_rule;
> >      bool below_limit = true;
> >
> > @@ -5615,6 +5652,11 @@ ofproto_flow_mod_learn(struct ofproto_flow_mod *ofm, 
> > bool keep_ref,
> >
> >              error = ofproto_flow_mod_learn_start(ofm);
> >              if (!error) {
> > +                /* ofproto_flow_mod_learn_start may have overwritten
> > +                 * modified with current time. */
> > +                ovs_mutex_lock(&ofm->temp_rule->mutex);
> > +                ofm->temp_rule->modified = last_used;
> > +                ovs_mutex_unlock(&ofm->temp_rule->mutex);
> >                  error = ofproto_flow_mod_learn_finish(ofm, NULL);
> >              }
> >          } else {
> > diff --git a/tests/learn.at b/tests/learn.at
> > index d127fed34..8e5054c08 100644
> > --- a/tests/learn.at
> > +++ b/tests/learn.at
> > @@ -836,3 +836,135 @@ AT_CHECK([ovs-vsctl add-br br1 -- set b br1 
> > datapath_type=dummy])
> >
> >  OVS_VSWITCHD_STOP
> >  AT_CLEANUP
> > +
> > +AT_SETUP([learning action - flapping learn rule])
> > +OVS_VSWITCHD_START
> > +add_of_ports br0 1 2 3
> > +
> > +ovs-appctl time/stop
> > +AT_CHECK([[ovs-ofctl add-flow br0 
> > 'table=0,priority=2,in_port=1,actions=resubmit(,2)']])
> > +AT_CHECK([[ovs-ofctl add-flow br0 
> > 'table=0,priority=2,in_port=3,actions=resubmit(,2)']])
> > +AT_CHECK([[ovs-ofctl add-flow br0 'table=0,in_port=2,actions=normal']])
> > +AT_CHECK([[ovs-ofctl add-flow br0 'table=0,priority=4,actions=drop']])
> > +AT_CHECK([[ovs-ofctl add-flow br0 
> > 'table=2,actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x0001020304050607,NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],output:OXM_OF_IN_PORT[]),output:2']])
> > +
> > +packet1='in_port(1),eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
> > +packet2='in_port(3),eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
> > +
> > +ovs-appctl revalidator/pause
> > +ovs-appctl netdev-dummy/receive p3 $packet2
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p3 $packet2
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl revalidator/resume
> > +ovs-appctl revalidator/wait
> > +
> > +AT_CHECK([ovs-ofctl --no-stats dump-flows br0 | ofctl_strip | sed -e 
> > 's/n_packets=[[0-9]]*, n_bytes=[[0-9]]*, //' | sort], [0], [dnl
> > + cookie=0x1020304050607, hard_timeout=3, 
> > priority=1,vlan_tci=0x0000/0x0fff,dl_dst=50:54:00:00:00:06 actions=output:1
> > + in_port=2 actions=NORMAL
> > + priority=2 actions=drop
> > + priority=2,in_port=1 actions=resubmit(,2)
> > + priority=2,in_port=3 actions=resubmit(,2)
> > + table=2, 
> > actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x1020304050607,NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],output:OXM_OF_IN_PORT[[]]),output:2
> > +NXST_FLOW reply:
> > +])
> > +
> > +ovs-appctl revalidator/pause
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p3 $packet2
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p3 $packet2
> > +ovs-appctl time/warp 75
> > +
> > +ovs-appctl revalidator/resume
> > +ovs-appctl revalidator/wait
> > +
> > +AT_CHECK([ovs-ofctl --no-stats dump-flows br0 | ofctl_strip | sort], [0], 
> > [dnl
> > + cookie=0x1020304050607, hard_timeout=3, 
> > priority=1,vlan_tci=0x0000/0x0fff,dl_dst=50:54:00:00:00:06 actions=output:3
> > + in_port=2 actions=NORMAL
> > + priority=2 actions=drop
> > + priority=2,in_port=1 actions=resubmit(,2)
> > + priority=2,in_port=3 actions=resubmit(,2)
> > + table=2, 
> > actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x1020304050607,NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],output:OXM_OF_IN_PORT[[]]),output:2
> > +NXST_FLOW reply:
> > +])
> > +
> > +ovs-appctl time/warp 1000 3001
> > +
> > +AT_CHECK([ovs-ofctl dump-flows br0 | ofctl_strip | sort], [0], [dnl
> > + in_port=2 actions=NORMAL
> > + priority=2 actions=drop
> > + priority=2,in_port=1 actions=resubmit(,2)
> > + priority=2,in_port=3 actions=resubmit(,2)
> > + table=2, 
> > actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x1020304050607,NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],output:OXM_OF_IN_PORT[[]]),output:2
> > +NXST_FLOW reply:
> > +])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
>
> The test above looks like a debug version of a test below and always fails.
> Please, remove.
>
> > +
> > +AT_SETUP([learning action - flapping learn rule])
> > +OVS_VSWITCHD_START
> > +add_of_ports br0 1 2 3
> > +
> > +ovs-appctl time/stop
> > +AT_CHECK([[ovs-ofctl add-flow br0 
> > 'table=0,priority=2,in_port=1,actions=resubmit(,2)']])
> > +AT_CHECK([[ovs-ofctl add-flow br0 
> > 'table=0,priority=2,in_port=2,actions=resubmit(,2)']])
> > +AT_CHECK([[ovs-ofctl add-flow br0 
> > 'table=2,actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x123,NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],output:OXM_OF_IN_PORT[]),output:3']])
> > +
> > +packet1='eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
> > +packet2='eth(src=50:54:00:00:00:06,dst=50:54:00:00:00:07),eth_type(0x0800),ipv4(src=192.168.0.1,dst=192.168.0.2,proto=6,tos=0,ttl=64,frag=no),tcp(src=8,dst=9)'
> > +
>
> Why two variables for the same packet?
> Might make sense to also use m4_define for it, so the value is visible
> in the test log.


Hello Ilya,

Thanks for the feedback, I'll resubmit with many changes incorporated.
However, I don't think using m4_define like that would be consistent
with any other tests.


Cheers,
M

>
> Are we protected from hashing and flow dump order differences without
> a loop that we had in previous version?
>
> > +ovs-appctl revalidator/pause
> > +ovs-appctl netdev-dummy/receive p2 $packet2
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p2 $packet2
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl revalidator/resume
> > +ovs-appctl revalidator/wait
>
> Please, wrap all the ovs-appctl commands into AT_CHECK, so they are visible
> in the test log.  To save some LOCs, time warps can be written as:
>
>   AT_CHECK([ovs-appctl time/warp 75], [0], [ignore])
>
> > +
> > +ovs-ofctl --no-stats dump-flows br0
>
> Seems unnecessary.
>
> > +
> > +AT_CHECK([ovs-ofctl --no-stats dump-flows br0 | ofctl_strip | sort | grep 
> > 0x123], [0], [dnl
> > + cookie=0x123, hard_timeout=3, priority=1,dl_dst=50:54:00:00:00:06 
> > actions=output:1
> > + table=2, 
> > actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x123,NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],output:OXM_OF_IN_PORT[[]]),output:3
> > +])
> > +
> > +ovs-appctl revalidator/pause
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p2 $packet2
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p1 $packet1
> > +ovs-appctl time/warp 75
> > +ovs-appctl netdev-dummy/receive p2 $packet2
> > +ovs-appctl time/warp 75
> > +
> > +ovs-appctl revalidator/resume
> > +ovs-appctl revalidator/wait
>
> Ditto.
>
> > +
> > +AT_CHECK([ovs-ofctl --no-stats dump-flows br0 | ofctl_strip | sort | grep 
> > 0x123], [0], [dnl
> > + cookie=0x123, hard_timeout=3, priority=1,dl_dst=50:54:00:00:00:06 
> > actions=output:2
> > + table=2, 
> > actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x123,NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],output:OXM_OF_IN_PORT[[]]),output:3
> > +])
> > +
> > +dnl Wait and check for learned rule eviction
>
> Period at the end of a sentence.  Also, might make sense to point out
> that eviction is happening due to hard timeout.
>
> > +ovs-appctl time/warp 3200
> > +
> > +AT_CHECK([ovs-ofctl --no-stats dump-flows br0 | ofctl_strip | grep 0x123], 
> > [0], [dnl
> > + table=2, 
> > actions=learn(table=0,hard_timeout=3,priority=1,cookie=0x123,NXM_OF_ETH_DST[[]]=NXM_OF_ETH_SRC[[]],output:OXM_OF_IN_PORT[[]]),output:3
> > +])
> > +
> > +OVS_VSWITCHD_STOP
> > +AT_CLEANUP
> > --
> > 2.39.3
>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v7] ofproto-dpif-xlate: Don't reinstall removed XC_LEARN rule

Reply via email to