Sure,
will send a v2. Thanks for the review.

Ilya Maximets <[email protected]> 于2021年7月26日周一 上午7:22写道:

>
> Re-adding the list and people from CC.
>
> On 6/27/21 8:16 AM, 贺鹏 wrote:
> > Hi, Ilya,
> >
> > sorry for replying so late, being really busy this month.
>
> Busy time for everyone.
> BTW, I'm taking some time off next week.  Maybe someone else will be able
> to take a look at the changes while I'm not here.  My replies inline.

Have a good rest.

>
> > I've read through your comments. I personally prefer to keep the
> > number of megaflows as few as possible, as megaflow is becoming
> > expensive as the size of flow struct is large.
> > below are my experiments.
> >
> > Ilya Maximets <[email protected]> 于2021年5月22日周六 上午3:50写道:
> >>
> >> On 5/21/21 5:00 AM, 贺鹏 wrote:
> >>> Hi, Ilya
> >>>
> >>>
> >>>
> >>> Ilya Maximets <[email protected]> 于2021年5月19日周三 下午8:50写道:
> >>>>
> >>>> On 2/27/21 10:34 AM, Peng He wrote:
> >>>>> CT zone could be set from a field that is not included in frozen
> >>>>> metadata. Consider the example rules which are typically seen in
> >>>>> OpenStack security group rules:
> >>>>>
> >>>>> priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0)
> >>>>> priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2
> >>>>>
> >>>>> The zone is set from the first rule's ct action. These two rules will
> >>>>> generate two megaflows: the first one uses zone=5 to query the CT 
> >>>>> module,
> >>>>> the second one sets the zone-id from the first megaflow and commit to 
> >>>>> CT.
> >>>>>
> >>>>> The current implementation will generate a megaflow that does not use
> >>>>> ct_zone=5 as a match, but directly commit into the ct using zone=5, as 
> >>>>> zone is
> >>>>> set by an Imm not a field.
> >>>>>
> >>>>> Consider a situation that one changes the zone id (for example to 15)
> >>>>> in the first rule, however, still keep the second rule unchanged. During
> >>>>> this change, there is traffic hitting the two generated megaflows, the
> >>>>> revaldiator would revalidate all megaflows, however, the revalidator 
> >>>>> will
> >>>>> not change the second megaflow, because zone=5 is recorded in the
> >>>>> megaflow, so the xlate will still translate the commit action into 
> >>>>> zone=5,
> >>>>> and the new traffic will still commit to CT as zone=5, not zone=15,
> >>>>> resulting in taffic drops and other issues.
> >>>>>
> >>>>> Just like OVS set-field convention, if a field X is set by Y
> >>>>> (Y is a variable not an Imm), we should also mask Y as a match
> >>>>> in the generated megaflow. An exception is that if the zone-id is
> >>>>> set by the field that is included in the frozen state (i.e. regs) and 
> >>>>> this
> >>>>> upcall is a resume of a thawed xlate, the un-wildcarding can be skipped,
> >>>>> as the recirc_id is a hash of the values in these fields, and it will 
> >>>>> change
> >>>>> following the changes of these fields. When the recirc_id changes,
> >>>>> all megaflows with the old recirc id will be invalid later.
> >>>>>
> >>>>> Fixes: 07659514c3 ("Add support for connection tracking.")
> >>>>> Reported-by: Sai Su <[email protected]>
> >>>>> Signed-off-by: Peng He <[email protected]>
> >>>>> ---
> >>>>>  include/openvswitch/meta-flow.h |  1 +
> >>>>>  lib/meta-flow.c                 | 13 ++++++++++
> >>>>>  ofproto/ofproto-dpif-xlate.c    | 12 +++++++++
> >>>>>  tests/system-traffic.at         | 45 +++++++++++++++++++++++++++++++++
> >>>>>  4 files changed, 71 insertions(+)
> >>>>>
> >>>>> diff --git a/include/openvswitch/meta-flow.h 
> >>>>> b/include/openvswitch/meta-flow.h
> >>>>> index 95e52e358..045dce8f5 100644
> >>>>> --- a/include/openvswitch/meta-flow.h
> >>>>> +++ b/include/openvswitch/meta-flow.h
> >>>>> @@ -2305,6 +2305,7 @@ void mf_set_flow_value_masked(const struct 
> >>>>> mf_field *,
> >>>>>                                const union mf_value *mask,
> >>>>>                                struct flow *);
> >>>>>  bool mf_is_tun_metadata(const struct mf_field *);
> >>>>> +bool mf_is_frozen_metadata(const struct mf_field *);
> >>>>>  bool mf_is_pipeline_field(const struct mf_field *);
> >>>>>  bool mf_is_set(const struct mf_field *, const struct flow *);
> >>>>>  void mf_mask_field(const struct mf_field *, struct flow_wildcards *);
> >>>>> diff --git a/lib/meta-flow.c b/lib/meta-flow.c
> >>>>> index c808d205d..e03cd8d0c 100644
> >>>>> --- a/lib/meta-flow.c
> >>>>> +++ b/lib/meta-flow.c
> >>>>> @@ -1788,6 +1788,19 @@ mf_is_tun_metadata(const struct mf_field *mf)
> >>>>>             mf->id < MFF_TUN_METADATA0 + TUN_METADATA_NUM_OPTS;
> >>>>>  }
> >>>>>
> >>>>> +bool
> >>>>> +mf_is_frozen_metadata(const struct mf_field *mf)
> >>>>> +{
> >>>>> +    if (mf->id >= MFF_TUN_ID && mf->id <= MFF_IN_PORT_OXM) {
> >>>>> +        return true;
> >>>>> +    }
> >>>>> +
> >>>>> +    if (mf->id >= MFF_REG0 && mf->id < MFF_ETH_SRC) {
> >>>>> +        return true;
> >>>>> +    }
> >>>>> +    return false;
> >>>>> +}
> >>>>> +
> >>>>>  bool
> >>>>>  mf_is_pipeline_field(const struct mf_field *mf)
> >>>>>  {
> >>>>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> >>>>> index 7108c8a30..14d00db1e 100644
> >>>>> --- a/ofproto/ofproto-dpif-xlate.c
> >>>>> +++ b/ofproto/ofproto-dpif-xlate.c
> >>>>> @@ -6195,6 +6195,18 @@ compose_conntrack_action(struct xlate_ctx *ctx, 
> >>>>> struct ofpact_conntrack *ofc,
> >>>>>
> >>>>>      if (ofc->zone_src.field) {
> >>>>>          zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow);
> >>>>> +        if (ctx->xin->frozen_state) {
> >>>>> +            /* If the upcall is a resume of a recirculation, we only 
> >>>>> need to
> >>>>> +             * unwildcard the fields that are not in the 
> >>>>> frozen_metadata, as
> >>>>> +             * when the rules update, OVS will generate a new 
> >>>>> recirc_id,
> >>>>> +             * which will invalidate the megaflow with old the 
> >>>>> recirc_id.
> >>>>> +             */
> >>>>> +            if (!mf_is_frozen_metadata(ofc->zone_src.field)) {
> >>>>> +                mf_mask_field(ofc->zone_src.field, ctx->wc);
> >>>>> +            }
> >>>>> +        } else {
> >>>>> +            mf_mask_field(ofc->zone_src.field, ctx->wc);
> >>>>> +        }
> >>>>
> >>>> IIUC, in current code above block is equal to just a single line:
> >>>>
> >>>>     mf_mask_field(ofc->zone_src.field, ctx->wc);
> >>>>
> >>>> That is because zone is not part of a frozen metadata, right?
> >>>
> >>> Yes, basically this is the reason.
> >>> I think the original code only considers the case that the zone id is
> >>> set by IMM, not
> >>> by a reg.
> >>>
> >>>>
> >>>> Can we just remove all this extra logic and unwildcard unconditionally?
> >>>> I understand that this code might save us one match in case someday
> >>>> zone will be part of a frozen metadata, but is that really important?
> >>>> Is there any harm in unwildcarding this field unconditionally?
> >>>
> >>> I think the code can reduce the number of megaflows in case that the rule
> >>> sets the zone-id using IMM. But I think unconditionally un-wildcarding is 
> >>> also
> >>> OK.
> >>
> >> From what I managed to test, if value is taken from any register
> >> populated by Imm, there are no extra matches in datapath flows.
> >> Maybe I'm missing some cases, though.
> >>
> >> I tried to experiment a bit with this code and the test case.
> >> First thing that I found is that we, probably, need to use
> >> mf_write_subfield_flow() instead of mf_mask_field(), because
> >> we should not unwildcard the whole field if only part of it
> >> used.
> >>
> >> Second thing is that I tried to write following set of flows
> >> that might not make much sense, but should probably work fine:
> >>
> >> priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0,commit,exec(load:0xffff0005->NXM_NX_CT_LABEL[[0..31]]))
> >> priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_LABEL[[0..15]]),2
> >>
> >> And what I see is that in resulted datapath flows there is no
> >> match on ct_label.  This happens because masks of ct_label and
> >> ct_mark gets saved at the beginning of the function and restored
> >> later, i.e. our un-wildcarding is lost.
> >
> > Yes, good catch!
> >
> > Yes, I am using the belowing code:
> >
> >      if (ofc->zone_src.field) {
> >          zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow);
> > +        union mf_subvalue value;
> > +        memset(&value, 0xff, sizeof(value));
> >          if (ctx->xin->frozen_state) {
> >              /* If the upcall is a resume of a recirculation, we only need 
> > to
> >               * unwildcard the fields that are not in the frozen_metadata, 
> > as
> > @@ -6202,15 +6189,30 @@ compose_conntrack_action(struct xlate_ctx
> > *ctx, struct ofpact_conntrack *ofc,
> >               * which will invalidate the megaflow with old the recirc_id.
> >               */
> >              if (!mf_is_frozen_metadata(ofc->zone_src.field)) {
> > -                mf_mask_field(ofc->zone_src.field, ctx->wc);
> > +                mf_write_subfield_flow(&ofc->zone_src, &value,
> > &ctx->wc->masks);
> >              }
> >          } else {
> > -            mf_mask_field(ofc->zone_src.field, ctx->wc);
> > +            mf_write_subfield_flow(&ofc->zone_src, &value, 
> > &ctx->wc->masks);
> >          }
> >      } else {
> >          zone = ofc->zone_imm;
> >      }
> >
> > Not sure if it fits the idiom usage of mf APIs.
>
> I'm not an expert in this part of the code, but this seems fine to me.
> I would re-order the definition part though, oterwise it's hard to read.
> Something like:
>
> if (ofc->zone_src.field) {
>     union mf_subvalue value;
>     memset(&value, 0xff, sizeof(value));
>
>     zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow);
>     if (ctx->xin->frozen_state) {
>         ...
>
>
> >
> >>
> >> The third thing is that if I'm un-wildcarding before saving or
> >> after restoring, I see a bit weird datapath flow after modification
> >> of the first flow to:
> >>
> >> priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=15,table=0,commit,exec(load:0xffff000f->NXM_NX_CT_LABEL[[0..31]]))
> >>
> >> There is a new datapath flow with a correct ct_label match, but
> >> the old flow gets updated to have ct_label match on 0.  This might
> >> make some sense for this set of flows, but I'm not sure.
> >>
> >> Before flow-mod:
> >>
> >> recirc_id(0),in_port(ovs-p0),ct_state(-trk),eth(),eth_type(0x0800),ipv4(proto=6,frag=no),
> >>     actions:ct(commit,zone=5,label=0xffff0005/0xffffffff),recirc(0x1)
> >>
> >> recirc_id(0x1),in_port(ovs-p0),ct_state(+trk),ct_label(0x5/0xffff),eth(),eth_type(0x0800),ipv4(proto=6,frag=no),
> >>     actions:ct(commit,zone=5,label=0x5/0xffff),ovs-p1
> >>
> >> After:
> >>
> >> recirc_id(0),in_port(ovs-p0),ct_state(-trk),eth(),eth_type(0x0800),ipv4(proto=6,frag=no),
> >>     actions:ct(commit,zone=15,label=0xffff000f/0xffffffff),recirc(0x1)
> >>
> >> recirc_id(0x1),in_port(ovs-p0),ct_state(+trk),ct_label(0/0xffff),eth(),eth_type(0x0800),ipv4(proto=6,frag=no),
> >>     actions:ct(commit,label=0/0xffff),ovs-p1
> >>
> >> recirc_id(0x1),in_port(ovs-p0),ct_state(+trk),ct_label(0xf/0xffff),eth(),eth_type(0x0800),ipv4(proto=6,frag=no),
> >>     actions:ct(commit,zone=15,label=0xf/0xffff),ovs-p1
> >>
> >
> > I do not encounter such an issue, my changes are here:
>
> Hmm.  Maybe I did something wrong.
> Could you prepare v2, so we can continue discussion from there?
>
> Best regards, Ilya Maximets.
>
> >
> >  compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack 
> > *ofc,
> >                           bool is_last_action)
> >  {
> > -    ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
> > -    uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
> > -    size_t ct_offset;
> >      uint16_t zone;
> > -
> >      uint16_t zone;
> > -
> > -    /* Ensure that any prior actions are applied before composing the new
> > -     * conntrack action. */
> > -    xlate_commit_actions(ctx);
> > -
> > -    /* Process nested actions first, to populate the key. */
> > -    ctx->ct_nat_action = NULL;
> > -    ctx->wc->masks.ct_mark = 0;
> > -    ctx->wc->masks.ct_label = OVS_U128_ZERO;
> > -    do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx,
> > -                     is_last_action, false);
> > -
> >      if (ofc->zone_src.field) {
> >          zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow);
> > +        union mf_subvalue value;
> > +        memset(&value, 0xff, sizeof(value));
> >          if (ctx->xin->frozen_state) {
> >              /* If the upcall is a resume of a recirculation, we only need 
> > to
> >               * unwildcard the fields that are not in the frozen_metadata, 
> > as
> > @@ -6202,15 +6189,30 @@ compose_conntrack_action(struct xlate_ctx
> > *ctx, struct ofpact_conntrack *ofc,
> >               * which will invalidate the megaflow with old the recirc_id.
> >               */
> >              if (!mf_is_frozen_metadata(ofc->zone_src.field)) {
> > -                mf_mask_field(ofc->zone_src.field, ctx->wc);
> > +                mf_write_subfield_flow(&ofc->zone_src, &value,
> > &ctx->wc->masks);
> >              }
> >          } else {
> > -            mf_mask_field(ofc->zone_src.field, ctx->wc);
> > +            mf_write_subfield_flow(&ofc->zone_src, &value, 
> > &ctx->wc->masks);
> >          }
> >      } else {
> >          zone = ofc->zone_imm;
> >      }
> >
> > +    size_t ct_offset;
> > +    ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label;
> > +    uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark;
> > +    /* Ensure that any prior actions are applied before composing the new
> > +     * conntrack action. */
> > +    xlate_commit_actions(ctx);
> > +
> > +    /* Process nested actions first, to populate the key. */
> > +    ctx->ct_nat_action = NULL;
> > +    ctx->wc->masks.ct_mark = 0;
> > +    ctx->wc->masks.ct_label = OVS_U128_ZERO;
> > +    do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx,
> > +                     is_last_action, false);
> > +
> > +
> >
> > and my test is OK.
> > root@debian:~/ovs# ovs-appctl dpctl/dump-flows | grep label
> > ct_state(+trk),ct_label(0xffff0005/0xffff),recirc_id(0x2),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no),
> > packets:5, bytes:465, used:1.323s, flags:FP.,
> > actions:ct(commit,zone=5),3
> > ct_state(-trk),recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no),
> > packets:11, bytes:1004, used:1.293s, flags:SFP.,
> > actions:ct(commit,zone=15,label=0xffff000f/0xffffffff),recirc(0x2)
> > ct_state(+trk),ct_label(0xffff000f/0xffff),recirc_id(0x2),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no),
> > packets:5, bytes:465, used:1.293s, flags:FP.,
> > actions:ct(commit,zone=15),3
> >
> >
> >
> >> Best regards, Ilya Maximets.
> >
> >
> >
>


--
hepeng
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to