Sure, will send a v2. Thanks for the review.
Ilya Maximets <[email protected]> 于2021年7月26日周一 上午7:22写道: > > Re-adding the list and people from CC. > > On 6/27/21 8:16 AM, 贺鹏 wrote: > > Hi, Ilya, > > > > sorry for replying so late, being really busy this month. > > Busy time for everyone. > BTW, I'm taking some time off next week. Maybe someone else will be able > to take a look at the changes while I'm not here. My replies inline. Have a good rest. > > > I've read through your comments. I personally prefer to keep the > > number of megaflows as few as possible, as megaflow is becoming > > expensive as the size of flow struct is large. > > below are my experiments. > > > > Ilya Maximets <[email protected]> 于2021年5月22日周六 上午3:50写道: > >> > >> On 5/21/21 5:00 AM, 贺鹏 wrote: > >>> Hi, Ilya > >>> > >>> > >>> > >>> Ilya Maximets <[email protected]> 于2021年5月19日周三 下午8:50写道: > >>>> > >>>> On 2/27/21 10:34 AM, Peng He wrote: > >>>>> CT zone could be set from a field that is not included in frozen > >>>>> metadata. Consider the example rules which are typically seen in > >>>>> OpenStack security group rules: > >>>>> > >>>>> priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0) > >>>>> priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2 > >>>>> > >>>>> The zone is set from the first rule's ct action. These two rules will > >>>>> generate two megaflows: the first one uses zone=5 to query the CT > >>>>> module, > >>>>> the second one sets the zone-id from the first megaflow and commit to > >>>>> CT. > >>>>> > >>>>> The current implementation will generate a megaflow that does not use > >>>>> ct_zone=5 as a match, but directly commit into the ct using zone=5, as > >>>>> zone is > >>>>> set by an Imm not a field. > >>>>> > >>>>> Consider a situation that one changes the zone id (for example to 15) > >>>>> in the first rule, however, still keep the second rule unchanged. During > >>>>> this change, there is traffic hitting the two generated megaflows, the > >>>>> revaldiator would revalidate all megaflows, however, the revalidator > >>>>> will > >>>>> not change the second megaflow, because zone=5 is recorded in the > >>>>> megaflow, so the xlate will still translate the commit action into > >>>>> zone=5, > >>>>> and the new traffic will still commit to CT as zone=5, not zone=15, > >>>>> resulting in taffic drops and other issues. > >>>>> > >>>>> Just like OVS set-field convention, if a field X is set by Y > >>>>> (Y is a variable not an Imm), we should also mask Y as a match > >>>>> in the generated megaflow. An exception is that if the zone-id is > >>>>> set by the field that is included in the frozen state (i.e. regs) and > >>>>> this > >>>>> upcall is a resume of a thawed xlate, the un-wildcarding can be skipped, > >>>>> as the recirc_id is a hash of the values in these fields, and it will > >>>>> change > >>>>> following the changes of these fields. When the recirc_id changes, > >>>>> all megaflows with the old recirc id will be invalid later. > >>>>> > >>>>> Fixes: 07659514c3 ("Add support for connection tracking.") > >>>>> Reported-by: Sai Su <[email protected]> > >>>>> Signed-off-by: Peng He <[email protected]> > >>>>> --- > >>>>> include/openvswitch/meta-flow.h | 1 + > >>>>> lib/meta-flow.c | 13 ++++++++++ > >>>>> ofproto/ofproto-dpif-xlate.c | 12 +++++++++ > >>>>> tests/system-traffic.at | 45 +++++++++++++++++++++++++++++++++ > >>>>> 4 files changed, 71 insertions(+) > >>>>> > >>>>> diff --git a/include/openvswitch/meta-flow.h > >>>>> b/include/openvswitch/meta-flow.h > >>>>> index 95e52e358..045dce8f5 100644 > >>>>> --- a/include/openvswitch/meta-flow.h > >>>>> +++ b/include/openvswitch/meta-flow.h > >>>>> @@ -2305,6 +2305,7 @@ void mf_set_flow_value_masked(const struct > >>>>> mf_field *, > >>>>> const union mf_value *mask, > >>>>> struct flow *); > >>>>> bool mf_is_tun_metadata(const struct mf_field *); > >>>>> +bool mf_is_frozen_metadata(const struct mf_field *); > >>>>> bool mf_is_pipeline_field(const struct mf_field *); > >>>>> bool mf_is_set(const struct mf_field *, const struct flow *); > >>>>> void mf_mask_field(const struct mf_field *, struct flow_wildcards *); > >>>>> diff --git a/lib/meta-flow.c b/lib/meta-flow.c > >>>>> index c808d205d..e03cd8d0c 100644 > >>>>> --- a/lib/meta-flow.c > >>>>> +++ b/lib/meta-flow.c > >>>>> @@ -1788,6 +1788,19 @@ mf_is_tun_metadata(const struct mf_field *mf) > >>>>> mf->id < MFF_TUN_METADATA0 + TUN_METADATA_NUM_OPTS; > >>>>> } > >>>>> > >>>>> +bool > >>>>> +mf_is_frozen_metadata(const struct mf_field *mf) > >>>>> +{ > >>>>> + if (mf->id >= MFF_TUN_ID && mf->id <= MFF_IN_PORT_OXM) { > >>>>> + return true; > >>>>> + } > >>>>> + > >>>>> + if (mf->id >= MFF_REG0 && mf->id < MFF_ETH_SRC) { > >>>>> + return true; > >>>>> + } > >>>>> + return false; > >>>>> +} > >>>>> + > >>>>> bool > >>>>> mf_is_pipeline_field(const struct mf_field *mf) > >>>>> { > >>>>> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > >>>>> index 7108c8a30..14d00db1e 100644 > >>>>> --- a/ofproto/ofproto-dpif-xlate.c > >>>>> +++ b/ofproto/ofproto-dpif-xlate.c > >>>>> @@ -6195,6 +6195,18 @@ compose_conntrack_action(struct xlate_ctx *ctx, > >>>>> struct ofpact_conntrack *ofc, > >>>>> > >>>>> if (ofc->zone_src.field) { > >>>>> zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow); > >>>>> + if (ctx->xin->frozen_state) { > >>>>> + /* If the upcall is a resume of a recirculation, we only > >>>>> need to > >>>>> + * unwildcard the fields that are not in the > >>>>> frozen_metadata, as > >>>>> + * when the rules update, OVS will generate a new > >>>>> recirc_id, > >>>>> + * which will invalidate the megaflow with old the > >>>>> recirc_id. > >>>>> + */ > >>>>> + if (!mf_is_frozen_metadata(ofc->zone_src.field)) { > >>>>> + mf_mask_field(ofc->zone_src.field, ctx->wc); > >>>>> + } > >>>>> + } else { > >>>>> + mf_mask_field(ofc->zone_src.field, ctx->wc); > >>>>> + } > >>>> > >>>> IIUC, in current code above block is equal to just a single line: > >>>> > >>>> mf_mask_field(ofc->zone_src.field, ctx->wc); > >>>> > >>>> That is because zone is not part of a frozen metadata, right? > >>> > >>> Yes, basically this is the reason. > >>> I think the original code only considers the case that the zone id is > >>> set by IMM, not > >>> by a reg. > >>> > >>>> > >>>> Can we just remove all this extra logic and unwildcard unconditionally? > >>>> I understand that this code might save us one match in case someday > >>>> zone will be part of a frozen metadata, but is that really important? > >>>> Is there any harm in unwildcarding this field unconditionally? > >>> > >>> I think the code can reduce the number of megaflows in case that the rule > >>> sets the zone-id using IMM. But I think unconditionally un-wildcarding is > >>> also > >>> OK. > >> > >> From what I managed to test, if value is taken from any register > >> populated by Imm, there are no extra matches in datapath flows. > >> Maybe I'm missing some cases, though. > >> > >> I tried to experiment a bit with this code and the test case. > >> First thing that I found is that we, probably, need to use > >> mf_write_subfield_flow() instead of mf_mask_field(), because > >> we should not unwildcard the whole field if only part of it > >> used. > >> > >> Second thing is that I tried to write following set of flows > >> that might not make much sense, but should probably work fine: > >> > >> priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0,commit,exec(load:0xffff0005->NXM_NX_CT_LABEL[[0..31]])) > >> priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_LABEL[[0..15]]),2 > >> > >> And what I see is that in resulted datapath flows there is no > >> match on ct_label. This happens because masks of ct_label and > >> ct_mark gets saved at the beginning of the function and restored > >> later, i.e. our un-wildcarding is lost. > > > > Yes, good catch! > > > > Yes, I am using the belowing code: > > > > if (ofc->zone_src.field) { > > zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow); > > + union mf_subvalue value; > > + memset(&value, 0xff, sizeof(value)); > > if (ctx->xin->frozen_state) { > > /* If the upcall is a resume of a recirculation, we only need > > to > > * unwildcard the fields that are not in the frozen_metadata, > > as > > @@ -6202,15 +6189,30 @@ compose_conntrack_action(struct xlate_ctx > > *ctx, struct ofpact_conntrack *ofc, > > * which will invalidate the megaflow with old the recirc_id. > > */ > > if (!mf_is_frozen_metadata(ofc->zone_src.field)) { > > - mf_mask_field(ofc->zone_src.field, ctx->wc); > > + mf_write_subfield_flow(&ofc->zone_src, &value, > > &ctx->wc->masks); > > } > > } else { > > - mf_mask_field(ofc->zone_src.field, ctx->wc); > > + mf_write_subfield_flow(&ofc->zone_src, &value, > > &ctx->wc->masks); > > } > > } else { > > zone = ofc->zone_imm; > > } > > > > Not sure if it fits the idiom usage of mf APIs. > > I'm not an expert in this part of the code, but this seems fine to me. > I would re-order the definition part though, oterwise it's hard to read. > Something like: > > if (ofc->zone_src.field) { > union mf_subvalue value; > memset(&value, 0xff, sizeof(value)); > > zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow); > if (ctx->xin->frozen_state) { > ... > > > > > >> > >> The third thing is that if I'm un-wildcarding before saving or > >> after restoring, I see a bit weird datapath flow after modification > >> of the first flow to: > >> > >> priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=15,table=0,commit,exec(load:0xffff000f->NXM_NX_CT_LABEL[[0..31]])) > >> > >> There is a new datapath flow with a correct ct_label match, but > >> the old flow gets updated to have ct_label match on 0. This might > >> make some sense for this set of flows, but I'm not sure. > >> > >> Before flow-mod: > >> > >> recirc_id(0),in_port(ovs-p0),ct_state(-trk),eth(),eth_type(0x0800),ipv4(proto=6,frag=no), > >> actions:ct(commit,zone=5,label=0xffff0005/0xffffffff),recirc(0x1) > >> > >> recirc_id(0x1),in_port(ovs-p0),ct_state(+trk),ct_label(0x5/0xffff),eth(),eth_type(0x0800),ipv4(proto=6,frag=no), > >> actions:ct(commit,zone=5,label=0x5/0xffff),ovs-p1 > >> > >> After: > >> > >> recirc_id(0),in_port(ovs-p0),ct_state(-trk),eth(),eth_type(0x0800),ipv4(proto=6,frag=no), > >> actions:ct(commit,zone=15,label=0xffff000f/0xffffffff),recirc(0x1) > >> > >> recirc_id(0x1),in_port(ovs-p0),ct_state(+trk),ct_label(0/0xffff),eth(),eth_type(0x0800),ipv4(proto=6,frag=no), > >> actions:ct(commit,label=0/0xffff),ovs-p1 > >> > >> recirc_id(0x1),in_port(ovs-p0),ct_state(+trk),ct_label(0xf/0xffff),eth(),eth_type(0x0800),ipv4(proto=6,frag=no), > >> actions:ct(commit,zone=15,label=0xf/0xffff),ovs-p1 > >> > > > > I do not encounter such an issue, my changes are here: > > Hmm. Maybe I did something wrong. > Could you prepare v2, so we can continue discussion from there? > > Best regards, Ilya Maximets. > > > > > compose_conntrack_action(struct xlate_ctx *ctx, struct ofpact_conntrack > > *ofc, > > bool is_last_action) > > { > > - ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label; > > - uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark; > > - size_t ct_offset; > > uint16_t zone; > > - > > uint16_t zone; > > - > > - /* Ensure that any prior actions are applied before composing the new > > - * conntrack action. */ > > - xlate_commit_actions(ctx); > > - > > - /* Process nested actions first, to populate the key. */ > > - ctx->ct_nat_action = NULL; > > - ctx->wc->masks.ct_mark = 0; > > - ctx->wc->masks.ct_label = OVS_U128_ZERO; > > - do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx, > > - is_last_action, false); > > - > > if (ofc->zone_src.field) { > > zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow); > > + union mf_subvalue value; > > + memset(&value, 0xff, sizeof(value)); > > if (ctx->xin->frozen_state) { > > /* If the upcall is a resume of a recirculation, we only need > > to > > * unwildcard the fields that are not in the frozen_metadata, > > as > > @@ -6202,15 +6189,30 @@ compose_conntrack_action(struct xlate_ctx > > *ctx, struct ofpact_conntrack *ofc, > > * which will invalidate the megaflow with old the recirc_id. > > */ > > if (!mf_is_frozen_metadata(ofc->zone_src.field)) { > > - mf_mask_field(ofc->zone_src.field, ctx->wc); > > + mf_write_subfield_flow(&ofc->zone_src, &value, > > &ctx->wc->masks); > > } > > } else { > > - mf_mask_field(ofc->zone_src.field, ctx->wc); > > + mf_write_subfield_flow(&ofc->zone_src, &value, > > &ctx->wc->masks); > > } > > } else { > > zone = ofc->zone_imm; > > } > > > > + size_t ct_offset; > > + ovs_u128 old_ct_label_mask = ctx->wc->masks.ct_label; > > + uint32_t old_ct_mark_mask = ctx->wc->masks.ct_mark; > > + /* Ensure that any prior actions are applied before composing the new > > + * conntrack action. */ > > + xlate_commit_actions(ctx); > > + > > + /* Process nested actions first, to populate the key. */ > > + ctx->ct_nat_action = NULL; > > + ctx->wc->masks.ct_mark = 0; > > + ctx->wc->masks.ct_label = OVS_U128_ZERO; > > + do_xlate_actions(ofc->actions, ofpact_ct_get_action_len(ofc), ctx, > > + is_last_action, false); > > + > > + > > > > and my test is OK. > > root@debian:~/ovs# ovs-appctl dpctl/dump-flows | grep label > > ct_state(+trk),ct_label(0xffff0005/0xffff),recirc_id(0x2),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no), > > packets:5, bytes:465, used:1.323s, flags:FP., > > actions:ct(commit,zone=5),3 > > ct_state(-trk),recirc_id(0),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no), > > packets:11, bytes:1004, used:1.293s, flags:SFP., > > actions:ct(commit,zone=15,label=0xffff000f/0xffffffff),recirc(0x2) > > ct_state(+trk),ct_label(0xffff000f/0xffff),recirc_id(0x2),in_port(2),packet_type(ns=0,id=0),eth_type(0x0800),ipv4(proto=6,frag=no), > > packets:5, bytes:465, used:1.293s, flags:FP., > > actions:ct(commit,zone=15),3 > > > > > > > >> Best regards, Ilya Maximets. > > > > > > > -- hepeng _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
