Hi, Sent a new version of the patch.
贺鹏 <xnhp0...@gmail.com> 于2021年2月20日周六 上午10:21写道: > > Hi, Mark, > > thanks for the review! > > Mark Gray <mark.d.g...@redhat.com> 于2021年2月19日周五 下午9:07写道: > > > > On 18/02/2021 09:43, Peng He wrote: > > > > One small comment below. By the way, good catch .. i found it difficult > > enough to review it, let alone find it! > > Thanks. > This is found by exploring our controller's bug, which changes zone-id > every time it restarts. > > > > > > CT zone could be set from a field that is not included in frozen > > > metadata. Consider the example rules which are typically seen in > > > OpenStack security group rules: > > > > > > priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0) > > > priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2 > > > > > > The zone is set from the first rule's ct action. These two rules will > > > generate two megaflows: the first one uses zone=5 to query the CT module, > > > the second one sets the zone-id from the first megaflow and commit to CT. > > > > > > The current implementation will generate a megaflow that does not use > > > ct_zone=5 as a match, but directly commit into the ct using zone=5, as > > > zone is > > > set by an Imm not a field. > > > > > > Consider a situation that one changes the zone id (for example to 15) > > > in the first rule, however, still keep the second rule unchanged. During > > > this change, there is traffic hitting the two generated megaflows, the > > > revaldiator would revalidate all megaflows, however, the revalidator will > > > not change the second megaflow, because zone=5 is recorded in the > > > megaflow, so the xlate will still translate the commit action into zone=5, > > > and the new traffic will still commit to CT as zone=5, not zone=15, > > > resulting in taffic drops and other issues. > > > > > > Just like OVS set-field convention, if a field X is set by Y > > > (Y is a variable not an Imm), we should also mask Y as a match > > > in the generated megaflow. An exception is that if the zone-id is > > > set by the field that is included in the frozen state (i.e. regs) and this > > > upcall is a resume of a thawed xlate, the un-wildcarding can be skipped, > > > as the recirc_id is a hash of the values in these fields, and it will > > > change > > > following the changes of these fields. When the recirc_id changes, > > > all megaflows with the old recirc id will be invalid later. > > > > > > Fixes: 07659514c3 ("Add support for connection tracking.") > > > Reported-by: Sai Su <susai...@bytedance.com> > > > Signed-off-by: Peng He <hepeng.0...@bytedance.com> > > > --- > > > include/openvswitch/meta-flow.h | 1 + > > > lib/meta-flow.c | 13 ++++++++++ > > > ofproto/ofproto-dpif-xlate.c | 12 ++++++++++ > > > tests/system-traffic.at | 42 +++++++++++++++++++++++++++++++++ > > > 4 files changed, 68 insertions(+) > > > > > > diff --git a/include/openvswitch/meta-flow.h > > > b/include/openvswitch/meta-flow.h > > > index 95e52e358..045dce8f5 100644 > > > --- a/include/openvswitch/meta-flow.h > > > +++ b/include/openvswitch/meta-flow.h > > > @@ -2305,6 +2305,7 @@ void mf_set_flow_value_masked(const struct mf_field > > > *, > > > const union mf_value *mask, > > > struct flow *); > > > bool mf_is_tun_metadata(const struct mf_field *); > > > +bool mf_is_frozen_metadata(const struct mf_field *); > > > bool mf_is_pipeline_field(const struct mf_field *); > > > bool mf_is_set(const struct mf_field *, const struct flow *); > > > void mf_mask_field(const struct mf_field *, struct flow_wildcards *); > > > diff --git a/lib/meta-flow.c b/lib/meta-flow.c > > > index c808d205d..e03cd8d0c 100644 > > > --- a/lib/meta-flow.c > > > +++ b/lib/meta-flow.c > > > @@ -1788,6 +1788,19 @@ mf_is_tun_metadata(const struct mf_field *mf) > > > mf->id < MFF_TUN_METADATA0 + TUN_METADATA_NUM_OPTS; > > > } > > > > > > +bool > > > +mf_is_frozen_metadata(const struct mf_field *mf) > > > +{ > > > + if (mf->id >= MFF_TUN_ID && mf->id <= MFF_IN_PORT_OXM) { > > > + return true; > > > + } > > > + > > > + if (mf->id >= MFF_REG0 && mf->id < MFF_ETH_SRC) { > > > + return true; > > > + } > > > + return false; > > > +} > > > + > > > bool > > > mf_is_pipeline_field(const struct mf_field *mf) > > > { > > > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c > > > index 7108c8a30..14d00db1e 100644 > > > --- a/ofproto/ofproto-dpif-xlate.c > > > +++ b/ofproto/ofproto-dpif-xlate.c > > > @@ -6195,6 +6195,18 @@ compose_conntrack_action(struct xlate_ctx *ctx, > > > struct ofpact_conntrack *ofc, > > > > > > if (ofc->zone_src.field) { > > > zone = mf_get_subfield(&ofc->zone_src, &ctx->xin->flow); > > > + if (ctx->xin->frozen_state) { > > > + /* If the upcall is a resume of a recirculation, we only > > > need to > > > + * unwildcard the fields that are not in the > > > frozen_metadata, as > > > + * when the rules update, OVS will generate a new recirc_id, > > > + * which will invalidate the megaflow with old the recirc_id. > > > + */ > > > + if (!mf_is_frozen_metadata(ofc->zone_src.field)) { > > > + mf_mask_field(ofc->zone_src.field, ctx->wc); > > > + } > > > + } else { > > > + mf_mask_field(ofc->zone_src.field, ctx->wc); > > > + } > > > } else { > > > zone = ofc->zone_imm; > > > } > > > diff --git a/tests/system-traffic.at b/tests/system-traffic.at > > > index fb5b9a36d..84a9759ea 100644 > > > --- a/tests/system-traffic.at > > > +++ b/tests/system-traffic.at > > > @@ -1927,6 +1927,48 @@ > > > tcp,orig=(src=10.1.1.3,dst=10.1.1.4,sport=<cleared>,dport=<cleared>),reply=(src= > > > OVS_TRAFFIC_VSWITCHD_STOP > > > AT_CLEANUP > > > > > > +AT_SETUP([conntrack - zones from other field]) > > > +CHECK_CONNTRACK() > > > +OVS_TRAFFIC_VSWITCHD_START() > > > + > > > +ADD_NAMESPACES(at_ns0, at_ns1) > > > + > > > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24") > > > +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24") > > > + > > > +dnl Allow any traffic from ns0->ns1. Only allow nd, return traffic from > > > ns1->ns0. > > > +AT_DATA([flows.txt], [dnl > > > +priority=1,action=drop > > > +priority=10,arp,action=normal > > > +priority=10,icmp,action=normal > > > +priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0) > > > +priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2 > > > +priority=100,in_port=2,ct_state=-trk,tcp,action=ct(table=0,zone=5) > > > +priority=100,in_port=2,ct_state=+trk,ct_zone=5,tcp,action=1 > > > +]) > > > + > > > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt]) > > > + > > > +OVS_START_L7([at_ns1], [http]) > > > + > > > +dnl HTTP requests from p0->p1 should work fine. > > > +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v > > > -o wget0.log]) > > > + > > > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | FORMAT_CT(10.1.1.2)], [0], > > > [dnl > > > +tcp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=<cleared>,dport=<cleared>),reply=(src=10.1.1.2,dst=10.1.1.1,sport=<cleared>,dport=<cleared>),zone=5,protoinfo=(state=<cleared>) > > > +]) > > > + > > > +AT_CHECK([ovs-appctl dpctl/dump-flows --names filter=in_port=ovs-p0 | > > > grep "+trk" | grep -q "ct_zone(0x5)" ], [0], []) > > > + > > > +dnl changes zone-id > > > > Could you add a comment here explaining what you are doing and why? I > > realise it is in the commit message but not everyone will look there. > > Sure, will add it in the next version. > > > > +AT_CHECK([ovs-ofctl mod-flows br0 > > > 'priority=100,ct_state=-trk,tcp,in_port="ovs-p0" > > > actions=ct(table=0,zone=15)']) > > > +NS_CHECK_EXEC([at_ns0], [wget 10.1.1.2 -t 3 -T 1 --retry-connrefused -v > > > -o wget0.log]) > > > +AT_CHECK([ovs-appctl dpctl/dump-flows --names filter=in_port=ovs-p0 | > > > grep "+trk" | grep -q "ct_zone(0xf)" ], [0], []) > > > > One thing that I noticed is that the ct_zone(0x5) flow is still in the > > fast path. I think this is ok though as it should expire and I cant > > think of any unwanted side-effects that this would cause? wdyt? > > I think this will not have side-effects as the ct_zone(0x5) is in a > megaflow that has recirc_id. The rule that > queries CT with a special zone-id has changed into 0xf, however, since > CT_ZONE is not included in the frozen_state metadata, > it will not change the recirc_id's hash value, i.e. the action part of > the megaflow that queries CT (the action is with the form of > "ct(zone=15), recirc(ID)") will > keep using the same recirc_id, so both ct_zone(0x5) and ct_zone(0xf) > will have the same recirc_id(ID) in their matches and only one of them > will get matched, > the old one will expire later since no packets will hit this megaflow, > and all sessions in CT with the old ct_zone will expire too later. > > > > > + > > > +OVS_TRAFFIC_VSWITCHD_STOP > > > +AT_CLEANUP > > > + > > > + > > > AT_SETUP([conntrack - multiple bridges]) > > > CHECK_CONNTRACK() > > > OVS_TRAFFIC_VSWITCHD_START( > > > > > > > > -- > hepeng -- hepeng _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev