date:20210305

Gaëtan Rivet  于2021年3月3日周三 上午4:01写道：
>
> On Sun, Nov 29, 2020, at 03:32, hepeng.0320 wrote:
> > From: hepeng 
> >
> > This patch tries to use ovs_spin_lock as an alternative to reduce the
> > size of each conn. The ovs_mutex_lock consumes 48 bytes while the
> > ovs_spin_lock only uses 16 bytes.
>
> Using a spin-lock means that the thread won't yield. It might be fine for 
> userland datapath threads, but those are not the only one locking the 
> connections. The ct_clean thread will do, and will execute on randomly 
> assigned CPUs, potentially blocking other threads from being scheduled there. 
> As many conn critical sections are not short, this could impair other modules 
> beyond conntrack.

I guess one way to check if you should use spinlock is to check if
your datapath is polling base or not, the spinlock might be harmful
for AF_XDP userspace datapath, I guess.

since PMD are normally tied into some cores, so the execute time of
the critical section is limited and predictable (as normally some
special optimization like putting isolcpu in the GRUB), so we can
expect that the CT clean thread will not be blocked that long?
I see in DPDK code the spinlock is used in many places, I guess in
certain cases, that we can switch to use spinlock.

Maybe uses pthread_mutex as a default setting, but provide compile
macros as in this patch.

>
> Gaining some bytes is good, but it needs more thought and opinions. There 
> might be other venues available, less risky, to reduce the size.
>
> >Also, we remove 
> > the alg info into an
> > extra space as alg is not common in the real world userspace ct use case.
> > (mostly used as security group and nat in the cloud).
>
> Can you submit this as a separate commit?
>
> >
> > The size of conn_tcp: 312 -> 240.
> >
> > Signed-off-by: Peng He 
>
> Regards,
> --
> Gaetan Rivet
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

-- 
hepeng
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 3/4] conntrack: refactoring alg handle code path

Gaëtan Rivet  于2021年3月3日周三 上午3:49写道：
>
> On Sun, Nov 29, 2020, at 03:32, hepeng.0320 wrote:
> > From: hepeng 
> >
> > Just like the previous patch, split the handle_ftp_ctl into
> > handle_ftp_interest and handle_ftp_other, since we can infer
> > which one to call from the called positions.
> >
> > We also introduce a simple framework which unifies the normal
> > conn handle path and alg conn handle path by using two hooks:
> > *before_conn_update_hook* and *after_conn_update_hook*.
> >
>
> Reading this 'also', I thought initially the commit should be split and the 
> two changes, dividing the ALG CTL in two and introducing the framework were 
> unrelated. Reading further, it's clearer now that this is one logical change.
> I think the link between the two should be expanded in the commit log.
>
> The motivation I get for this commit is to "unify the normal and ALG conn 
> handle path". However, only the FTP ALG benefits from the hooks introduced. 
> The code is streamlined and a single-purpose function is removed, but it 
> seems flimsy as a motivation to introduce a new callback system.

The userspace CT only supports very few ALGs, and even for FTP ALG,
the code can just work for the normal traffic,
any retrans TCP will confuse FTP ALG since the seq adjust account both
normal and retrans packets.

the mini framework is just to unify the code handle path, as for alg
process, you have to do something before conn_udpate
and after, it's better to have a hook framework better than hard
coded. I am not sure if future ALG handlers can benefit from
this framework, but it seems it's better than the current implementation.

The other part I agree with you.

>
> Can you expand on the goal of the framework, and which ALG it could help to 
> implement?
> It is hard to judge whether the framework is useful and properly defined 
> without having a full picture.
>
> > Signed-off-by: Peng He 
> > ---
> >  lib/conntrack-private.h |  16 +-
> >  lib/conntrack.c | 477 
> >  2 files changed, 307 insertions(+), 186 deletions(-)
> >
> > diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
> > index 6bec43d3f..2aa828674 100644
> > --- a/lib/conntrack-private.h
> > +++ b/lib/conntrack-private.h
> > @@ -83,7 +83,22 @@ struct alg_exp_node {
> >  bool nat_rpl_dst;
> >  };
> >
> > +enum ct_alg_ctl_type {
> > +CT_ALG_CTL_NONE,
> > +CT_ALG_CTL_FTP,
> > +CT_ALG_CTL_TFTP,
> > +/* SIP is not enabled through Openflow and presently only used as
> > + * an example of an alg that allows a wildcard src ip. */
> > +CT_ALG_CTL_SIP,
> > +CT_ALG_CTL_MAX,
>
> I don't see this CT_ALG_CTL_MAX used afterward in this commit?
> It is introduced by this change.
>
> > +};
> > +
> >  #define CONN_FLAG_NAT_MASK 0xf
> > +#define CONN_FLAG_CTL_FTP  (CT_ALG_CTL_FTP  << 4)
> > +#define CONN_FLAG_CTL_TFTP (CT_ALG_CTL_TFTP << 4)
> > +#define CONN_FLAG_CTL_SIP  (CT_ALG_CTL_SIP  << 4)
> > +/* currently only 3 algs supported */
> > +#define CONN_FLAG_ALG_MASK 0x70
>
> The enum values are not flags.
> I guess they should instead be defined as
>
> #define CONN_FLAG_CTL_FTP (UINT64_C(1) << (4 + CT_ALG_CTL_FTP))
>
> In the code this should have no effect as SIP is not implemented, so only ALG 
> were 1 and 2, which are valid flags.
>
> >
> >  struct conn_dir {
> >  struct cmap_node cm_node;
> > @@ -99,7 +114,6 @@ struct conn {
> >  struct conn_key parent_key; /* Only used for orig_tuple support. */
> >  struct ovs_list exp_node;
> >  uint64_t conn_flags;
> > -char *alg;
> >
> >  /* Mutable data. */
> >  struct ovs_mutex lock; /* Guards all mutable fields. */
> > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > index a22252a63..fffc617fb 100644
> > --- a/lib/conntrack.c
> > +++ b/lib/conntrack.c
> > @@ -69,15 +69,6 @@ enum ct_alg_mode {
> >  CT_TFTP_MODE,
> >  };
> >
> > -enum ct_alg_ctl_type {
> > -CT_ALG_CTL_NONE,
> > -CT_ALG_CTL_FTP,
> > -CT_ALG_CTL_TFTP,
> > -/* SIP is not enabled through Openflow and presently only used as
> > - * an example of an alg that allows a wildcard src ip. */
> > -CT_ALG_CTL_SIP,
> > -};
> > -
> >  struct zone_limit {
> >  struct hmap_node node;
> >  struct conntrack_zone_limit czl;
> > @@ -153,30 +144,145 @@ static struct ct_l4_proto *l4_protos[] = {
> >  };
> >
> >  static void
> > -handle_ftp_ctl(struct conntrack *ct, const struct conn_lookup_ctx *ctx,
> > -   struct dp_packet *pkt, struct conn *ec, long long now,
> > -   enum ftp_ctl_pkt ftp_ctl, bool nat);
> > -
> > +handle_ftp_ctl_other(struct conntrack *ct, const struct
> > conn_lookup_ctx *ctx,
> > + struct dp_packet *pkt, struct conn *ec, long long
> > now,
> > + bool nat);
> > +static void
> > +handle_ftp_ctl_interest(struct conntrack *ct, const struct
> > conn_lookup_ctx *ctx,
> > +struct dp_packet *pkt, struct conn *ec, long
> > long

Re: [ovs-dev] [PATCH 1/4] conntrack: remove nat_conn

Gaëtan Rivet  于2021年3月3日周三 上午1:46写道：
>
> On Sat, Feb 27, 2021, at 09:49, 贺鹏 wrote:
> > Hi,
> >
> > Thanks for the detailed reviews.
> > These patches are like RFC to see if the work is interesting enough.
> >
> > Aaron Conole  于2021年2月25日周四 上午1:37写道：
> > >
> > > "hepeng.0320"  writes:
> > >
> > > > From: hepeng 
> > > >
> > > > Currently, when doing NAT, the userspace conntrack will use an extra
> > > > conn for the two directions in a flow. However, each conn has actually
> > > > the two keys for both orig and rev flow directions. This patch
> > > > introduces a conn_dir member in the conn and it consists of both rev and
> > > > orig cmap_node for hash lookup. This saves extra allocation for
> > > > nat_conn, and makes userspace code much cleaner.
> > >
> > > If I'm understanding this correctly, you still re-insert the conn into
> > > the conn list?
> >
> >
> > Yes. This is to insert the rev key into the cmap also. So for the nat 
> > traffic,
> > both orig and rev key are in the cmap.
> >
> > >
> > > > We also introduces a conn_flags member to reduce the memory footprint 
> > > > of a
> > > > conn.
> > >
> > > We should split this out to a separate patch.
> > >
> > > > Signed-off-by: Peng He 
> > > > ---
> > > >  lib/conntrack-private.h |  20 +--
> > > >  lib/conntrack.c | 264 
> > > >  2 files changed, 121 insertions(+), 163 deletions(-)
> > > >
> > > > diff --git a/lib/conntrack-private.h b/lib/conntrack-private.h
> > > > index 789af82ff..6bec43d3f 100644
> > > > --- a/lib/conntrack-private.h
> > > > +++ b/lib/conntrack-private.h
> > > > @@ -83,21 +83,23 @@ struct alg_exp_node {
> > > >  bool nat_rpl_dst;
> > > >  };
> > > >
> > > > -enum OVS_PACKED_ENUM ct_conn_type {
> > > > -CT_CONN_TYPE_DEFAULT,
> > > > -CT_CONN_TYPE_UN_NAT,
> > > > +#define CONN_FLAG_NAT_MASK 0xf
> > > > +
> > > > +struct conn_dir {
> > > > +struct cmap_node cm_node;
> > > > +bool orig;
> > >
> > > This naming is confusing.  We can have 'conn->orig->orig'.  Consider
> > > renaming one of these fields to distinguish what is going on.  I would
> > > prefer seeing 'fwd' used (since it's the forward direction).
> >
> > Yes, agree.
> >
> > >
> > > >  };
> > > >
> > > >  struct conn {
> > > >  /* Immutable data. */
> > > >  struct conn_key key;
> > > > +struct conn_dir orig;
> > > >  struct conn_key rev_key;
> > > > +struct conn_dir rev;
> > >
> > > I think this might be better if setup like:
> > > +enum ct_direction {
> > > +CT_DIRECTION_FWD,
> > > +CT_DIRECTION_REV,
> > > +CT_DIRECTIONS
> > > +}
> > > +
> > > +struct conn_data {
> > > +struct conn_key key;
> > > +struct cmap_node cm_node;
> > >  };
> > >
> > >  struct conn {
> > > -struct conn_key key;
> > > -struct conn_key rev_key;
> > > +struct conn_data cn_data[CT_DIRECTIONS];
> > > ...
> > >
> > > Then in the code, we can always get orig and rev information:
> > >
> > >   conn->cn_data[CT_DIRECTION_FWD]
> > >   conn->cn_data[CT_DIRECTION_REV]
> > >
> > > Did I miss something?
> >
> > Since both origin and rev are in the cmap, when you lookup the hash table,
> > you get a pointer to the  'middle data structure', in the patch, it is
> > the conn_dir.
> >
> > However, you are not sure what you get is the origin dir or the rev dir. 
> > The key
> > is to use a variable stored in the conn_dir, like 'fwd' or 'orig'.
> > Then you know the dir,
> > by knowing the dir, you know the offset between your pointer and the 
> > pointer to
> > the conn, just like the belowing code:
> >
> > static inline struct conn * conn_from_conndir(struct conn_dir *conndir) {
> > return conndir->orig ? CONTAINER_OF(conndir, struct conn, orig) : \
> >CONTAINER_OF(conndir, struct conn, rev);
> > }
> >
> > In your case:
> >
> > static inline struct conn * conn_from_conndir(struct conn_dir *conndir) {
> > return conndir->orig ? CONTAINER_OF(conndir, struct conn, cn_data[FWD]) 
> > : \
> >CONTAINER_OF(conndir, struct conn, cn_data[REV]);
> > }
> >
> > In the following patches I've changed the code of conn_dir as a part
> > of conn_key
>
> I haven't seen it in the following patches, I see in the last commit still 
> struct conn_dir on its own, and not as part of conn_key. Maybe I 
> misunderstood what you meant?
>
> Also, logically I'd consider the conn_key as part of the conn_dir, not the 
> inverse. I would expect conn_key to contain the whole key and only the key of 
> the conn to compute its hash, no other elements such as a dir mark or a cmap 
> node.
>
> would it make sense to have something like
>
> struct conn_dir {
> enum ct_direction dir;
> struct conn_key key;
> struct cmap_node cm_node;
> };
>
> struct conn {
> [...]
> struct conn_dir dir[CT_DIRECTIONS];
> };
>
> So the same as what Aaron proposed, but with 'conn_data' renamed as 
> 'conn_dir' and the enum ct_directions used to mark each conn_dir.

Re: [ovs-dev] conntrack: code refactoring userspace ct.

Hi,

I've read all your suggestions and appreciate that.

Just have been through a very busy week, and probably will have a
very busy month, so the work and reply might be slow.

Will do a rebase in the next version.

I've seen some of your patches related to CT. do you have some plans
for CT optimization? is it a part of plans related to CT offload?


Gaëtan Rivet  于2021年3月2日周二 下午10:47写道：
>
> On Sun, Nov 29, 2020, at 03:32, hepeng.0320 wrote:
> > This series of patch refactor the code of userspace CT, which makes
> > the code more simple and more structured. We mainly refactor the code
> > path for NAT and ALG, using conn_flags instead of bool value and use a
> > light weight lock to reduce the size of the conn. The size of conn has
> > been reduced from 312 to 240 bytes.
> >
> >
>
> Hello,
>
> This series is interesting, but it needs some work.
>
> Commits should be divided into smaller elements (as Aaron pointed out), and 
> some changes are
> barely described, or bunched with other changes without enough justification.
>
> Could you rebase and rework the series to have a more organized patchset?
> That would make the review easier.
>
> Regards,
> --
> Gaetan
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev



-- 
hepeng
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3] ofproto-dpif-xlate: fix zone set from non-frozen-metadata fields

Mark Gray  于2021年3月5日周五 下午7:54写道：
>
> On 27/02/2021 09:34, Peng He wrote:
> > CT zone could be set from a field that is not included in frozen
> > metadata. Consider the example rules which are typically seen in
> > OpenStack security group rules:
> >
> > priority=100,in_port=1,tcp,ct_state=-trk,action=ct(zone=5,table=0)
> > priority=100,in_port=1,tcp,ct_state=+trk,action=ct(commit,zone=NXM_NX_CT_ZONE[]),2
> >
> > The zone is set from the first rule's ct action. These two rules will
> > generate two megaflows: the first one uses zone=5 to query the CT module,
> > the second one sets the zone-id from the first megaflow and commit to CT.
> >
> > The current implementation will generate a megaflow that does not use
> > ct_zone=5 as a match, but directly commit into the ct using zone=5, as zone 
> > is
> > set by an Imm not a field.
> >
> > Consider a situation that one changes the zone id (for example to 15)
> > in the first rule, however, still keep the second rule unchanged. During
> > this change, there is traffic hitting the two generated megaflows, the
> > revaldiator would revalidate all megaflows, however, the revalidator will
> > not change the second megaflow, because zone=5 is recorded in the
> > megaflow, so the xlate will still translate the commit action into zone=5,
> > and the new traffic will still commit to CT as zone=5, not zone=15,
> > resulting in taffic drops and other issues.
> >
> > Just like OVS set-field convention, if a field X is set by Y
> > (Y is a variable not an Imm), we should also mask Y as a match
> > in the generated megaflow. An exception is that if the zone-id is
> > set by the field that is included in the frozen state (i.e. regs) and this
> > upcall is a resume of a thawed xlate, the un-wildcarding can be skipped,
> > as the recirc_id is a hash of the values in these fields, and it will change
> > following the changes of these fields. When the recirc_id changes,
> > all megaflows with the old recirc id will be invalid later.
>
> This looks good to me and all the unit-tests pass.
>
> There is some trailing whitespace. You can run
> "./utilities/checkpatch.py" when submitting to catch them before the
> 0-day robot does. Its a bit of a nit and I don't know if this won't get
> committed because of it. That's a decision for a maintainer.

thanks for your kind suggestion
.
better send a new version, but before that maybe there are
some more suggestions from maintainers, and it's better
to wait and then resend.

> >
> > Fixes: 07659514c3 ("Add support for connection tracking.")
> > Reported-by: Sai Su 
> > Signed-off-by: Peng He 
> > ---
> >  include/openvswitch/meta-flow.h |  1 +
> >  lib/meta-flow.c | 13 ++
> >  ofproto/ofproto-dpif-xlate.c| 12 +
> >  tests/system-traffic.at | 45 +
> >  4 files changed, 71 insertions(+)
> >
> > diff --git a/include/openvswitch/meta-flow.h 
> > b/include/openvswitch/meta-flow.h
> > index 95e52e358..045dce8f5 100644
> > --- a/include/openvswitch/meta-flow.h
> > +++ b/include/openvswitch/meta-flow.h
> > @@ -2305,6 +2305,7 @@ void mf_set_flow_value_masked(const struct mf_field *,
> >const union mf_value *mask,
> >struct flow *);
> >  bool mf_is_tun_metadata(const struct mf_field *);
> > +bool mf_is_frozen_metadata(const struct mf_field *);
> >  bool mf_is_pipeline_field(const struct mf_field *);
> >  bool mf_is_set(const struct mf_field *, const struct flow *);
> >  void mf_mask_field(const struct mf_field *, struct flow_wildcards *);
> > diff --git a/lib/meta-flow.c b/lib/meta-flow.c
> > index c808d205d..e03cd8d0c 100644
> > --- a/lib/meta-flow.c
> > +++ b/lib/meta-flow.c
> > @@ -1788,6 +1788,19 @@ mf_is_tun_metadata(const struct mf_field *mf)
> > mf->id < MFF_TUN_METADATA0 + TUN_METADATA_NUM_OPTS;
> >  }
> >
> > +bool
> > +mf_is_frozen_metadata(const struct mf_field *mf)
> > +{
> > +if (mf->id >= MFF_TUN_ID && mf->id <= MFF_IN_PORT_OXM) {
> > +return true;
> > +}
> > +
> > +if (mf->id >= MFF_REG0 && mf->id < MFF_ETH_SRC) {
> > +return true;
> > +}
> > +return false;
> > +}
> > +
> >  bool
> >  mf_is_pipeline_field(const struct mf_field *mf)
> >  {
> > diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> > index 7108c8a30..14d00db1e 100644
> > --- a/ofproto/ofproto-dpif-xlate.c
> > +++ b/ofproto/ofproto-dpif-xlate.c
> > @@ -6195,6 +6195,18 @@ compose_conntrack_action(struct xlate_ctx *ctx, 
> > struct ofpact_conntrack *ofc,
> >
> >  if (ofc->zone_src.field) {
> >  zone = mf_get_subfield(>zone_src, >xin->flow);
> > +if (ctx->xin->frozen_state) {
> > +/* If the upcall is a resume of a recirculation, we only need 
> > to
> > + * unwildcard the fields that are not in the frozen_metadata, 
> > as
> > + * when the rules update, OVS will generate a new recirc_id,
> > + * which

[ovs-dev] [PATCH ovn] ovn-northd-ddlog: Fix startup race updating nb_cfg_timestamp.

The DDlog code used by ovn-northd-ddlog relies on always having a value
(exactly one of them) in the NbCfgTimestamp relation.  The C code
inserts and updates this value.  It's supposed to insert the initial
value the first time that it processes any update from the northbound
database.

However, it actually only did this if the first update it processed was
from the northbound database--if it was from the southbound database,
it forgot to do so even when the first northbound update showed up.
The DDlog code would then go kind of berzerk creating and deleting the
NB_Global record until something actually happened to increment nb_cfg
(such as "ovn-nbctl --wait=sb sync").

This fixes the problem by only recording that a value was inserted to
NbCfgTimestamp when the first update from the northbound database is
received, same as originally intended.

Signed-off-by: Ben Pfaff 
---
 northd/ovn-northd-ddlog.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/northd/ovn-northd-ddlog.c b/northd/ovn-northd-ddlog.c
index aa0ea73e401d..f373bb129a89 100644
--- a/northd/ovn-northd-ddlog.c
+++ b/northd/ovn-northd-ddlog.c
@@ -445,8 +445,10 @@ northd_parse_updates(struct northd_ctx *ctx, struct 
ovs_list *updates)
 if (ddlog_commit(ctx->ddlog)) {
 goto error;
 }
-old_nb_cfg = new_nb_cfg;
-old_nb_cfg_timestamp = new_nb_cfg_timestamp;
+if (ctx->has_timestamp_columns) {
+old_nb_cfg = new_nb_cfg;
+old_nb_cfg_timestamp = new_nb_cfg_timestamp;
+}
 
 /* This update may have implications for the other side, so
  * immediately wake to check for more changes to be applied. */
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn] ovn-sbctl: Sort "dump-flows" output by actions if all else fails.

The Logical_Flow table can have rows that differ only in actions.  Such
rows would have identical matches and priorities and be in the same
datapath.  At first glance, any these rows would be a bug, since they'd
match the same packets and there'd be no way to choose between them.

In practice, though, actions can have prerequisites.  If two different
flows have mutually exclusive prerequisites, the otherwise identical
flows are actually valid.

This comes up in practice in DNS response flows.  There are two of
them, in the same table and pipeline and with the same match expression
"udp.dst == 53 && reg0[4]", but one of them includes the action
"ip4.src <-> ip4.dst;" and the other "ip6.src <-> ip6.dst;".  Because
the first has an IPv4 prerequisite and the latter an IPv6 prerequisite,
their matches don't really overlap.

Anyway, regardless of whether they're valid, it's still good to sort
consistently, so that's all a digression.

Signed-off-by: Ben Pfaff 
---
 utilities/ovn-sbctl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/utilities/ovn-sbctl.c b/utilities/ovn-sbctl.c
index c38e8ec3bdbd..94e33d2bdfe6 100644
--- a/utilities/ovn-sbctl.c
+++ b/utilities/ovn-sbctl.c
@@ -754,13 +754,14 @@ sbctl_lflow_cmp(const void *a_, const void *b_)
 
 int a_pipeline = pipeline_encode(a->pipeline);
 int b_pipeline = pipeline_encode(b->pipeline);
-return (a_pipeline > b_pipeline ? 1
+cmp = (a_pipeline > b_pipeline ? 1
 : a_pipeline < b_pipeline ? -1
 : a->table_id > b->table_id ? 1
 : a->table_id < b->table_id ? -1
 : a->priority > b->priority ? -1
 : a->priority < b->priority ? 1
 : strcmp(a->match, b->match));
+return cmp ? cmp : strcmp(a->actions, b->actions);
 }
 
 static char *
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn 1/2] tests: Fix bad syntax in "test" for waiting.

'test "$a"' is always true if $abc is nonempty.
Spaces are necessary to test for equality, e.g. 'test $a = $b'.

Signed-off-by: Ben Pfaff 
---
 tests/ovn-controller.at | 4 ++--
 tests/ovn.at| 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 2cd3e261f2eb..300872aa1852 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -165,11 +165,11 @@ OVS_WAIT_UNTIL([check_datapath_type foo])
 # Change the br-int's datapath type to bar.
 # It should be reset to foo since ovn-bridge-datapath-type is configured.
 ovs-vsctl set Bridge br-int datapath-type=bar
-OVS_WAIT_UNTIL([test foo=`ovs-vsctl get Bridge br-int datapath-type`])
+OVS_WAIT_UNTIL([test foo = `ovs-vsctl get Bridge br-int datapath-type`])
 OVS_WAIT_UNTIL([check_datapath_type foo])
 
 ovs-vsctl set Open_vSwitch . external_ids:ovn-bridge-datapath-type=foobar
-OVS_WAIT_UNTIL([test foobar=`ovs-vsctl get Bridge br-int datapath-type`])
+OVS_WAIT_UNTIL([test foobar = `ovs-vsctl get Bridge br-int datapath-type`])
 OVS_WAIT_UNTIL([check_datapath_type foobar])
 
 expected_iface_types=$(ovs-vsctl get Open_vSwitch . iface_types | tr -d '[[]] 
""')
diff --git a/tests/ovn.at b/tests/ovn.at
index 39edb6b85526..87579e59d7a5 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -11084,7 +11084,7 @@ 
sent_garp="0101020381020806000108000604000101010203c0a80
 
 OVN_CHECK_PACKETS([hv1/vif1-tx.pcap], [foo1.expout])
 # Wait until we receive atleast 1 packet
-OVS_WAIT_UNTIL([test 1=`$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" 
hv2/br-ex_n2-tx.pcap | wc -l`])
+OVS_WAIT_UNTIL([test 1 = `$PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" 
hv2/br-ex_n2-tx.pcap | wc -l`])
 $PYTHON "$ovs_srcdir/utilities/ovs-pcap.in" hv2/br-ex_n2-tx.pcap | head -1 > 
packets
 echo $sent_garp > expout
 AT_CHECK([cat packets], [0], [expout])
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn 2/2] tests: Add missing sync in "controller I-P handling when lrp added last".

Signed-off-by: Ben Pfaff 
---
 tests/ovn.at | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 87579e59d7a5..b751d6db2e4f 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -22235,7 +22235,8 @@ ovn-nbctl lsp-set-addresses sw0-lr0 00:00:00:00:00:01
 ovn-nbctl lr-add lr0
 ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:00:01 192.168.1.1/24 
aef0:0:0:0:0:0:0:1/64
 
-OVS_WAIT_UNTIL([test x$(ovn-nbctl lsp-get-up sw0-p1) = xup])
+wait_for_ports_up sw0-p1
+ovn-nbctl --wait=sb sync
 
 sw0_dpkey=$(ovn-sbctl  --bare --columns tunnel_key list datapath_binding sw0)
 lr0_dpkey=$(ovn-sbctl  --bare --columns tunnel_key list datapath_binding lr0)
-- 
2.29.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn] binding: Fix potential NULL dereference of lbinding.

2021-03-05 Thread Dumitru Ceara

There is a valid code path that can lead to 'lbinding' being NULL in
handle_deleted_vif_lport().  Make sure we check for it.

Found by code inspection.

Fixes: 68cf9fdceba8 ("binding: Fix container port removal from local bindings.")
Signed-off-by: Dumitru Ceara 
---
 controller/binding.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/controller/binding.c b/controller/binding.c
index 2b19fd0..4e6c756 100644
--- a/controller/binding.c
+++ b/controller/binding.c
@@ -2153,10 +2153,12 @@ handle_deleted_vif_lport(const struct 
sbrec_port_binding *pb,
 /* If the container port is removed we should also remove it from
  * its parent's children set.
  */
-if (lbinding->parent) {
-local_binding_delete_child(lbinding->parent, lbinding);
+if (lbinding) {
+if (lbinding->parent) {
+local_binding_delete_child(lbinding->parent, lbinding);
+}
+local_binding_destroy(lbinding);
 }
-local_binding_destroy(lbinding);
 }
 
 handle_deleted_lport(pb, b_ctx_in, b_ctx_out);
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] RFC: netdev-afxdp: Support for XDP metadata HW hints.

2021-03-05 Thread William Tu

Thanks Björn and Jesper for feedbacks.

On Fri, Mar 5, 2021 at 7:34 AM Björn Töpel  wrote:
>
> On 2021-03-05 16:07, Jesper Dangaard Brouer wrote:
> > On Fri, 5 Mar 2021 14:56:02 +0100
> > Björn Töpel  wrote:
> >
> >> On 2021-03-05 11:13, Jesper Dangaard Brouer wrote:
>
>
> [...]
> >>
> >> You'd like to use the length of metadata to express if it's available or
> >> not. I'm stating that we don't need that for AF_XDP. In XDP if we're
> >> accessing the field, we need to validate access due to the verifier. But
> >> it's not to know if it's actually there. When we attach a certain XDP to
> >> a device, we can query what offload it supports, and the the application
> >> knows that meta data is structured in a certain way.
> >>
> >> Some application has simple metadata:
> >> struct tstamp { u64 rx; };
> >>
> >> That application will unconditionally look at (struct tstamp
> >> *)(pkt-sizeof(struct tstamp)->rx;.
> >
> > I imagine that NIC hardware will/can provide packets with different
> > metadata.  Your example with timestamps are a good example, as some
> > drivers/HW only support timestamping PTP packets.  Thus, I don't want
> > to provide metadata info on tstamp if HW didn't provide it (and I also
> > don't want to spend CPU-cycles on clearing the u64 field with zero).
> > Another example is vlan IDs is sometimes provided by hardware.
> >
> > Thus, on a per packet basis the metadata can change.
> >
Is this future work?

Looking at Saeed's patch, I thought we query the netdev through
BTF netlink API, and we get a BTF ID and C-struct per-netdev.
And this will be done when OVS control plane, when a user attaches
a device to its bridge, we query its metadata structure.

>
> Yes, I agree. For a certain set of packets, there can be different
> metadata per packet that, as you pointed out, can be dispatched do
> different handlers.
>
> >> Some might be more flexible and have:
> >> struct metadata { char storage[64]; int btf_id; };
> >
> > I really like the idea of placing the btf_id as the last entry, that
> > way we know its location.  I can get the btf_id via packet minus-offset
> > sizeof(int).
> >
[...]

> >
> > BTF sort-of make the struct names identifiers and API.  And gives *BPF*
> > programs the ability to adjust offsets, when loading the BPF-prog into
> > the kernel.
> >
> > Driver-1 choose:
> >
> >   struct xdp_md_desc {
> >   uint32_t flow_mark;
> >   uint32_t hash32;
> >   } __attribute__((preserve_access_index));
> >
> > Driver-2 choose:
> >
> >   struct xdp_md_desc {
> >   uint32_t hash32;
> >   uint32_t flow_mark;
> >   } __attribute__((preserve_access_index));
> >
> > The BPF code can access the members, and kernel will remap-struct
> > access, and internally in the kernel create two different BPF-programs
> > with adjusted offsets in the byte-code.
> >
> > How do we handle this in the OVS userspace C-code?
> >

BTW, do we assume all vendors use the same field name
for same purpose? For example:
When OVS query Intel or mlx, if rxhash is available, they should
use the same name as "hash32".
Otherwise, I don't know how to associate it into OVS code logic.

If that's the case, I can do (in userspace C code, not BPF program)
1) When attaching a device, query its MD struct and MD's size
using BTF info.
2) From the BTF info returned, check if string "hash32" exists.
if yes, then calculate its offset from the beginning of the MD.
3) OVS cat set it by calling "dp_packet_set_rss(packet, md[rxhash_offset]);"

>
> Longer-term, I think extending the linkers to support struct member
> relocations would be a good thing! I.e. simply support
> __attribute__((preserve_access_index)) in userland. That would enable
> very powerful userland/kernel interface, like what we're discussing now.
>
> But I wonder if it's too big of an initial undertaking to do that as a
> first step? Maybe non-relocatable struct could be a start.
>
Having relocatable struct would be great. So I can just use 'md->hash32',
instead of offset like above.

> I definitely agree that the goal should be relocatable structs!
>
> > Maybe/hopefully you have a better idea than mine, which is simply to
> > compile two C-programs (or program for-each know BTF-id) and dispatch
> > based on the BTF-id.
> >

Thanks
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn v2] northd: Fix the missing force_snat_for_lb flows when router_ip is configured.

On Fri, Mar 05, 2021 at 02:53:47PM +0530, num...@ovn.org wrote:
> From: Numan Siddique 
> 
> The commit c6e21a23bd8 which supported the option 'lb_force_snat_ip=router_ip'
> on a gateway router, missed out on
>   - updating the flows in 'lr_in_dnat' to set 'flags.force_snat_for_lb = 1'.
>   - removing the flow to drop if ip.dst == router port IP in 'lr_in_ip_input'
> stage.
> 
> This patch fixes these issue and adds a system test to cover the
> hairpin load balancer traffic for the gateway routers.
> 
> Fixes: c6e21a23bd8("northd: Provide the Gateway router option 
> 'lb_force_snat_ip' to take router port ips.")
> 
> CC: Ben Pfaff 
> Signed-off-by: Numan Siddique 

Thanks!

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: increase timeout of ovsdb-server.service to 300s

On Fri, Mar 05, 2021 at 12:47:05PM +0100, Timothy Redaelli wrote:
> In some scenarios starting ovsdb-server takes more than 90 seconds and
> so ovsdb-server.service is marked as failed.
> 
> Signed-off-by: Timothy Redaelli 

I'm surprised it takes so long.  What leads to that?  Is it a bug in
ovsdb-server?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH ovn] northd: Remove the usage of 'ct.inv' in logical flows.

2021-03-05 Thread Numan Siddique

On Fri, Mar 5, 2021 at 11:53 AM Han Zhou  wrote:
>
> On Thu, Mar 4, 2021 at 5:25 PM Numan Siddique  wrote:
> >
> > On Fri, Mar 5, 2021 at 4:22 AM Han Zhou  wrote:
> > >
> > > On Mon, Mar 1, 2021 at 5:40 AM Numan Siddique  wrote:
> > > >
> > > > On Fri, Feb 26, 2021 at 1:15 AM Han Zhou  wrote:
> > > > >
> > > > > On Thu, Feb 25, 2021 at 1:25 AM Numan Siddique 
> wrote:
> > > > > >
> > > > > > On Thu, Feb 25, 2021 at 1:12 PM Han Zhou  wrote:
> > > > > > >
> > > > > > > On Wed, Feb 24, 2021 at 5:27 AM  wrote:
> > > > > > > >
> > > > > > > > From: Numan Siddique 
> > > > > > > >
> > > > > > > > Presently we add 65535 priority lflows in the stages -
> > > > > > > > 'ls_in_acl' and 'ls_out_acl' to drop packets which
> > > > > > > > match on 'ct.inv'.
> > > > > > > >
> > > > > > > > As per the 'ovs-fields' man page, this
> > > > > > > > ct state field can be used to identify problems such as:
> > > > > > > >  •  L3/L4 protocol handler is not loaded/unavailable.
> > > > > > > >
> > > > > > > >  •  L3/L4 protocol handler determines that the packet is
> > > > > > > > malformed.
> > > > > > > >
> > > > > > > >  •  Packets are unexpected length for protocol.
> > > > > > > >
> > > > > > > > This patch removes the usage of this field for the following
> > > > > > > > reasons:
> > > > > > > >
> > > > > > > >  • Some of the smart NICs which support offloading datapath
> > > > > > > >flows don't support this field.
> > > > > > >
> > > > > > > What do you mean by "don't support this field"? Do you mean the
> NIC
> > > > > > > offloading supports connection tracking, but cannot detect if a
> > > packet
> > > > > is
> > > > > > > invalid and always populate the ct.inv as 0?
> > > > > >
> > > > > > I think so. From what I understand, the kernel conntrack feature
> is
> > > used
> > > > > > for the actual connection tracking. So NIC can't tell if the
> packet is
> > > > > > invalid or not
> > > > > > (say due to out-of-window tcp errors).
> > > > > >
> > > > > I know some NICs support CT offloading and some doesn't.
> > > > > So here what you are referring are the NICs that doesn't support CT
> > > > > offloading, which falls back to kernel datapath when CT is used, is
> it?
> > > If
> > > > > this is the case, then even without ct.inv it still couldn't support
> > > > > ct.est, etc. right?
> > > > > Or, do you mean this is specifically for NICs that support CT
> offloading
> > > > > but not ct.inv, i.e. it can do regular conntrack in NIC but just
> can't
> > > > > identify out-of-window packets, and that's why it supports ct.est
> but
> > > not
> > > > > ct.inv?
> > > > > I am still quite confused. Could clarify a little more, which types
> of
> > > NICs
> > > > > would benefit from this, and how?
> > > >
> > > > I'm not sure if I can explain the issue well.  Can you please look
> > > > into this bugzilla -
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=1921946
> > > > We can discuss further if you have further questions or comments.
> > > >
> > > Unfortunately this one seems to require access permission.
> >
> > Ok. Let me try to share in some other way.
> >
> > >
> > > >
> > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >  • A recent commit in kernel ovs datapath sets the committed
> > > > > > > >connection tracking entry to be liberal for out-of-window
> > > > > > > >tcp packets (nf_ct_set_tcp_be_liberal()).  Such TCP
> > > > > > > >packets will not be marked as invalid.
> > > > > > > >
> > > > > > >
> > > > > > > Could you share a link to this commit?
> > > > > >
> > > > > > Sure.
> > > > > >
> > > > >
> > >
> https://kernel.googlesource.com/pub/scm/linux/kernel/git/netdev/net-next/+/e2ef5203c817a60bfb591343ffd851b6537370ff
> > > > > >
> > > > > Thanks for sharing. So OVS is not capable of detecting a
> out-of-window
> > > > > packet now. Could you explain more about the motivation? I couldn't
> get
> > > the
> > > > > full picture from commit message of that patch. Do you have a link
> that
> > > > > discusses more details?
> > > >
> > > > Let me share with you the patch which I first submitted to handle this
> > > issue.
> > > > During the review, @Florian Westphal suggested being liberal for
> > > out-of-window
> > > > packets to solve this issue.
> > > >
> > > > I think the patch discussions have enough information. Please let me
> know
> > > > if you have further questions or comments and we can discuss further.
> > > >
> > > > Initial approach taken by me -
> > > >
> > >
> https://patchwork.ozlabs.org/project/netdev/patch/20201006083355.121018-1-nusid...@redhat.com/
> > > > Final approach taken -
> > > >
> > >
> https://patchwork.ozlabs.org/project/netdev/patch/20201109072930.14048-1-nusid...@redhat.com/
> > > >
> > > Thanks for the pointers. Now I have a better context. It seems all these
> > > work was to deal with (optimize) the LB (stateful) + stateless ACL use
> > > cases.
> > > 1. we don't want to track packets coming from VIF (because there is no
> > > stateful ACL)
> > > 2. but

[ovs-dev] PMD auto load balance experimental tag

2021-03-05 Thread Kevin Traynor

Hi All,

PMD auto load balance (ALB) is an experimental feature since OVS 2.11.

In short, RxQs are reassigned to PMDs based on load during a reconfig.

However, there might not be a reconfig for a long time so ALB runs
periodically and checks PMD thread load and the load variance between
PMD threads. If load is high and it's estimated that it will be improved
by a certain amount after reassigning RxQs to PMDs, a reassignment is
triggered.

ALB calls the reassignment code similar to if there was a reconfig or if
the rebalance command is used. There are some ideas for how to improve
RxQ to PMD assignments in general, but that's a slightly different topic
and can be a later optimization if needed.

ALB is documented more here:
https://docs.openvswitch.org/en/latest/topics/dpdk/pmd/?highlight=rebalance#automatic-assignment-of-port-rx-queue-to-pmd-threads-experimental

I would like to progress it so the experimental tag on ALB can be
removed. There are a few things I think are needed:

- Need to restructure the code so that ALB reassignment dry run uses the
same code path as the reassignment code it will call

- ALB operation should be understandable through logs and stats
e.g. Should be able to confirm setting state/parameters in logs and see
if ALB is taking some action. Should be able to get more debug if needed
etc.

- Unit tests
Some patches here:
http://patchwork.ozlabs.org/project/openvswitch/patch/20210218104802.293740-2-ktray...@redhat.com/

- Iron out if the current ALB enable/disable user interface logging
based partly min resource requirements is right or we want to it to run
regardless
i.e.
https://mail.openvswitch.org/pipermail/ovs-dev/2021-February/380386.html

- Review/Improve operation documentation

- Further testing

Thoughts?

thanks,
Kevin.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH branch-2.11 0/2] Release patches for v2.11.7.

On 3/5/21 4:50 PM, Stokes, Ian wrote:
>> These patches are for the *final* stable release of OVS 2.11.
>> Includes a couple of a latest bug fixes and validated with the
>> final stable DPDK 18.11 release.
>>
>> Since 2.5 reached it's EOL, we're no longer supporting branches
>> below 2.12.  This release for 2.11 is mostly to close all the
>> remaining documentation gaps.
>>
>> 2.13 is our current LTS release and 2.15 is our latest stable.
>> And according to our release documentation, while 2.12 and 2.14
>> are not formally maintained we're providing releases for these
>> branches once in a while, however these branches may not include
>> all the fixes that LTS or latest stable does.
>>
>> There is still 'Prepare for 2.11.8' patch, but it's here just to
>> follow the release process and not break my scripts. :)
>>
>> Ilya Maximets (2):
>>   Set release date for 2.11.7.
>>   Prepare for 2.11.8.
>>
>>  NEWS | 6 +-
>>  configure.ac | 2 +-
>>  debian/changelog | 8 +++-
>>  3 files changed, 13 insertions(+), 3 deletions(-)
>>
>> --
>> 2.26.2
> 
> Acked for the series.
> 
> BR
> Ian Stokes
> 

Thanks!
Applied to branch-2.11 and tagged.

Best regards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH branch-2.11 0/2] Release patches for v2.11.7.

2021-03-05 Thread Stokes, Ian

> These patches are for the *final* stable release of OVS 2.11.
> Includes a couple of a latest bug fixes and validated with the
> final stable DPDK 18.11 release.
> 
> Since 2.5 reached it's EOL, we're no longer supporting branches
> below 2.12.  This release for 2.11 is mostly to close all the
> remaining documentation gaps.
> 
> 2.13 is our current LTS release and 2.15 is our latest stable.
> And according to our release documentation, while 2.12 and 2.14
> are not formally maintained we're providing releases for these
> branches once in a while, however these branches may not include
> all the fixes that LTS or latest stable does.
> 
> There is still 'Prepare for 2.11.8' patch, but it's here just to
> follow the release process and not break my scripts. :)
> 
> Ilya Maximets (2):
>   Set release date for 2.11.7.
>   Prepare for 2.11.8.
> 
>  NEWS | 6 +-
>  configure.ac | 2 +-
>  debian/changelog | 8 +++-
>  3 files changed, 13 insertions(+), 3 deletions(-)
> 
> --
> 2.26.2

Acked for the series.

BR
Ian Stokes

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH branch-2.11 2/2] Prepare for 2.11.8.

Signed-off-by: Ilya Maximets 
---
 NEWS | 3 +++
 configure.ac | 2 +-
 debian/changelog | 6 ++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/NEWS b/NEWS
index afda9198b..b8510e01e 100644
--- a/NEWS
+++ b/NEWS
@@ -1,3 +1,6 @@
+v2.11.8 - xx xxx 
+-
+
 v2.11.7 - 05 Mar 2021
 -
- Bug fixes
diff --git a/configure.ac b/configure.ac
index c627448a7..107429d41 100644
--- a/configure.ac
+++ b/configure.ac
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 AC_PREREQ(2.63)
-AC_INIT(openvswitch, 2.11.7, b...@openvswitch.org)
+AC_INIT(openvswitch, 2.11.8, b...@openvswitch.org)
 AC_CONFIG_SRCDIR([datapath/datapath.c])
 AC_CONFIG_MACRO_DIR([m4])
 AC_CONFIG_AUX_DIR([build-aux])
diff --git a/debian/changelog b/debian/changelog
index b3322f0a6..5b1d53da3 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,9 @@
+openvswitch (2.11.8-1) unstable; urgency=low
+   [ Open vSwitch team ]
+   * New upstream version
+
+ -- Open vSwitch team   Fri, 05 Mar 2021 16:27:47 +0100
+
 openvswitch (2.11.7-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH branch-2.11 0/2] Release patches for v2.11.7.

These patches are for the *final* stable release of OVS 2.11.
Includes a couple of a latest bug fixes and validated with the
final stable DPDK 18.11 release.

Since 2.5 reached it's EOL, we're no longer supporting branches
below 2.12.  This release for 2.11 is mostly to close all the
remaining documentation gaps.

2.13 is our current LTS release and 2.15 is our latest stable.
And according to our release documentation, while 2.12 and 2.14
are not formally maintained we're providing releases for these
branches once in a while, however these branches may not include
all the fixes that LTS or latest stable does.

There is still 'Prepare for 2.11.8' patch, but it's here just to
follow the release process and not break my scripts. :)

Ilya Maximets (2):
  Set release date for 2.11.7.
  Prepare for 2.11.8.

 NEWS | 6 +-
 configure.ac | 2 +-
 debian/changelog | 8 +++-
 3 files changed, 13 insertions(+), 3 deletions(-)

-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH branch-2.11 1/2] Set release date for 2.11.7.

Signed-off-by: Ilya Maximets 
---
 NEWS | 3 ++-
 debian/changelog | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/NEWS b/NEWS
index 11a7b40fc..afda9198b 100644
--- a/NEWS
+++ b/NEWS
@@ -1,5 +1,6 @@
-v2.11.7 - xx xxx 
+v2.11.7 - 05 Mar 2021
 -
+   - Bug fixes
- DPDK
  * OVS validated with DPDK 18.11.11. Due to this being the final release
in the DPDK 18.11 series it is recommended to be used.
diff --git a/debian/changelog b/debian/changelog
index d1e4505c4..b3322f0a6 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,7 +2,7 @@ openvswitch (2.11.7-1) unstable; urgency=low
[ Open vSwitch team ]
* New upstream version
 
- -- Open vSwitch team   Wed, 10 Feb 2021 16:07:06 +0100
+ -- Open vSwitch team   Fri, 05 Mar 2021 16:27:47 +0100
 
 openvswitch (2.11.6-1) unstable; urgency=low
[ Open vSwitch team ]
-- 
2.26.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] RFC: netdev-afxdp: Support for XDP metadata HW hints.

2021-03-05 Thread Björn Töpel


On 2021-03-05 16:07, Jesper Dangaard Brouer wrote:

On Fri, 5 Mar 2021 14:56:02 +0100
Björn Töpel  wrote:


On 2021-03-05 11:13, Jesper Dangaard Brouer wrote:



[...]


We also want this sizeof-offset to be dynamic.
And I hope that AF_XDP could provide this info, on the size of
data_meta area.  I need some help from Magnus and Bjørn here?

Look at the kernel-side code of AF_XDP/XSK I don't see this being
transferred? Help is this true? Can we add the info?
  


XDP sockets make sure that the meta data is propagated to userland.
However, the meta data *size* is not passed via the AF_XDP descriptors.

This was per-design; The metadata is a contract between the XDP program
and the XDP socket, and in the future maybe the driver as well.

So, we're simply stating that "your AF_XDP application should know if
there's meta data or not, and how it's structured".

I guess we could explore starting to use the options bits of struct
xdp_desc for this, but only if we're *really* seeing a need for it.

If the XDP program would like to pass it's length, it can do that via
the metadata area as well.

struct meta {
int len;


I'm not interested in having the len part of metadata.   The btf_id
will basically/indirectly give us the length.



Ok!



...
...
};




   

   index = addr >> FRAME_SHIFT;
   xpacket = >xpool.array[index];
   packet = >packet;
@@ -868,6 +881,12 @@ netdev_afxdp_rxq_recv(struct netdev_rxq *rxq_, struct 
dp_packet_batch *batch,
   OVS_XDP_HEADROOM);
   dp_packet_set_size(packet, len);
   
+/* FIXME: This should be done by detecting whether

+ * XDP MD is enabled or not. Ex:
+ * $ bpftool net xdp set dev enp2s0f0np0 md_btf on
+ */
+dp_packet_set_rss_hash(packet, md->hash32);


The general idea to make this code more dynamic is to let AF_XDP
provide the meta-data-len and then use the (proposed) btf_id to know
what "callback" C-code to call.
  


Hmm, again, here I might have a bit of a different view. Personally I
think it's better that all the "NIC/XDP/socket" does the negotiation in
the control/setup phase, and keep the fast-path lean. Instead of:

recv(void *data)
{
 len = get_meta_data_len(data);
 if (len) {
// check this, check that, if-else-if-else
 }
 ...
}

You'd like to use the length of metadata to express if it's available or
not. I'm stating that we don't need that for AF_XDP. In XDP if we're
accessing the field, we need to validate access due to the verifier. But
it's not to know if it's actually there. When we attach a certain XDP to
a device, we can query what offload it supports, and the the application
knows that meta data is structured in a certain way.

Some application has simple metadata:
struct tstamp { u64 rx; };

That application will unconditionally look at (struct tstamp
*)(pkt-sizeof(struct tstamp)->rx;.


I imagine that NIC hardware will/can provide packets with different
metadata.  Your example with timestamps are a good example, as some
drivers/HW only support timestamping PTP packets.  Thus, I don't want
to provide metadata info on tstamp if HW didn't provide it (and I also
don't want to spend CPU-cycles on clearing the u64 field with zero).
Another example is vlan IDs is sometimes provided by hardware.

Thus, on a per packet basis the metadata can change.



Yes, I agree. For a certain set of packets, there can be different
metadata per packet that, as you pointed out, can be dispatched do
different handlers.


Some might be more flexible and have:
struct metadata { char storage[64]; int btf_id; };


I really like the idea of placing the btf_id as the last entry, that
way we know its location.  I can get the btf_id via packet minus-offset
sizeof(int).



If you need an "availability check", then you can solve that.

int xdp_prog(struct xdp_md *ctx)
{
struct metadata *m = (void *)(long)ctx->data_meta;
void *data = (void *)(unsigned long)ctx->data;

  if ((m + 1) > data) {
ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(*));
if (ret < 0)
return XDP_ABORTED;
data = (void *)(unsigned long)ctx->data;
m = (void *)(long)ctx->data_meta;
if ((m + 1) > data)
return XDP_ABORTED;
m->btf_id = -1; // not valid
}
// ... bpf_redirect_map() to socket
}

Would you agree? Or maybe expand a bit why you need the length.


With the btf_id as the last entry, I basically don't need the length.



Ok! Good! :-)


The dispatcher function, will based on the btf_id know the size of the
struct.  The C-code will basically type-cast the struct at the pointer
offset and get access to the metadata at the known offsets.



Make sense, and inline with my view!





Using a btf_id make this independent of the NIC driver.


E.g. we don't need to dereference the NIC driver struct to find what

Re: [ovs-dev] [PATCH] RFC: netdev-afxdp: Support for XDP metadata HW hints.

2021-03-05 Thread Jesper Dangaard Brouer

On Fri, 5 Mar 2021 14:56:02 +0100
Björn Töpel  wrote:

> On 2021-03-05 11:13, Jesper Dangaard Brouer wrote:
> > 
> > Bjørn and Magnus please take a look at my questions inlined below.
> > 
> > On Thu,  4 Mar 2021 10:27:05 -0800
> > William Tu  wrote:
> >   
> >> One big problem of netdev-afxdp is that there is no metadata support
> >> from the hardware at all.  For example, OVS netdev-afxdp has to do rxhash,
> >> or TCP checksum in software, resulting in high performance overhead.
> >>
> >> A generic meta data type for XDP frame using BTF is proposed[1] and
> >> there is sample implementation[2][3].  This patch experiments enabling
> >> the XDP metadata, or called HW hints, and shows the potential performance
> >> improvement.  The patch uses only the rxhash value provided from HW,
> >> so avoiding at the calculation of hash at lib/dpif-netdev.c:
> >>  if (!dp_packet_rss_valid(execute->packet)) {
> >>  dp_packet_set_rss_hash(execute->packet,
> >> flow_hash_5tuple(execute->flow, 0));
> >>  }
> >>
> >> Using '$ ovs-appctl dpif-netdev/pmd-stats-show', the 'avg processing
> >> cycles per packet' drops from 402 to 272.  More details below
> >>
> >> Reference:
> >> --
> >> [1] https://www.kernel.org/doc/html/latest/bpf/btf.html
> >> [2] 
> >> https://netdevconf.info/0x14/pub/slides/54/[1]%20XDP%20meta%20data%20acceleration.pdf
> >> [3] 
> >> https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/log/?h=topic/xdp_metadata4
> >>
> >> Testbed:
> >> 
> >> Two Xeon E5-2620 v3 2.4GHz connected back-to-back using Mellanox
> >> ConnectX-6Dx 25GbE. Before starting OVS, enable the MD by:
> >> $ bpftool net xdp show
> >> xdp:
> >> enp2s0f0np0(4) md_btf_id(1) md_btf_enabled(0)
> >> enp2s0f1np1(5) md_btf_id(2) md_btf_enabled(0)
> >> $ bpftool net xdp set dev enp2s0f0np0 md_btf on
> >> $ bpftool net xdp
> >> xdp:
> >> enp2s0f0np0(4) md_btf_id(1) md_btf_enabled(1)
> >>
> >> Limitations/TODO:
> >> -
> >> 1. Support only AF_XDP native mode, not zero-copy mode.
> >> 2. Currently only three fields: vlan, hash, and flow_mark, and only receive
> >> side supports XDP metadata.
> >> 3. Control plane, how to enable and probe the structure, not upstream yet.
> >>
> >> OVS rxdrop without HW hints:
> >> ---
> >> Drop rate: 4.8Mpps
> >>
> >> pmd thread numa_id 0 core_id 3:
> >>packets received: 196592006
> >>packet recirculations: 0
> >>avg. datapath passes per packet: 1.00
> >>emc hits: 196592006
> >>smc hits: 0
> >>megaflow hits: 0
> >>avg. subtable lookups per megaflow hit: 0.00
> >>miss with success upcall: 0
> >>miss with failed upcall: 0
> >>avg. packets per output batch: 0.00
> >>idle cycles: 56009063835 (41.43%)
> >>processing cycles: 79164971931 (58.57%)
> >>avg cycles per packet: 687.59 (135174035766/196592006)
> >>avg processing cycles per packet: 402.69 (79164971931/196592006)
> >>
> >> pmd thread numa_id 0 core_id 3:
> >>Iterations:   339607649  (0.23 us/it)
> >>- Used TSC cycles: 188620512777  ( 99.9 % of total cycles)
> >>- idle iterations:330697002  ( 40.3 % of used cycles)
> >>- busy iterations:  8910647  ( 59.7 % of used cycles)
> >>Rx packets:   285140031  (3624 Kpps, 395 cycles/pkt)
> >>Datapath passes:  285140031  (1.00 passes/pkt)
> >>- EMC hits:   28513  (100.0 %)
> >>- SMC hits:   0  (  0.0 %)
> >>- Megaflow hits:  0  (  0.0 %, 0.00 subtbl lookups/hit)
> >>- Upcalls:0  (  0.0 %, 0.0 us/upcall)
> >>- Lost upcalls:   0  (  0.0 %)
> >>Tx packets:   0
> >>
> >> Perf report:
> >>17.56%  pmd-c03/id:11  ovs-vswitchd[.] netdev_afxdp_rxq_recv
> >>14.39%  pmd-c03/id:11  ovs-vswitchd[.] 
> >> dp_netdev_process_rxq_port
> >>14.17%  pmd-c03/id:11  ovs-vswitchd[.] pmd_thread_main
> >>10.86%  pmd-c03/id:11  [vdso]  [.] __vdso_clock_gettime
> >>10.19%  pmd-c03/id:11  ovs-vswitchd[.] pmd_perf_end_iteration
> >> 7.71%  pmd-c03/id:11  ovs-vswitchd[.] time_timespec__
> >> 5.64%  pmd-c03/id:11  ovs-vswitchd[.] time_usec
> >> 3.88%  pmd-c03/id:11  ovs-vswitchd[.] netdev_get_class
> >> 2.95%  pmd-c03/id:11  ovs-vswitchd[.] netdev_rxq_recv
> >> 2.78%  pmd-c03/id:11  libbpf.so.0.2.0 [.] xsk_socket__fd
> >> 2.74%  pmd-c03/id:11  ovs-vswitchd[.] pmd_perf_start_iteration
> >> 2.11%  pmd-c03/id:11  libc-2.27.so[.] __clock_gettime
> >> 1.32%  pmd-c03/id:11  ovs-vswitchd[.] xsk_socket__fd@plt
> >>
> >> OVS rxdrop with HW hints:
> >> -
> >> rxdrop rate: 4.73Mpps
> >>
> >> pmd thread numa_id 0 core_id 7:
> >>packets received: 13686880
> >>packet recirculations: 0
> >>avg. datapath passes per packet: 1.00
> >>emc hits:

Re: [ovs-dev] [PATCH] RFC: netdev-afxdp: Support for XDP metadata HW hints.

2021-03-05 Thread Björn Töpel


On 2021-03-05 11:13, Jesper Dangaard Brouer wrote:


Bjørn and Magnus please take a look at my questions inlined below.

On Thu,  4 Mar 2021 10:27:05 -0800
William Tu  wrote:


One big problem of netdev-afxdp is that there is no metadata support
from the hardware at all.  For example, OVS netdev-afxdp has to do rxhash,
or TCP checksum in software, resulting in high performance overhead.

A generic meta data type for XDP frame using BTF is proposed[1] and
there is sample implementation[2][3].  This patch experiments enabling
the XDP metadata, or called HW hints, and shows the potential performance
improvement.  The patch uses only the rxhash value provided from HW,
so avoiding at the calculation of hash at lib/dpif-netdev.c:
 if (!dp_packet_rss_valid(execute->packet)) {
 dp_packet_set_rss_hash(execute->packet,
flow_hash_5tuple(execute->flow, 0));
 }

Using '$ ovs-appctl dpif-netdev/pmd-stats-show', the 'avg processing
cycles per packet' drops from 402 to 272.  More details below

Reference:
--
[1] https://www.kernel.org/doc/html/latest/bpf/btf.html
[2] 
https://netdevconf.info/0x14/pub/slides/54/[1]%20XDP%20meta%20data%20acceleration.pdf
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/log/?h=topic/xdp_metadata4

Testbed:

Two Xeon E5-2620 v3 2.4GHz connected back-to-back using Mellanox
ConnectX-6Dx 25GbE. Before starting OVS, enable the MD by:
$ bpftool net xdp show
xdp:
enp2s0f0np0(4) md_btf_id(1) md_btf_enabled(0)
enp2s0f1np1(5) md_btf_id(2) md_btf_enabled(0)
$ bpftool net xdp set dev enp2s0f0np0 md_btf on
$ bpftool net xdp
xdp:
enp2s0f0np0(4) md_btf_id(1) md_btf_enabled(1)

Limitations/TODO:
-
1. Support only AF_XDP native mode, not zero-copy mode.
2. Currently only three fields: vlan, hash, and flow_mark, and only receive
side supports XDP metadata.
3. Control plane, how to enable and probe the structure, not upstream yet.

OVS rxdrop without HW hints:
---
Drop rate: 4.8Mpps

pmd thread numa_id 0 core_id 3:
   packets received: 196592006
   packet recirculations: 0
   avg. datapath passes per packet: 1.00
   emc hits: 196592006
   smc hits: 0
   megaflow hits: 0
   avg. subtable lookups per megaflow hit: 0.00
   miss with success upcall: 0
   miss with failed upcall: 0
   avg. packets per output batch: 0.00
   idle cycles: 56009063835 (41.43%)
   processing cycles: 79164971931 (58.57%)
   avg cycles per packet: 687.59 (135174035766/196592006)
   avg processing cycles per packet: 402.69 (79164971931/196592006)

pmd thread numa_id 0 core_id 3:
   Iterations:   339607649  (0.23 us/it)
   - Used TSC cycles: 188620512777  ( 99.9 % of total cycles)
   - idle iterations:330697002  ( 40.3 % of used cycles)
   - busy iterations:  8910647  ( 59.7 % of used cycles)
   Rx packets:   285140031  (3624 Kpps, 395 cycles/pkt)
   Datapath passes:  285140031  (1.00 passes/pkt)
   - EMC hits:   28513  (100.0 %)
   - SMC hits:   0  (  0.0 %)
   - Megaflow hits:  0  (  0.0 %, 0.00 subtbl lookups/hit)
   - Upcalls:0  (  0.0 %, 0.0 us/upcall)
   - Lost upcalls:   0  (  0.0 %)
   Tx packets:   0

Perf report:
   17.56%  pmd-c03/id:11  ovs-vswitchd[.] netdev_afxdp_rxq_recv
   14.39%  pmd-c03/id:11  ovs-vswitchd[.] dp_netdev_process_rxq_port
   14.17%  pmd-c03/id:11  ovs-vswitchd[.] pmd_thread_main
   10.86%  pmd-c03/id:11  [vdso]  [.] __vdso_clock_gettime
   10.19%  pmd-c03/id:11  ovs-vswitchd[.] pmd_perf_end_iteration
7.71%  pmd-c03/id:11  ovs-vswitchd[.] time_timespec__
5.64%  pmd-c03/id:11  ovs-vswitchd[.] time_usec
3.88%  pmd-c03/id:11  ovs-vswitchd[.] netdev_get_class
2.95%  pmd-c03/id:11  ovs-vswitchd[.] netdev_rxq_recv
2.78%  pmd-c03/id:11  libbpf.so.0.2.0 [.] xsk_socket__fd
2.74%  pmd-c03/id:11  ovs-vswitchd[.] pmd_perf_start_iteration
2.11%  pmd-c03/id:11  libc-2.27.so[.] __clock_gettime
1.32%  pmd-c03/id:11  ovs-vswitchd[.] xsk_socket__fd@plt

OVS rxdrop with HW hints:
-
rxdrop rate: 4.73Mpps

pmd thread numa_id 0 core_id 7:
   packets received: 13686880
   packet recirculations: 0
   avg. datapath passes per packet: 1.00
   emc hits: 13686880
   smc hits: 0
   megaflow hits: 0
   avg. subtable lookups per megaflow hit: 0.00
   miss with success upcall: 0
   miss with failed upcall: 0
   avg. packets per output batch: 0.00
   idle cycles: 3182105544 (46.02%)
   processing cycles: 3732023844 (53.98%)
   avg cycles per packet: 505.16 (6914129388/13686880)
   avg processing cycles per packet: 272.67 (3732023844/13686880)

pmd thread numa_id 0 core_id 7:

   Iterations:   392909539  (0.18 us/it)
   - Used TSC cycles: 167697342678  ( 99.9 % of total cycles)
   - idle iterations:382539861  ( 46.0 % of used cycles)

[ovs-dev] [PATCH ovn v2 9/9] tests: Test with SSL and RBAC for controller by default

To help ourself to not forget updating RBAC rules when we land
changes to existing functionality and new features we must enable
SSL+RBAC on the `ovn-controller` <-> SB DB connection for builds
with OpenSSL enabled.

Signed-off-by: Frode Nordahl 
---
 tests/automake.mk   |  9 +++--
 tests/ofproto-macros.at | 12 
 tests/ovn-macros.at | 38 --
 tests/ovn-northd.at |  6 +++---
 4 files changed, 58 insertions(+), 7 deletions(-)

diff --git a/tests/automake.mk b/tests/automake.mk
index 771dddea2..ba8567da4 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -237,7 +237,10 @@ FLAKE8_PYFILES += $(CHECK_PYFILES)
 
 if HAVE_OPENSSL
 OVS_PKI_DIR = $(CURDIR)/tests/pki
-TESTPKI_CNS = test test2
+# NOTE: Certificate generation has to be done serially, and each one adds a few
+# seconds to the test run. Please try to re-use one of the many CNs already
+# used in the existing tests.
+TESTPKI_CNS = test test2 main hv hv-foo hv1 hv2 hv3 hv4 hv5 hv6 hv7 hv8 hv9 
hv10 hv-1 hv-2 hv-10-1 hv-10-2 hv-20-1 hv-20-2 vtep hv_gw pbr-hv gw1 gw2 gw3 
gw4 gw5 ext1
 TESTPKI_FILES = $(shell \
for cn in $(TESTPKI_CNS); do \
echo tests/testpki-$$cn-cert.pem ; \
@@ -262,9 +265,11 @@ tests/pki/stamp:
$(AM_V_at)rm -f tests/pki/stamp
$(AM_V_at)rm -rf tests/pki
$(AM_V_GEN)$(OVS_PKI) init && \
+   cd tests/pki && \
for cn in $(TESTPKI_CNS); do \
-   $(OVS_PKI) req+sign tests/pki/$$cn; \
+   $(OVS_PKI) -u req+sign $$cn; \
done && \
+   cd ../../ && \
: > tests/pki/stamp
 CLEANFILES += tests/ovs-pki.log
 
diff --git a/tests/ofproto-macros.at b/tests/ofproto-macros.at
index 3d7ac08b3..23d793a95 100644
--- a/tests/ofproto-macros.at
+++ b/tests/ofproto-macros.at
@@ -104,6 +104,7 @@ start_daemon () {
 #
 # sim_add hv0
 # as hv0 ovs-vsctl add-br br0
+PKIDIR="$(cd $abs_top_builddir/tests && pwd)"
 sims=
 sim_add () {
echo "adding simulator '$1'"
@@ -126,6 +127,17 @@ sim_add () {
# Start ovs-vswitchd
as $1 start_daemon ovs-vswitchd --enable-dummy=system -vvconn 
-vofproto_dpif -vunixctl
as $1 ovs-appctl vlog/disable-rate-limit vconn
+   if test X$HAVE_OPENSSL = Xyes; then
+  if test -f $PKIDIR/testpki-$1-privkey.pem; then
+ as $1 ovs-vsctl set-ssl \
+$PKIDIR/testpki-$1-privkey.pem \
+$PKIDIR/testpki-$1-cert.pem \
+$PKIDIR/testpki-cacert.pem \
+|| return 1
+  else
+ echo "WARNING: No certificate created for sim '$1', check TESTPKI_CNS 
variable in tests/automake.mk"
+  fi
+   fi
 }
 
 # "as $1" sets the OVS_*DIR environment variables to point to $ovs_base/$1.
diff --git a/tests/ovn-macros.at b/tests/ovn-macros.at
index be8114de2..25f3dbe34 100644
--- a/tests/ovn-macros.at
+++ b/tests/ovn-macros.at
@@ -124,7 +124,18 @@ ovn_init_db () {
 mkdir "$d" || return 1
 : > "$d"/.$1.db.~lock~
 as $as_d ovsdb-tool create "$d"/$1.db "$abs_top_srcdir"/$1.ovsschema
-as $as_d start_daemon ovsdb-server -vjsonrpc --remote=punix:"$d"/$1.sock 
"$d"/$1.db
+
+local remote_in_db=
+if test X$HAVE_OPENSSL = Xyes -a X"$1" = X"ovn-sb"; then
+remote_in_db="--remote=db:OVN_Southbound,SB_Global,connections 
--private-key=$PKIDIR/testpki-test-privkey.pem 
--certificate=$PKIDIR/testpki-test-cert.pem 
--ca-cert=$PKIDIR/testpki-cacert.pem"
+fi
+
+as $as_d start_daemon ovsdb-server \
+-vjsonrpc \
+--remote=punix:"$d"/$1.sock \
+$remote_in_db \
+"$d"/$1.db
+
 local var=`echo $1_db | tr a-z- A-Z_`
 AS_VAR_SET([$var], [unix:"$d"/$1.sock]); export $var
 }
@@ -193,6 +204,24 @@ ovn_start () {
 ovn_start_northd backup $AZ
 fi
 
+if test X$HAVE_OPENSSL = Xyes; then
+# Create the SB DB pssl+RBAC connection. Ideally we could pre-create
+# SB_Global and Connection with ovsdb-tool transact at DB creation
+# time, but unfortunately that does not work, northd-ddlog will replace
+# the SB_Global record on startup.
+ovn-sbctl \
+-- --id=@c create connection \
+target=\"pssl:0:127.0.0.1\" role=ovn-controller \
+-- add SB_Global . connections @c
+local d=$ovs_base
+if test -n "$AZ"; then
+d=$d/$AZ
+fi
+PARSE_LISTENING_PORT([$d/ovn-sb/ovsdb-server.log], [TCP_PORT])
+var="SSL_OVN_SB_DB"
+AS_VAR_SET([$var], [ssl:127.0.0.1:$TCP_PORT]); export $var
+fi
+
 if test -n "$AZ"; then
 ovn-nbctl --wait=sb sync || exit $?
 
@@ -257,11 +286,16 @@ ovn_az_attach() {
 
 local ovn_remote
 if test X"$az" = XNONE; then
-ovn_remote=unix:$ovs_base/ovn-sb/ovn-sb.sock
+if test X$HAVE_OPENSSL = Xyes; then
+ovn_remote=$SSL_OVN_SB_DB
+else
+ovn_remote=unix:$ovs_base/ovn-sb/ovn-sb.sock
+fi
 else
 ovn_remote=unix:$ovs_base/$az/ovn-sb/ovn-sb.sock

[ovs-dev] [PATCH ovn v2 8/9] tests: Make certificate generation extendable

In preparation for enabling testing with SSL and RBAC enabled by
default, rework the certificate generation so that we can easily
add generation of more certificates/CN on demand.

A side erffect of the change is a more generic naming scheme for
the certificate files so the patch also contains an update to
existing tests so that they use the new filenames.

Signed-off-by: Frode Nordahl 
---
 tests/automake.mk | 48 ++-
 tests/ovn.at  | 48 +++
 2 files changed, 46 insertions(+), 50 deletions(-)

diff --git a/tests/automake.mk b/tests/automake.mk
index df6d0a2a9..771dddea2 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -236,39 +236,35 @@ PYCOV_CLEAN_FILES += $(CHECK_PYFILES:.py=.py,cover) 
.coverage
 FLAKE8_PYFILES += $(CHECK_PYFILES)
 
 if HAVE_OPENSSL
-TESTPKI_FILES = \
-   tests/testpki-cacert.pem \
-   tests/testpki-cert.pem \
-   tests/testpki-privkey.pem \
-   tests/testpki-req.pem \
-   tests/testpki-cert2.pem \
-   tests/testpki-privkey2.pem \
-   tests/testpki-req2.pem
+OVS_PKI_DIR = $(CURDIR)/tests/pki
+TESTPKI_CNS = test test2
+TESTPKI_FILES = $(shell \
+   for cn in $(TESTPKI_CNS); do \
+   echo tests/testpki-$$cn-cert.pem ; \
+   echo tests/testpki-$$cn-privkey.pem ; \
+   echo tests/testpki-$$cn-req.pem ; \
+   done)
+
+tests/testpki-cacert.pem: tests/pki/stamp
+   $(AM_V_GEN)cp $(OVS_PKI_DIR)/switchca/cacert.pem $@
+
+$(TESTPKI_FILES): tests/pki/stamp
+   $(AM_V_GEN)cp $(OVS_PKI_DIR)/$(notdir $(subst testpki-,,$@)) $@
+
+check_DATA += tests/testpki-cacert.pem
 check_DATA += $(TESTPKI_FILES)
+CLEANFILES += tests/testpki-cacert.pem
 CLEANFILES += $(TESTPKI_FILES)
 
-tests/testpki-cacert.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/switchca/cacert.pem $@
-tests/testpki-cert.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/test-cert.pem $@
-tests/testpki-req.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/test-req.pem $@
-tests/testpki-privkey.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/test-privkey.pem $@
-tests/testpki-cert2.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/test2-cert.pem $@
-tests/testpki-req2.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/test2-req.pem $@
-tests/testpki-privkey2.pem: tests/pki/stamp
-   $(AM_V_GEN)cp tests/pki/test2-privkey.pem $@
-
-OVS_PKI = $(SHELL) $(ovs_srcdir)/utilities/ovs-pki.in --dir=tests/pki 
--log=tests/ovs-pki.log
+
+OVS_PKI = $(SHELL) $(ovs_srcdir)/utilities/ovs-pki.in --dir=$(OVS_PKI_DIR) 
--log=tests/ovs-pki.log
 tests/pki/stamp:
$(AM_V_at)rm -f tests/pki/stamp
$(AM_V_at)rm -rf tests/pki
$(AM_V_GEN)$(OVS_PKI) init && \
-   $(OVS_PKI) req+sign tests/pki/test && \
-   $(OVS_PKI) req+sign tests/pki/test2 && \
+   for cn in $(TESTPKI_CNS); do \
+   $(OVS_PKI) req+sign tests/pki/$$cn; \
+   done && \
: > tests/pki/stamp
 CLEANFILES += tests/ovs-pki.log
 
diff --git a/tests/ovn.at b/tests/ovn.at
index ca9623fee..5cd8b34d7 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -8810,8 +8810,8 @@ AT_CHECK(
 
 start_daemon ovsdb-server --remote=punix:ovn-sb.sock \
   --remote=db:OVN_Southbound,SB_Global,connections \
-  --private-key="$PKIDIR/testpki-privkey2.pem" \
-  --certificate="$PKIDIR/testpki-cert2.pem" \
+  --private-key="$PKIDIR/testpki-test2-privkey.pem" \
+  --certificate="$PKIDIR/testpki-test2-cert.pem" \
   --ca-cert="$PKIDIR/testpki-cacert.pem" \
   ovn-sb.db
 
@@ -8819,20 +8819,20 @@ PARSE_LISTENING_PORT([ovsdb-server.log], [TCP_PORT])
 
 # read-only accesses should succeed
 AT_CHECK([ovn-sbctl --db=ssl:127.0.0.1:$TCP_PORT \
---private-key=$PKIDIR/testpki-privkey.pem \
---certificate=$PKIDIR/testpki-cert.pem \
+--private-key=$PKIDIR/testpki-test-privkey.pem \
+--certificate=$PKIDIR/testpki-test-cert.pem \
 --ca-cert=$PKIDIR/testpki-cacert.pem \
 list SB_Global], [0], [stdout], [ignore])
 AT_CHECK([ovn-sbctl --db=ssl:127.0.0.1:$TCP_PORT \
---private-key=$PKIDIR/testpki-privkey.pem \
---certificate=$PKIDIR/testpki-cert.pem \
+--private-key=$PKIDIR/testpki-test-privkey.pem \
+--certificate=$PKIDIR/testpki-test-cert.pem \
 --ca-cert=$PKIDIR/testpki-cacert.pem \
 list Connection], [0], [stdout], [ignore])
 
 # write access should fail
 AT_CHECK([ovn-sbctl --db=ssl:127.0.0.1:$TCP_PORT \
---private-key=$PKIDIR/testpki-privkey.pem \
---certificate=$PKIDIR/testpki-cert.pem \
+

[ovs-dev] [PATCH ovn v2 6/9] tests: Amend release stale port binding test for RBAC

The current version of the test attempts to simulate chassis
registration prior to starting `ovn-controller`, however it does
not set the `hostname` field.

The RBAC role for `ovn-controller` does not allow for a chassis to
change its own name or hostname, which makes sense as this is used
for authentication.

Update the test to set the `hostname` field when simulating chassis
registration so that `ovn-controller` does not attempt to update it
and subsequently make the test fail.

Fixes b6b3823d4 ("ovn-controller: Fix I-P for SB Port_Binding and OVS 
Interface")

Signed-off-by: Frode Nordahl 
---
 tests/ovn.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index bec593dcc..ca9623fee 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -21572,7 +21572,7 @@ ovn-nbctl --wait=sb lsp-add ls1 lsp1
 
 # Simulate the fact that lsp1 had been previously bound on hv1.
 ovn-sbctl --id=@e create encap chassis_name=hv1 ip="192.168.0.1" type="geneve" 
\
--- --id=@c create chassis name=hv1 encaps=@e \
+-- --id=@c create chassis hostname=hv1 name=hv1 encaps=@e \
 -- set Port_Binding lsp1 chassis=@c
 
 as hv1
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2 2/9] northd: Add missing RBAC rules for FDB table

The recently added FDB table did not get its RBAC rules which
would prohibit a `ovn-controller` from updating it with RBAC
enabled.

Fixes: 6ec3b1259 ("MAC learning: Add a new FDB table in southbound db")
Signed-off-by: Frode Nordahl 
---
 northd/ovn-northd.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index bafcb51e9..bb8f3032c 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -13248,6 +13248,11 @@ static const char *rbac_encap_auth[] =
 static const char *rbac_encap_update[] =
 {"type", "options", "ip"};
 
+static const char *rbac_fdb_auth[] =
+{""};
+static const char *rbac_fdb_update[] =
+{"dp_key", "mac", "port_key"};
+
 static const char *rbac_port_binding_auth[] =
 {""};
 static const char *rbac_port_binding_update[] =
@@ -13300,6 +13305,14 @@ static struct rbac_perm_cfg {
 .update = rbac_encap_update,
 .n_update = ARRAY_SIZE(rbac_encap_update),
 .row = NULL
+},{
+.table = "FDB",
+.auth = rbac_fdb_auth,
+.n_auth = ARRAY_SIZE(rbac_fdb_auth),
+.insdel = true,
+.update = rbac_fdb_update,
+.n_update = ARRAY_SIZE(rbac_fdb_update),
+.row = NULL
 },{
 .table = "Port_Binding",
 .auth = rbac_port_binding_auth,
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2 7/9] tests: Use ovn_start in tests/ovn-controller.at

The current version of the tests only initializes the SB DB and
instruments it directly. This does not work with SSL+RBAC as
northd must run to program the RBAC rules into the SB DB.

Run tests both for C and ddlog version of northd.

Add workaround for ovn-controller not re-reading certificates to
'ovn-controller - Chassis other_config' test.

Signed-off-by: Frode Nordahl 
---
 tests/ovn-controller.at | 50 -
 1 file changed, 44 insertions(+), 6 deletions(-)

diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 2cd3e261f..1dd1553cd 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -1,8 +1,9 @@
 AT_BANNER([ovn-controller])
 
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn-controller - ovn-bridge-mappings])
 AT_KEYWORDS([ovn])
-ovn_init_db ovn-sb
+ovn_start
 net_add n1
 sim_add hv
 as hv
@@ -54,6 +55,14 @@ check_bridge_mappings () {
 OVS_WAIT_UNTIL([test x"${local_mappings}" = x$(ovn-sbctl get Chassis 
${sysid} other_config:ovn-bridge-mappings | sed -e 's/\"//g')])
 }
 
+# NOTE: This test originally ran with only the SB-DB and no northd. For the
+# test to be successfull with SSL+RBAC we need to initially run northd to get
+# the RBAC rules programmed into the SB-DB. The test instruments the SB-DB
+# directly and we need to stop northd to avoid overwriting the instrumentation.
+kill `cat northd/ovn-northd.pid`
+kill `cat northd-backup/ovn-northd.pid`
+kill `cat ovn-nb/ovsdb-server.pid`
+
 # Initially there should be no patch ports.
 check_patches
 
@@ -116,12 +125,14 @@ as ovn-sb
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
+])
 
 # Checks that ovn-controller populates datapath-type and iface-types
 # correctly in the Chassis other_config column.
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn-controller - Chassis other_config])
 AT_KEYWORDS([ovn])
-ovn_init_db ovn-sb
+ovn_start
 
 net_add n1
 sim_add hv
@@ -192,7 +203,21 @@ OVS_WAIT_UNTIL([
 # chassis_private records.  Until that happens ovn-controller fails to
 # create the records due to constraint violation on the Encap table.
 sysid=${sysid}-foo
-ovs-vsctl set Open_vSwitch . external-ids:system-id="${sysid}"
+current_remote=`ovs-vsctl get Open_vSwitch . external-ids:ovn-remote`
+if test X$HAVE_OPENSSL = Xyes; then
+# To change chassis name we need to change certificate with matching CN
+ovs-vsctl set-ssl \
+$PKIDIR/testpki-${sysid}-privkey.pem \
+$PKIDIR/testpki-${sysid}-cert.pem \
+$PKIDIR/testpki-cacert.pem
+# force reconnect which makes OVN controller read the new certificates
+# TODO implement check for change of certificates in ovn-controller
+# and remove this workaround.
+ovs-vsctl set Open_vSwitch . external-ids:ovn-remote=unix:/dev/null
+fi
+ovs-vsctl -- set Open_vSwitch . external-ids:hostname="${sysid}" \
+  -- set Open_vSwitch . external-ids:system-id="${sysid}" \
+  -- set Open_vSwitch . external-ids:ovn-remote="${current_remote}"
 
 OVS_WAIT_UNTIL([
 grep -q 'Transaction causes multiple rows in \\"Encap\\" table to have 
identical values' hv/ovn-controller.log
@@ -216,12 +241,14 @@ as ovn-sb
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
+])
 
 # Checks that ovn-controller correctly maintains the mapping from the Encap
 # table in the Southbound database to OVS in the face of changes on both sides
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn-controller - change Encap properties])
 AT_KEYWORDS([ovn])
-ovn_init_db ovn-sb
+ovn_start
 
 net_add n1
 sim_add hv
@@ -271,11 +298,13 @@ as ovn-sb
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
+])
 
 # Check ovn-controller connection status to Southbound database
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn-controller - check sbdb connection])
 AT_KEYWORDS([ovn])
-ovn_init_db ovn-sb
+ovn_start
 
 net_add n1
 sim_add hv
@@ -305,11 +334,13 @@ as ovn-sb
 OVS_APP_EXIT_AND_WAIT([ovsdb-server])
 
 AT_CLEANUP
+])
 
 # Checks that ovn-controller recreates its chassis record when deleted 
externally.
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn-controller - Chassis self record])
 AT_KEYWORDS([ovn])
-ovn_init_db ovn-sb
+ovn_start
 
 net_add n1
 sim_add hv
@@ -360,8 +391,10 @@ OVS_WAIT_UNTIL([test x0 = x`ovn-sbctl --columns nb_cfg 
--bare find chassis`])
 
 OVN_CLEANUP([hv])
 AT_CLEANUP
+])
 
 # Test unix command: debug/delay-nb-cfg-report
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn-controller - debug/delay-nb-cfg-report])
 AT_KEYWORDS([ovn])
 ovn_start
@@ -393,7 +426,9 @@ AT_CHECK([ovn-nbctl --timeout=1 --wait=hv sync])
 
 OVN_CLEANUP([hv])
 AT_CLEANUP
+])
 
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn -- nb_cfg sync to OVS])
 ovn_start
 
@@ -414,7 +449,9 @@ OVS_WAIT_UNTIL([ovs-vsctl get Bridge br-int 
external_ids:ovn-nb-cfg], [0], [1])
 
 OVN_CLEANUP([hv1])
 AT_CLEANUP
+])
 
+OVN_FOR_EACH_NORTHD([
 AT_SETUP([ovn -- features])
 AT_KEYWORDS([features])
 ovn_start
@@ -431,3 +468,4 @@ OVS_WAIT_UNTIL([
 
 OVN_CLEANUP([hv1])
 AT_CLEANUP
+])
-- 
2.30.0

___

[ovs-dev] [PATCH ovn v2 1/9] northd: Amend RBAC rules for Port_Binding table

When `ovn-controller` claims a virtual lport it will update the
Port_Binding table with which chassis currently has claimed the
port as well as recording information about the virtual parent
lport [0].

When `ovn-controller` claims a lport it will also update the
encap field of the Port_Binding table if set and an update is
needed.

The current RBAC rules does not allow for these updates.

0: 
https://github.com/ovn-org/ovn/blob/b7b0fbdab03ce8b39d5bdc114876e6b0d0683892/controller/pinctrl.c#L6150
Fixes: 054f4c85c ("Add a new logical switch port type - 'virtual'")
Fixes: 6c8b9a132 (" ovn-controller: Store the local port bindings in the 
runtime data I-P state")
Reported-At: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/1917475
Signed-off-by: Frode Nordahl 
---
 northd/ovn-northd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index ac872aade..bafcb51e9 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -13251,7 +13251,7 @@ static const char *rbac_encap_update[] =
 static const char *rbac_port_binding_auth[] =
 {""};
 static const char *rbac_port_binding_update[] =
-{"chassis", "up"};
+{"chassis", "encap", "up", "virtual_parent"};
 
 static const char *rbac_mac_binding_auth[] =
 {""};
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2 4/9] northd: Add Controller_Event RBAC rules

The use of the Controller_Event table does currently not work
when RBAC is enabled.

Fixes: be1eeb09d ("OVN: introduce Controller_Event table")
Signed-off-by: Frode Nordahl 
---
 northd/ovn-northd.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index f85a3dcff..c4a3f2383 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -13248,6 +13248,12 @@ static const char *rbac_encap_auth[] =
 static const char *rbac_encap_update[] =
 {"type", "options", "ip"};
 
+static const char *rbac_controller_event_auth[] =
+{""};
+static const char *rbac_controller_event_update[] =
+{"chassis", "event_info", "event_type", "seq_num"};
+
+
 static const char *rbac_fdb_auth[] =
 {""};
 static const char *rbac_fdb_update[] =
@@ -13297,6 +13303,14 @@ static struct rbac_perm_cfg {
 .update = rbac_chassis_private_update,
 .n_update = ARRAY_SIZE(rbac_chassis_private_update),
 .row = NULL
+},{
+.table = "Controller_Event",
+.auth = rbac_controller_event_auth,
+.n_auth = ARRAY_SIZE(rbac_controller_event_auth),
+.insdel = true,
+.update = rbac_controller_event_update,
+.n_update = ARRAY_SIZE(rbac_controller_event_update),
+.row = NULL
 },{
 .table = "Encap",
 .auth = rbac_encap_auth,
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2 0/9] Fix missing RBAC rules and enable testing

To help ourself to not forget updating RBAC rules when we land
changes to existing functionality and new features we must enable
SSL+RBAC on the `ovn-controller` <-> SB DB connection for builds
with OpenSSL enabled.

The series is structured with one commit per table where RBAC
rules are fixed for the C version and one summary commit to 
update the northd-ddlog implementation. Then there are a few fixes
to existing tests before finally enabling SSL+RBAC for all tests.

This should allow for easier backports back to series where the
respective tables / features first appeared.

A successful testrun can be viewed at [0], in addittion I have done
local testing with ovn-northd-ddlog.

0: https://github.com/fnordahl/ovn/actions/runs/624324890

Frode Nordahl (9):
  northd: Amend RBAC rules for Port_Binding table
  northd: Add missing RBAC rules for FDB table
  northd: Amend Chassis RBAC rules
  northd: Add Controller_Event RBAC rules
  northd-ddlog: Update RBAC rules
  tests: Amend release stale port binding test for RBAC
  tests: Use ovn_start in tests/ovn-controller.at
  tests: Make certificate generation extendable
  tests: Test with SSL and RBAC for controller by default

 northd/ovn-northd.c | 31 ++--
 northd/ovn_northd.dl| 24 +--
 tests/automake.mk   | 53 +
 tests/ofproto-macros.at | 12 ++
 tests/ovn-controller.at | 50 +-
 tests/ovn-macros.at | 38 +++--
 tests/ovn-northd.at |  6 ++---
 tests/ovn.at| 50 +++---
 8 files changed, 198 insertions(+), 66 deletions(-)

-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2 3/9] northd: Amend Chassis RBAC rules

The Transport Zones support does currently not work when RBAC is
enabled.

Fixes: 07d0d258d ("OVN: Add support for Transport Zones")
Signed-off-by: Frode Nordahl 
---
 northd/ovn-northd.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index bb8f3032c..f85a3dcff 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -13236,7 +13236,7 @@ static const char *rbac_chassis_auth[] =
 {"name"};
 static const char *rbac_chassis_update[] =
 {"nb_cfg", "external_ids", "encaps", "vtep_logical_switches",
- "other_config"};
+ "other_config", "transport_zones"};
 
 static const char *rbac_chassis_private_auth[] =
 {"name"};
-- 
2.30.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH ovn v2 5/9] northd-ddlog: Update RBAC rules