Re: [ovs-dev] [External] Re: [PATCH v4] utilities/ofctl: add-meters for save and restore

2023-03-06 Thread Wan Junjie
Hi Simon Horman,

Thanks for your review.


On Mon, Mar 6, 2023 at 11:23 PM Simon Horman  wrote:
>
> On Sat, Mar 04, 2023 at 11:23:50AM +0800, Wan Junjie wrote:
> > Hi Simon,
> >
> > Thanks for the review.
>
> Hi Wan Junjie,
>
> Thanks for responding.
>
> > On Fri, Mar 3, 2023 at 11:06 PM Simon Horman  
> > wrote:
> > > On Wed, Mar 01, 2023 at 04:05:17PM +0800, Wan Junjie wrote:
>
> ...
>
> > > > diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
> > > > index 0a611b2ee..c44091906 100644
> > > > --- a/utilities/ovs-ofctl.8.in
> > > > +++ b/utilities/ovs-ofctl.8.in
> > > > @@ -492,23 +492,35 @@ the \fBBridge\fR table).  For more information, 
> > > > see ``Q: What versions
> > > >  of OpenFlow does Open vSwitch support?'' in the Open vSwitch FAQ.
> > > >  .
> > > >  .IP "\fBadd\-meter \fIswitch meter\fR"
> > > > +.IQ "\fBadd\-meter \fIswitch - < file\fR"
> > > >  Add a meter entry to \fIswitch\fR's tables. The \fImeter\fR syntax is
> > > >  described in section \fBMeter Syntax\fR, below.
> > > >  .
> > > > +.IP "\fBadd\-meters \fIswitch file\fR"
> > > > +Add each meter entry to \fIswitch\fR's tables. Each meter specification
> > > > +(each line in file) may start with \fBadd\fI, \fBmodify\fI or 
> > > > \fBdelete\fI
> > > > +keyword to specify whether a meter is to be added, modified, or 
> > > > deleted.
> > > > +For backwards compatibility a meter specification without one of these 
> > > > keywords
> > > > +is treated as a meter add. The \fImeter\fR syntax is described in 
> > > > section
> > > > +\fBMeter Syntax\fR, below.
> > > > +.
> > > >  .IP "\fBmod\-meter \fIswitch meter\fR"
> > > > +.IQ "\fBmod\-meter \fIswitch - < file\fR"
> > > >  Modify an existing meter.
> > > >  .
> > > >  .IP "\fBdel\-meters \fIswitch\fR [\fImeter\fR]"
> > > > +.IQ "\fBdel\-meters \fIswitch\fR - < file"
> > > >  Delete entries from \fIswitch\fR's meter table.  To delete only a
> > > > -specific meter, specify its number as \fImeter\fR.  Otherwise, if
> > > > +specific meter, specify a meter syntax \fBmeter=\fIid\fR.  Otherwise, 
> > > > if
> > >
> > > Are there backwards compatibility concerns here...
> > >
> > In fact, old docs are not clear on meter syntax, it means meter=id.
>
> Ok, so the documentation is being fixed to match the current implementation?
> If so I think that should be a separate patch.
>
Yes, and it is pointed out by Adrián Moreno in the first reply to this patch.
I will do the split in a later patch.

> > > >  \fImeter\fR is omitted, or if it is specified as \fBall\fR, then all
> > > >  meters are deleted.
> > > >  .
> > > > -.IP "\fBdump\-meters \fIswitch\fR [\fImeter\fR]"
> > > > +.IP "[\fB\-\-oneline\fR] \fBdump\-meters \fIswitch\fR [\fImeter\fR]"
> > > >  Print entries from \fIswitch\fR's meter table.  To print only a
> > > > -specific meter, specify its number as \fImeter\fR.  Otherwise, if
> > > > +specific meter, specify a meter syntax \fBmeter=\fIid\fR.  Otherwise, 
> > > > if
> > >
> > > ... and here?
> > >
> > > >  \fImeter\fR is omitted, or if it is specified as \fBall\fR, then all
> > > > -meters are printed.
> > > > +meters are printed.  With \fB\-\-oneline\fR, band information will be
> > > > +combined into one line.
> > > >  .
> > > >  .IP "\fBmeter\-stats \fIswitch\fR [\fImeter\fR]"
> > > >  Print meter statistics.  \fImeter\fR can specify a single meter with
> > > > @@ -1342,6 +1354,11 @@ includes flow duration, packet and byte counts, 
> > > > and idle and hard age
> > > >  in its output.  With \fB\-\-no\-stats\fR, it omits all of these, as
> > > >  well as cookie values and table IDs if they are zero.
> > > >  .
> > > > +.IP "\fB\-\-oneline\fR"
> > > > +The \fBdump\-meters\fR command prints each band in a separate line by
> > > > +default. With \fB\-\-oneline\fR, all bands will be printed in a single
> > > > +line. This is useful for dumping meters to a file and restoring them.
> > > > +.
> > > >  .IP "\fB\-\-read-only\fR"
> > > >  Do not execute read/write commands.
> > > >  .
>
> ...
>
> > > > +static void
> > > > +ofctl_dump_meters(struct ovs_cmdl_context *ctx)
> > > > +{
> > > > +if (!oneline) {
> > > > +ofctl_dump_meters__(ctx);
> > > > +} else {
> > > > +struct ofputil_meter_mod mm;
> > > > +struct vconn *vconn;
> > > > +enum ofputil_protocol protocol;
> > > > +enum ofp_version version;
> > > > +
> > > > +memset(, 0, sizeof mm);
> > > > +const char *bridge = ctx->argv[1];
> > > > +const char *str = ctx->argc > 2 ? ctx->argv[2] : NULL;
> > > > +
> > > > +vconn = prepare_dump_meters(bridge, str, , );
> > > > +version = ofputil_protocol_to_ofp_version(protocol);
> > > > +struct ofpbuf *request = ofputil_encode_meter_request(version,
> > > > +OFPUTIL_METER_CONFIG, mm.meter.meter_id);
> > >
> > > The logic in the two 'paragraphs' above seems to largely duplicate
> > > ofctl_dump_meters__() => ofctl_meter_request__().
> > >
> > > Could those functions 

Re: [ovs-dev] [PATCH v2] treewide: Remove uses of ATOMIC_VAR_INIT

2023-03-06 Thread Ilya Maximets
On 3/1/23 03:30, Fangrui Song via dev wrote:
> ATOMIC_VAR_INIT has a trivial definition
> `#define ATOMIC_VAR_INIT(value) (value)`,
> is deprecated in C17/C++20, and will be removed in newer standards in
> newer GCC/Clang (e.g. https://reviews.llvm.org/D144196).
> 
> Signed-off-by: Fangrui Song 
> ---
> Changes from v1:
> 
> * remove ATOMIC_VAR_INIT from lib/ovs-atomic*.h files
> * update lib/ovs-atomic.h

Applied.  Thanks!

Best regards, Ilya Maximets.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpdk: Allow retaining CAP_SYS_RAWIO privileges

2023-03-06 Thread Aaron Conole
Simon Horman  writes:

> On Fri, Mar 03, 2023 at 10:16:00AM -0500, Aaron Conole wrote:
>> Open vSwitch generally tries to let the underlying operating system
>> managed the low level details of hardware, for example DMA mapping,
>> bus arbitration, etc.  However, when using DPDK, the underlying
>> operating system yields control of many of these details to userspace
>> for management.
>> 
>> In the case of some DPDK port drivers, configuring rte_flow or even
>> allocating resources may require access to iopl/ioperm calls, which
>> are guarded by the CAP_SYS_RAWIO privilege on linux systems.  These
>> calls are dangerous, and can allow a process to completely compromise
>> a system.  However, they are needed in the case of some userspace
>> driver code which manages the hardware (for example, the mlx
>> implementation of backend support for rte_flow).
>> 
>> Here, we create an opt-in flag passed to the command line to allow
>> this access.  We need to do this before ever accessing the database,
>> because we want to drop all privileges asap, and cannot wait for
>> a connection to the database to be established and functional before
>> dropping.  There may be distribution specific ways to do capability
>> management as well (using for example, systemd), but they are not
>> as universal to the vswitchd as a flag.
>> 
>> Signed-off-by: Aaron Conole 
>
> Fun times.

:)

> Minor nit below not withstanding this looks good to me,
> within the context of the description above.
>
>
> Reviewed-by: Simon Horman 

Thanks for the review, Simon!

> ...
>
>> @@ -827,6 +829,17 @@ daemon_become_new_user_linux(bool access_datapath 
>> OVS_UNUSED)
>>  ret = capng_update(CAPNG_ADD, cap_sets, CAP_NET_ADMIN)
>>|| capng_update(CAPNG_ADD, cap_sets, CAP_NET_RAW)
>>|| capng_update(CAPNG_ADD, cap_sets, 
>> CAP_NET_BROADCAST);
>> +#ifdef DPDK_NETDEV
>> +if (access_hardware_ports && !ret) {
>> +ret = capng_update(CAPNG_ADD, cap_sets, CAP_SYS_RAWIO);
>> +VLOG_INFO("CAP_SYS_RAWIO enabled.");
>> +}
>> +#else
>> +;
>
> nit: is the line above needed?

d'oh.  I removed it when I reviewed, but forgot to generate a new
patch.  Thanks for looking!

v2 incoming.

>> +if (access_hardware_ports) {
>> +VLOG_WARN("Dropped CAP_SYS_RAWIO request (no 
>> drivers).");
>> +}
>> +#endif
>>  }
>>  } else {
>>  ret = -1;

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Check rx/tx descriptor sizes for device.

2023-03-06 Thread David Marchand
On Mon, Mar 6, 2023 at 5:05 PM David Marchand  wrote:
> > diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> > index fb0dd43f7..e901f857e 100644
> > --- a/lib/netdev-dpdk.c
> > +++ b/lib/netdev-dpdk.c
> > @@ -1008,4 +1008,13 @@ dpdk_watchdog(void *dummy OVS_UNUSED)
> >  }
> >
> > +static int
> > +dpdk_limit_desc_size(int desc_size, struct rte_eth_desc_lim *lim)
> > +{
> > +desc_size = ROUND_UP(desc_size, lim->nb_align);
>
> Based on a quick grep, it seems possible a driver exposes nb_align == 0.
> Like for example:
> https://git.dpdk.org/dpdk/tree/drivers/net/bnxt/bnxt_ethdev.c#n1020

Ok... you can scratch that, ethdev provides a 1 default value.

https://git.dpdk.org/dpdk/tree/lib/ethdev/rte_ethdev.h#n3275
https://git.dpdk.org/dpdk/tree/lib/ethdev/rte_ethdev.c#n3577


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Check rx/tx descriptor sizes for device.

2023-03-06 Thread David Marchand
On Thu, Feb 9, 2023 at 5:29 PM Kevin Traynor  wrote:
>
> By default OVS configures 2048 descriptors for tx and rx queues
> on DPDK devices. It also allows the user to configure those values.
>
> If the values used are not acceptable to the device than queue setup
> will fail.
>
> The device exposes it's max/min/alignment requirements, so use those
> to ensure that an acceptable value is used during queue setup.
>
> If the default or user value is not acceptable, adjust to a suitable
> value.
>
> Reported-at: https://bugzilla.redhat.com/2119876
> Signed-off-by: Kevin Traynor 
> ---
>  lib/netdev-dpdk.c | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index fb0dd43f7..e901f857e 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1008,4 +1008,13 @@ dpdk_watchdog(void *dummy OVS_UNUSED)
>  }
>
> +static int
> +dpdk_limit_desc_size(int desc_size, struct rte_eth_desc_lim *lim)
> +{
> +desc_size = ROUND_UP(desc_size, lim->nb_align);

Based on a quick grep, it seems possible a driver exposes nb_align == 0.
Like for example:
https://git.dpdk.org/dpdk/tree/drivers/net/bnxt/bnxt_ethdev.c#n1020

> +desc_size = MIN(desc_size, lim->nb_max);
> +desc_size = MAX(desc_size, lim->nb_min);
> +return desc_size;

Automatically adjusting a default value seems normal to me.
But in the case of user input, I think it would help if there was some
log indicating that OVS adjusted the config.


> +}
> +
>  static int
>  dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
> @@ -1056,4 +1065,7 @@ dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int 
> n_rxq, int n_txq)
>  }
>
> +dev->rxq_size = dpdk_limit_desc_size(dev->rxq_size, _desc_lim);
> +dev->txq_size = dpdk_limit_desc_size(dev->txq_size, _desc_lim);
> +
>  /* A device may report more queues than it makes available (this has
>   * been observed for Intel xl710, which reserves some of them for
> --
> 2.39.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


-- 
David Marchand

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [External] Re: [PATCH v4] utilities/ofctl: add-meters for save and restore

2023-03-06 Thread Simon Horman
On Sat, Mar 04, 2023 at 11:23:50AM +0800, Wan Junjie wrote:
> Hi Simon,
> 
> Thanks for the review.

Hi Wan Junjie,

Thanks for responding.

> On Fri, Mar 3, 2023 at 11:06 PM Simon Horman  
> wrote:
> > On Wed, Mar 01, 2023 at 04:05:17PM +0800, Wan Junjie wrote:

...

> > > diff --git a/utilities/ovs-ofctl.8.in b/utilities/ovs-ofctl.8.in
> > > index 0a611b2ee..c44091906 100644
> > > --- a/utilities/ovs-ofctl.8.in
> > > +++ b/utilities/ovs-ofctl.8.in
> > > @@ -492,23 +492,35 @@ the \fBBridge\fR table).  For more information, see 
> > > ``Q: What versions
> > >  of OpenFlow does Open vSwitch support?'' in the Open vSwitch FAQ.
> > >  .
> > >  .IP "\fBadd\-meter \fIswitch meter\fR"
> > > +.IQ "\fBadd\-meter \fIswitch - < file\fR"
> > >  Add a meter entry to \fIswitch\fR's tables. The \fImeter\fR syntax is
> > >  described in section \fBMeter Syntax\fR, below.
> > >  .
> > > +.IP "\fBadd\-meters \fIswitch file\fR"
> > > +Add each meter entry to \fIswitch\fR's tables. Each meter specification
> > > +(each line in file) may start with \fBadd\fI, \fBmodify\fI or 
> > > \fBdelete\fI
> > > +keyword to specify whether a meter is to be added, modified, or deleted.
> > > +For backwards compatibility a meter specification without one of these 
> > > keywords
> > > +is treated as a meter add. The \fImeter\fR syntax is described in section
> > > +\fBMeter Syntax\fR, below.
> > > +.
> > >  .IP "\fBmod\-meter \fIswitch meter\fR"
> > > +.IQ "\fBmod\-meter \fIswitch - < file\fR"
> > >  Modify an existing meter.
> > >  .
> > >  .IP "\fBdel\-meters \fIswitch\fR [\fImeter\fR]"
> > > +.IQ "\fBdel\-meters \fIswitch\fR - < file"
> > >  Delete entries from \fIswitch\fR's meter table.  To delete only a
> > > -specific meter, specify its number as \fImeter\fR.  Otherwise, if
> > > +specific meter, specify a meter syntax \fBmeter=\fIid\fR.  Otherwise, if
> >
> > Are there backwards compatibility concerns here...
> >
> In fact, old docs are not clear on meter syntax, it means meter=id.

Ok, so the documentation is being fixed to match the current implementation?
If so I think that should be a separate patch.

> > >  \fImeter\fR is omitted, or if it is specified as \fBall\fR, then all
> > >  meters are deleted.
> > >  .
> > > -.IP "\fBdump\-meters \fIswitch\fR [\fImeter\fR]"
> > > +.IP "[\fB\-\-oneline\fR] \fBdump\-meters \fIswitch\fR [\fImeter\fR]"
> > >  Print entries from \fIswitch\fR's meter table.  To print only a
> > > -specific meter, specify its number as \fImeter\fR.  Otherwise, if
> > > +specific meter, specify a meter syntax \fBmeter=\fIid\fR.  Otherwise, if
> >
> > ... and here?
> >
> > >  \fImeter\fR is omitted, or if it is specified as \fBall\fR, then all
> > > -meters are printed.
> > > +meters are printed.  With \fB\-\-oneline\fR, band information will be
> > > +combined into one line.
> > >  .
> > >  .IP "\fBmeter\-stats \fIswitch\fR [\fImeter\fR]"
> > >  Print meter statistics.  \fImeter\fR can specify a single meter with
> > > @@ -1342,6 +1354,11 @@ includes flow duration, packet and byte counts, 
> > > and idle and hard age
> > >  in its output.  With \fB\-\-no\-stats\fR, it omits all of these, as
> > >  well as cookie values and table IDs if they are zero.
> > >  .
> > > +.IP "\fB\-\-oneline\fR"
> > > +The \fBdump\-meters\fR command prints each band in a separate line by
> > > +default. With \fB\-\-oneline\fR, all bands will be printed in a single
> > > +line. This is useful for dumping meters to a file and restoring them.
> > > +.
> > >  .IP "\fB\-\-read-only\fR"
> > >  Do not execute read/write commands.
> > >  .

...

> > > +static void
> > > +ofctl_dump_meters(struct ovs_cmdl_context *ctx)
> > > +{
> > > +if (!oneline) {
> > > +ofctl_dump_meters__(ctx);
> > > +} else {
> > > +struct ofputil_meter_mod mm;
> > > +struct vconn *vconn;
> > > +enum ofputil_protocol protocol;
> > > +enum ofp_version version;
> > > +
> > > +memset(, 0, sizeof mm);
> > > +const char *bridge = ctx->argv[1];
> > > +const char *str = ctx->argc > 2 ? ctx->argv[2] : NULL;
> > > +
> > > +vconn = prepare_dump_meters(bridge, str, , );
> > > +version = ofputil_protocol_to_ofp_version(protocol);
> > > +struct ofpbuf *request = ofputil_encode_meter_request(version,
> > > +OFPUTIL_METER_CONFIG, mm.meter.meter_id);
> >
> > The logic in the two 'paragraphs' above seems to largely duplicate
> > ofctl_dump_meters__() => ofctl_meter_request__().
> >
> > Could those functions be parameterised over oneline?
> >
> ofctl_meter_request__() will handle stats, features requests. We can
> put 'oneline' to
> parameters, but that will make other functions a little confused about
> this parameter.
> A combined function will be a little more complicated since it will
> have to judge the
> request type, changes won't save us any code.
> 
> > > +
> > > +struct ofputil_meter_config *mcs;
> > > +size_t n_mtrs;
> > > +

Re: [ovs-dev] [PATCH] dpdk: Allow retaining CAP_SYS_RAWIO privileges

2023-03-06 Thread Simon Horman
On Fri, Mar 03, 2023 at 10:16:00AM -0500, Aaron Conole wrote:
> Open vSwitch generally tries to let the underlying operating system
> managed the low level details of hardware, for example DMA mapping,
> bus arbitration, etc.  However, when using DPDK, the underlying
> operating system yields control of many of these details to userspace
> for management.
> 
> In the case of some DPDK port drivers, configuring rte_flow or even
> allocating resources may require access to iopl/ioperm calls, which
> are guarded by the CAP_SYS_RAWIO privilege on linux systems.  These
> calls are dangerous, and can allow a process to completely compromise
> a system.  However, they are needed in the case of some userspace
> driver code which manages the hardware (for example, the mlx
> implementation of backend support for rte_flow).
> 
> Here, we create an opt-in flag passed to the command line to allow
> this access.  We need to do this before ever accessing the database,
> because we want to drop all privileges asap, and cannot wait for
> a connection to the database to be established and functional before
> dropping.  There may be distribution specific ways to do capability
> management as well (using for example, systemd), but they are not
> as universal to the vswitchd as a flag.
> 
> Signed-off-by: Aaron Conole 

Fun times.

Minor nit below not withstanding this looks good to me,
within the context of the description above.


Reviewed-by: Simon Horman 

...

> @@ -827,6 +829,17 @@ daemon_become_new_user_linux(bool access_datapath 
> OVS_UNUSED)
>  ret = capng_update(CAPNG_ADD, cap_sets, CAP_NET_ADMIN)
>|| capng_update(CAPNG_ADD, cap_sets, CAP_NET_RAW)
>|| capng_update(CAPNG_ADD, cap_sets, 
> CAP_NET_BROADCAST);
> +#ifdef DPDK_NETDEV
> +if (access_hardware_ports && !ret) {
> +ret = capng_update(CAPNG_ADD, cap_sets, CAP_SYS_RAWIO);
> +VLOG_INFO("CAP_SYS_RAWIO enabled.");
> +}
> +#else
> +;

nit: is the line above needed?

> +if (access_hardware_ports) {
> +VLOG_WARN("Dropped CAP_SYS_RAWIO request (no drivers).");
> +}
> +#endif
>  }
>  } else {
>  ret = -1;
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn v2 1/2] ovn-controller: fixed ovn-installed not always properly added or removed.

2023-03-06 Thread Xavier Simonart
Hi Dumitru

Thanks for the review and the comments

On Wed, Feb 15, 2023 at 2:15 PM Dumitru Ceara  wrote:

> Hi Xavier,
>
> Thanks for this new version and sorry for the time it took to review it!
>  I think it mostly looks good; I only have a few minor remarks/questions
> below.
>
> On 1/4/23 09:32, Xavier Simonart wrote:
> > OVN checks whether ovn-installed is already present (in OVS) before
> updating it.
> > This might cause ovn-installed related issues in the following case:
> > - (1) ovn-installed is present
> > - (2) we claim the interface
> > - (3) we update ovs, removing ovn-installed and start installing flows
> > - (4) (next loop), after flows installed, we check if ovn-installed is
> absent,
> >   and if yes, we update OVS with ovn-installed.
> > However, in step4, if OVS is still busy from step 3, ovn-installed is
> read as
> > present; hence OVN controller does not update it and move to INSTALLED
> state.
> >
> > ovn-installed was also not properly deleted on port or binding removal.
> >
> > Note that port->up is not properly removed on external_ids:iface-id
> removal when
> > sb is read-only. This will be fixed in a further patch.
> >
> > Fixes: a7c7d4519e50 ("controller: avoid recomputes triggered by SBDB
> Port_Binding updates.")
> >
> > Signed-off-by: Xavier Simonart 
> >
>
> Nit: there should be a "---" here.  Like that the comments below don't
> get included in the actual commit.  We can probably fix this at apply
> time so there's no need to send a v3 for now.
>
> Oops - thanks, good catch.

> > v2: - handled Dumitru's comments
> > - rebased on main
> > - updated state machine diagram as commented in v1 commit message
> > - remove ovn-installed on port deletion or external_ids:iface-id
> removal.
> > - added test case
> > ---
> >  controller/binding.c   |  59 +-
> >  controller/binding.h   |   6 +
> >  controller/if-status.c | 113 +-
> >  tests/ovn.at   | 257 +
> >  4 files changed, 403 insertions(+), 32 deletions(-)
> >
> > diff --git a/controller/binding.c b/controller/binding.c
> > index 5df62baef..d85a17730 100644
> > --- a/controller/binding.c
> > +++ b/controller/binding.c
> > @@ -746,6 +746,19 @@ local_binding_get_lport_ofport(const struct shash
> *local_bindings,
> >  u16_to_ofp(lbinding->iface->ofport[0]) : 0;
> >  }
> >
> > +bool
> > +local_binding_is_ovn_installed(struct shash *local_bindings,
> > +   const char *pb_name)
> > +{
> > +struct local_binding *lbinding =
> > +local_binding_find(local_bindings, pb_name);
> > +if (lbinding && lbinding->iface) {
> > +return smap_get_bool(>iface->external_ids,
> > + OVN_INSTALLED_EXT_ID, false);
> > +}
> > +return false;
> > +}
> > +
> >  bool
> >  local_binding_is_up(struct shash *local_bindings, const char *pb_name,
> >  const struct sbrec_chassis *chassis_rec)
> > @@ -783,6 +796,7 @@ local_binding_is_down(struct shash *local_bindings,
> const char *pb_name,
> >  } else if (b_lport->pb->chassis) {
> >  VLOG_DBG("lport %s already claimed by other chassis",
> >   b_lport->pb->logical_port);
> > +return true;
> >  }
> >  }
> >
> > @@ -834,6 +848,33 @@ local_binding_set_up(struct shash *local_bindings,
> const char *pb_name,
> >  }
> >  }
> >
> > +static void
> > +remove_ovn_installed(struct local_binding *lbinding, const char
> *pb_name)
> > +{
> > +if (lbinding && lbinding->iface &&
> > +smap_get_bool(>iface->external_ids,
> > + OVN_INSTALLED_EXT_ID, false)) {
> > +/* If iface has been deleted, do not try to delete a key from
> it */
> > +if (!ovsrec_interface_is_deleted(lbinding->iface)) {
>
> Not really a problem, it just felt a bit weird to read this knowing that
> the function can be called even when not processing tracked changes.
> Still, it should be fine.
>
> I think this is really due to the fact that, when we handle tracked
changes and a db is ro, we have two choices
- fail as we were doing before, and recompute
- do nothing in the current loop, and have if-status "reconcile", but hence
not based on tracked changes

> > +VLOG_INFO("Removing lport %s ovn-installed in OVS",
> pb_name);
> > +ovsrec_interface_update_external_ids_delkey(lbinding->iface,
> > +
> OVN_INSTALLED_EXT_ID);
> > +}
> > +}
> > +}
> > +
> > +void
> > +local_binding_remove_ovn_installed(struct shash *local_bindings,
> > +   const char *pb_name, bool
> ovs_readonly)
> > +{
> > +if (ovs_readonly) {
> > +return;
> > +}
> > +struct local_binding *lbinding =
> > +local_binding_find(local_bindings, pb_name);
> > +remove_ovn_installed(lbinding, pb_name);
> > +}
> > +
> >  void
> >  local_binding_set_down(struct shash *local_bindings, 

[ovs-dev] [PATCH v2 1/2] dpctl: Fix flush-conntrack with datapath as argument

2023-03-06 Thread Ales Musil
Specifying datapath with "dpctl/flush-conntrack" didn't
work as expected and caused error:
ovs-dpctl: field system@ovs-system missing value (Invalid argument)

To prevent that check if we have datapath as first argument
and use it accordingly.

Signed-off-by: Ales Musil 
---
v2: Add test case.
---
 lib/dpctl.c | 17 -
 tests/system-traffic.at |  2 ++
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index c501a0cd7..a7a4d8ee8 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1717,10 +1717,16 @@ dpctl_flush_conntrack(int argc, const char *argv[],
 uint16_t zone, *pzone = NULL;
 int error;
 int args = argc - 1;
+int zone_pos = 1;
+
+if (dp_arg_exists(argc, argv)) {
+args--;
+zone_pos = 2;
+}
 
 /* Parse zone. */
-if (args && !strncmp(argv[1], "zone=", 5)) {
-if (!ovs_scan(argv[1], "zone=%"SCNu16, )) {
+if (args && !strncmp(argv[zone_pos], "zone=", 5)) {
+if (!ovs_scan(argv[zone_pos], "zone=%"SCNu16, )) {
 ds_put_cstr(, "failed to parse zone");
 error = EINVAL;
 goto error;
@@ -1747,13 +1753,6 @@ dpctl_flush_conntrack(int argc, const char *argv[],
 args--;
 }
 
-/* Report error if there is more than one unparsed argument. */
-if (args > 1) {
-ds_put_cstr(, "invalid arguments");
-error = EINVAL;
-goto error;
-}
-
 error = opt_dpif_open(argc, argv, dpctl_p, 5, );
 if (error) {
 dpctl_error(dpctl_p, error, "Cannot open dpif");
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 380372430..c6ec61dba 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -2360,8 +2360,10 @@ priority=100,in_port=2,icmp,action=ct(zone=5,commit),1
 ])
 
 AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+dp=$(ovs-appctl dpctl/dump-dps)
 
 m4_foreach([FLUSH_CMD], [[ovs-appctl dpctl/flush-conntrack],
+ [ovs-appctl dpctl/flush-conntrack $dp],
  [ovs-ofctl ct-flush br0]], [
 AS_BOX([Testing with FLUSH_CMD])
 
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 2/2] vswitch: Add missing documentation for "ct_flush" capability

2023-03-06 Thread Ales Musil
Signed-off-by: Ales Musil 
---
 vswitchd/vswitch.xml | 5 +
 1 file changed, 5 insertions(+)

diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 12708a313..15e4f97b7 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -6301,6 +6301,11 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch 
options:peer=p1 \
 translated to an ephemeral port. If there is no collision, no SNAT
 is performed.
   
+  
+True if the datapath supports CT flush extension. The extensions
+allows to flush CT entries based on specified parameters.
+  
 
 
 
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpctl: Fix flush-conntrack with datapath as argument

2023-03-06 Thread 0-day Robot
Bleep bloop.  Greetings Ales Musil, I am a robot and I have tried out your 
patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
ERROR: Inappropriate bracing around statement
#28 FILE: lib/dpctl.c:1722:
if (dp_arg_exists(argc, argv) ) {

Lines checked: 58, Warnings: 0, Errors: 1


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] dpctl: Fix flush-conntrack with datapath as argument

2023-03-06 Thread Ilya Maximets
On 3/6/23 13:44, Ales Musil wrote:
> Specifying datapath with "dpctl/flush-conntrack" didn't
> work as expected and caused error:
> ovs-dpctl: field system@ovs-system missing value (Invalid argument)
> 
> To prevent that check if we have datapath as first argument
> and use it accordingly.
> 
> Signed-off-by: Ales Musil 
> ---

Hi, Ales.  Could you, please, add a test for this use-case?
Thanks.

BTW, I noticed that we're missing documentation for the
ct_flush datapath capability flag in vswitch/vswitch.xml.
Could you also fix that?  Should be a separate change.

Best regards, Ilya Maximets.


>  lib/dpctl.c | 17 -
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/dpctl.c b/lib/dpctl.c
> index c501a0cd7..3539bed77 100644
> --- a/lib/dpctl.c
> +++ b/lib/dpctl.c
> @@ -1717,10 +1717,16 @@ dpctl_flush_conntrack(int argc, const char *argv[],
>  uint16_t zone, *pzone = NULL;
>  int error;
>  int args = argc - 1;
> +int zone_pos = 1;
> +
> +if (dp_arg_exists(argc, argv) ) {
> +args--;
> +zone_pos = 2;
> +}
>  
>  /* Parse zone. */
> -if (args && !strncmp(argv[1], "zone=", 5)) {
> -if (!ovs_scan(argv[1], "zone=%"SCNu16, )) {
> +if (args && !strncmp(argv[zone_pos], "zone=", 5)) {
> +if (!ovs_scan(argv[zone_pos], "zone=%"SCNu16, )) {
>  ds_put_cstr(, "failed to parse zone");
>  error = EINVAL;
>  goto error;
> @@ -1747,13 +1753,6 @@ dpctl_flush_conntrack(int argc, const char *argv[],
>  args--;
>  }
>  
> -/* Report error if there is more than one unparsed argument. */
> -if (args > 1) {
> -ds_put_cstr(, "invalid arguments");
> -error = EINVAL;
> -goto error;
> -}
> -
>  error = opt_dpif_open(argc, argv, dpctl_p, 5, );
>  if (error) {
>  dpctl_error(dpctl_p, error, "Cannot open dpif");

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 1/2] ofp, dpif: Allow CT flush based on partial match

2023-03-06 Thread Ales Musil
On Fri, Mar 3, 2023 at 4:00 PM Roi Dayan  wrote:

>
>
> On 01/03/2023 14:01, Ales Musil wrote:
> > On Wed, Mar 1, 2023 at 12:34 PM Roi Dayan  wrote:
> >
> >>
> >>
> >> On 01/03/2023 12:53, Roi Dayan via dev wrote:
> >>>
> >>>
> >>> On 16/01/2023 13:45, Ales Musil wrote:
>  Currently, the CT can be flushed by dpctl only by specifying
>  the whole 5-tuple. This is not very convenient when there are
>  only some fields known to the user of CT flush. Add new struct
>  ofputil_ct_match which represents the generic filtering that can
>  be done for CT flush. The match is done only on fields that are
>  non-zero with exception to the icmp fields.
> 
>  This allows the filtering just within dpctl, however
>  it is a preparation for OpenFlow extension.
> 
>  Reported-at:
> >>
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2F2120546=05%7C01%7Croid%40nvidia.com%7Cbc4c95a2472a4dfbb0a408db1a434035%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638132648384230198%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=wBA5yCdijnN%2FkcwxncDe1xxHzsqzGQgCRqNRHIMQUHQ%3D=0
>  Signed-off-by: Ales Musil 
>  ---
> >>
> >> ...
> >>
>  @@ -1707,37 +1708,55 @@ dpctl_flush_conntrack(int argc, const char
> >> *argv[],
> struct dpctl_params *dpctl_p)
>   {
>   struct dpif *dpif = NULL;
>  -struct ct_dpif_tuple tuple, *ptuple = NULL;
>  +struct ofp_ct_match match = {0};
>   struct ds ds = DS_EMPTY_INITIALIZER;
>   uint16_t zone, *pzone = NULL;
>   int error;
>   int args = argc - 1;
> 
>  -/* Parse ct tuple */
>  -if (args && ct_dpif_parse_tuple(, argv[args], )) {
>  -ptuple = 
>  +/* Parse zone. */
>  +if (args && !strncmp(argv[1], "zone=", 5)) {
>  +if (!ovs_scan(argv[1], "zone=%"SCNu16, )) {
>  +ds_put_cstr(, "failed to parse zone");
>  +error = EINVAL;
>  +goto error;
>  +}
>  +pzone = 
>   args--;
>   }
> 
>  -/* Parse zone */
>  -if (args && ovs_scan(argv[args], "zone=%"SCNu16, )) {
>  -pzone = 
>  +/* Parse ct tuples. */
>  +for (int i = 0; i < 2; i++) {
>  +if (!args) {
>  +break;
>  +}
>  +
>  +struct ofp_ct_tuple *tuple =
>  +i ? _reply : _orig;
>  +const char *arg = argv[argc - args];
>  +
>  +if (arg[0] && !ofp_ct_tuple_parse(tuple, arg, ,
> >> _proto,
>  +  _type)) {
>  +error = EINVAL;
>  +goto error;
>  +}
>   args--;
>   }
> >>>
> >>> Hi,
> >>>
> >>> There is a problem with the change here that you cannot specify now dp.
> >>>
> >>> # ovs-appctl dpctl/flush-conntrack system@ovs-system
> >>> ovs-dpctl: field system@ovs-system missing value (Invalid argument)
> >>>
> >>> As I understand it the old logic was parsing from the end and if we
> left
> >>> with 1 arg it's considered the dp.
> >>>
> >>> The new logic starts from first argument and fails on first failure
> >>> so we actually never reach the following check if args > 1.
> >>>
> >>> A quick fix I tried is to goto error only if args > 1.
> >>> but this leaves an opening that specifying wrong key fails on the dp
> arg.
> >>>
> >>> # ovs-dpctl flush-conntrack  system@ovs-system key=val
> >>> ovs-dpctl: field system@ovs-system missing value (Invalid argument)
> >>>
> >>> I ended up with parsing backwards and checking if the error is on last
> >>> arg or not.
> >>>
> >>> I'll send a fixup for review.
> >>>
> >>> Thanks,
> >>> Roi
> >>>
> >>
> >> Well it worked for manual run I tested but I fail the ct flush
> testsuite.
> >>
> >>  53: conntrack - ct flushFAILED (
> >> system-traffic.at:2364)
> >>
> >> I'm having some trouble following the log maybe you can help and do the
> >> correct fix.
> >> i'll also try to look but reporting this for now.
> >>
> >> The change I tested is the following:
> >>
> >> --- a/lib/dpctl.c
> >> +++ b/lib/dpctl.c
> >> @@ -1737,12 +1737,15 @@ dpctl_flush_conntrack(int argc, const char
> *argv[],
> >>
> >>  struct ofp_ct_tuple *tuple =
> >>  i ? _reply : _orig;
> >> -const char *arg = argv[argc - args];
> >> +const char *arg = argv[args];
> >>
> >>  if (arg[0] && !ofp_ct_tuple_parse(tuple, arg, ,
> >> _proto,
> >>_type)) {
> >> -error = EINVAL;
> >> -goto error;
> >> +if (args > 1) {
> >> +error = EINVAL;
> >> +goto error;
> >> +}
> >> +break;
> >>  }
> >>  args--;
> >>  }
> >>
> >
> >
> > 

[ovs-dev] [PATCH] dpctl: Fix flush-conntrack with datapath as argument

2023-03-06 Thread Ales Musil
Specifying datapath with "dpctl/flush-conntrack" didn't
work as expected and caused error:
ovs-dpctl: field system@ovs-system missing value (Invalid argument)

To prevent that check if we have datapath as first argument
and use it accordingly.

Signed-off-by: Ales Musil 
---
 lib/dpctl.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/lib/dpctl.c b/lib/dpctl.c
index c501a0cd7..3539bed77 100644
--- a/lib/dpctl.c
+++ b/lib/dpctl.c
@@ -1717,10 +1717,16 @@ dpctl_flush_conntrack(int argc, const char *argv[],
 uint16_t zone, *pzone = NULL;
 int error;
 int args = argc - 1;
+int zone_pos = 1;
+
+if (dp_arg_exists(argc, argv) ) {
+args--;
+zone_pos = 2;
+}
 
 /* Parse zone. */
-if (args && !strncmp(argv[1], "zone=", 5)) {
-if (!ovs_scan(argv[1], "zone=%"SCNu16, )) {
+if (args && !strncmp(argv[zone_pos], "zone=", 5)) {
+if (!ovs_scan(argv[zone_pos], "zone=%"SCNu16, )) {
 ds_put_cstr(, "failed to parse zone");
 error = EINVAL;
 goto error;
@@ -1747,13 +1753,6 @@ dpctl_flush_conntrack(int argc, const char *argv[],
 args--;
 }
 
-/* Report error if there is more than one unparsed argument. */
-if (args > 1) {
-ds_put_cstr(, "invalid arguments");
-error = EINVAL;
-goto error;
-}
-
 error = opt_dpif_open(argc, argv, dpctl_p, 5, );
 if (error) {
 dpctl_error(dpctl_p, error, "Cannot open dpif");
-- 
2.39.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn: Handling of arp/nd learning on logical switches

2023-03-06 Thread Ilya Maximets
On 3/3/23 11:20, Olaf Seibert via dev wrote:
> Tangentially related to this: I was trying to detect when the problem with 
> the "too many resubmits" occurs, for alerting purposes.
> Some light examination of the source of ovs suggested that the coverage 
> counter "drop_action_too_many_resubmit" corresponds to the error 
> "XLATE_TOO_MANY_RESUBMITS".
> 
> So I saw this very recent log message in 
> /var/log/openvswitch/ovs-vswitchd.log (I censored some addresses):
> 
> 2023-03-03T10:06:51.453Z|00518|ofproto_dpif_xlate(handler34)|WARN|over 4096 
> resubmit actions on bridge br-int while processing 
> arp,in_port=1,vlan_tci=0x,dl_src=zz:zz:zz:zz:zz:zz,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=xxx.xxx.xxx.xxx,arp_tpa=yyy.yyy.yyy.yyy,arp_op=1,arp_sha=vv:vv:vv:vv:vv:vv,arp_tha=00:00:00:00:00:00
> 
> but the coverage counter seems to ignore this:
> 
> # ovs-appctl coverage/read-counter drop_action_too_many_resubmit
> 0
> 
> Is this a bug or am I looking at the wrong thing(s)?

Coverage counters are collected only for actions executed in userspace.
In your case, if XLATE_TOO_MANY_RESUBMITS happens, the 'drop' flow
will be installed into kernel datapath and the actual packet drops will
happen in the kernel.  Coverage counters can't count these.
So, drop_action_* counters are only useful for userspace datapath at
the moment.

Best rgards, Ilya Maximets.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v1 1/1] netdev-windows: Add checking when creating netdev with system type on Windows

2023-03-06 Thread Alin Serdean
Applied on master and backported until 2.9

Thank you!

> On 9 Nov 2022, at 04:32, Wilson Peng  wrote:
> 
> From: Wilson Peng 
> 
> In the recent Antrea project testing, some port could not be created
> on Windows.
> 
> When doing debug, our team found there is one case happening when multiple
> ports are waiting for be created with correct port number.
> 
> Some system type port will be created netdev successfully and it will cause
> conflict as in the dpif side it will be internal type. So finally the port
> will be created failed and it could not be easily recovered.
> 
> With the patch, on Windows the netdev creating will be blocked for system
> type when the ovs_tyep got on dpif is internal. More detailed case description
> is in the reported issue No.262 with link below.
> 
> Reported-at:https://github.com/openvswitch/ovs-issues/issues/262
> Signed-off-by: Wilson Peng 
> ---
> lib/netdev-windows.c | 11 +++
> 1 file changed, 11 insertions(+)
> 
> diff --git a/lib/netdev-windows.c b/lib/netdev-windows.c
> index 4ad45ffa1..3fad501e3 100644
> --- a/lib/netdev-windows.c
> +++ b/lib/netdev-windows.c
> @@ -156,6 +156,7 @@ netdev_windows_system_construct(struct netdev *netdev_)
> struct netdev_windows_netdev_info info;
> struct ofpbuf *buf;
> int ret;
> +const char*type = NULL;
> 
> /* Query the attributes and runtime status of the netdev. */
> ret = query_netdev(netdev_get_name(>up), , );
> @@ -167,6 +168,16 @@ netdev_windows_system_construct(struct netdev *netdev_)
> }
> ofpbuf_delete(buf);
> 
> +/* Don't create netdev if ovs-type is "internal"
> + * but the type of netdev->up is "system". */
> +type = netdev_get_type(>up);
> +if (type && !strcmp(type, "system") &&
> +(info.ovs_type == OVS_VPORT_TYPE_INTERNAL)) {
> +VLOG_DBG("construct device %s, ovs_type: %u failed",
> + netdev_get_name(>up), info.ovs_type);
> +return 1;
> +}
> +
> netdev->change_seq = 1;
> netdev->dev_type = info.ovs_type;
> netdev->port_no = info.port_no;
> -- 
> 2.32.1 (Apple Git-133)
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn v2] northd: prevents sending packet to conntrack for router ports

2023-03-06 Thread Xavier Simonart
As commented in northd.c, we should not use ct() for router ports.
When there are no stateful_acl, this patch prevents sending packet to conntrack
for router ports.
The patch does this by issuing ct_clear in ls_out_pre_lb stage so that hints
are not set in ls_out_acl_hint and ls_out_acl stages.

Note that ct_clear is not added for ingress for router ports as already done
for patch ports (no change by this patch on this aspect).

Also, this patch does not change the behavior for ACLs such as allow-related:
packets are still sent to conntrack, even for router ports. While this does
not work if router ports are distributed, allow-related ACLs work today on
router ports when those ports are handled on the same chassis for ingress and
egress traffic. This patch does not change that behavior.

Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=2062431

Signed-off-by: Xavier Simonart 

---
v2: - handled Dumitru's comments
- handled Ales' comments
- added change to xml documentation
- do not do ct_clear for ingress as already done
---
 northd/northd.c |  24 +++---
 northd/ovn-northd.8.xml |  11 +++
 tests/ovn-northd.at |  11 +--
 tests/system-ovn.at | 166 
 4 files changed, 197 insertions(+), 15 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 7ad4cdfad..b356faf64 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -5779,20 +5779,24 @@ skip_port_from_conntrack(struct ovn_datapath *od, 
struct ovn_port *op,
  * know about the connection, as the icmp request went through the logical
  * router on hostA, not hostB. This would only work with distributed
  * conntrack state across all chassis. */
-struct ds match_in = DS_EMPTY_INITIALIZER;
-struct ds match_out = DS_EMPTY_INITIALIZER;
 
-ds_put_format(_in, "ip && inport == %s", op->json_key);
-ds_put_format(_out, "ip && outport == %s", op->json_key);
+const char *ingress_action = "next;";
+const char *egress_action = od->has_stateful_acl
+? "next;"
+: "ct_clear; next;";
+
+char *ingress_match = xasprintf("ip && inport == %s", op->json_key);
+char *egress_match = xasprintf("ip && outport == %s", op->json_key);
+
 ovn_lflow_add_with_lport_and_hint(lflows, od, in_stage, priority,
-  ds_cstr(_in), "next;", op->key,
-  >nbsp->header_);
+  ingress_match, ingress_action,
+  op->key, >nbsp->header_);
 ovn_lflow_add_with_lport_and_hint(lflows, od, out_stage, priority,
-  ds_cstr(_out), "next;", op->key,
-  >nbsp->header_);
+  egress_match, egress_action,
+  op->key, >nbsp->header_);
 
-ds_destroy(_in);
-ds_destroy(_out);
+free(ingress_match);
+free(egress_match);
 }
 
 static void
diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml
index 2eab2c4ae..efadfe808 100644
--- a/northd/ovn-northd.8.xml
+++ b/northd/ovn-northd.8.xml
@@ -2056,6 +2056,17 @@ output;
   db="OVN_Northbound"/> table.
 
 
+
+  This table also has a priority-110 flow with the match
+  outport == I for all logical switch
+  datapaths to move traffic to the next table, and, if there are no
+  stateful_acl, clear the ct_state. Where I
+  is the peer of a logical router port. This flow is added to
+  skip the connection tracking of packets which will be entering
+  logical router datapath from logical switch datapath for routing.
+
+
+
 Egress Table 2: Pre-stateful
 
 
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 3fa02d2b3..d0f6893e9 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -4111,6 +4111,7 @@ check ovn-nbctl lsp-set-options sw0-lr0 
router-port=lr0-sw0
 check ovn-nbctl --wait=sb sync
 
 check_stateful_flows() {
+action=$1
 ovn-sbctl dump-flows sw0 > sw0flows
 AT_CAPTURE_FILE([sw0flows])
 
@@ -4144,12 +4145,12 @@ check_stateful_flows() {
   table=??(ls_in_stateful ), priority=100  , match=(reg0[[1]] == 1 && 
reg0[[13]] == 1), action=(ct_commit { ct_mark.blocked = 0; ct_label.label = 
reg3; }; next;)
 ])
 
-AT_CHECK([grep "ls_out_pre_lb" sw0flows | sort], [0], [dnl
+AT_CHECK_UNQUOTED([grep "ls_out_pre_lb" sw0flows | sort], [0], [dnl
   table=1 (ls_out_pre_lb  ), priority=0, match=(1), action=(next;)
   table=1 (ls_out_pre_lb  ), priority=100  , match=(ip), action=(reg0[[2]] 
= 1; next;)
   table=1 (ls_out_pre_lb  ), priority=110  , match=(eth.mcast), 
action=(next;)
-  table=1 (ls_out_pre_lb  ), priority=110  , match=(eth.src == 
$svc_monitor_mac), action=(next;)
-  table=1 (ls_out_pre_lb  ), priority=110  , match=(ip && outport == 
"sw0-lr0"), action=(next;)
+  

Re: [ovs-dev] [PATCH v22 4/8] netdev-offload-tc: Add sFlow offload API for TC

2023-03-06 Thread Chris Mi via dev

On 3/6/2023 4:51 PM, Eelco Chaudron wrote:


On 6 Mar 2023, at 9:19, Chris Mi wrote:


On 3/1/2023 9:23 PM, Ilya Maximets wrote:

On 3/1/23 14:21, Ilya Maximets wrote:

On 3/1/23 14:08, Chris Mi wrote:

On 3/1/2023 8:44 PM, Ilya Maximets wrote:

On 2/28/23 14:05, Chris Mi wrote:

On 2/24/2023 4:16 AM, Ilya Maximets wrote:

On 2/23/23 12:26, Chris Mi wrote:

Initialize psample socket. Add sFlow recv API to receive sampled
packets from psample socket. Add sFow recv wait API to add psample
socket fd to poll list.

Signed-off-by: Chris Mi
Reviewed-by: Roi Dayan
---
   lib/dpif.h|   7 ++
   lib/netdev-offload-provider.h |  11 ++
   lib/netdev-offload-tc.c   | 188 ++
   3 files changed, 206 insertions(+)





+{
+int read_tries = 0;
+
+if (!psample_sock) {
+return ENOENT;
+}
+
+for (;;) {
+struct offload_psample psample;
+struct offload_sflow sflow;
+int error;
+
+if (++read_tries > 50) {
+return EAGAIN;
+}
+
+error = nl_sock_recv(psample_sock, buf, NULL, false);
+if (error == ENOBUFS) {
+continue;
+}
+
+if (error) {
+if (error == EAGAIN) {
+break;
+}
+return error;
+}
+
+error = psample_from_ofpbuf(, buf);
+if (!error) {
+return psample_parse_packet(, , upcall);
+} else if (error) {

Condition here is always true.

I copied from dpif_netlink_recv_cpu_dispatch(). And I also think it is ok.

The 'if' condition in dpif_netlink_recv_cpu_dispatch() is different
and so the 'else if (error)' is not always true there.  But it is
always true here, hence makes no practical sense.

OK, I see. I'll change the code like this:

      error = psample_from_ofpbuf(, buf);
      if (!error) {
      return psample_parse_packet(, upcall);
      } else {
      return error;
      }

The 'else' condition is unnecessary, since we do always return
at this point.  In the dpif-netlink code there is a case where
we continue, but here there is no such case.

So, I'd suggest:

  error = psample_from_ofpbuf(, buf);
  if (error) {
  return error;
  }

  return psample_parse_packet(, upcall);

Or even just a ternary operator:

  error = psample_from_ofpbuf(, buf);

  return error ? error : psample_parse_packet(, upcall);

Just in case, please wait for a proper review on v24 from someone
before sending a new version.  There is no point to re-spin the
set just for this.

OK.

Eelco,

You reviewed my previous versions and acked most of them. But since Ilya 
suggested
a new design, the code was changed greatly. So I removed the Acked-by for you. 
I'm wondering
if you have time to review again recently?

Yes, it’s on my todo for this week, but might slip into next week…

//Eelco

No problem.

Thanks,
Chris
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v22 4/8] netdev-offload-tc: Add sFlow offload API for TC

2023-03-06 Thread Eelco Chaudron


On 6 Mar 2023, at 9:19, Chris Mi wrote:

> On 3/1/2023 9:23 PM, Ilya Maximets wrote:
>> On 3/1/23 14:21, Ilya Maximets wrote:
>>> On 3/1/23 14:08, Chris Mi wrote:
 On 3/1/2023 8:44 PM, Ilya Maximets wrote:
> On 2/28/23 14:05, Chris Mi wrote:
>> On 2/24/2023 4:16 AM, Ilya Maximets wrote:
>>> On 2/23/23 12:26, Chris Mi wrote:
 Initialize psample socket. Add sFlow recv API to receive sampled
 packets from psample socket. Add sFow recv wait API to add psample
 socket fd to poll list.

 Signed-off-by: Chris Mi
 Reviewed-by: Roi Dayan
 ---
   lib/dpif.h|   7 ++
   lib/netdev-offload-provider.h |  11 ++
   lib/netdev-offload-tc.c   | 188 
 ++
   3 files changed, 206 insertions(+)

> 
>
 +{
 +int read_tries = 0;
 +
 +if (!psample_sock) {
 +return ENOENT;
 +}
 +
 +for (;;) {
 +struct offload_psample psample;
 +struct offload_sflow sflow;
 +int error;
 +
 +if (++read_tries > 50) {
 +return EAGAIN;
 +}
 +
 +error = nl_sock_recv(psample_sock, buf, NULL, false);
 +if (error == ENOBUFS) {
 +continue;
 +}
 +
 +if (error) {
 +if (error == EAGAIN) {
 +break;
 +}
 +return error;
 +}
 +
 +error = psample_from_ofpbuf(, buf);
 +if (!error) {
 +return psample_parse_packet(, , upcall);
 +} else if (error) {
>>> Condition here is always true.
>> I copied from dpif_netlink_recv_cpu_dispatch(). And I also think it is 
>> ok.
> The 'if' condition in dpif_netlink_recv_cpu_dispatch() is different
> and so the 'else if (error)' is not always true there.  But it is
> always true here, hence makes no practical sense.
 OK, I see. I'll change the code like this:

      error = psample_from_ofpbuf(, buf);
      if (!error) {
      return psample_parse_packet(, upcall);
      } else {
      return error;
      }
>>> The 'else' condition is unnecessary, since we do always return
>>> at this point.  In the dpif-netlink code there is a case where
>>> we continue, but here there is no such case.
>>>
>>> So, I'd suggest:
>>>
>>>  error = psample_from_ofpbuf(, buf);
>>>  if (error) {
>>>  return error;
>>>  }
>>>
>>>  return psample_parse_packet(, upcall);
>>>
>>> Or even just a ternary operator:
>>>
>>>  error = psample_from_ofpbuf(, buf);
>>>
>>>  return error ? error : psample_parse_packet(, upcall);
>> Just in case, please wait for a proper review on v24 from someone
>> before sending a new version.  There is no point to re-spin the
>> set just for this.
> OK.
>
> Eelco,
>
> You reviewed my previous versions and acked most of them. But since Ilya 
> suggested
> a new design, the code was changed greatly. So I removed the Acked-by for 
> you. I'm wondering
> if you have time to review again recently?

Yes, it’s on my todo for this week, but might slip into next week…

//Eelco

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v22 4/8] netdev-offload-tc: Add sFlow offload API for TC

2023-03-06 Thread Chris Mi via dev

On 3/1/2023 9:23 PM, Ilya Maximets wrote:

On 3/1/23 14:21, Ilya Maximets wrote:

On 3/1/23 14:08, Chris Mi wrote:

On 3/1/2023 8:44 PM, Ilya Maximets wrote:

On 2/28/23 14:05, Chris Mi wrote:

On 2/24/2023 4:16 AM, Ilya Maximets wrote:

On 2/23/23 12:26, Chris Mi wrote:

Initialize psample socket. Add sFlow recv API to receive sampled
packets from psample socket. Add sFow recv wait API to add psample
socket fd to poll list.

Signed-off-by: Chris Mi
Reviewed-by: Roi Dayan
---
  lib/dpif.h|   7 ++
  lib/netdev-offload-provider.h |  11 ++
  lib/netdev-offload-tc.c   | 188 ++
  3 files changed, 206 insertions(+)





+{
+int read_tries = 0;
+
+if (!psample_sock) {
+return ENOENT;
+}
+
+for (;;) {
+struct offload_psample psample;
+struct offload_sflow sflow;
+int error;
+
+if (++read_tries > 50) {
+return EAGAIN;
+}
+
+error = nl_sock_recv(psample_sock, buf, NULL, false);
+if (error == ENOBUFS) {
+continue;
+}
+
+if (error) {
+if (error == EAGAIN) {
+break;
+}
+return error;
+}
+
+error = psample_from_ofpbuf(, buf);
+if (!error) {
+return psample_parse_packet(, , upcall);
+} else if (error) {

Condition here is always true.

I copied from dpif_netlink_recv_cpu_dispatch(). And I also think it is ok.

The 'if' condition in dpif_netlink_recv_cpu_dispatch() is different
and so the 'else if (error)' is not always true there.  But it is
always true here, hence makes no practical sense.

OK, I see. I'll change the code like this:

     error = psample_from_ofpbuf(, buf);
     if (!error) {
     return psample_parse_packet(, upcall);
     } else {
     return error;
     }

The 'else' condition is unnecessary, since we do always return
at this point.  In the dpif-netlink code there is a case where
we continue, but here there is no such case.

So, I'd suggest:

 error = psample_from_ofpbuf(, buf);
 if (error) {
 return error;
 }

 return psample_parse_packet(, upcall);

Or even just a ternary operator:

 error = psample_from_ofpbuf(, buf);

 return error ? error : psample_parse_packet(, upcall);

Just in case, please wait for a proper review on v24 from someone
before sending a new version.  There is no point to re-spin the
set just for this.

OK.

Eelco,

You reviewed my previous versions and acked most of them. But since Ilya 
suggested
a new design, the code was changed greatly. So I removed the Acked-by 
for you. I'm wondering

if you have time to review again recently?

Thanks,
Chris



Best regards, Ilya Maximets

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev