Re: [ovs-dev] [PATCH ovn] northd: Fix an issue wrt mac binding aging.

2024-05-15 Thread Ales Musil
On Thu, May 16, 2024 at 7:24 AM Indrajitt Valsaraj <
indrajitt.valsa...@nutanix.com> wrote:

> Issue:
> In case of a Logical_Router without mac_binding_age_threshold set or a
> Logical_Router with an incorrectly formatted mac_binding_threshold option,
> entries were not being purged from the Mac Binding table in SouthBound.
>
> This was because in the function `en_mac_binding_aging_run` in case of
> an invalid mac_binding_threshold entry or if mac_binding_threshold is
> not set we are returning from the loop instead of iterating through the
> remaining LRs. As a result, subsequent runs of the aging_waker node are
> also not scehduled and we end up not purging any MAC Bindings.
>
> Fix:
> This patch fixes this issue by changing the return to a continue so that
> we skip the current LR but continue processing for the remaining LRs.
>
> Fixes: 78851b6ffb58 ("northd: Support CIDR-based MAC binding aging
> threshold.")
> Signed-off-by: Indrajitt Valsaraj 
> Acked-by: Naveen Yerramneni 
> ---
>

Hi Indrajitt,

thank you for the fix I have one comment down below.


>  northd/aging.c |   2 +-
>  tests/ovn.at   | 111 -
>  2 files changed, 74 insertions(+), 39 deletions(-)
>
> diff --git a/northd/aging.c b/northd/aging.c
> index b76963a2d..9685044e7 100644
> --- a/northd/aging.c
> +++ b/northd/aging.c
> @@ -421,7 +421,7 @@ en_mac_binding_aging_run(struct engine_node *node,
> void *data OVS_UNUSED)
>  if (!parse_aging_threshold(smap_get(>nbr->options,
>  "mac_binding_age_threshold"),
> _config)) {
> -return;
> +continue;
>  }
>
>  aging_context_set_threshold(, _config);
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 486680649..28a62d03d 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -34407,32 +34407,40 @@ ovn_start
>  net_add n1
>
>  AT_CHECK([ovn-nbctl ls-add public])
> -AT_CHECK([ovn-nbctl ls-add internal])
> +AT_CHECK([ovn-nbctl ls-add internal-1])
>

Why do we need to rename internal to internal-1? There is only 1 anyway. So
it can very much remain just internal.


>
>  AT_CHECK([ovn-nbctl lsp-add public ln_port])
>  AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown])
>  AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet])
>  AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1])
>
> -AT_CHECK([ovn-nbctl lsp-add public public-gw])
> -AT_CHECK([ovn-nbctl lsp-set-type public-gw router])
> -AT_CHECK([ovn-nbctl lsp-set-addresses public-gw 00:00:00:00:10:00 router])
> -AT_CHECK([ovn-nbctl lsp-set-options public-gw router-port=gw-public])
> +AT_CHECK([ovn-nbctl lsp-add public public-gw-1])
> +AT_CHECK([ovn-nbctl lsp-set-type public-gw-1 router])
> +AT_CHECK([ovn-nbctl lsp-set-addresses public-gw-1 00:00:00:00:10:00
> router])
> +AT_CHECK([ovn-nbctl lsp-set-options public-gw-1 router-port=gw-1-public])
>
> -AT_CHECK([ovn-nbctl lsp-add internal internal-gw])
> -AT_CHECK([ovn-nbctl lsp-set-type internal-gw router])
> -AT_CHECK([ovn-nbctl lsp-set-addresses internal-gw 00:00:00:00:20:00
> router])
> -AT_CHECK([ovn-nbctl lsp-set-options internal-gw router-port=gw-internal])
> +AT_CHECK([ovn-nbctl lsp-add public public-gw-2])
> +AT_CHECK([ovn-nbctl lsp-set-type public-gw-2 router])
> +AT_CHECK([ovn-nbctl lsp-set-addresses public-gw-2 00:00:00:00:30:00
> router])
> +AT_CHECK([ovn-nbctl lsp-set-options public-gw-2 router-port=gw-2-public])
>
> -AT_CHECK([ovn-nbctl lsp-add internal vif1])
> +AT_CHECK([ovn-nbctl lsp-add internal-1 internal-gw-1])
> +AT_CHECK([ovn-nbctl lsp-set-type internal-gw-1 router])
> +AT_CHECK([ovn-nbctl lsp-set-addresses internal-gw-1 00:00:00:00:20:00
> router])
> +AT_CHECK([ovn-nbctl lsp-set-options internal-gw-1
> router-port=gw-internal-1])
> +
> +AT_CHECK([ovn-nbctl lsp-add internal-1 vif1])
>  AT_CHECK([ovn-nbctl lsp-set-addresses vif1 "00:00:00:00:20:10
> 192.168.20.10"])
>
> -AT_CHECK([ovn-nbctl lsp-add internal vif2])
> +AT_CHECK([ovn-nbctl lsp-add internal-1 vif2])
>  AT_CHECK([ovn-nbctl lsp-set-addresses vif2 "00:00:00:00:20:20
> 192.168.20.20"])
>
> -AT_CHECK([ovn-nbctl lr-add gw])
> -AT_CHECK([ovn-nbctl lrp-add gw gw-public 00:00:00:00:10:00
> 192.168.10.1/24])
> -AT_CHECK([ovn-nbctl lrp-add gw gw-internal 00:00:00:00:20:00
> 192.168.20.1/24])
> +AT_CHECK([ovn-nbctl lr-add gw-1])
> +AT_CHECK([ovn-nbctl lrp-add gw-1 gw-1-public 00:00:00:00:10:00
> 192.168.10.1/24])
> +AT_CHECK([ovn-nbctl lrp-add gw-1 gw-internal-1 00:00:00:00:20:00
> 192.168.20.1/24])
> +
> +AT_CHECK([ovn-nbctl lr-add gw-2])
> +AT_CHECK([ovn-nbctl lrp-add gw-2 gw-2-public 00:00:00:00:30:00
> 192.168.10.2/24])
>
>  sim_add hv1
>  as hv1
> @@ -34500,21 +34508,27 @@ send_udp() {
>  as $hv ovs-appctl netdev-dummy/receive $dev $packet
>  }
>  # Check if the option is not present by default
> -AT_CHECK([fetch_column nb:logical_router options name="gw" | grep -q
> mac_binding_age_threshold], [1])
> +AT_CHECK([fetch_column 

[ovs-dev] [PATCH ovn] northd: Fix an issue wrt mac binding aging.

2024-05-15 Thread Indrajitt Valsaraj
Issue:
In case of a Logical_Router without mac_binding_age_threshold set or a
Logical_Router with an incorrectly formatted mac_binding_threshold option,
entries were not being purged from the Mac Binding table in SouthBound.

This was because in the function `en_mac_binding_aging_run` in case of
an invalid mac_binding_threshold entry or if mac_binding_threshold is
not set we are returning from the loop instead of iterating through the
remaining LRs. As a result, subsequent runs of the aging_waker node are
also not scehduled and we end up not purging any MAC Bindings.

Fix:
This patch fixes this issue by changing the return to a continue so that
we skip the current LR but continue processing for the remaining LRs.

Fixes: 78851b6ffb58 ("northd: Support CIDR-based MAC binding aging threshold.")
Signed-off-by: Indrajitt Valsaraj 
Acked-by: Naveen Yerramneni 
---
 northd/aging.c |   2 +-
 tests/ovn.at   | 111 -
 2 files changed, 74 insertions(+), 39 deletions(-)

diff --git a/northd/aging.c b/northd/aging.c
index b76963a2d..9685044e7 100644
--- a/northd/aging.c
+++ b/northd/aging.c
@@ -421,7 +421,7 @@ en_mac_binding_aging_run(struct engine_node *node, void 
*data OVS_UNUSED)
 if (!parse_aging_threshold(smap_get(>nbr->options,
 "mac_binding_age_threshold"),
_config)) {
-return;
+continue;
 }

 aging_context_set_threshold(, _config);
diff --git a/tests/ovn.at b/tests/ovn.at
index 486680649..28a62d03d 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -34407,32 +34407,40 @@ ovn_start
 net_add n1

 AT_CHECK([ovn-nbctl ls-add public])
-AT_CHECK([ovn-nbctl ls-add internal])
+AT_CHECK([ovn-nbctl ls-add internal-1])

 AT_CHECK([ovn-nbctl lsp-add public ln_port])
 AT_CHECK([ovn-nbctl lsp-set-addresses ln_port unknown])
 AT_CHECK([ovn-nbctl lsp-set-type ln_port localnet])
 AT_CHECK([ovn-nbctl lsp-set-options ln_port network_name=physnet1])

-AT_CHECK([ovn-nbctl lsp-add public public-gw])
-AT_CHECK([ovn-nbctl lsp-set-type public-gw router])
-AT_CHECK([ovn-nbctl lsp-set-addresses public-gw 00:00:00:00:10:00 router])
-AT_CHECK([ovn-nbctl lsp-set-options public-gw router-port=gw-public])
+AT_CHECK([ovn-nbctl lsp-add public public-gw-1])
+AT_CHECK([ovn-nbctl lsp-set-type public-gw-1 router])
+AT_CHECK([ovn-nbctl lsp-set-addresses public-gw-1 00:00:00:00:10:00 router])
+AT_CHECK([ovn-nbctl lsp-set-options public-gw-1 router-port=gw-1-public])

-AT_CHECK([ovn-nbctl lsp-add internal internal-gw])
-AT_CHECK([ovn-nbctl lsp-set-type internal-gw router])
-AT_CHECK([ovn-nbctl lsp-set-addresses internal-gw 00:00:00:00:20:00 router])
-AT_CHECK([ovn-nbctl lsp-set-options internal-gw router-port=gw-internal])
+AT_CHECK([ovn-nbctl lsp-add public public-gw-2])
+AT_CHECK([ovn-nbctl lsp-set-type public-gw-2 router])
+AT_CHECK([ovn-nbctl lsp-set-addresses public-gw-2 00:00:00:00:30:00 router])
+AT_CHECK([ovn-nbctl lsp-set-options public-gw-2 router-port=gw-2-public])

-AT_CHECK([ovn-nbctl lsp-add internal vif1])
+AT_CHECK([ovn-nbctl lsp-add internal-1 internal-gw-1])
+AT_CHECK([ovn-nbctl lsp-set-type internal-gw-1 router])
+AT_CHECK([ovn-nbctl lsp-set-addresses internal-gw-1 00:00:00:00:20:00 router])
+AT_CHECK([ovn-nbctl lsp-set-options internal-gw-1 router-port=gw-internal-1])
+
+AT_CHECK([ovn-nbctl lsp-add internal-1 vif1])
 AT_CHECK([ovn-nbctl lsp-set-addresses vif1 "00:00:00:00:20:10 192.168.20.10"])

-AT_CHECK([ovn-nbctl lsp-add internal vif2])
+AT_CHECK([ovn-nbctl lsp-add internal-1 vif2])
 AT_CHECK([ovn-nbctl lsp-set-addresses vif2 "00:00:00:00:20:20 192.168.20.20"])

-AT_CHECK([ovn-nbctl lr-add gw])
-AT_CHECK([ovn-nbctl lrp-add gw gw-public 00:00:00:00:10:00 192.168.10.1/24])
-AT_CHECK([ovn-nbctl lrp-add gw gw-internal 00:00:00:00:20:00 192.168.20.1/24])
+AT_CHECK([ovn-nbctl lr-add gw-1])
+AT_CHECK([ovn-nbctl lrp-add gw-1 gw-1-public 00:00:00:00:10:00 
192.168.10.1/24])
+AT_CHECK([ovn-nbctl lrp-add gw-1 gw-internal-1 00:00:00:00:20:00 
192.168.20.1/24])
+
+AT_CHECK([ovn-nbctl lr-add gw-2])
+AT_CHECK([ovn-nbctl lrp-add gw-2 gw-2-public 00:00:00:00:30:00 
192.168.10.2/24])

 sim_add hv1
 as hv1
@@ -34500,21 +34508,27 @@ send_udp() {
 as $hv ovs-appctl netdev-dummy/receive $dev $packet
 }
 # Check if the option is not present by default
-AT_CHECK([fetch_column nb:logical_router options name="gw" | grep -q 
mac_binding_age_threshold], [1])
+AT_CHECK([fetch_column nb:logical_router options name="gw-1" | grep -q 
mac_binding_age_threshold], [1])
+AT_CHECK([fetch_column nb:logical_router options name="gw-2" | grep -q 
mac_binding_age_threshold], [1])

 # Send GARP to populate MAC binding table records
 send_garp hv1 ext1 10
 send_garp hv2 ext2 20

-wait_row_count mac_binding 1 ip="192.168.10.10"
-wait_row_count mac_binding 1 ip="192.168.10.20"
+# Two rows present for each IP, one corresponding to each logical_port
+wait_row_count mac_binding 2 ip="192.168.10.10"

Re: [ovs-dev] [PATCH ovn] ovn-ctl: Support for --config-file ovsdb-server option.

2024-05-15 Thread Vladislav Odintsov
Thanks Numan!

regards,
Vladislav Odintsov

> On 15 May 2024, at 23:55, Numan Siddique  wrote:
> 
> On Fri, May 3, 2024 at 2:05 AM Ales Musil  wrote:
>> 
>>> On Tue, Apr 23, 2024 at 6:43 PM Vladislav Odintsov 
>>> wrote:
>>> 
>>> Since OVS 3.3.0 ovsdb-server accepts databases and remotes configuration
>>> via JSON text file.  This patch adds support for such option.
>>> 
>>> Signed-off-by: Vladislav Odintsov 
> 
> Thanks for the patch.
> 
> I applied this with the below changes
> 
> -
> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
> index fd1ae12567..a4f410e4f7 100755
> --- a/utilities/ovn-ctl
> +++ b/utilities/ovn-ctl
> @@ -1242,8 +1242,7 @@ File location options:
>   --db-sb-relay-sock=SOCKET  OVN_IC_Northbound db socket (default:
> $DB_SB_RELAY_SOCK)
>   --db-sb-relay-pidfile=FILE OVN_Southbound relay db pidfile
> (default: $DB_SB_RELAY_CTRL_PIDFILE)
>   --db-sb-relay-ctrl-sock=SOCKET OVN_Southbound relay db control
> socket (default: $DB_SB_RELAY_CTRL_SOCK)
> -  --db-sb-relay-config-file=FILE OVN_IC_Northbound ovsdb-server
> configuration file
> - Mutually exclusive with
> --db-ic-nb-use-remote-in-db=yes.
> +  --db-sb-relay-config-file=FILE OVN_Southbound relay ovsdb-server
> configuration file.

Oops, copy-paste typo.

>   --ovn-sb-relay-db-ssl-key=KEY OVN_Southbound DB relay SSL private key file
>   --ovn-sb-relay-db-ssl-cert=CERT OVN_Southbound DB relay SSL certificate file
>   --ovn-sb-relay-db-ssl-ca-cert=CERT OVN OVN_Southbound DB relay SSL
> CA certificate file
> diff --git a/utilities/ovn-ctl.8.xml b/utilities/ovn-ctl.8.xml
> index c0fbb0792d..4f21ba4ea3 100644
> --- a/utilities/ovn-ctl.8.xml
> +++ b/utilities/ovn-ctl.8.xml
> @@ -86,6 +86,11 @@
> --db-ic-sb-schema=FILE
> --db-ic-sb-create-insecure-remote=yes|no
> --db-ic-nb-create-insecure-remote=yes|no
> +--db-nb-config-file=FILE
> +--db-sb-config-file=FILE
> +--db-ic-nb-config-file=FILE
> +--db-ic-sb-config-file=FILE
> +--db-sb-relay-config-file=FILE
> --ovn-controller-ssl-key=KEY
> --ovn-controller-ssl-cert=CERT
> --ovn-controller-ssl-ca-cert=CERT
> 
> -
> 
> 
> Numan
> 
>>> ---
>>> NEWS  |  1 +
>>> utilities/ovn-ctl | 39 +++
>>> 2 files changed, 36 insertions(+), 4 deletions(-)
>>> 
>>> diff --git a/NEWS b/NEWS
>>> index 9adf6a31c..39ea88d78 100644
>>> --- a/NEWS
>>> +++ b/NEWS
>>> @@ -16,6 +16,7 @@ Post v24.03.0
>>>   - Remove "ovn-set-local-ip" config option from vswitchd
>>> external-ids, the option is no longer needed as it became effectively
>>> "true" for all scenarios.
>>> +  - Add support for ovsdb-server `--config-file` option in ovn-ctl.
>>> 
>>> OVN v24.03.0 - 01 Mar 2024
>>> --
>>> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
>>> index dae5e22f4..fd1ae1256 100755
>>> --- a/utilities/ovn-ctl
>>> +++ b/utilities/ovn-ctl
>>> @@ -169,6 +169,7 @@ start_ovsdb__() {
>>> local sync_from_port
>>> local file
>>> local schema
>>> +local config_file
>>> local logfile
>>> local log
>>> local sock
>>> @@ -199,6 +200,7 @@ start_ovsdb__() {
>>> eval sync_from_port=\$DB_${DB}_SYNC_FROM_PORT
>>> eval file=\$DB_${DB}_FILE
>>> eval schema=\$DB_${DB}_SCHEMA
>>> +eval config_file=\$DB_${DB}_CONFIG_FILE
>>> eval logfile=\$OVN_${DB}_LOGFILE
>>> eval log=\$OVN_${DB}_LOG
>>> eval sock=\$DB_${DB}_SOCK
>>> @@ -281,7 +283,12 @@ $cluster_remote_port
>>> 
>>> set ovsdb-server
>>> set "$@" $log --log-file=$logfile
>>> -set "$@" --remote=punix:$sock --pidfile=$db_pid_file
>>> +set "$@" --pidfile=$db_pid_file
>>> +if test X"$config_file" == X; then
>>> +set "$@" --remote=punix:$sock
>>> +else
>>> +set "$@" --config-file=$config_file
>>> +fi
>>> set "$@" --unixctl=$ctrl_sock
>>> 
>>> [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
>>> @@ -297,7 +304,7 @@ $cluster_remote_port
>>> set exec "$@"
>>> fi
>>> 
>>> -if test X"$use_remote_in_db" != Xno; then
>>> +if test X"$use_remote_in_db" != Xno && test X"$config_file" == X; then
>>> set "$@" --remote=db:$schema_name,$table_name,connections
>>> fi
>>> 
>>> @@ -343,6 +350,11 @@ $cluster_remote_port
>>> 
>>> local run_ovsdb_in_bg="no"
>>> local process_id=
>>> +
>>> +if test X$config_file = X; then
>>> +set "$@" "$file"
>>> +fi
>>> +
>>> if test X$detach = Xno && test $mode = cluster && test -z
>>> "$cluster_remote_addr" ; then
>>> # When detach is no (for run_nb_ovsdb/run_sb_ovsdb commands)
>>> # we want to run ovsdb-server in background rather than running
>>> it in
>>> @@ -351,10 +363,10 @@ $cluster_remote_port
>>> # Note: We run only the ovsdb-server in backgroud which created
>>> the
>>> # cluster (i.e cluster_remote_addr is not 

Re: [ovs-dev] [PATCH ovn] docs: List supported rolling upgrade paths.

2024-05-15 Thread Numan Siddique
On Fri, May 3, 2024 at 1:43 AM Ales Musil  wrote:
>
> On Fri, Apr 26, 2024 at 10:49 PM Ihar Hrachyshka 
> wrote:
>
> > The wording above is not completely clear without these scenarios
> > listed. A confused reader may incorrectly read it as:
> >
> > ```
> > Only LTS-to-LTS is supported for rolling upgrades.
> > ```
> >
> > which is wrong.
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  Documentation/intro/install/ovn-upgrades.rst | 5 +
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/Documentation/intro/install/ovn-upgrades.rst
> > b/Documentation/intro/install/ovn-upgrades.rst
> > index 1f99a86ec..f3dea07dc 100644
> > --- a/Documentation/intro/install/ovn-upgrades.rst
> > +++ b/Documentation/intro/install/ovn-upgrades.rst
> > @@ -74,6 +74,11 @@ To avoid buildup of complexity and technical debt we
> > limit the span of versions
> >  supported for a rolling upgrade on :ref:`long-term-support` (LTS), and it
> >  should always be possible to upgrade from the previous LTS version to the
> > next.
> >
> > +The following rolling upgrade paths are supported:
> > +
> > +1. LTS to the very next LTS release, or to any non-LTS in between the two.
> > +2. Any non-LTS to the very next LTS release.
> > +
> >  The first LTS version of OVN was 22.03.  If you want to upgrade between
> > other
> >  versions, you can use the `Fail-safe upgrade`_ procedure.
> >
> > --
> > 2.41.0
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
> Looks good to me, thanks.
>
> Acked-by: Ales Musil 

Thanks.  Applied to main.

Numan


>
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA 
>
> amu...@redhat.com
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovn] docs: Explain nature of ovs dependency.

2024-05-15 Thread Numan Siddique
On Fri, May 3, 2024 at 1:43 AM Ales Musil  wrote:
>
> On Fri, Apr 26, 2024 at 10:35 PM Ihar Hrachyshka 
> wrote:
>
> > The dependency is during build time, not runtime.
> >
> > Signed-off-by: Ihar Hrachyshka 
> > ---
> >  Documentation/intro/install/ovn-upgrades.rst | 18 --
> >  1 file changed, 12 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/intro/install/ovn-upgrades.rst
> > b/Documentation/intro/install/ovn-upgrades.rst
> > index bb387e2f8..1f99a86ec 100644
> > --- a/Documentation/intro/install/ovn-upgrades.rst
> > +++ b/Documentation/intro/install/ovn-upgrades.rst
> > @@ -40,13 +40,19 @@ Release Notes
> >  You should always check the OVS and OVN release notes (NEWS file) for any
> >  release specific notes on upgrades.
> >
> > -OVS
> > 
> > +Open vSwitch
> > +
> >
> > -OVN depends on and is included with OVS.  It's expected that OVS and OVN
> > are
> > -upgraded together, partly for convenience.  OVN is included in OVS
> > releases
> > -so it's easiest to upgrade them together.  OVN may also make use of new
> > -features of OVS only available in that release.
> > +OVN compiles with a particular version of Open vSwitch.  This is a
> > build-time
> > +dependency.
> > +
> > +In runtime, OVN should be able to work with any reasonably fresh version
> > of
> > +Open vSwitch (not necessarily the version that it was compiled against.)
> > +
> > +OVN may make use of new runtime features of Open vSwitch that are only
> > +available in a particular release. OVN is expected to test for an Open
> > vSwitch
> > +feature presence before using it, and gracefully handle scenarios where
> > Open
> > +vSwitch doesn't support a particular optional feature, yet.
> >
> >  Upgrade procedures
> >  --
> > --
> > 2.41.0
> >
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> >
> >
> Looks good to me, thanks.
>
> Acked-by: Ales Musil 

Thanks.  Applied to main.

Numan

>
> --
>
> Ales Musil
>
> Senior Software Engineer - OVN Core
>
> Red Hat EMEA 
>
> amu...@redhat.com
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v29 0/8] Add offload support for sFlow

2024-05-15 Thread Ilya Maximets
On 3/26/24 03:30, Chris Mi wrote:
> This patch set adds offload support for sFlow.
> 
> Psample is a genetlink channel for packet sampling. TC action act_sample
> uses psample to send sampled packets to userspace.
> 
> When offloading sample action to TC, userspace creates a unique ID to
> map sFlow action and tunnel info and passes this ID to kernel instead
> of the sFlow info. psample will send this ID and sampled packet to
> userspace. Using the ID, userspace can recover the sFlow info and send
> sampled packet to the right sFlow monitoring host.

Hi, Chris, others.

I've been looking through the set and a psample in general for the past
few days.  There are few fairly minor issues in the set like UBsan crash
because of incorrect OVS_RETURNS_NONNULL annotation and a few style and
formatting issues, which are not hard to fix.  However, I encountered
an issue with a way tunnel metadata is recovered that I'm not sure we
can actually fix.

The algorithm of work in this patch set is described in the cover letter
above.  The key point is then packet-1 goes though MISS upcall we install
a new datapath flow into TC and we remember the tunnel metadata of the
packet-1 associated with a sample ID.  When the packet-2 hits the datapath
flow (tc actions), it gets sent to psample and ovs-vswitchd reads this
packet from psample netlink socket and presents it as an ACTION upcall.
Doing that it finds tunnel metadata using the sample ID and uses it as a
tunnel metadata for packet-2.  The key of the problem is that this is
metadata from packet-1 and it can be different from metadata of packet-2.

An example of this will be having two separate TCP streams going through
the same VxLAN tunnel.  For load balancing reasons most UDP tunnels
choose source port of the outer UDP header based on RSS or 5-tuple hash
of the inner packet.  This means that two TCP streams going through the
same tunnel will have different UDP source port in the outer header.

When a packet from a first stream hits a MISS upcall, datapath flow will
be installed and the sample ID will be associated with the metadata of
the first stream.  Chances are this datapath flow will not have exact
match on the source port of the tunnel metadata.  That means that a
packet from the second stream will successfully match on the installed
datapath flow and will go to psample with the ID of the first stream.
OVS will read it from the psample socket and send to sFlow collector
with metadata it found by the ID, i.e. with a tunnel metadata that
contains tunnel source port of the first TCP stream, which is incorrect.

The test below reproduces the issue.  It's not a real test, it always fails
on purpose and not very pretty, but what it does is that it creates a
tunnel, then starts netcat server on one side and sends two files to it
from the other side.  These two transfers are separate TCP connections, so
they use different source ports, that is causing the tunnel to choose
different UDP source ports for them.  After the test failure we can see in
the sflow.log that all the values for 'tunnel4_in_src_port' are the same
for both streams and some exchanges prior to that.  However, if offload
is disabled, the values will be different as they should.

So, unless we can ensure that packets from different streams do not match
on the same datapath flow, this schema doesn't work.

The only way to ensure that packets from different streams within the same
tunnel do not match on the same datapath flow is to always have an exact
match on the whole tunnel metadata.  But I don't think that is a good idea,
because this way we'll have a separate datapath flow per TCP stream, which
will be very bad for performance.  And it will be bad for hardware offload
since hardware NICs normally have a more limited amount of available
resources.

This leads us to conclusion that only non-tunneling traffic can be offloaded
this way.  For this to work we'll have to add an explicit match on tunnel
metadata being empty (can that be expressed in TC?)

But at this point a question arises if this even worth implementing, taking
into account that it requires a lot of infrastructure changes throughout all
the layers of OVS code just for sampling of non-tunnel packets, i.e. a very
narrow use case.

Also, my understanding was that there is a plan to offload other types of
userspace() action in the future, such as controller() action using the same
psample infrastructure.  But that will not be possible for the same reason.
In addition to tunnel metadata we will also have to provide connection tracking
information, which is even harder, because conntrack state is more dynamic and
is only actually known to the datapath.  psample can't supply this information
to the userspace.

Thoughts?

Best regards, Ilya Maximets.

P.S.  The test that reproduces the issue:
---
diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index ee4817e8e..51abc2782 100644
--- a/tests/system-offloads-traffic.at

Re: [ovs-dev] [PATCH ovn] ovn-ctl: Support for --config-file ovsdb-server option.

2024-05-15 Thread Numan Siddique
On Fri, May 3, 2024 at 2:05 AM Ales Musil  wrote:
>
> On Tue, Apr 23, 2024 at 6:43 PM Vladislav Odintsov 
> wrote:
>
> > Since OVS 3.3.0 ovsdb-server accepts databases and remotes configuration
> > via JSON text file.  This patch adds support for such option.
> >
> > Signed-off-by: Vladislav Odintsov 

Thanks for the patch.

I applied this with the below changes

-
diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
index fd1ae12567..a4f410e4f7 100755
--- a/utilities/ovn-ctl
+++ b/utilities/ovn-ctl
@@ -1242,8 +1242,7 @@ File location options:
   --db-sb-relay-sock=SOCKET  OVN_IC_Northbound db socket (default:
$DB_SB_RELAY_SOCK)
   --db-sb-relay-pidfile=FILE OVN_Southbound relay db pidfile
(default: $DB_SB_RELAY_CTRL_PIDFILE)
   --db-sb-relay-ctrl-sock=SOCKET OVN_Southbound relay db control
socket (default: $DB_SB_RELAY_CTRL_SOCK)
-  --db-sb-relay-config-file=FILE OVN_IC_Northbound ovsdb-server
configuration file
- Mutually exclusive with
--db-ic-nb-use-remote-in-db=yes.
+  --db-sb-relay-config-file=FILE OVN_Southbound relay ovsdb-server
configuration file.
   --ovn-sb-relay-db-ssl-key=KEY OVN_Southbound DB relay SSL private key file
   --ovn-sb-relay-db-ssl-cert=CERT OVN_Southbound DB relay SSL certificate file
   --ovn-sb-relay-db-ssl-ca-cert=CERT OVN OVN_Southbound DB relay SSL
CA certificate file
diff --git a/utilities/ovn-ctl.8.xml b/utilities/ovn-ctl.8.xml
index c0fbb0792d..4f21ba4ea3 100644
--- a/utilities/ovn-ctl.8.xml
+++ b/utilities/ovn-ctl.8.xml
@@ -86,6 +86,11 @@
 --db-ic-sb-schema=FILE
 --db-ic-sb-create-insecure-remote=yes|no
 --db-ic-nb-create-insecure-remote=yes|no
+--db-nb-config-file=FILE
+--db-sb-config-file=FILE
+--db-ic-nb-config-file=FILE
+--db-ic-sb-config-file=FILE
+--db-sb-relay-config-file=FILE
 --ovn-controller-ssl-key=KEY
 --ovn-controller-ssl-cert=CERT
 --ovn-controller-ssl-ca-cert=CERT

-


Numan

> > ---
> >  NEWS  |  1 +
> >  utilities/ovn-ctl | 39 +++
> >  2 files changed, 36 insertions(+), 4 deletions(-)
> >
> > diff --git a/NEWS b/NEWS
> > index 9adf6a31c..39ea88d78 100644
> > --- a/NEWS
> > +++ b/NEWS
> > @@ -16,6 +16,7 @@ Post v24.03.0
> >- Remove "ovn-set-local-ip" config option from vswitchd
> >  external-ids, the option is no longer needed as it became effectively
> >  "true" for all scenarios.
> > +  - Add support for ovsdb-server `--config-file` option in ovn-ctl.
> >
> >  OVN v24.03.0 - 01 Mar 2024
> >  --
> > diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl
> > index dae5e22f4..fd1ae1256 100755
> > --- a/utilities/ovn-ctl
> > +++ b/utilities/ovn-ctl
> > @@ -169,6 +169,7 @@ start_ovsdb__() {
> >  local sync_from_port
> >  local file
> >  local schema
> > +local config_file
> >  local logfile
> >  local log
> >  local sock
> > @@ -199,6 +200,7 @@ start_ovsdb__() {
> >  eval sync_from_port=\$DB_${DB}_SYNC_FROM_PORT
> >  eval file=\$DB_${DB}_FILE
> >  eval schema=\$DB_${DB}_SCHEMA
> > +eval config_file=\$DB_${DB}_CONFIG_FILE
> >  eval logfile=\$OVN_${DB}_LOGFILE
> >  eval log=\$OVN_${DB}_LOG
> >  eval sock=\$DB_${DB}_SOCK
> > @@ -281,7 +283,12 @@ $cluster_remote_port
> >
> >  set ovsdb-server
> >  set "$@" $log --log-file=$logfile
> > -set "$@" --remote=punix:$sock --pidfile=$db_pid_file
> > +set "$@" --pidfile=$db_pid_file
> > +if test X"$config_file" == X; then
> > +set "$@" --remote=punix:$sock
> > +else
> > +set "$@" --config-file=$config_file
> > +fi
> >  set "$@" --unixctl=$ctrl_sock
> >
> >  [ "$OVN_USER" != "" ] && set "$@" --user "$OVN_USER"
> > @@ -297,7 +304,7 @@ $cluster_remote_port
> >  set exec "$@"
> >  fi
> >
> > -if test X"$use_remote_in_db" != Xno; then
> > +if test X"$use_remote_in_db" != Xno && test X"$config_file" == X; then
> >  set "$@" --remote=db:$schema_name,$table_name,connections
> >  fi
> >
> > @@ -343,6 +350,11 @@ $cluster_remote_port
> >
> >  local run_ovsdb_in_bg="no"
> >  local process_id=
> > +
> > +if test X$config_file = X; then
> > +set "$@" "$file"
> > +fi
> > +
> >  if test X$detach = Xno && test $mode = cluster && test -z
> > "$cluster_remote_addr" ; then
> >  # When detach is no (for run_nb_ovsdb/run_sb_ovsdb commands)
> >  # we want to run ovsdb-server in background rather than running
> > it in
> > @@ -351,10 +363,10 @@ $cluster_remote_port
> >  # Note: We run only the ovsdb-server in backgroud which created
> > the
> >  # cluster (i.e cluster_remote_addr is not set.).
> >  run_ovsdb_in_bg="yes"
> > -"$@" $file &
> > +"$@" &
> >  process_id=$!
> >  else
> > -start_wrapped_daemon "$wrapper" ovsdb-$db "" "$@" 

[ovs-dev] [PATCH] ipf: Only add fragments to batch of same dl_type.

2024-05-15 Thread Mike Pattrick
When conntrack is reassembling packet fragments, the same reassembly
context can be shared across multiple threads handling different packets
simultaneously. Once a full packet is assembled, it is added to a packet
batch for processing, this is most likely the batch that added it in the
first place, but that isn't a guarantee.

The packets in these batches should be segregated by network protocol
versuib (ipv4 vs ipv6) for conntrack defragmentation to function
appropriately. However, there are conditions where we would add a
reassembled packet of one type to a batch of another.

This change introduces checks to make sure that reassembled or expired
fragments are only added to packet batches of the same type. It also
makes a best effort attempt to make sure the defragmented packet is
inserted into the current batch.

Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
Reported-at: https://issues.redhat.com/browse/FDP-560
Signed-off-by: Mike Pattrick 
---
Note: This solution is far from perfect, ipf.c can still insert packets
into more or less arbitrary batches but this bug fix is needed to avoid a
memory overrun and should insert packets into the proper batch in the
common case. I'm working on a more correct solution but it changes how
fragments are fundimentally handled, and couldn't be considered a bug fix.
---
 lib/ipf.c | 40 +++-
 1 file changed, 31 insertions(+), 9 deletions(-)

diff --git a/lib/ipf.c b/lib/ipf.c
index 7d74e2c13..90c819d63 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -506,13 +506,15 @@ ipf_reassemble_v6_frags(struct ipf_list *ipf_list)
 }
 
 /* Called when a frag list state transitions to another state. This is
- * triggered by new fragment for the list being received.*/
-static void
+* triggered by new fragment for the list being received. Returns a reassembled
+* packet if this fragment has completed one. */
+static struct reassembled_pkt *
 ipf_list_state_transition(struct ipf *ipf, struct ipf_list *ipf_list,
   bool ff, bool lf, bool v6)
 OVS_REQUIRES(ipf->ipf_lock)
 {
 enum ipf_list_state curr_state = ipf_list->state;
+struct reassembled_pkt *ret = NULL;
 enum ipf_list_state next_state;
 switch (curr_state) {
 case IPF_LIST_STATE_UNUSED:
@@ -562,12 +564,15 @@ ipf_list_state_transition(struct ipf *ipf, struct 
ipf_list *ipf_list,
 ipf_reassembled_list_add(>reassembled_pkt_list, rp);
 ipf_expiry_list_remove(ipf_list);
 next_state = IPF_LIST_STATE_COMPLETED;
+ret = rp;
 } else {
 next_state = IPF_LIST_STATE_REASS_FAIL;
 }
 }
 }
 ipf_list->state = next_state;
+
+return ret;
 }
 
 /* Some sanity checks are redundant, but prudent, in case code paths for
@@ -799,7 +804,8 @@ ipf_is_frag_duped(const struct ipf_frag *frag_list, int 
last_inuse_idx,
 static bool
 ipf_process_frag(struct ipf *ipf, struct ipf_list *ipf_list,
  struct dp_packet *pkt, uint16_t start_data_byte,
- uint16_t end_data_byte, bool ff, bool lf, bool v6)
+ uint16_t end_data_byte, bool ff, bool lf, bool v6,
+ struct reassembled_pkt **rp)
 OVS_REQUIRES(ipf->ipf_lock)
 {
 bool duped_frag = ipf_is_frag_duped(ipf_list->frag_list,
@@ -820,13 +826,14 @@ ipf_process_frag(struct ipf *ipf, struct ipf_list 
*ipf_list,
 ipf_list->last_inuse_idx++;
 atomic_count_inc(>nfrag);
 ipf_count(ipf, v6, IPF_NFRAGS_ACCEPTED);
-ipf_list_state_transition(ipf, ipf_list, ff, lf, v6);
+*rp = ipf_list_state_transition(ipf, ipf_list, ff, lf, v6);
 } else {
 OVS_NOT_REACHED();
 }
 } else {
 ipf_count(ipf, v6, IPF_NFRAGS_OVERLAP);
 pkt->md.ct_state = CS_INVALID;
+*rp = NULL;
 return false;
 }
 return true;
@@ -853,7 +860,8 @@ ipf_list_init(struct ipf_list *ipf_list, struct 
ipf_list_key *key,
  * to a list of fragemnts. */
 static bool
 ipf_handle_frag(struct ipf *ipf, struct dp_packet *pkt, ovs_be16 dl_type,
-uint16_t zone, long long now, uint32_t hash_basis)
+uint16_t zone, long long now, uint32_t hash_basis,
+struct reassembled_pkt **rp)
 OVS_REQUIRES(ipf->ipf_lock)
 {
 struct ipf_list_key key;
@@ -922,7 +930,7 @@ ipf_handle_frag(struct ipf *ipf, struct dp_packet *pkt, 
ovs_be16 dl_type,
 }
 
 return ipf_process_frag(ipf, ipf_list, pkt, start_data_byte,
-end_data_byte, ff, lf, v6);
+end_data_byte, ff, lf, v6, rp);
 }
 
 /* Filters out fragments from a batch of fragments and adjust the batch. */
@@ -933,6 +941,7 @@ ipf_extract_frags_from_batch(struct ipf *ipf, struct 
dp_packet_batch *pb,
 {
 const size_t pb_cnt = dp_packet_batch_size(pb);
 int pb_idx; /* Index in a packet batch. */
+struct 

Re: [ovs-dev] vlog: Destroy async_append first then close log_fd.

2024-05-15 Thread Simon Horman
On Wed, May 15, 2024 at 11:28:21AM +0800, hepeng via dev wrote:
> From: Peng He 
> 
> async_append stores log_fd, it should be destructed before log_fd
> is closed.
> 
> Signed-off-by: Peng He 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] table: Fix freeing global variable.

2024-05-15 Thread Simon Horman
On Wed, May 15, 2024 at 02:14:13PM +0800, Yunjian Wang via dev wrote:
> From: Pengfei Sun 
> 
> In function shash_replace_nocopy, argument to free() is the address
> of a global variable (argument passed by function table_print_json__),
> which is not memory allocated by malloc().
> 
> ovsdb-client -f json monitor Open_vSwitch --timestamp
> 
> ASan reports:
> =
> ==1443083==ERROR: AddressSanitizer: attempting free on address
> which was not malloc()-ed: 0x00535980 in thread T0
> #0 0xaf1c9eac in __interceptor_free (/usr/lib64/libasan.so.6+
> 0xa4eac)
> #1 0x4826e4 in json_destroy_object lib/json.c:445
> #2 0x4826e4 in json_destroy__ lib/json.c:403
> #3 0x4cc4e4 in table_print lib/table.c:633
> #4 0x410650 in monitor_print_table ovsdb/ovsdb-client.c:1019
> #5 0x410650 in monitor_print ovsdb/ovsdb-client.c:1040
> #6 0x4110cc in monitor_print ovsdb/ovsdb-client.c:1030
> #7 0x4110cc in do_monitor__ ovsdb/ovsdb-client.c:1503
> #8 0x40743c in main ovsdb/ovsdb-client.c:283
> #9 0xaeb50038  (/usr/lib64/libc.so.6+0x2b038)
> #10 0xaeb50110 in __libc_start_main (/usr/lib64/libc.so.6+
> 0x2b110)
> #11 0x40906c in _start (/usr/local/bin/ovsdb-client+0x40906c)
> 
> Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a 
> table as a string.")
> Signed-off-by: Pengfei Sun 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 4/6] netdev-dpdk: Refactor TSO request code.

2024-05-15 Thread Kevin Traynor
On 13/05/2024 08:13, David Marchand wrote:
> Hello Kevin,
> 
> Thanks for reviewing.
> 
> On Fri, May 10, 2024 at 11:50 PM Kevin Traynor  wrote:
>>
>> On 19/04/2024 15:06, David Marchand wrote:
>>> Replace check on th == NULL with an assert() because dp_packet_l4(pkt)
>>> is priorly used to compute (outer) L3 length.
>>>
>>> Besides, filling l4_len and tso_segsz only matters to TSO, so there is
>>> no need to check for other L4 checksum offloading requests.
>>>
>>> Signed-off-by: David Marchand 
>>> ---
>>>  lib/netdev-dpdk.c | 36 +++-
>>>  1 file changed, 11 insertions(+), 25 deletions(-)
>>>
>>> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>>> index 8b6a3ed189..661269e4b6 100644
>>> --- a/lib/netdev-dpdk.c
>>> +++ b/lib/netdev-dpdk.c
>>> @@ -2584,7 +2584,6 @@ static bool
>>>  netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf 
>>> *mbuf)
>>>  {
>>>  struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
>>> -struct tcp_header *th;
>>>
>>>  const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
>>>   RTE_MBUF_F_TX_L4_MASK |
>>> @@ -2614,6 +2613,8 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
>>> struct rte_mbuf *mbuf)
>>>  return true;
>>>  }
>>>
>>> +ovs_assert(dp_packet_l4(pkt));
>>
>> I'm not clear why you want to change this from a warning/return
>> fail/drop to an assert ?
> 
> From this point in the function, there is at least one request for
> checksum offloading pending.
> Any L3 (or higher) checksum requested by OVS means that the packet has
> been parsed/composed as either IP or IPv6 and packet->l4_ofs was set
> to point after the l3 header (with miniflow_extract / *_compose()
> helpers).
> 
> So getting a NULL pointer for l4 here indicates a bug in OVS.
> An assert seems better than a warn/return that probably nobody notice(d).
> 
> Did I miss a case where l4_ofs can be unset?
> 

I don't have a reason why it wouldn't be set. I was just wondering why
the change from drop/warn.

>>
>> Nit: should this be in the previous patch instead ? and I see it is
>> removed in a later patch.
> 
> It is not supposed to be removed in the series.
> The last patch moves it later in the function.
> 

ok, I hadn't reviewed full series when i made that comment.

> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 6/6] netdev-dpdk: Refactor tunnel checksum offloading.

2024-05-15 Thread Kevin Traynor
On 19/04/2024 15:06, David Marchand wrote:
> All informations required for checksum offloading can be deduced by
> already tracked dp_packet l3_ofs, l4_ofs, inner_l3_ofs and inner_l4_ofs
> fields.
> Remove DPDK specific l[2-4]_len from generic OVS code.
> 
> netdev-dpdk code then fills mbuf specifics step by step:
> - outer_l2_len and outer_l3_len are needed for tunneling (and below
>   features),
> - l2_len and l3_len are needed for IP and L4 checksum (and below features),
> - l4_len and tso_segsz are needed when doing TSO,
> 
> Signed-off-by: David Marchand 
> ---
>  lib/dp-packet.h | 37 --
>  lib/netdev-dpdk.c   | 35 ++---
>  lib/netdev-native-tnl.c | 50 +
>  3 files changed, 27 insertions(+), 95 deletions(-)
> 
> diff --git a/lib/dp-packet.h b/lib/dp-packet.h
> index 3622764c47..a75b1c5cdb 100644
> --- a/lib/dp-packet.h
> +++ b/lib/dp-packet.h
> @@ -604,25 +604,6 @@ dp_packet_get_nd_payload(const struct dp_packet *b)
>  }
>  
>  #ifdef DPDK_NETDEV
> -static inline void
> -dp_packet_set_l2_len(struct dp_packet *b, size_t l2_len)
> -{
> -b->mbuf.l2_len = l2_len;
> -}
> -
> -static inline void
> -dp_packet_set_l3_len(struct dp_packet *b, size_t l3_len)
> -{
> -b->mbuf.l3_len = l3_len;
> -}
> -
> -static inline void
> -dp_packet_set_l4_len(struct dp_packet *b, size_t l4_len)
> -{
> -b->mbuf.l4_len = l4_len;
> -}
> -
> -
>  static inline uint64_t *
>  dp_packet_ol_flags_ptr(const struct dp_packet *b)
>  {
> @@ -642,24 +623,6 @@ dp_packet_flow_mark_ptr(const struct dp_packet *b)
>  }
>  
>  #else
> -static inline void
> -dp_packet_set_l2_len(struct dp_packet *b OVS_UNUSED, size_t l2_len 
> OVS_UNUSED)
> -{
> -/* There is no implementation. */
> -}
> -
> -static inline void
> -dp_packet_set_l3_len(struct dp_packet *b OVS_UNUSED, size_t l3_len 
> OVS_UNUSED)
> -{
> -/* There is no implementation. */
> -}
> -
> -static inline void
> -dp_packet_set_l4_len(struct dp_packet *b OVS_UNUSED, size_t l4_len 
> OVS_UNUSED)
> -{
> -/* There is no implementation. */
> -}
> -
>  static inline uint32_t *
>  dp_packet_ol_flags_ptr(const struct dp_packet *b)
>  {
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 1dad2ef833..31dd6f1d5a 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2584,6 +2584,9 @@ static bool
>  netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, struct rte_mbuf *mbuf)
>  {
>  struct dp_packet *pkt = CONTAINER_OF(mbuf, struct dp_packet, mbuf);
> +void *l2;
> +void *l3;
> +void *l4;
>  
>  const uint64_t all_inner_requests = (RTE_MBUF_F_TX_IP_CKSUM |
>   RTE_MBUF_F_TX_L4_MASK |
> @@ -2613,11 +2616,6 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> struct rte_mbuf *mbuf)
>  return true;
>  }
>  
> -ovs_assert(dp_packet_l4(pkt));
> -
> -/* If packet is vxlan or geneve tunnel packet, calculate outer
> - * l2 len and outer l3 len. Inner l2/l3/l4 len are calculated
> - * before. */
>  const uint64_t tunnel_type = mbuf->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK;
>  if (OVS_UNLIKELY(tunnel_type &&
>   tunnel_type != RTE_MBUF_F_TX_TUNNEL_GENEVE &&
> @@ -2635,6 +2633,11 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> struct rte_mbuf *mbuf)
>   (char *) dp_packet_eth(pkt);
>  mbuf->outer_l3_len = (char *) dp_packet_l4(pkt) -
>   (char *) dp_packet_l3(pkt);
> +

> +/* Inner L2 length must account for the tunnel header length. */
> +l2 = dp_packet_l4(pkt);

Code looks ok to me, but it's tricky and the L2 settings with inner
requests are a bit unintuitive without a notepad and thinking from the
driver perspective backwards. Not sure there is much can be done to
mitigate that here, other than the comment you added.

Did you manage to test to confirm they're working as expected ?

> +l3 = dp_packet_inner_l3(pkt);
> +l4 = dp_packet_inner_l4(pkt);

see below

>  } else {
>  /* If no outer offloading is requested, clear outer marks. */
>  mbuf->ol_flags &= ~all_outer_marks;
> @@ -2642,8 +2645,9 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> struct rte_mbuf *mbuf)
>  mbuf->outer_l3_len = 0;
>  
>  /* Skip outer headers. */
> -mbuf->l2_len += (char *) dp_packet_l4(pkt) -
> -(char *) dp_packet_eth(pkt);
> +l2 = dp_packet_eth(pkt);

> +l3 = dp_packet_inner_l3(pkt);
> +l4 = dp_packet_inner_l4(pkt);

You could move these outside the inner (pardon the pun) if else, but I
could understand if you prefer to set l2/l3/l4 together for better
readability ?

>  }
>  } else {
>  if (tunnel_type) {
> @@ -2663,22 +2667,27 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk 

[ovs-dev] [PATCH v4] lib: Assert for incorrect packet.

2024-05-15 Thread Amit Prakash Shukla
Packet that are not encapsulated but metadata of the packet contains
a offload flag set, will call dp_packet_inner_l4 to get TCP, UDP, SCTP
header pointers. dp_packet_inner_l4 for such packets would return NULL
as the inner offsets by-default are configured as UINT16_MAX. On
derefrencing such pointers, segfault is observed.

Add assert check for packets with incorrect header or incorrect
offload flag set.

Signed-off-by: Amit Prakash Shukla 
---
v2:
- Added Fixes tag and updated commit message.

v3:
- Resolved review comment - added assert.
- Updated patch subject and commit message.

v4:
- Fixed checkpatch warning.

 lib/packets.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/packets.c b/lib/packets.c
index 5803d26f4..ebf516d67 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -2011,6 +2011,9 @@ packet_tcp_complete_csum(struct dp_packet *p, bool inner)
 tcp_sz = dp_packet_l4_size(p);
 }
 
+ovs_assert(tcp);
+ovs_assert(ip_hdr);
+
 if (!inner && dp_packet_hwol_is_outer_ipv6(p)) {
 is_v4 = false;
 } else if (!inner && dp_packet_hwol_is_outer_ipv4(p)) {
@@ -2057,6 +2060,9 @@ packet_udp_complete_csum(struct dp_packet *p, bool inner)
 udp_sz = dp_packet_l4_size(p);
 }
 
+ovs_assert(udp);
+ovs_assert(ip_hdr);
+
 /* Skip csum calculation if the udp_csum is zero. */
 if (!udp->udp_csum) {
 return;
@@ -2109,6 +2115,8 @@ packet_sctp_complete_csum(struct dp_packet *p, bool inner)
 tp_len = dp_packet_l4_size(p);
 }
 
+ovs_assert(sh);
+
 put_16aligned_be32(>sctp_csum, 0);
 csum = crc32c((void *) sh, tp_len);
 put_16aligned_be32(>sctp_csum, csum);
-- 
2.34.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3 5/6] netdev-dpdk: Use guest TSO segmentation size hint.

2024-05-15 Thread Kevin Traynor
On 19/04/2024 15:06, David Marchand wrote:
> In a typical setup like:
> guest A <-virtio-> OVS A <-vxlan-> OVS B <-virtio-> guest B
> 
> TSO packets from guest A are segmented against the OVS A physical port
> mtu adjusted by the vxlan tunnel header size, regardless of guest A
> interface mtu.
> 
> As an example, let's say guest A and guest B mtu are set to 1500 bytes.
> OVS A and OVS B physical ports mtu are set to 1600 bytes.
> Guest A will request TCP segmentation for 1448 bytes segments.
> On the other hand, OVS A will request 1498 bytes segments to the HW.
> This results in OVS B dropping packets because decapsulated packets
> are larger than the vhost-user port (serving guest B) mtu.
> 
> 2024-04-17T14:13:01.239Z|2|netdev_dpdk(pmd-c03/id:7)|WARN|vhost0:
>   Too big size 1564 max_packet_len 1518
> 
> vhost-user ports expose a guest mtu by filling mbuf->tso_segsz.
> Use it as a hint.
> 
> This may result in segments (on the wire) slightly shorter than the
> optimal size.
> 
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/321
> Signed-off-by: David Marchand 
> ---
> Note:
> As we trust the guest with this change, should we put a lower limit on
> mbuf->tso_segsz?
> 

There are some checks I looked at (e.g [0]), but it could be checked
here for an earlier drop i suppose...additional comment below

[0]
https://git.dpdk.org/dpdk/tree/drivers/net/ice/ice_rxtx.c#n3754

> ---
>  lib/netdev-dpdk.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 661269e4b6..1dad2ef833 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2671,14 +2671,19 @@ netdev_dpdk_prep_hwol_packet(struct netdev_dpdk *dev, 
> struct rte_mbuf *mbuf)
>  
>  if (mbuf->ol_flags & RTE_MBUF_F_TX_TCP_SEG) {
>  struct tcp_header *th = dp_packet_l4(pkt);
> +uint16_t link_tso_segsz;
>  int hdr_len;
>  
>  if (tunnel_type) {
> -mbuf->tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> -  mbuf->l4_len - mbuf->outer_l3_len;
> +link_tso_segsz = dev->mtu - mbuf->l2_len - mbuf->l3_len -
> + mbuf->l4_len - mbuf->outer_l3_len;
>  } else {
>  mbuf->l4_len = TCP_OFFSET(th->tcp_ctl) * 4;
> -mbuf->tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> +link_tso_segsz = dev->mtu - mbuf->l3_len - mbuf->l4_len;
> +}
> +
> +if (!mbuf->tso_segsz || mbuf->tso_segsz > link_tso_segsz) {

It seems like something is not right if the flag is set but tso_segsz is
0. It is set by vhost lib from gso_size, but I don't see a validation
there either.

So it could be checked and/or adjusted here for obviously incorrect
values. The only question would be is it the right layer to check these,
or is it more appropriate for it to be done in the drivers.

> +mbuf->tso_segsz = link_tso_segsz;
>  }
>  
>  hdr_len = mbuf->l2_len + mbuf->l3_len + mbuf->l4_len;

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v3] lib: assert for incorrect packet.

2024-05-15 Thread 0-day Robot
Bleep bloop.  Greetings Amit Prakash Shukla, I am a robot and I have tried out 
your patch.
Thanks for your contribution.

I encountered some error that I wasn't expecting.  See the details below.


checkpatch:
WARNING: The subject summary should start with a capital.
Subject: lib: assert for incorrect packet.
Lines checked: 56, Warnings: 1, Errors: 0


Please check this out.  If you feel there has been an error, please email 
acon...@redhat.com

Thanks,
0-day Robot
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] lib: assert for incorrect packet.

2024-05-15 Thread Amit Prakash Shukla
Packet that are not encapsulated but metadata of the packet contains
a offload flag set, will call dp_packet_inner_l4 to get TCP, UDP, SCTP
header pointers. dp_packet_inner_l4 for such packets would return NULL
as the inner offsets by-default are configured as UINT16_MAX. On
derefrencing such pointers, segfault is observed.

Add assert check for packets with incorrect header or incorrect
offload flag set.

Signed-off-by: Amit Prakash Shukla 
---
v2:
- Added Fixes tag and updated commit message.

v3:
- Resolved review comment - added assert.
- Updated patch subject and commit message.

 lib/packets.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/lib/packets.c b/lib/packets.c
index 5803d26f4..ebf516d67 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -2011,6 +2011,9 @@ packet_tcp_complete_csum(struct dp_packet *p, bool inner)
 tcp_sz = dp_packet_l4_size(p);
 }
 
+ovs_assert(tcp);
+ovs_assert(ip_hdr);
+
 if (!inner && dp_packet_hwol_is_outer_ipv6(p)) {
 is_v4 = false;
 } else if (!inner && dp_packet_hwol_is_outer_ipv4(p)) {
@@ -2057,6 +2060,9 @@ packet_udp_complete_csum(struct dp_packet *p, bool inner)
 udp_sz = dp_packet_l4_size(p);
 }
 
+ovs_assert(udp);
+ovs_assert(ip_hdr);
+
 /* Skip csum calculation if the udp_csum is zero. */
 if (!udp->udp_csum) {
 return;
@@ -2109,6 +2115,8 @@ packet_sctp_complete_csum(struct dp_packet *p, bool inner)
 tp_len = dp_packet_l4_size(p);
 }
 
+ovs_assert(sh);
+
 put_16aligned_be32(>sctp_csum, 0);
 csum = crc32c((void *) sh, tp_len);
 put_16aligned_be32(>sctp_csum, csum);
-- 
2.34.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [v2] odp-execute: Fix AVX checksum calculation.

2024-05-15 Thread Eelco Chaudron


On 14 May 2024, at 15:48, Emma Finn wrote:

> The AVX implementation for calcualting checksums was not
> handling carry-over addition correctly in some cases.
> This patch adds an additional shuffle to add 16-bit padding
> to the final part of the calculation to handle such cases.
> This commit also adds a unit test to fuzz test the actions
> autovalidator.
>
> Signed-off-by: Emma Finn 
> Reported-by: Eelco Chaudron 

Hi Emma,

Thanks for also fixing the IPv6 case, however, the test you added does not seem 
to catch the issue. See notes below.

Cheers,

Eelco

> ---
>  lib/odp-execute-avx512.c |  5 +
>  tests/dpif-netdev.at | 26 ++
>  2 files changed, 31 insertions(+)
>
> diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c
> index 50c48bfd4..a74a85dc1 100644
> --- a/lib/odp-execute-avx512.c
> +++ b/lib/odp-execute-avx512.c
> @@ -366,6 +366,8 @@ avx512_get_delta(__m256i old_header, __m256i new_header)
>0xF, 0xF, 0xF, 0xF);
>  v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
>
> +v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
> +v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a);
>  v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>  v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
>
> @@ -575,6 +577,9 @@ avx512_ipv6_sum_header(__m512i ip6_header)
>0xF, 0xF, 0xF, 0xF);
>
>  v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta);
> +
> +v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
> +v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a);
>  v_delta = _mm256_hadd_epi32(v_delta, v_zeros);
>  v_delta = _mm256_hadd_epi16(v_delta, v_zeros);
>
> diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at
> index 790b5a43a..4db6a99e1 100644
> --- a/tests/dpif-netdev.at
> +++ b/tests/dpif-netdev.at
> @@ -1091,3 +1091,29 @@ OVS_VSWITCHD_STOP(["dnl
>  /Error: unknown miniflow extract implementation superstudy./d
>  /Error: invalid study_pkt_cnt value: -pmd./d"])
>  AT_CLEANUP
> +
> +AT_SETUP([datapath - Actions Autovalidator Fuzzy])

This is not a Fuzzy test, but a normal Actions Autovalidator.

However, the main problem with this test is that it does not find the problem. 
Even without the C code changes, it’s passing the test.

Maybe it will be better to add a specific test to capture checksum wrapping for 
IPv4 and 6. In addition, you should also make sure the received packet is ok. 
You can use options:pcap=p1.pcap for this, see other test cases.

> +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], [])
> +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 2000 > packets])
> +
> +OVS_VSWITCHD_START(add-port br0 p0 -- set Interface p0 type=dummy \
> +   -- add-port br0 p1 -- set Interface p1 type=dummy)
> +
> +AT_CHECK([ovs-appctl odp-execute/action-impl-set autovalidator], [0], [dnl
> +Action implementation set to autovalidator.
> +])
> +
> +AT_DATA([flows.txt], [dnl
> +  in_port=p0,ip,actions=mod_nw_src=10.1.1.1,p1
> +  in_port=p0,ipv6,actions=set_field:fc00::100->ipv6_src,p1
> +])
> +
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl -Oopenflow13 add-flows br0 flows.txt])
> +
> +cat packets | while read line; do
> +  AT_CHECK([ovs-appctl netdev-dummy/receive p0 $line], [0], [ignore])
> +done
> +
> +OVS_TRAFFIC_VSWITCHD_STOP
> +AT_CLEANUP
> -- 
> 2.25.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] table: Fix freeing global variable.

2024-05-15 Thread Eelco Chaudron



On 15 May 2024, at 8:14, Yunjian Wang via dev wrote:

> From: Pengfei Sun 
>
> In function shash_replace_nocopy, argument to free() is the address
> of a global variable (argument passed by function table_print_json__),
> which is not memory allocated by malloc().
>
> ovsdb-client -f json monitor Open_vSwitch --timestamp
>
> ASan reports:
> =
> ==1443083==ERROR: AddressSanitizer: attempting free on address
> which was not malloc()-ed: 0x00535980 in thread T0
> #0 0xaf1c9eac in __interceptor_free (/usr/lib64/libasan.so.6+
> 0xa4eac)
> #1 0x4826e4 in json_destroy_object lib/json.c:445
> #2 0x4826e4 in json_destroy__ lib/json.c:403
> #3 0x4cc4e4 in table_print lib/table.c:633
> #4 0x410650 in monitor_print_table ovsdb/ovsdb-client.c:1019
> #5 0x410650 in monitor_print ovsdb/ovsdb-client.c:1040
> #6 0x4110cc in monitor_print ovsdb/ovsdb-client.c:1030
> #7 0x4110cc in do_monitor__ ovsdb/ovsdb-client.c:1503
> #8 0x40743c in main ovsdb/ovsdb-client.c:283
> #9 0xaeb50038  (/usr/lib64/libc.so.6+0x2b038)
> #10 0xaeb50110 in __libc_start_main (/usr/lib64/libc.so.6+
> 0x2b110)
> #11 0x40906c in _start (/usr/local/bin/ovsdb-client+0x40906c)
>
> Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a 
> table as a string.")
> Signed-off-by: Pengfei Sun 

Thanks for the patch. This change looks good to me.

Acked-by: Eelco Chaudron 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] table: Fix freeing global variable.

2024-05-15 Thread Yunjian Wang via dev
From: Pengfei Sun 

In function shash_replace_nocopy, argument to free() is the address
of a global variable (argument passed by function table_print_json__),
which is not memory allocated by malloc().

ovsdb-client -f json monitor Open_vSwitch --timestamp

ASan reports:
=
==1443083==ERROR: AddressSanitizer: attempting free on address
which was not malloc()-ed: 0x00535980 in thread T0
#0 0xaf1c9eac in __interceptor_free (/usr/lib64/libasan.so.6+
0xa4eac)
#1 0x4826e4 in json_destroy_object lib/json.c:445
#2 0x4826e4 in json_destroy__ lib/json.c:403
#3 0x4cc4e4 in table_print lib/table.c:633
#4 0x410650 in monitor_print_table ovsdb/ovsdb-client.c:1019
#5 0x410650 in monitor_print ovsdb/ovsdb-client.c:1040
#6 0x4110cc in monitor_print ovsdb/ovsdb-client.c:1030
#7 0x4110cc in do_monitor__ ovsdb/ovsdb-client.c:1503
#8 0x40743c in main ovsdb/ovsdb-client.c:283
#9 0xaeb50038  (/usr/lib64/libc.so.6+0x2b038)
#10 0xaeb50110 in __libc_start_main (/usr/lib64/libc.so.6+
0x2b110)
#11 0x40906c in _start (/usr/local/bin/ovsdb-client+0x40906c)

Fixes: cb139fa8b3a1 ("table: New function table_format() for formatting a table 
as a string.")
Signed-off-by: Pengfei Sun 
---
V2:
Use json_object_put instead of json_object_put_nocopy
by Ilya Maximets
---
 lib/table.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/table.c b/lib/table.c
index 48d18b6..b7addbf 100644
--- a/lib/table.c
+++ b/lib/table.c
@@ -522,7 +522,7 @@ table_print_json__(const struct table *table, const struct 
table_style *style,
 json_object_put_string(json, "caption", table->caption);
 }
 if (table->timestamp) {
-json_object_put_nocopy(
+json_object_put(
 json, "time",
 json_string_create_nocopy(table_format_timestamp__()));
 }
-- 
2.33.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev