[ovs-dev] [PATCH ovn] ofctrl: Wait at S_WAIT_BEFORE_CLEAR only once.

2024-03-04 Thread Han Zhou
The ovn-ofctrl-wait-before-clear setting is designed to minimize
downtime during the initial start-up of the ovn-controller. For this
purpose, the ovn-controller should wait only once upon entering the
S_WAIT_BEFORE_CLEAR state for the first time. Subsequent reconnections
to the OVS, such as those occurring during an OVS restart/upgrade,
should not trigger this wait. However, the current implemention always
waits for the configured time in the S_WAIT_BEFORE_CLEAR state, which
can inadvertently delay flow installations during OVS restart/upgrade,
potentially causing more harm than good. (The extent of the impact
varies based on the method used to restart OVS, including whether flow
save/restore tools and the flow-restore-wait feature are employed.)

This patch avoids the unnecessary wait after the initial one.

Fixes: 896adfd2d8b3 ("ofctrl: Support ovn-ofctrl-wait-before-clear to reduce 
down time during upgrade.")
Signed-off-by: Han Zhou 
---
 controller/ofctrl.c | 1 -
 tests/ovn-controller.at | 9 +++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/controller/ofctrl.c b/controller/ofctrl.c
index f14cd79a8dbb..0d72ecbaa167 100644
--- a/controller/ofctrl.c
+++ b/controller/ofctrl.c
@@ -634,7 +634,6 @@ run_S_WAIT_BEFORE_CLEAR(void)
 if (!wait_before_clear_time ||
 (wait_before_clear_expire &&
  time_msec() >= wait_before_clear_expire)) {
-wait_before_clear_expire = 0;
 state = S_CLEAR_FLOWS;
 return;
 }
diff --git a/tests/ovn-controller.at b/tests/ovn-controller.at
index 37f1ded1bd26..b65e11722cbb 100644
--- a/tests/ovn-controller.at
+++ b/tests/ovn-controller.at
@@ -2284,10 +2284,15 @@ lflow_run_2=$(ovn-appctl -t ovn-controller 
coverage/read-counter lflow_run)
 AT_CHECK_UNQUOTED([echo $lflow_run_1], [0], [$lflow_run_2
 ])
 
-# Restart OVS this time, and wait until flows are reinstalled
+# Restart OVS this time. Flows should be reinstalled without waiting.
 OVS_APP_EXIT_AND_WAIT([ovs-vswitchd])
 start_daemon ovs-vswitchd --enable-dummy=system -vvconn -vofproto_dpif 
-vunixctl
-OVS_WAIT_UNTIL([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 
2.2.2.2])
+
+# Sleep for 3s, which is long enough for the flows to be installed, but
+# shorter than the wait-before-clear (5s), to make sure the flows are installed
+# without waiting.
+sleep 3
+AT_CHECK([ovs-ofctl dump-flows br-int | grep -F 10.1.2.4 | grep -vF 2.2.2.2], 
[0], [ignore])
 
 check ovn-nbctl --wait=hv lb-add lb3 3.3.3.3 10.1.2.5 \
 -- ls-lb-add ls1 lb3
-- 
2.38.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] utilities: Make database connection optional for ovn-detrace.

2024-03-04 Thread Ales Musil
The ovn-detrace required both connections (SB and NB) to provide some
output. However, it is a valid scenario to have only one database
available, usually the SB. With that in mind make the connection
optional and print warning into stderr that the output will not
contain information from DB that is not available.

Reported-at: https://issues.redhat.com/browse/FDP-68
Signed-off-by: Ales Musil 
---
 utilities/ovn_detrace.py.in | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/utilities/ovn_detrace.py.in b/utilities/ovn_detrace.py.in
index 364f11a71..12df6e52a 100755
--- a/utilities/ovn_detrace.py.in
+++ b/utilities/ovn_detrace.py.in
@@ -37,6 +37,9 @@ except Exception:
 
 argv0 = sys.argv[0]
 version = "@VERSION@"
+DB_CONNECTION_ERR = ('The connection to {0} DB is not available,'
+ ' {0} information will be missing from the detrace.')
+
 
 def usage():
 print("""\
@@ -77,6 +80,11 @@ def chassis_str(chassis):
 ch = chassis[0]
 return 'chassis-name "%s", chassis-str "%s"' % (ch.name, ch.hostname)
 
+
+class ConnectionException(Exception):
+pass
+
+
 class OVSDB(object):
 STREAM_TIMEOUT_MS = 1000
 
@@ -111,7 +119,7 @@ class OVSDB(object):
 os.strerror(error)))
 strm = None
 if not strm:
-raise Exception('Unable to connect to %s' % self.remote)
+raise ConnectionException()
 
 rpc = jsonrpc.Connection(strm)
 req = jsonrpc.Message.create_request('get_schema', [schema_name])
@@ -167,6 +175,8 @@ class CookieHandlerByUUUID(CookieHandler):
 super(CookieHandlerByUUUID, self).__init__(db, table, printer)
 
 def get_records(self, cookie):
+if not self._db:
+return []
 # Adjust cookie to include leading zeroes if needed.
 cookie = cookie.zfill(8)
 return self._db.find_rows_by_partial_uuid(self._table, cookie)
@@ -488,8 +498,19 @@ def main():
 if ovs and not ovs_db:
 ovs_db = 'unix:%s/db.sock' % ovs_rundir
 
-ovsdb_ovnsb = OVSDB(ovnsb_db, 'OVN_Southbound', leader_only=leader_only)
-ovsdb_ovnnb = OVSDB(ovnnb_db, 'OVN_Northbound', leader_only=leader_only)
+try:
+ovsdb_ovnsb = OVSDB(ovnsb_db, 'OVN_Southbound',
+leader_only=leader_only)
+except ConnectionException:
+print(DB_CONNECTION_ERR.format('SB'), file=sys.stderr)
+ovsdb_ovnsb = None
+
+try:
+ovsdb_ovnnb = OVSDB(ovnnb_db, 'OVN_Northbound',
+leader_only=leader_only)
+except ConnectionException:
+print(DB_CONNECTION_ERR.format('NB'), file=sys.stderr)
+ovsdb_ovnnb = None
 
 printer = Printer()
 cookie_handlers = get_cookie_handlers(ovsdb_ovnnb, ovsdb_ovnsb, printer)
-- 
2.44.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH ovn] Fix broken link for LTS release.

2024-03-04 Thread Igor Zhukov
I found the broken link at 
https://docs.ovn.org/en/latest/internals/release-process.html.

I believe the correct link is https://www.ovn.org/en/releases/#long-term-support

Signed-off-by: Igor Zhukov 
---
 Documentation/internals/release-process.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/internals/release-process.rst 
b/Documentation/internals/release-process.rst
index 26d3f8d4d..988257975 100644
--- a/Documentation/internals/release-process.rst
+++ b/Documentation/internals/release-process.rst
@@ -203,5 +203,5 @@ Contact
 
 Use d...@openvswitch.org to discuss the OVN development and release process.
 
-__ https://www.ovn.org/en/releases/long_term_support/
+__ https://www.ovn.org/en/releases/#long-term-support
 __ https://www.ovn.org
-- 
2.43.0.windows.1
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/4] appveyor: Build with OpenSSL 3.0.

2024-03-04 Thread Alin Serdean
That’s understandable, I think we should be fine for now and if someone asks 
for a backport we can address it then.

Thank you again for all the work you put into this!

Alin

> 
> On 4 Mar 2024, at 23:44, Ilya Maximets  wrote:
> 
> On 3/1/24 22:29, Alin Serdean wrote:
>> Lgtm.  Thank you so much Ilya
>> 
>> Acked-by: Alin-Gabriel Serdean 
> 
> Thanks, Alin and Simon!
> 
> I applied the set now.
> 
> Also backported patches 2 and 3 to branch-3.3, since they are bug fixes
> and 3.3 is planned to be our next LTS.  We may backport further, but
> I'm a little hesitant to do so since we don't have CI for stable branches
> and no-one complained so far.  Let me know if you think we should backport
> further, I can do that.
> 
> Best regards, Ilya Maximets.
> 
>> 
>>> 
 On 1 Mar 2024, at 22:10, Ilya Maximets  wrote:
>>> 
>>> OpenSSL 1.0.2u is long deprecated and not available for download.
>>> So, our CI never actually downloads it and uses whatever is in the
>>> OpenSSL-Win64 folder provided by AppVeyor.  Luckily, it happens to
>>> be OpenSSL 1.0.2u today.
>>> 
>>> The oldest supported version of OpenSSL upstream today is 3.0.
>>> And it is an LTS version.  3.1 and 3.2 are not LTS.
>>> 
>>> Use OpenSSL 3.0 for testing instead.
>>> 
>>> This commit does a few things to achieve that:
>>> 
>>> 1. Removes the folder provided by AppVeyor.  This way we will fail
>>>  the build if something goes wrong instead of silently using
>>>  OpenSSL version provided by AppVeyor.
>>> 
>>> 2. Obtains the JSON description of available releases and downloads
>>>  the latest minor version of OpenSSL 3.0 64-bit.  With this approach
>>>  we should not need to update the download link that frequently.
>>>  New minor releases will be picked up automatically.  They should
>>>  not have any breaking changes, so should be fine to use in CI.
>>>  OpenSSL 3.0 is supported until at least Sep 2026.
>>> 
>>>  The JSON file is an official file referenced on the:
>>>   https://slproweb.com/products/Win32OpenSSL.html
>>>  So, it should be safe to use.
>>> 
>>> 3. Executes the downloaded installer with 'Start-Process -Wait' to
>>>  properly wait for installation to finish instead of just sleeping
>>>  for 30 seconds.
>>> 
>>> 4. Caches the downloaded installer, so we're not downloading 300 MB
>>>  on each CI run as that is not nice to do.  We know the hash of the
>>>  latest version, so we will re-download only when the binary changes,
>>>  i.e. on a new minor release.
>>> 
>>>  For the cache to work we need to introduce the 'install' phase,
>>>  because caches are populated after 'init', but before 'install'.
>>>  Alternatively, we could have just renamed 'init' to 'install',
>>>  but I think it's a little nicer to have separate phases, and we
>>>  can also move 'windows-prepare.sh' to the install phase.
>>> 
>>>  Cache is also invalidated whenever appveyor.yml changes.
>>> 
>>> Acked-by: Simon Horman 
>>> Signed-off-by: Ilya Maximets 
>>> ---
>>> appveyor.yml | 52 ++--
>>> 1 file changed, 42 insertions(+), 10 deletions(-)
>>> 
>>> diff --git a/appveyor.yml b/appveyor.yml
>>> index 373f01a43..29cc44d6c 100644
>>> --- a/appveyor.yml
>>> +++ b/appveyor.yml
>>> @@ -8,28 +8,60 @@ configuration:
>>>  - Release
>>> clone_folder: C:\openvswitch_compile
>>> shallow_clone: true
>>> +
>>> init:
>>> - ps: $env:PATH ="C:\Python312-x64;"+$env:PATH
>>> - ps: New-Item -Type HardLink -Path  "C:\Python312-x64\python3.exe"
>>>  -Value "C:\Python312-x64\python.exe"
>>> +
>>> +cache:
>>> +- C:\ovs-build-downloads -> appveyor.yml
>>> +
>>> +install:
>>> - ps: |
>>> -mkdir C:\ovs-build-downloads
>>> +Remove-Item -Recurse -Force -Path C:/OpenSSL-Win64
>>> +New-Item -ItemType Directory -Force -Path C:\ovs-build-downloads
>>> +
>>> +# Find and download the latest stable OpenSSl 3.0.
>>> +$URL = 
>>> "https://raw.githubusercontent.com/slproweb/opensslhashes/master/win32_openssl_hashes.json;
>>> +$webData = (Invoke-WebRequest -Uri $URL).content | ConvertFrom-Json
>>> +$source = ($webData.files.PSObject.Properties | Where-Object {
>>> +$_.Value.basever   -match "3.0.*" -and
>>> +$_.Value.bits  -eq"64"-and
>>> +$_.Value.arch  -eq"INTEL" -and
>>> +$_.Value.installer -eq"exe"   -and
>>> +-not $_.Value.light
>>> +} | Select-Object Value).PSObject.Properties.Value
>>> +
>>> +Write-Host "Latest OpenSSL 3.0:" ($source | Format-List | Out-String)
>>> +
>>> +$destination = "C:\ovs-build-downloads\Win64OpenSSL.exe"
>>> +if (Test-Path $destination) {
>>> +$fileHash = (Get-FileHash $destination -Algorithm 
>>> SHA256).Hash.ToLower()
>>> +if ($fileHash -ne $source.sha256) {
>>> +Write-Host "Cache miss:" $fileHash "!=" $source.sha256
>>> +Remove-Item -Path $destination
>>> +}
>>> +}
>>> 
>>> -$source = 

Re: [ovs-dev] [PATCH v2 4/4] appveyor: Build with OpenSSL 3.0.

2024-03-04 Thread Ilya Maximets
On 3/1/24 22:29, Alin Serdean wrote:
> Lgtm.  Thank you so much Ilya
> 
> Acked-by: Alin-Gabriel Serdean 

Thanks, Alin and Simon!

I applied the set now.

Also backported patches 2 and 3 to branch-3.3, since they are bug fixes
and 3.3 is planned to be our next LTS.  We may backport further, but
I'm a little hesitant to do so since we don't have CI for stable branches
and no-one complained so far.  Let me know if you think we should backport
further, I can do that.

Best regards, Ilya Maximets.

> 
>>
>> On 1 Mar 2024, at 22:10, Ilya Maximets  wrote:
>>
>> OpenSSL 1.0.2u is long deprecated and not available for download.
>> So, our CI never actually downloads it and uses whatever is in the
>> OpenSSL-Win64 folder provided by AppVeyor.  Luckily, it happens to
>> be OpenSSL 1.0.2u today.
>>
>> The oldest supported version of OpenSSL upstream today is 3.0.
>> And it is an LTS version.  3.1 and 3.2 are not LTS.
>>
>> Use OpenSSL 3.0 for testing instead.
>>
>> This commit does a few things to achieve that:
>>
>> 1. Removes the folder provided by AppVeyor.  This way we will fail
>>   the build if something goes wrong instead of silently using
>>   OpenSSL version provided by AppVeyor.
>>
>> 2. Obtains the JSON description of available releases and downloads
>>   the latest minor version of OpenSSL 3.0 64-bit.  With this approach
>>   we should not need to update the download link that frequently.
>>   New minor releases will be picked up automatically.  They should
>>   not have any breaking changes, so should be fine to use in CI.
>>   OpenSSL 3.0 is supported until at least Sep 2026.
>>
>>   The JSON file is an official file referenced on the:
>>https://slproweb.com/products/Win32OpenSSL.html
>>   So, it should be safe to use.
>>
>> 3. Executes the downloaded installer with 'Start-Process -Wait' to
>>   properly wait for installation to finish instead of just sleeping
>>   for 30 seconds.
>>
>> 4. Caches the downloaded installer, so we're not downloading 300 MB
>>   on each CI run as that is not nice to do.  We know the hash of the
>>   latest version, so we will re-download only when the binary changes,
>>   i.e. on a new minor release.
>>
>>   For the cache to work we need to introduce the 'install' phase,
>>   because caches are populated after 'init', but before 'install'.
>>   Alternatively, we could have just renamed 'init' to 'install',
>>   but I think it's a little nicer to have separate phases, and we
>>   can also move 'windows-prepare.sh' to the install phase.
>>
>>   Cache is also invalidated whenever appveyor.yml changes.
>>
>> Acked-by: Simon Horman 
>> Signed-off-by: Ilya Maximets 
>> ---
>> appveyor.yml | 52 ++--
>> 1 file changed, 42 insertions(+), 10 deletions(-)
>>
>> diff --git a/appveyor.yml b/appveyor.yml
>> index 373f01a43..29cc44d6c 100644
>> --- a/appveyor.yml
>> +++ b/appveyor.yml
>> @@ -8,28 +8,60 @@ configuration:
>>   - Release
>> clone_folder: C:\openvswitch_compile
>> shallow_clone: true
>> +
>> init:
>> - ps: $env:PATH ="C:\Python312-x64;"+$env:PATH
>> - ps: New-Item -Type HardLink -Path  "C:\Python312-x64\python3.exe"
>>   -Value "C:\Python312-x64\python.exe"
>> +
>> +cache:
>> +- C:\ovs-build-downloads -> appveyor.yml
>> +
>> +install:
>> - ps: |
>> -mkdir C:\ovs-build-downloads
>> +Remove-Item -Recurse -Force -Path C:/OpenSSL-Win64
>> +New-Item -ItemType Directory -Force -Path C:\ovs-build-downloads
>> +
>> +# Find and download the latest stable OpenSSl 3.0.
>> +$URL = 
>> "https://raw.githubusercontent.com/slproweb/opensslhashes/master/win32_openssl_hashes.json;
>> +$webData = (Invoke-WebRequest -Uri $URL).content | ConvertFrom-Json
>> +$source = ($webData.files.PSObject.Properties | Where-Object {
>> +$_.Value.basever   -match "3.0.*" -and
>> +$_.Value.bits  -eq"64"-and
>> +$_.Value.arch  -eq"INTEL" -and
>> +$_.Value.installer -eq"exe"   -and
>> +-not $_.Value.light
>> +} | Select-Object Value).PSObject.Properties.Value
>> +
>> +Write-Host "Latest OpenSSL 3.0:" ($source | Format-List | Out-String)
>> +
>> +$destination = "C:\ovs-build-downloads\Win64OpenSSL.exe"
>> +if (Test-Path $destination) {
>> +$fileHash = (Get-FileHash $destination -Algorithm 
>> SHA256).Hash.ToLower()
>> +if ($fileHash -ne $source.sha256) {
>> +Write-Host "Cache miss:" $fileHash "!=" $source.sha256
>> +Remove-Item -Path $destination
>> +}
>> +}
>>
>> -$source = "https://slproweb.com/download/Win64OpenSSL-1_0_2u.exe;
>> -$destination = "C:\ovs-build-downloads\Win64OpenSSL-1_0_2u.exe"
>> -Invoke-WebRequest $source -OutFile $destination
>> +if (Test-Path $destination) {
>> +Write-Host "Using cached:" $destination
>> +} else {
>> +Write-Host "Downloading:" $source.url
>> +Invoke-WebRequest $source.url -OutFile 

Re: [ovs-dev] [PATCH v3] conntrack: Fix flush not flushing all elements.

2024-03-04 Thread Mike Pattrick
On Mon, Mar 4, 2024 at 10:22 AM Xavier Simonart  wrote:
>
> On netdev datapath, when a ct element was cleaned, the cmap
> could be shrinked, potentially causing some elements to be skipped
> in the flush iteration.
>
> Fixes: 967bb5c5cd90 ("conntrack: Add rcu support.")
> Signed-off-by: Xavier Simonart 

Acked-by: Mike Pattrick 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v9 1/2] revalidator: Add a USDT probe during flow deletion with purge reason.

2024-03-04 Thread Eelco Chaudron


On 4 Mar 2024, at 16:46, Aaron Conole wrote:

> Eelco Chaudron  writes:
>
>> On 20 Feb 2024, at 22:47, Aaron Conole wrote:
>>
>>> From: Kevin Sprague 
>>>
>>> During normal operations, it is useful to understand when a particular flow
>>> gets removed from the system. This can be useful when debugging performance
>>> issues tied to ofproto flow changes, trying to determine deployed traffic
>>> patterns, or while debugging dynamic systems where ports come and go.
>>>
>>> Prior to this change, there was a lack of visibility around flow expiration.
>>> The existing debugging infrastructure could tell us when a flow was added to
>>> the datapath, but not when it was removed or why.
>>>
>>> This change introduces a USDT probe at the point where the revalidator
>>> determines that the flow should be removed.  Additionally, we track the
>>> reason for the flow eviction and provide that information as well.  With
>>> this change, we can track the complete flow lifecycle for the netlink
>>> datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
>>> the revalidator USDT, letting us watch as flows are added and removed from
>>> the kernel datapath.
>>>
>>> This change only enables this information via USDT probe, so it won't be
>>> possible to access this information any other way (see:
>>> Documentation/topics/usdt-probes.rst).
>>>
>>> Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
>>> which serves as a demonstration of how the new USDT probe might be used
>>> going forward.
>>>
>>> Acked-by: Han Zhou 
>>> Signed-off-by: Kevin Sprague 
>>> Co-authored-by: Aaron Conole 
>>> Signed-off-by: Aaron Conole 
>>
>> Thanks for doing the v9, some small comments remain below.
>>
>> Cheers,
>>
>> Eelco
>>
>>> ---
>>> v8 -> v9: Reorganized the flow delete reasons enum
>>>   Updated flow_reval_monitor to use pahole to extract fields
>>>   Added the purge reason with a proper USDT point
>>>   Updated documentation
>>>   Dropped all the outstanding ACKs
>>>
>>>  Documentation/topics/usdt-probes.rst |  43 +
>>>  ofproto/ofproto-dpif-upcall.c|  48 +-
>>>  utilities/automake.mk|   3 +
>>>  utilities/usdt-scripts/flow_reval_monitor.py | 997 +++
>>>  4 files changed, 1085 insertions(+), 6 deletions(-)
>>>  create mode 100755 utilities/usdt-scripts/flow_reval_monitor.py
>>>
>>> diff --git a/Documentation/topics/usdt-probes.rst 
>>> b/Documentation/topics/usdt-probes.rst
>>> index e527f43bab..015614a6b8 100644
>>> --- a/Documentation/topics/usdt-probes.rst
>>> +++ b/Documentation/topics/usdt-probes.rst
>>> @@ -214,8 +214,10 @@ Available probes in ``ovs_vswitchd``:
>>>  - dpif_recv:recv_upcall
>>>  - main:poll_block
>>>  - main:run_start
>>> +- revalidate:flow_result
>>>  - revalidate_ukey\_\_:entry
>>>  - revalidate_ukey\_\_:exit
>>> +- revalidator_sweep\_\_:flow_result
>>>  - udpif_revalidator:start_dump
>>>  - udpif_revalidator:sweep_done
>>>
>>> @@ -443,6 +445,47 @@ sweep phase was completed.
>>>  - ``utilities/usdt-scripts/reval_monitor.py``
>>>
>>>
>>> +probe revalidate:flow_result
>>> +
>>> +
>>> +**Description**:
>>> +This probe is triggered when the revalidator has executed on a particular
>>> +flow key to make a determination whether to evict a flow, and the cause
>>> +for eviction.  The revalidator runs periodically, and this probe will only
>>> +be triggered when a flow is flagged for revalidation.
>>> +
>>> +**Arguments**:
>>> +
>>> +- *arg0*: ``(enum reval_result) result``
>>> +- *arg1*: ``(enum flow_del_reason) reason``
>>
>> nit: variable name changed, so should be del_reason.
>
> Good catch, I'll update.
>
>>> +- *arg2*: ``(struct udpif *) udpif``
>>> +- *arg3*: ``(struct udpif_key *) ukey``
>>> +
>>
>> I think you missed my previous comment on re-ordering the arguments to
>> be more inline with existing probes, i.e.:
>>
>> +OVS_USDT_PROBE(revalidator_sweep__, flow_result, udpif, 
>> ukey,
>> +   result, del_reason);
>
> Guess so.  I'll fix it.
>
>>> +**Script references**:
>>> +
>>> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
>>> +
>>> +
>>> +probe revalidator_sweep\_\_:flow_result
>>> +~~~
>>> +
>>> +**Description**:
>>> +This probe is placed in the path of the revalidator sweep, and is executed
>>> +under the condition that a flow entry is in an unexpected state, or the
>>> +flows were asked to be purged due to a user action.
>>> +
>>> +**Arguments**:
>>> +
>>> +- *arg0*: ``(enum reval_result) result``
>>> +- *arg1*: ``(enum flow_del_reason) reason``
>>
>> nit: variable name changed, so should be del_reason.
>
> Okay.
>
>>> +- *arg2*: ``(struct udpif *) udpif``
>>> +- *arg3*: ``(struct udpif_key *) ukey``
>>
>> See comments above on argument ordering.
>>
>>> +
>>> +**Script references**:
>>> +
>>> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
>>> +
>>> +

Re: [ovs-dev] [PATCH v9 1/2] revalidator: Add a USDT probe during flow deletion with purge reason.

2024-03-04 Thread Aaron Conole
Eelco Chaudron  writes:

> On 20 Feb 2024, at 22:47, Aaron Conole wrote:
>
>> From: Kevin Sprague 
>>
>> During normal operations, it is useful to understand when a particular flow
>> gets removed from the system. This can be useful when debugging performance
>> issues tied to ofproto flow changes, trying to determine deployed traffic
>> patterns, or while debugging dynamic systems where ports come and go.
>>
>> Prior to this change, there was a lack of visibility around flow expiration.
>> The existing debugging infrastructure could tell us when a flow was added to
>> the datapath, but not when it was removed or why.
>>
>> This change introduces a USDT probe at the point where the revalidator
>> determines that the flow should be removed.  Additionally, we track the
>> reason for the flow eviction and provide that information as well.  With
>> this change, we can track the complete flow lifecycle for the netlink
>> datapath by hooking the upcall tracepoint in kernel, the flow put USDT, and
>> the revalidator USDT, letting us watch as flows are added and removed from
>> the kernel datapath.
>>
>> This change only enables this information via USDT probe, so it won't be
>> possible to access this information any other way (see:
>> Documentation/topics/usdt-probes.rst).
>>
>> Also included is a script (utilities/usdt-scripts/flow_reval_monitor.py)
>> which serves as a demonstration of how the new USDT probe might be used
>> going forward.
>>
>> Acked-by: Han Zhou 
>> Signed-off-by: Kevin Sprague 
>> Co-authored-by: Aaron Conole 
>> Signed-off-by: Aaron Conole 
>
> Thanks for doing the v9, some small comments remain below.
>
> Cheers,
>
> Eelco
>
>> ---
>> v8 -> v9: Reorganized the flow delete reasons enum
>>   Updated flow_reval_monitor to use pahole to extract fields
>>   Added the purge reason with a proper USDT point
>>   Updated documentation
>>   Dropped all the outstanding ACKs
>>
>>  Documentation/topics/usdt-probes.rst |  43 +
>>  ofproto/ofproto-dpif-upcall.c|  48 +-
>>  utilities/automake.mk|   3 +
>>  utilities/usdt-scripts/flow_reval_monitor.py | 997 +++
>>  4 files changed, 1085 insertions(+), 6 deletions(-)
>>  create mode 100755 utilities/usdt-scripts/flow_reval_monitor.py
>>
>> diff --git a/Documentation/topics/usdt-probes.rst 
>> b/Documentation/topics/usdt-probes.rst
>> index e527f43bab..015614a6b8 100644
>> --- a/Documentation/topics/usdt-probes.rst
>> +++ b/Documentation/topics/usdt-probes.rst
>> @@ -214,8 +214,10 @@ Available probes in ``ovs_vswitchd``:
>>  - dpif_recv:recv_upcall
>>  - main:poll_block
>>  - main:run_start
>> +- revalidate:flow_result
>>  - revalidate_ukey\_\_:entry
>>  - revalidate_ukey\_\_:exit
>> +- revalidator_sweep\_\_:flow_result
>>  - udpif_revalidator:start_dump
>>  - udpif_revalidator:sweep_done
>>
>> @@ -443,6 +445,47 @@ sweep phase was completed.
>>  - ``utilities/usdt-scripts/reval_monitor.py``
>>
>>
>> +probe revalidate:flow_result
>> +
>> +
>> +**Description**:
>> +This probe is triggered when the revalidator has executed on a particular
>> +flow key to make a determination whether to evict a flow, and the cause
>> +for eviction.  The revalidator runs periodically, and this probe will only
>> +be triggered when a flow is flagged for revalidation.
>> +
>> +**Arguments**:
>> +
>> +- *arg0*: ``(enum reval_result) result``
>> +- *arg1*: ``(enum flow_del_reason) reason``
>
> nit: variable name changed, so should be del_reason.

Good catch, I'll update.

>> +- *arg2*: ``(struct udpif *) udpif``
>> +- *arg3*: ``(struct udpif_key *) ukey``
>> +
>
> I think you missed my previous comment on re-ordering the arguments to
> be more inline with existing probes, i.e.:
>
> +OVS_USDT_PROBE(revalidator_sweep__, flow_result, udpif, ukey,
> +   result, del_reason);

Guess so.  I'll fix it.

>> +**Script references**:
>> +
>> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
>> +
>> +
>> +probe revalidator_sweep\_\_:flow_result
>> +~~~
>> +
>> +**Description**:
>> +This probe is placed in the path of the revalidator sweep, and is executed
>> +under the condition that a flow entry is in an unexpected state, or the
>> +flows were asked to be purged due to a user action.
>> +
>> +**Arguments**:
>> +
>> +- *arg0*: ``(enum reval_result) result``
>> +- *arg1*: ``(enum flow_del_reason) reason``
>
> nit: variable name changed, so should be del_reason.

Okay.

>> +- *arg2*: ``(struct udpif *) udpif``
>> +- *arg3*: ``(struct udpif_key *) ukey``
>
> See comments above on argument ordering.
>
>> +
>> +**Script references**:
>> +
>> +- ``utilities/usdt-scripts/flow_reval_monitor.py``
>> +
>> +
>>  Adding your own probes
>>  --
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index b5cbeed878..fbc7858690 100644
>> --- 

Re: [ovs-dev] [PATCH v2 3/4] m4: Fix linking with OpenSSL 1.1.0+ and 3+ on Windows.

2024-03-04 Thread Simon Horman
On Fri, Mar 01, 2024 at 10:10:39PM +0100, Ilya Maximets wrote:
> OpenSSL 1.1.0 changed the library names from libeay32 and ssleay32 to
> standard libssl and libcrypto.  All the versions of OpenSSL that used
> old names reached their official EoL, so it should be safe to just
> migrate to new names.  They can still be supported via premium support
> option, but I don't think that is important for us.
> 
> Also, OpenSSL installers for older versions had the following folder
> structure:
> 
>   C:\OPENSSL-WIN64\
>   +---bin
>   +---include
>   |   +---openssl
>   +---lib
>   |   libeay32.lib
>   |   ssleay32.lib
>   +---VC
>   libeay32MD.lib
>   libeay32MDd.lib
>   libeay32MT.lib
>   libeay32MTd.lib
>   ssleay32MD.lib
>   ssleay32MDd.lib
>   ssleay32MT.lib
>   ssleay32MTd.lib
> 
> With newer OpenSSL 3+ the structure is different:
> 
>   C:\OPENSSL-WIN64
>   +---bin
>   +---include
>   |   +---openssl
>   +---lib
>   +---VC
>   +---x64
>   +---MD
>   |   libcrypto.lib
>   |   libssl.lib
>   +---MDd
>   |   libcrypto.lib
>   |   libssl.lib
>   +---MT
>   |   libcrypto.lib
>   |   libssl.lib
>   +---MTd
>   libcrypto.lib
>   libssl.lib
> 
> Basically, instead of one generic library in the lib folder and a bunch
> of differently named versions of it for different type of linkage, we
> now have multiple instances of the library located in different folders
> based on the linkage type.  So, we have to provide an exact path in
> order to find the library.
> 
> 'lib/VC/x64/MT' was chosen in this patch since it is a way used for
> building in build-aux/ccl.
> MD stands for dynamic linking, MT is static, 'd' stands for debug
> versions of the libraries.
> 
> While at it, fixing documentation examples to point to Win64 default
> installation folder.
> 
> Signed-off-by: Ilya Maximets 

Acked-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v3] conntrack: Fix flush not flushing all elements.

2024-03-04 Thread Xavier Simonart
On netdev datapath, when a ct element was cleaned, the cmap
could be shrinked, potentially causing some elements to be skipped
in the flush iteration.

Fixes: 967bb5c5cd90 ("conntrack: Add rcu support.")
Signed-off-by: Xavier Simonart 
---
v2: - Updated commit message.
- Use compose-packet instead of hex packet content.
- Use dnl for comments.
- Remove unnecessary errors in OVS_TRAFFIC_VSWITCHD_STOP.
- Rebased on origin/master.
v3: - removed cm_pos.
- Use zone=1 in flush-conntrack and dump-conntrack.
---
 lib/conntrack.c | 14 
 lib/conntrack.h |  2 +-
 tests/system-traffic.at | 47 +
 3 files changed, 52 insertions(+), 11 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 8a7056bac..5786424f6 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -2651,25 +2651,19 @@ conntrack_dump_start(struct conntrack *ct, struct 
conntrack_dump *dump,
 
 dump->ct = ct;
 *ptot_bkts = 1; /* Need to clean up the callers. */
+dump->cursor = cmap_cursor_start(>conns);
 return 0;
 }
 
 int
 conntrack_dump_next(struct conntrack_dump *dump, struct ct_dpif_entry *entry)
 {
-struct conntrack *ct = dump->ct;
 long long now = time_msec();
 
-for (;;) {
-struct cmap_node *cm_node = cmap_next_position(>conns,
-   >cm_pos);
-if (!cm_node) {
-break;
-}
-struct conn_key_node *keyn;
-struct conn *conn;
+struct conn_key_node *keyn;
+struct conn *conn;
 
-INIT_CONTAINER(keyn, cm_node, cm_node);
+CMAP_CURSOR_FOR_EACH_CONTINUE (keyn, cm_node, >cursor) {
 if (keyn->dir != CT_DIR_FWD) {
 continue;
 }
diff --git a/lib/conntrack.h b/lib/conntrack.h
index ee7da099e..8ab8b0017 100644
--- a/lib/conntrack.h
+++ b/lib/conntrack.h
@@ -107,8 +107,8 @@ struct conntrack_dump {
 struct conntrack *ct;
 unsigned bucket;
 union {
-struct cmap_position cm_pos;
 struct hmap_position hmap_pos;
+struct cmap_cursor cursor;
 };
 bool filter_zone;
 uint16_t zone;
diff --git a/tests/system-traffic.at b/tests/system-traffic.at
index 98e494abf..891a1de0a 100644
--- a/tests/system-traffic.at
+++ b/tests/system-traffic.at
@@ -8389,6 +8389,53 @@ AT_CHECK([ovs-pcap client.pcap | grep 
20102000], [0], [dnl
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([conntrack - Flush many conntrack entries by port])
+CHECK_CONNTRACK()
+OVS_TRAFFIC_VSWITCHD_START()
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+
+AT_DATA([flows.txt], [dnl
+priority=100,in_port=1,udp,action=ct(zone=1,commit),2
+])
+
+AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
+
+dnl 20 packets from port 1 and 1 packet from port 2.
+flow_l3="\
+eth_src=50:54:00:00:00:09,eth_dst=50:54:00:00:00:0a,dl_type=0x0800,\
+nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_proto=17,nw_ttl=64,nw_frag=no"
+
+for i in $(seq 1 20); do
+frame=$(ovs-ofctl compose-packet --bare "$flow_l3, udp_src=1,udp_dst=$i")
+AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1 packet=$frame 
actions=resubmit(,0)"])
+done
+frame=$(ovs-ofctl compose-packet --bare "$flow_l3, udp_src=2,udp_dst=1")
+AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1 packet=$frame 
actions=resubmit(,0)"])
+
+: > conntrack
+
+for i in $(seq 1 20); do
+echo 
"udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=${i}),reply=(src=10.1.1.2,dst=10.1.1.1,sport=${i},dport=1),zone=1"
 >> conntrack
+done
+echo 
"udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=2,dport=1),reply=(src=10.1.1.2,dst=10.1.1.1,sport=1,dport=2),zone=1"
 >> conntrack
+
+sort conntrack > expout
+
+AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=1 | grep -F "src=10.1.1.1," | 
sort ], [0], [expout])
+
+dnl Check that flushing conntrack by port 1 flush all ct for port 1 but keeps 
ct for port 2.
+AT_CHECK([ovs-appctl dpctl/flush-conntrack zone=1 
'ct_nw_proto=17,ct_tp_src=1'])
+AT_CHECK([ovs-appctl dpctl/dump-conntrack zone=1 | grep -F "src=10.1.1.1," | 
sort ], [0], [dnl
+udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=2,dport=1),reply=(src=10.1.1.2,dst=10.1.1.1,sport=1,dport=2),zone=1
+])
+
+OVS_TRAFFIC_VSWITCHD_STOP
+AT_CLEANUP
+
 AT_BANNER([IGMP])
 
 AT_SETUP([IGMP - flood under normal action])
-- 
2.41.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] conntrack: Fix flush not flushing all elements.

2024-03-04 Thread Xavier Simonart
Thanks Mike and Ilya.
Sending v3 to handle comments from Mike and Ilya.

Thanks
Xavier

On Fri, Mar 1, 2024 at 7:48 PM Ilya Maximets  wrote:

> On 2/26/24 11:49, Xavier Simonart wrote:
> > On netdev datapath, when a ct element was cleaned, the cmap
> > could be shrinked, potentially causing some elements to be skipped
> > in the flush iteration.
> >
> > Fixes: 967bb5c5cd90 ("conntrack: Add rcu support.")
> > Signed-off-by: Xavier Simonart 
> > ---
> > v2: - Updated commit message.
> > - Use compose-packet instead of hex packet content.
> > - Use dnl for comments.
> > - Remove unnecessary errors in OVS_TRAFFIC_VSWITCHD_STOP.
> > - Rebased on origin/master.
>
> Thanks, Xavier!
>
> Beside the comment from Mike I have a couple nits for the test.
> See below.  Otherwise, the change looks good.
>
> Best regards, Ilya Maximets.
>
> > ---
> >  lib/conntrack.c | 14 
> >  lib/conntrack.h |  1 +
> >  tests/system-traffic.at | 47 +
> >  3 files changed, 52 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/conntrack.c b/lib/conntrack.c
> > index 8a7056bac..5786424f6 100644
> > --- a/lib/conntrack.c
> > +++ b/lib/conntrack.c
> > @@ -2651,25 +2651,19 @@ conntrack_dump_start(struct conntrack *ct,
> struct conntrack_dump *dump,
> >
> >  dump->ct = ct;
> >  *ptot_bkts = 1; /* Need to clean up the callers. */
> > +dump->cursor = cmap_cursor_start(>conns);
> >  return 0;
> >  }
> >
> >  int
> >  conntrack_dump_next(struct conntrack_dump *dump, struct ct_dpif_entry
> *entry)
> >  {
> > -struct conntrack *ct = dump->ct;
> >  long long now = time_msec();
> >
> > -for (;;) {
> > -struct cmap_node *cm_node = cmap_next_position(>conns,
> > -   >cm_pos);
> > -if (!cm_node) {
> > -break;
> > -}
> > -struct conn_key_node *keyn;
> > -struct conn *conn;
> > +struct conn_key_node *keyn;
> > +struct conn *conn;
> >
> > -INIT_CONTAINER(keyn, cm_node, cm_node);
> > +CMAP_CURSOR_FOR_EACH_CONTINUE (keyn, cm_node, >cursor) {
> >  if (keyn->dir != CT_DIR_FWD) {
> >  continue;
> >  }
> > diff --git a/lib/conntrack.h b/lib/conntrack.h
> > index ee7da099e..aa12a1847 100644
> > --- a/lib/conntrack.h
> > +++ b/lib/conntrack.h
> > @@ -109,6 +109,7 @@ struct conntrack_dump {
> >  union {
> >  struct cmap_position cm_pos;
> >  struct hmap_position hmap_pos;
> > +struct cmap_cursor cursor;
> >  };
> >  bool filter_zone;
> >  uint16_t zone;
> > diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> > index 98e494abf..34f93b2e5 100644
> > --- a/tests/system-traffic.at
> > +++ b/tests/system-traffic.at
> > @@ -8389,6 +8389,53 @@ AT_CHECK([ovs-pcap client.pcap | grep
> 20102000], [0], [dnl
> >  OVS_TRAFFIC_VSWITCHD_STOP
> >  AT_CLEANUP
> >
> > +AT_SETUP([conntrack - Flush many conntrack entries by port])
> > +CHECK_CONNTRACK()
> > +OVS_TRAFFIC_VSWITCHD_START()
> > +
> > +ADD_NAMESPACES(at_ns0, at_ns1)
> > +
> > +ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
> > +ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
> > +
> > +AT_DATA([flows.txt], [dnl
> > +priority=100,in_port=1,udp,action=ct(zone=1,commit),2
> > +])
> > +
> > +AT_CHECK([ovs-ofctl --bundle add-flows br0 flows.txt])
> > +
> > +dnl 20 packets from port 1 and 1 packet from port 2.
> > +flow_l3="\
> > +eth_src=50:54:00:00:00:09,eth_dst=50:54:00:00:00:0a,dl_type=0x0800,\
> > +nw_src=10.1.1.1,nw_dst=10.1.1.2,nw_proto=17,nw_ttl=64,nw_frag=no"
> > +
> > +for i in $(seq 1 20); do
> > +frame=$(ovs-ofctl compose-packet --bare "$flow_l3,
> udp_src=1,udp_dst=$i")
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1
> packet=$frame actions=resubmit(,0)"])
> > +done
> > +frame=$(ovs-ofctl compose-packet --bare "$flow_l3, udp_src=2,udp_dst=1")
> > +AT_CHECK([ovs-ofctl -O OpenFlow13 packet-out br0 "in_port=1
> packet=$frame actions=resubmit(,0)"])
> > +
> > +: > conntrack
> > +
> > +for i in $(seq 1 20); do
> > +echo
> "udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=1,dport=${i}),reply=(src=10.1.1.2,dst=10.1.1.1,sport=${i},dport=1),zone=1"
> >> conntrack
> > +done
> > +echo
> "udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=2,dport=1),reply=(src=10.1.1.2,dst=10.1.1.1,sport=1,dport=2),zone=1"
> >> conntrack
> > +
> > +sort conntrack > expout
> > +
> > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep -F "src=10.1.1.1," |
> sort ], [0], [expout])
> > +
> > +dnl Check that flushing conntrack by port 1 flush all ct for port 1 but
> keeps ct for port 2.
> > +AT_CHECK([ovs-appctl dpctl/flush-conntrack
> 'ct_nw_proto=17,ct_tp_src=1'])
> > +AT_CHECK([ovs-appctl dpctl/dump-conntrack | grep -F "src=10.1.1.1," |
> sort ], [0], [dnl
> >
> +udp,orig=(src=10.1.1.1,dst=10.1.1.2,sport=2,dport=1),reply=(src=10.1.1.2,dst=10.1.1.1,sport=1,dport=2),zone=1
> > +])
>
> Can we add zone=1 filters 

[ovs-dev] [Patch ovn 2/2] northd.c: Fix direct access to SNAT network on DR.

2024-03-04 Thread Martin Kalcok
This change fixes bug that breaks ability of machines from external
networks to communicate with machines in SNATed networks (specifically
when using a Distributed router).

Currently when a machine (S1) on an external network tries to talk
over TCP with a machine (A1) in a network that has enabled SNAT, the
connection is established successfully. However after the three-way
handshake, any packets that come from the A1 machine will have their
source address translated by the Distributed router, breaking the
communication.

Existing rule in `build_lrouter_out_snat_flow` that decides which
packets should be SNATed already tries to avoid SNATing packets in
reply direction with `(!ct.trk || !ct.rpl)`. However, previous stages
in the distributed LR egress pipeline do not initiate the CT state.

Additionally we need to commit new connections that originate from
external networks into CT, so that the packets in the reply direction
can be properly identified.

Rationale:

In my original RFC [0], there were questions about the motivation for
fixing this issue. I'll try to summarize why I think this is a bug
that should be fixed.

1. Current implementation for Distributed router already tries to
   avoid SNATing packets in the reply direction, it's just missing
   initialized CT states to make proper decisions.

2. This same scenario works with Gateway Router. I tested with
   following setup:

foo -- R1 -- join - R3 -- alice
  |
bar --R2

R1 is a Distributed router with SNAT for foo. R2 is a Gateway
router with SNAT for bar. R3 is a Gateway router with no SNAT.
Using 'alice1' as a client I was able to talk over TCP with
'bar1' but connection with 'foo1' failed.

3. Regarding security and "leaking" of internal IPs. Reading through
   RFC 4787 [1], 5382 [2] and their update in 7857 [3], the
   specifications do not seem to mandate that SNAT implementations
   must filter incoming traffic destined directly to the internal
   network. Sections like "5. Filtering Behavior" in 4787 and
   "4.3 Externally Initiated Connections" in 5382 describe only
   behavior for traffic destined to external IP/Port associated
   with NAT on the device that performs NAT.

   Besides, with the current implementation, it's already possible
   to scan the internal network with pings and TCP syn scanning.

   If an additional security is needed, ACLs can be used to filter
   out incomming traffic from external networks.

4. We do have customers/clouds that depend on this functionality.
   This is a scenario that used to work in Openstack with ML2/OVS
   and migrating those clouds to ML2/OVN would break it.

[0]https://mail.openvswitch.org/pipermail/ovs-dev/2024-February/411670.html
[1]https://datatracker.ietf.org/doc/html/rfc4787
[2]https://datatracker.ietf.org/doc/html/rfc5382
[3]https://datatracker.ietf.org/doc/html/rfc7857

Signed-off-by: Martin Kalcok 
---
 northd/northd.c | 82 -
 northd/ovn-northd.8.xml | 29 +++
 tests/ovn-northd.at | 15 +++-
 tests/system-ovn.at | 69 ++
 4 files changed, 185 insertions(+), 10 deletions(-)

diff --git a/northd/northd.c b/northd/northd.c
index 2c3560ce2..4b79b357c 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -14437,21 +14437,28 @@ build_lrouter_out_is_dnat_local(struct lflow_table 
*lflows,
 }
 
 static void
-build_lrouter_out_snat_match(struct lflow_table *lflows,
- const struct ovn_datapath *od,
- const struct nbrec_nat *nat, struct ds *match,
- bool distributed_nat, int cidr_bits, bool is_v6,
- struct ovn_port *l3dgw_port,
- struct lflow_ref *lflow_ref)
+build_lrouter_out_snat_direction_match__(struct lflow_table *lflows,
+ const struct ovn_datapath *od,
+ const struct nbrec_nat *nat,
+ struct ds *match,
+ bool distributed_nat, int cidr_bits,
+ bool is_v6,
+ struct ovn_port *l3dgw_port,
+ struct lflow_ref *lflow_ref,
+ bool is_reverse)
 {
 ds_clear(match);
 
-ds_put_format(match, "ip && ip%c.src == %s", is_v6 ? '6' : '4',
+ds_put_format(match, "ip && ip%c.%s == %s",
+  is_v6 ? '6' : '4',
+  is_reverse ? "dst" : "src",
   nat->logical_ip);
 
 if (!od->is_gw_router) {
 /* Distributed router. */
-ds_put_format(match, " && outport == %s", l3dgw_port->json_key);
+ds_put_format(match, " && %s == %s",
+  is_reverse ? "inport" : "outport",
+  l3dgw_port->json_key);

[ovs-dev] [Patch ovn 1/2] actions.c/h: Enable specifying zone for ct_commit.

2024-03-04 Thread Martin Kalcok
Action `ct_commit` currently does not allow specifying zone into
which connection is committed. For example, in LR datapath, the `ct_commit`
will always use the DNAT zone.

This change adds option to use `ct_commit(snat)` or `ct_commit(dnat)` to
explicitly specify the zone into which the connection will be committed.

Original behavior of `ct_commit` without the arguments remains unchanged.

Signed-off-by: Martin Kalcok 
---
 include/ovn/actions.h |  9 +
 lib/actions.c | 20 +++-
 ovn-sb.xml|  9 +
 3 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/include/ovn/actions.h b/include/ovn/actions.h
index 49fb96fc6..ce9597cf5 100644
--- a/include/ovn/actions.h
+++ b/include/ovn/actions.h
@@ -259,11 +259,20 @@ struct ovnact_ct_next {
 uint8_t ltable;/* Logical table ID of next table. */
 };
 
+/* Conntrack zone to be used for commiting CT entries by the action.
+ * DEFAULT uses default zone for given datapath. */
+enum ovnact_ct_zone {
+OVNACT_CT_ZONE_DEFAULT,
+OVNACT_CT_ZONE_SNAT,
+OVNACT_CT_ZONE_DNAT,
+};
+
 /* OVNACT_CT_COMMIT_V1. */
 struct ovnact_ct_commit_v1 {
 struct ovnact ovnact;
 uint32_t ct_mark, ct_mark_mask;
 ovs_be128 ct_label, ct_label_mask;
+enum ovnact_ct_zone zone;
 };
 
 /* Type of NAT used for the particular action.
diff --git a/lib/actions.c b/lib/actions.c
index a45874dfb..319e65b6f 100644
--- a/lib/actions.c
+++ b/lib/actions.c
@@ -707,6 +707,7 @@ static void
 parse_ct_commit_v1_arg(struct action_context *ctx,
struct ovnact_ct_commit_v1 *cc)
 {
+cc->zone = OVNACT_CT_ZONE_DEFAULT;
 if (lexer_match_id(ctx->lexer, "ct_mark")) {
 if (!lexer_force_match(ctx->lexer, LEX_T_EQUALS)) {
 return;
@@ -737,6 +738,10 @@ parse_ct_commit_v1_arg(struct action_context *ctx,
 return;
 }
 lexer_get(ctx->lexer);
+} else if (lexer_match_id(ctx->lexer, "snat")) {
+cc->zone = OVNACT_CT_ZONE_SNAT;
+} else if (lexer_match_id(ctx->lexer, "dnat")) {
+cc->zone = OVNACT_CT_ZONE_DNAT;
 } else {
 lexer_syntax_error(ctx->lexer, NULL);
 }
@@ -814,7 +819,20 @@ encode_CT_COMMIT_V1(const struct ovnact_ct_commit_v1 *cc,
 struct ofpact_conntrack *ct = ofpact_put_CT(ofpacts);
 ct->flags = NX_CT_F_COMMIT;
 ct->recirc_table = NX_CT_RECIRC_NONE;
-ct->zone_src.field = mf_from_id(MFF_LOG_CT_ZONE);
+switch (cc->zone) {
+case OVNACT_CT_ZONE_SNAT:
+ct->zone_src.field = mf_from_id(MFF_LOG_SNAT_ZONE);
+break;
+
+case OVNACT_CT_ZONE_DNAT:
+ct->zone_src.field = mf_from_id(MFF_LOG_DNAT_ZONE);
+break;
+
+case OVNACT_CT_ZONE_DEFAULT:
+default:
+ct->zone_src.field = mf_from_id(MFF_LOG_CT_ZONE);
+break;
+}
 ct->zone_src.ofs = 0;
 ct->zone_src.n_bits = 16;
 
diff --git a/ovn-sb.xml b/ovn-sb.xml
index ac4e585f2..66cb9747d 100644
--- a/ovn-sb.xml
+++ b/ovn-sb.xml
@@ -1405,6 +1405,8 @@
 ct_commit { ct_mark=value[/mask]; };
 ct_commit { ct_label=value[/mask]; };
 ct_commit { ct_mark=value[/mask]; 
ct_label=value[/mask]; };
+ct_commit(snat);
+ct_commit(dnat);
 
   
 Commit the flow to the connection tracking entry associated with it
@@ -1421,6 +1423,13 @@
 in order to have specific bits set.
   
 
+  
+Parameters ct_commit(snat) or ct_commit(dnat)
+ can be used to explicitly specify into which zone should be
+connection committed. Without this parameter, the connection is
+committed to the default zone for the Datapath.
+  
+
   
 Note that if you want processing to continue in the next table,
 you must execute the next action after
-- 
2.40.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6 2/2] netlink-conntrack: Optimize flushing ct zone.

2024-03-04 Thread Felix Huettner via dev
Previously the kernel did not provide a netlink interface to flush/list
only conntrack entries matching a specific zone. With [1] and [2] it is now
possible to flush and list conntrack entries filtered by zone. Older
kernels not yet supporting this feature will ignore the filter.
For the list request that means just returning all entries (which we can
then filter in userspace as before).
For the flush request that means deleting all conntrack entries.

The implementation is now identical to the windows one, so we combine
them.

These significantly improves the performance of flushing conntrack zones
when the conntrack table is large. Since flushing a conntrack zone is
normally triggered via an openflow command it blocks the main ovs thread
and thereby also blocks new flows from being applied. Using this new
feature we can reduce the flushing time for zones by around 93%.

In combination with OVN the creation of a Logical_Router (which causes
the flushing of a ct zone) could block other operations, e.g. the
failover of Logical_Routers (as they cause new flows to be created).
This is visible from a user perspective as a ovn-controller that is idle
(as it waits for vswitchd) and vswitchd reporting:
"blocked 1000 ms waiting for main to quiesce" (potentially with ever
increasing times).

The following performance tests where run in a qemu vm with 500.000
conntrack entries distributed evenly over 500 ct zones using `ovstest
test-netlink-conntrack flush zone=`.

  |  flush zone with 1000 entries  |   flush zone with no entry |
  +-+--+-+--|
  |   with the patch| without  |   with the patch| without  |
  +--+--+--+--+--+--|
  | v6.8-rc4 |  v6.7.1  | v6.8-rc4 | v6.8-rc4 |  v6.7.1  | v6.8-rc4 |
+-+--+--+--+--+--+--|
| Min |  0.260   |  3.946   |  3.497   |  0.228   |  3.462   |  3.212   |
| Median  |  0.319   |  4.237   |  4.349   |  0.298   |  4.460   |  4.010   |
| 90%ile  |  0.335   |  4.367   |  4.522   |  0.325   |  4.662   |  4.572   |
| 99%ile  |  0.348   |  4.495   |  4.773   |  0.340   |  4.931   |  6.003   |
| Max |  0.362   |  4.543   |  5.054   |  0.348   |  5.390   |  6.396   |
| Mean|  0.320   |  4.236   |  4.331   |  0.296   |  4.430   |  4.071   |
| Total   |  80.02   |  1058|  1082|  73.93   |  1107|  1017|

[1]: 
https://github.com/torvalds/linux/commit/eff3c558bb7e61c41b53e4c8130e514a5a4df9ba
[2]: 
https://github.com/torvalds/linux/commit/fa173a1b4e3fd1ab5451cbc57de6fc624c824b0a

Co-Authored-By: Luca Czesla 
Signed-off-by: Luca Czesla 
Co-Authored-By: Max Lamprecht 
Signed-off-by: Max Lamprecht 
Signed-off-by: Felix Huettner 
---
v5->v6: none
v4->v5: none
v3->v4:
- combine the flush logic with windows implementation
v2->v3:
- update description to include upstream fix (Thanks to Ilya for finding
  that issue)
v1->v2:
- fixed wrong signed-off-by

 lib/netlink-conntrack.c | 52 -
 tests/system-traffic.at |  8 +++
 2 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/lib/netlink-conntrack.c b/lib/netlink-conntrack.c
index 492bfcffb..d9015d9f0 100644
--- a/lib/netlink-conntrack.c
+++ b/lib/netlink-conntrack.c
@@ -141,6 +141,9 @@ nl_ct_dump_start(struct nl_ct_dump_state **statep, const 
uint16_t *zone,
 
 nl_msg_put_nfgenmsg(>buf, 0, AF_UNSPEC, NFNL_SUBSYS_CTNETLINK,
 IPCTNL_MSG_CT_GET, NLM_F_REQUEST);
+if (zone) {
+nl_msg_put_be16(>buf, CTA_ZONE, htons(*zone));
+}
 nl_dump_start(>dump, NETLINK_NETFILTER, >buf);
 ofpbuf_clear(>buf);
 
@@ -263,11 +266,9 @@ out:
 return err;
 }
 
-#ifdef _WIN32
-int
-nl_ct_flush_zone(uint16_t flush_zone)
+static int
+nl_ct_flush_zone_with_cta_zone(uint16_t flush_zone)
 {
-/* Windows can flush a specific zone */
 struct ofpbuf buf;
 int err;
 
@@ -282,24 +283,63 @@ nl_ct_flush_zone(uint16_t flush_zone)
 
 return err;
 }
+
+#ifdef _WIN32
+int
+nl_ct_flush_zone(uint16_t flush_zone)
+{
+return nl_ct_flush_zone_with_cta_zone(flush_zone);
+}
 #else
+
+static bool
+netlink_flush_supports_zone(void)
+{
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+static bool supported = false;
+
+if (ovsthread_once_start()) {
+if (ovs_kernel_is_version_or_newer(6, 8)) {
+supported = true;
+} else {
+VLOG_INFO("disabling conntrack flush by zone. "
+  "Not supported in Linux kernel");
+}
+ovsthread_once_done();
+}
+return supported;
+}
+
 int
 nl_ct_flush_zone(uint16_t flush_zone)
 {
-/* Apparently, there's no netlink interface to flush a specific zone.
+/* In older kernels, there was no netlink interface to flush a specific
+ * conntrack zone.
  * This code dumps every connection, checks the zone and 

[ovs-dev] [PATCH v6 1/2] util: Support checking for kernel versions.

2024-03-04 Thread Felix Huettner via dev
Extract checking for a given kernel version to a separate function.
It will be used also in the next patch.

Signed-off-by: Felix Huettner 
---
v5->v6:
- fix ovs_kernel_is_version_or_newer returning false if major and minor
  are equal (thanks Mike)
v4->v5:
- fix wrong ifdef that broke on macos
- fix ovs_kernel_is_version_or_newer working in reverse than desired
- ovs_kernel_is_version_or_newer now always returns false if uname
  errors (Thanks Eelco)
v4:
- extract function to check kernel version

 lib/netdev-linux.c | 15 +++
 lib/util.c | 27 +++
 lib/util.h |  4 
 3 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index bf91ef462..a8424ed82 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -40,7 +40,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -6427,18 +6426,10 @@ getqdisc_is_safe(void)
 static bool safe = false;
 
 if (ovsthread_once_start()) {
-struct utsname utsname;
-int major, minor;
-
-if (uname() == -1) {
-VLOG_WARN("uname failed (%s)", ovs_strerror(errno));
-} else if (!ovs_scan(utsname.release, "%d.%d", , )) {
-VLOG_WARN("uname reported bad OS release (%s)", utsname.release);
-} else if (major < 2 || (major == 2 && minor < 35)) {
-VLOG_INFO("disabling unsafe RTM_GETQDISC in Linux kernel %s",
-  utsname.release);
-} else {
+if (ovs_kernel_is_version_or_newer(2, 35)) {
 safe = true;
+} else {
+VLOG_INFO("disabling unsafe RTM_GETQDISC in Linux kernel");
 }
 ovsthread_once_done();
 }
diff --git a/lib/util.c b/lib/util.c
index 3fb3a4b40..5c31d983a 100644
--- a/lib/util.c
+++ b/lib/util.c
@@ -27,6 +27,7 @@
 #include 
 #ifdef __linux__
 #include 
+#include 
 #endif
 #include 
 #include 
@@ -2500,3 +2501,29 @@ OVS_CONSTRUCTOR(winsock_start) {
}
 }
 #endif
+
+#ifdef __linux__
+bool
+ovs_kernel_is_version_or_newer(int target_major, int target_minor)
+{
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+static int current_major, current_minor = -1;
+
+if (ovsthread_once_start()) {
+struct utsname utsname;
+
+if (uname() == -1) {
+VLOG_WARN("uname failed (%s)", ovs_strerror(errno));
+} else if (!ovs_scan(utsname.release, "%d.%d",
+_major, _minor)) {
+VLOG_WARN("uname reported bad OS release (%s)", utsname.release);
+}
+ovsthread_once_done();
+}
+if (current_major == -1 || current_minor == -1) {
+return false;
+}
+return current_major > target_major || (
+current_major == target_major && current_minor >= target_minor);
+}
+#endif
diff --git a/lib/util.h b/lib/util.h
index f2d45bcac..55718fd87 100644
--- a/lib/util.h
+++ b/lib/util.h
@@ -611,4 +611,8 @@ int ftruncate(int fd, off_t length);
 }
 #endif
 
+#ifdef __linux__
+bool ovs_kernel_is_version_or_newer(int target_major, int target_minor);
+#endif
+
 #endif /* util.h */

base-commit: 436aba68d52891fb5775ec7651282ccf9d04176b
-- 
2.43.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev