Re: [ovs-dev] [v3] odp-execute: Fix AVX checksum calculation.
On 5/23/24 13:03, Van Haaren, Harry wrote: >> On 21 May 2024, at 16:13, Emma Finn wrote: >>> The AVX implementation for calcualting checksums was not >>> handling carry-over addition correctly in some cases. >>> This patch adds an additional shuffle to add 16-bit padding >>> to the final part of the calculation to handle such cases. >>> This commit also adds a unit test to fuzz test the actions >>> autovalidator. >>> >>> Signed-off-by: Emma Finn >>> Reported-by: Eelco Chaudron > >>> +AT_SETUP([datapath - Actions Autovalidator Fuzzy]) >>> +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], []) >>> +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 1 fuzzy > packets]) >> >> Hi Emma, > > Hi Eelco, Ilya & Emma (&OVS community) > > As you know checksums are a very "input sensitive" workload; its kind of > why they're useful (hash ~uniquely represents inputset ideally) > Internally, computing a hash always involves carry-bits and (expected/good) > overflows, however this patch is fixing an unintended/bad overflow. > Testing these corner-cases & boundary conditions is hard; for me the only > way to be confident in its correctness is autovaldation and fuzzing. Hi. Fuzzing is indeed a valid testing approach and in case of checksums it may be one of the most effective. However, as already mentioned, it has a few disadvantages. The main one is unreliability - the test doesn't reproduce the issue every time it runs, and also time - it takes about 3 minutes to run the test. Fuzzing tests are useful, but they should not be part of the unit testing, especially if they are not guaranteed to reproduce issues they are written for. They should be ran separately and continuously. Once the issue is found through fuzzing, a new unit test for the found problem can be added to the unit test suite, so we have a guaranteed failure in case the same issue is re-introduced. Fuzzing can't guarantee that. This is the approach we use with oss-fuzz. oss-fuzz project is continuously fuzzing parts of OVS. Once they find a bug, we fix it and add a regression test either for this exact case or for a class of issues, if possible. > >> As Ilya explained in the v2 reply, we do not like to have fuzzy tests in >> the test suite as they are hard to reproduce. >> (link >> https://patchwork.ozlabs.org/project/openvswitch/patch/20240514134815.2576245-1-emma.f...@intel.com/#3313143) > > I understand that any randomness in CI is not desired, and understand that > random tests can be hard to reproduce. The autovalidator tries to mitigate > the "reproduce" issue by dumping the packet contents on > failure, providing the erroring case to easily allow reproduction. > This test *should* be 100% reliably passing, so no false-positives are > expected. > If not, we should fix the test to be 100% reliable. This test can't produce false failures, I agree. However, it does produce false successes fairly regularly. For example, for me it doesn't fail IPv4 check in about 2 out of 5 runs. Making it more reliably reproduce the issue would mean increasing the number of samples for the already very long test. > >> So my request is to send out a v4 with a test modeled around the 'userspace >> offload - ip csum offload' test case we have in tests/dpif-netdev.at, as per >> Ilya’s request. > > I'll suggest 3 paths forward, and give my pro/cons list for each; whatever > option is preferred by OVS community we can send as a v4; > > 1) Accept "SIMD code fix" part of patch only (no unit test, just fixes issue >with local fuzz-testing during development) > 2) Build "specific-to-this-issue unit-test" only (regression tests against >this specific issue, but nothing else; low value in my opinion, as it >leaves a large input space untested) Large input space will remain untested anyway. Fuzzy test only covers a tiny fraction at a time. There is also an advantage to the specific regression test - we can check the expected output. In this case we can capture the resulted packet and check that the checksum is not only the same between the action implementations, but also that it is correct. This adds value to the test and makes it useful even if some implementations are not available on a testing system. We can take the string representation of the packet, replace the IP, re-generate the hex and compare with what we got on the other side in pcap. > 3) Accept "v3 patch as is, with fuzzing" (best test-coverage, confidence in >current & future code changes. CI runtime on fuzz-tests, and if failures >occur, they must be flagged & fixed, but are *actual real issues* that >should be fixed.) > > I think 2) is what is being asked above, but wanted to make sure the tradeoffs > are understood. As above, please indicate which is the preferred approach and > a v4 will be on the way. > Option 4: Build a specific unit test that that will test for regression for this particular issue. Send the patch out. Then move all the existing fuzzing tes
Re: [ovs-dev] [PATCH v2 1/6] netdev-offload: Fix null pointer dereference' warnings on dump creation.
Bleep bloop. Greetings Mike Pattrick, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: The subject, ': ', is over 70 characters, i.e., 72. Subject: netdev-offload: Fix null pointer dereference' warnings on dump creation. Lines checked: 58, Warnings: 1, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 5/6] netdev-linux: Return an error if device feature names are empty.
When retrieving a list of features supported by a network card, return with an error code if the request completed without an error but the list contains zero entries. In practice this should never happen, but it does contribute to a detection in Clang's static analyzer. Fixes: 6c59c195266c ("netdev-linux: Use ethtool to detect offload support.") Signed-off-by: Mike Pattrick --- lib/netdev-linux.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 25349c605..8b855bdc4 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -2439,7 +2439,9 @@ netdev_linux_read_definitions(struct netdev_linux *netdev, int error = 0; error = netdev_linux_read_stringset_info(netdev, &len); -if (error || !len) { +if (!len) { +return -EOPNOTSUPP; +} else if (error) { return error; } strings = xzalloc(sizeof *strings + len * ETH_GSTRING_LEN); -- 2.39.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 4/6] socket: Fix uninitialized values in inet_parse_ functions.
Clang's static analyzer will complain about uninitialized value dns_failure because we weren't setting a value for dns_failure in all code paths. Now we initialize this in the error conditions of inet_parse_passive and inet_parse_active. Fixes: 08e9e5337383 ("ovsdb: raft: Fix inability to read the database with DNS host names.") Fixes: 5f219af8b3c7 ("ovsdb-server: Fix handling of DNS name for listener configuration.") Signed-off-by: Mike Pattrick --- lib/socket-util.c | 9 + 1 file changed, 9 insertions(+) diff --git a/lib/socket-util.c b/lib/socket-util.c index 2d89fce85..c569b7d16 100644 --- a/lib/socket-util.c +++ b/lib/socket-util.c @@ -546,9 +546,15 @@ inet_parse_active(const char *target_, int default_port, if (!host) { VLOG_ERR("%s: host must be specified", target_); ok = false; +if (dns_failure) { +*dns_failure = false; +} } else if (!port && default_port < 0) { VLOG_ERR("%s: port must be specified", target_); ok = false; +if (dns_failure) { +*dns_failure = false; +} } else { ok = parse_sockaddr_components(ss, host, port, default_port, target_, resolve_host, dns_failure); @@ -671,6 +677,9 @@ inet_parse_passive(const char *target_, int default_port, if (!port && default_port < 0) { VLOG_ERR("%s: port must be specified", target_); ok = false; +if (dns_failure) { +*dns_failure = false; +} } else { ok = parse_sockaddr_components(ss, host, port, default_port, target_, resolve_host, dns_failure); -- 2.39.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 6/6] netdev-linux: Initialize link speed in error conditions.
Clang's static analyzer noted that the output from netdev_linux_get_speed_locked can be checked even if this function doesn't set any values. Now we always set those values to a sane default in all cases. Fixes: 6240c0b4c80e ("netdev: Add netdev_get_speed() to netdev API.") Signed-off-by: Mike Pattrick --- lib/netdev-linux.c | 5 + 1 file changed, 5 insertions(+) diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 8b855bdc4..83c19618c 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -2726,6 +2726,8 @@ netdev_linux_get_speed_locked(struct netdev_linux *netdev, uint32_t *current, uint32_t *max) { if (netdev_linux_netnsid_is_remote(netdev)) { +*current = 0; +*max = 0; return EOPNOTSUPP; } @@ -2735,6 +2737,9 @@ netdev_linux_get_speed_locked(struct netdev_linux *netdev, ? 0 : netdev->current_speed; *max = MIN(UINT32_MAX, netdev_features_to_bps(netdev->supported, 0) / 100ULL); +} else { +*current = 0; +*max = 0; } return netdev->get_features_error; } -- 2.39.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 3/6] dpctl: Fix uninitialized value when deleting flows.
Clang's static analyzer will complain about an uninitialized value because we weren't setting a value for ufid_generated in all code paths. Now we initialize this on declaration. This patch also corrects the reverse x-mass of variable declaration. Fixes: bbe2e3928747 ("dpctl: Fix broken flow deletion via ovs-dpctl due to missing ufid.") Signed-off-by: Mike Pattrick --- lib/dpctl.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/lib/dpctl.c b/lib/dpctl.c index 3c555a559..a70df5342 100644 --- a/lib/dpctl.c +++ b/lib/dpctl.c @@ -1359,19 +1359,17 @@ static int dpctl_del_flow_dpif(struct dpif *dpif, const char *key_s, struct dpctl_params *dpctl_p) { +struct dpif_port_dump port_dump; struct dpif_flow_stats stats; +bool ufid_generated = false; struct dpif_port dpif_port; -struct dpif_port_dump port_dump; -struct ofpbuf key; +bool ufid_present = false; +struct simap port_names; struct ofpbuf mask; /* To be ignored. */ - +struct ofpbuf key; ovs_u128 ufid; -bool ufid_generated; -bool ufid_present; -struct simap port_names; int n, error; -ufid_present = false; n = odp_ufid_from_string(key_s, &ufid); if (n < 0) { dpctl_error(dpctl_p, -n, "parsing flow ufid"); -- 2.39.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 2/6] netdev-native-tnl: Fix use of uninitialized offset on SRv6 header pop.
Clang's static analyzer will complain about uninitialized value 'hlen' because we weren't properly checking the error code from a function that would have initialized the value. Instead, add a check for that return code. Signed-off-by: Mike Pattrick Fixes: 03fc1ad78521 ("userspace: Add SRv6 tunnel support.") --- lib/netdev-native-tnl.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index dee9ab344..6bcc00d8c 100644 --- a/lib/netdev-native-tnl.c +++ b/lib/netdev-native-tnl.c @@ -1068,7 +1068,10 @@ netdev_srv6_pop_header(struct dp_packet *packet) } pkt_metadata_init_tnl(md); -netdev_tnl_ip_extract_tnl_md(packet, tnl, &hlen); +if (!netdev_tnl_ip_extract_tnl_md(packet, tnl, &hlen)) { +goto err; +} + dp_packet_reset_packet(packet, hlen); return packet; -- 2.39.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH v2 1/6] netdev-offload: Fix null pointer dereference' warnings on dump creation.
Clang's static analyzer will complain about a null pointer dereference because dumps can be set to null and then there is a loop where it could have been written to. This is a false positive, but only because the netdev dpif type won't change during this loop. Instead, return early from the netdev_ports_flow_dump_create function if dumps is NULL. Signed-off-by: Mike Pattrick --- lib/netdev-offload.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/lib/netdev-offload.c b/lib/netdev-offload.c index 931d634e1..8a9d36555 100644 --- a/lib/netdev-offload.c +++ b/lib/netdev-offload.c @@ -626,8 +626,8 @@ netdev_ports_traverse(const char *dpif_type, struct netdev_flow_dump ** netdev_ports_flow_dump_create(const char *dpif_type, int *ports, bool terse) { +struct netdev_flow_dump **dumps = NULL; struct port_to_netdev_data *data; -struct netdev_flow_dump **dumps; int count = 0; int i = 0; @@ -638,7 +638,11 @@ netdev_ports_flow_dump_create(const char *dpif_type, int *ports, bool terse) } } -dumps = count ? xzalloc(sizeof *dumps * count) : NULL; +if (!count) { +goto unlock; +} + +dumps = xzalloc(sizeof *dumps * count); HMAP_FOR_EACH (data, portno_node, &port_to_netdev) { if (netdev_get_dpif_type(data->netdev) == dpif_type) { @@ -650,6 +654,8 @@ netdev_ports_flow_dump_create(const char *dpif_type, int *ports, bool terse) i++; } } + +unlock: ovs_rwlock_unlock(&port_to_netdev_rwlock); *ports = i; -- 2.39.3 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] dpdk: Use DPDK 23.11.1 release.
Update the CI and docs to use DPDK 23.11.1. Signed-off-by: Kevin Traynor --- .github/workflows/build-and-test.yml | 4 ++-- Documentation/faq/releases.rst | 10 +- Documentation/intro/install/dpdk.rst | 8 NEWS | 2 ++ 4 files changed, 13 insertions(+), 11 deletions(-) diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index 424dbab6c..9d3a13ca1 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -11,6 +11,6 @@ jobs: dependencies: gcc libbpf-dev libnuma-dev libpcap-dev ninja-build pkgconf CC: gcc - DPDK_GIT: https://dpdk.org/git/dpdk - DPDK_VER: 23.11 + DPDK_GIT: https://dpdk.org/git/dpdk-stable + DPDK_VER: 23.11.1 name: dpdk gcc outputs: diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index 26973c2ad..70219d717 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -217,9 +217,9 @@ Q: What DPDK version does each Open vSwitch release work with? 2.15.x 20.11.6 2.16.x 20.11.6 -2.17.x 21.11.6 -3.0.x21.11.6 -3.1.x22.11.4 -3.2.x22.11.4 -3.3.x23.11 +2.17.x 21.11.7 +3.0.x21.11.7 +3.1.x22.11.5 +3.2.x22.11.5 +3.3.x23.11.1 diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index f1646322c..63a978f0e 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -43,5 +43,5 @@ In addition to the requirements described in :doc:`general`, building Open vSwitch with DPDK will require the following: -- DPDK 23.11 +- DPDK 23.11.1 - A `DPDK supported NIC`_ @@ -74,7 +74,7 @@ Install DPDK $ cd /usr/src/ - $ wget https://fast.dpdk.org/rel/dpdk-23.11.tar.xz - $ tar xf dpdk-23.11.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-23.11 + $ wget https://fast.dpdk.org/rel/dpdk-23.11.1.tar.xz + $ tar xf dpdk-23.11.1.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-23.11.1 $ cd $DPDK_DIR diff --git a/NEWS b/NEWS index b92cec532..f7e8a1372 100644 --- a/NEWS +++ b/NEWS @@ -8,4 +8,6 @@ Post-v3.3.0 The OVS tree remains hosted on GitHub. https://github.com/openvswitch/ovs.git + - DPDK: + * Add support for DPDK 23.11.1. -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH branch-3.3] dpdk: Use DPDK 23.11.1 release for OVS 3.3.
Update the CI and docs to use DPDK 23.11.1. Signed-off-by: Kevin Traynor --- .github/workflows/build-and-test.yml | 4 ++-- Documentation/faq/releases.rst | 10 +- Documentation/intro/install/dpdk.rst | 8 NEWS | 3 +++ 4 files changed, 14 insertions(+), 11 deletions(-) diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index 44491db3e..4a012efd9 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -11,6 +11,6 @@ jobs: dependencies: gcc libbpf-dev libnuma-dev libpcap-dev ninja-build pkgconf CC: gcc - DPDK_GIT: https://dpdk.org/git/dpdk - DPDK_VER: 23.11 + DPDK_GIT: https://dpdk.org/git/dpdk-stable + DPDK_VER: 23.11.1 name: dpdk gcc outputs: diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index 49b987b61..95ce0bf41 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -217,9 +217,9 @@ Q: What DPDK version does each Open vSwitch release work with? 2.15.x 20.11.6 2.16.x 20.11.6 -2.17.x 21.11.6 -3.0.x21.11.6 -3.1.x22.11.4 -3.2.x22.11.4 -3.3.x23.11 +2.17.x 21.11.7 +3.0.x21.11.7 +3.1.x22.11.5 +3.2.x22.11.5 +3.3.x23.11.1 diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index ad9bdf22c..93c4a12e7 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -43,5 +43,5 @@ In addition to the requirements described in :doc:`general`, building Open vSwitch with DPDK will require the following: -- DPDK 23.11 +- DPDK 23.11.1 - A `DPDK supported NIC`_ @@ -74,7 +74,7 @@ Install DPDK $ cd /usr/src/ - $ wget https://fast.dpdk.org/rel/dpdk-23.11.tar.xz - $ tar xf dpdk-23.11.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-23.11 + $ wget https://fast.dpdk.org/rel/dpdk-23.11.1.tar.xz + $ tar xf dpdk-23.11.1.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-23.11.1 $ cd $DPDK_DIR diff --git a/NEWS b/NEWS index 4bfb341cf..1c93d2944 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,7 @@ v3.3.1 - xx xxx + - DPDK: + * Add support for DPDK 23.11.1. + v3.3.0 - 16 Feb 2024 -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH branch-3.2] dpdk: Use DPDK 22.11.5 release for OVS 3.2.
Update the CI and docs to use DPDK 22.11.5. Signed-off-by: Kevin Traynor --- .github/workflows/build-and-test.yml | 2 +- Documentation/faq/releases.rst | 8 Documentation/intro/install/dpdk.rst | 8 NEWS | 2 ++ 4 files changed, 11 insertions(+), 9 deletions(-) diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index 565058cec..951e7e321 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -12,5 +12,5 @@ jobs: CC: gcc DPDK_GIT: https://dpdk.org/git/dpdk-stable - DPDK_VER: 22.11.4 + DPDK_VER: 22.11.5 name: dpdk gcc outputs: diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index f47d40836..2be9b501c 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -216,8 +216,8 @@ Q: What DPDK version does each Open vSwitch release work with? 2.15.x 20.11.6 2.16.x 20.11.6 -2.17.x 21.11.6 -3.0.x21.11.6 -3.1.x22.11.4 -3.2.x22.11.4 +2.17.x 21.11.7 +3.0.x21.11.7 +3.1.x22.11.5 +3.2.x22.11.5 diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 27df48493..a9921ae42 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -43,5 +43,5 @@ In addition to the requirements described in :doc:`general`, building Open vSwitch with DPDK will require the following: -- DPDK 22.11.4 +- DPDK 22.11.5 - A `DPDK supported NIC`_ @@ -74,7 +74,7 @@ Install DPDK $ cd /usr/src/ - $ wget https://fast.dpdk.org/rel/dpdk-22.11.4.tar.xz - $ tar xf dpdk-22.11.4.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-stable-22.11.4 + $ wget https://fast.dpdk.org/rel/dpdk-22.11.5.tar.xz + $ tar xf dpdk-22.11.5.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-22.11.5 $ cd $DPDK_DIR diff --git a/NEWS b/NEWS index baeecae04..8baf3d2aa 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,6 @@ v3.2.3 - xx xxx + - DPDK: + * OVS validated with DPDK 22.11.5. v3.2.2 - 08 Feb 2024 -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH branch-3.1] dpdk: Use DPDK 22.11.5 release for OVS 3.1.
Update the CI and docs to use DPDK 22.11.5. Signed-off-by: Kevin Traynor --- .github/workflows/build-and-test.yml | 2 +- Documentation/faq/releases.rst | 6 +++--- Documentation/intro/install/dpdk.rst | 8 NEWS | 2 ++ 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml index 22dc81252..6659c1025 100644 --- a/.github/workflows/build-and-test.yml +++ b/.github/workflows/build-and-test.yml @@ -12,5 +12,5 @@ jobs: CC: gcc DPDK_GIT: https://dpdk.org/git/dpdk-stable - DPDK_VER: 22.11.4 + DPDK_VER: 22.11.5 name: dpdk gcc outputs: diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index a393f78a8..a2625062d 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -215,7 +215,7 @@ Q: What DPDK version does each Open vSwitch release work with? 2.15.x 20.11.6 2.16.x 20.11.6 -2.17.x 21.11.6 -3.0.x21.11.6 -3.1.x22.11.4 +2.17.x 21.11.7 +3.0.x21.11.7 +3.1.x22.11.5 diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index 27df48493..a9921ae42 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -43,5 +43,5 @@ In addition to the requirements described in :doc:`general`, building Open vSwitch with DPDK will require the following: -- DPDK 22.11.4 +- DPDK 22.11.5 - A `DPDK supported NIC`_ @@ -74,7 +74,7 @@ Install DPDK $ cd /usr/src/ - $ wget https://fast.dpdk.org/rel/dpdk-22.11.4.tar.xz - $ tar xf dpdk-22.11.4.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-stable-22.11.4 + $ wget https://fast.dpdk.org/rel/dpdk-22.11.5.tar.xz + $ tar xf dpdk-22.11.5.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-22.11.5 $ cd $DPDK_DIR diff --git a/NEWS b/NEWS index aaddb77e8..44a3e6247 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,6 @@ v3.1.5 - xx xxx + - DPDK: + * OVS validated with DPDK 22.11.5. v3.1.4 - 08 Feb 2024 -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH branch-3.0] dpdk: Use DPDK 21.11.7 release for OVS 3.0.
Update the CI and docs to use DPDK 21.11.7. Signed-off-by: Kevin Traynor --- .ci/linux-build.sh | 2 +- Documentation/faq/releases.rst | 4 ++-- Documentation/intro/install/dpdk.rst | 8 NEWS | 2 ++ 4 files changed, 9 insertions(+), 7 deletions(-) diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 7cb183dde..9d44c80d7 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -229,5 +229,5 @@ fi if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then if [ -z "$DPDK_VER" ]; then -DPDK_VER="21.11.6" +DPDK_VER="21.11.7" fi install_dpdk $DPDK_VER diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index 0dd2ec865..10156c209 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -214,6 +214,6 @@ Q: What DPDK version does each Open vSwitch release work with? 2.15.x 20.11.6 2.16.x 20.11.6 -2.17.x 21.11.6 -3.0.x21.11.6 +2.17.x 21.11.7 +3.0.x21.11.7 diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index fff49007e..3d011ecb3 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -43,5 +43,5 @@ In addition to the requirements described in :doc:`general`, building Open vSwitch with DPDK will require the following: -- DPDK 21.11.6 +- DPDK 21.11.7 - A `DPDK supported NIC`_ @@ -74,7 +74,7 @@ Install DPDK $ cd /usr/src/ - $ wget https://fast.dpdk.org/rel/dpdk-21.11.6.tar.xz - $ tar xf dpdk-21.11.6.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.6 + $ wget https://fast.dpdk.org/rel/dpdk-21.11.7.tar.xz + $ tar xf dpdk-21.11.7.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.7 $ cd $DPDK_DIR diff --git a/NEWS b/NEWS index 0bb28638c..8550e8e89 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,6 @@ v3.0.7 - xx xxx + - DPDK: + * OVS validated with DPDK 21.11.7. v3.0.6 - 08 Feb 2024 -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH branch-2.17] dpdk: Use DPDK 21.11.7 release for OVS 2.17.
Update the CI and docs to use DPDK 21.11.7. Signed-off-by: Kevin Traynor --- .ci/linux-build.sh | 2 +- Documentation/faq/releases.rst | 2 +- Documentation/intro/install/dpdk.rst | 8 NEWS | 2 ++ 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh index 2e1bf2588..0f5263709 100755 --- a/.ci/linux-build.sh +++ b/.ci/linux-build.sh @@ -221,5 +221,5 @@ fi if [ "$DPDK" ] || [ "$DPDK_SHARED" ]; then if [ -z "$DPDK_VER" ]; then -DPDK_VER="21.11.6" +DPDK_VER="21.11.7" fi install_dpdk $DPDK_VER diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst index a9045a2d5..1e896add5 100644 --- a/Documentation/faq/releases.rst +++ b/Documentation/faq/releases.rst @@ -211,5 +211,5 @@ Q: What DPDK version does each Open vSwitch release work with? 2.15.x 20.11.6 2.16.x 20.11.6 -2.17.x 21.11.6 +2.17.x 21.11.7 diff --git a/Documentation/intro/install/dpdk.rst b/Documentation/intro/install/dpdk.rst index fff49007e..3d011ecb3 100644 --- a/Documentation/intro/install/dpdk.rst +++ b/Documentation/intro/install/dpdk.rst @@ -43,5 +43,5 @@ In addition to the requirements described in :doc:`general`, building Open vSwitch with DPDK will require the following: -- DPDK 21.11.6 +- DPDK 21.11.7 - A `DPDK supported NIC`_ @@ -74,7 +74,7 @@ Install DPDK $ cd /usr/src/ - $ wget https://fast.dpdk.org/rel/dpdk-21.11.6.tar.xz - $ tar xf dpdk-21.11.6.tar.xz - $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.6 + $ wget https://fast.dpdk.org/rel/dpdk-21.11.7.tar.xz + $ tar xf dpdk-21.11.7.tar.xz + $ export DPDK_DIR=/usr/src/dpdk-stable-21.11.7 $ cd $DPDK_DIR diff --git a/NEWS b/NEWS index ca111ae3a..e91072dd6 100644 --- a/NEWS +++ b/NEWS @@ -1,4 +1,6 @@ v2.17.10 - xx xxx -- + - DPDK: + * OVS validated with DPDK 21.11.7. v2.17.9 - 08 Feb 2024 -- 2.44.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [RFC 00/11] Add psample support to NXAST_SAMPLE action.
On Wed, May 22, 2024 at 03:26:05PM GMT, Aaron Conole wrote: > Ilya Maximets writes: > > > On 5/16/24 19:03, Adrian Moreno wrote: > >> > >> > >> On 4/24/24 9:53 PM, Adrian Moreno wrote: > >>> This is the userspace counterpart of the work being done in the kernel > >>> [1]. Sending it as RFC to get some early feedback on the overall > >>> solution. > >>> > >>> ** Problem description ** > >>> Currently, OVS supports several observability features, such as > >>> per-bridge IPFIX, per-flow IPFIX and sFlow. However, given the > >>> complexity of OVN-generated pipelines, a single sample is typically not > >>> enough to troubleshoot what is OVN/OVS is doing with the packet. We need > >>> highler level metadata alongside the packet sample. > >>> > >>> This can be achieved by the means of per-flow IPFIX sampling and > >>> NXAST_SAMPLE action, through which OVN can add two arbitrary 32-bit > >>> values (observation_domain_id and observation_point_id) that are sent > >>> alongside the packet information in the IPFIX frames. > >>> > >>> However, there is a fundamental limitation of the existing sampling > >>> infrastructure, specially in the kernel datapath: samples are handled by > >>> ovs-vswitchd, forcing the need for an upcall (userspace action). The > >>> fact that samples share the same netlink sockets and handler thread cpu > >>> time with actual packet misse, can easily cause packet drops making this > >>> solution unfit for use in highly loaded production systems. > >>> > >>> Users are left with little option than guessing what sampling rate will > >>> be OK for their traffic pattern and dealing with the lost accuracy. > >>> > >>> ** Feature Description ** > >>> In order to solve this situation and enable this feature to be safely > >>> turned on in production, this RFC uses the psample support proposed in > >>> [1] to send NXAST_SAMPLE samples to psample adding the observability > >>> domain and point information as metadata. > >>> > >>> ~~ API ~~ > >>> The API is simply a new field called "psample_group" in the > >>> Flow_Sample_Collector_Set (FSCS) table. Configuring this value is > >>> orthogonal to also associating the FSCS entry to an entry in the IPFIX > >>> table. > >>> > >>> Apart from that configuration, the controller needs to add NXAST_SAMPLE > >>> actions that refer the entry's id. > >>> > >>> ~~ HW Offload ~~ > >>> psample is already supported by tc using the act_sample action. The > >>> metadata is taken from the actions' cookie. > >>> This series also adds support for offloading the odp sample action, > >>> only when it includes psample information but not nested actions. > >>> > >>> A bit of special care has to be taken in the tc layer to not store the > >>> ufid in the sample's cookie as it's used to carry action-specific data. > >>> > >>> ~~ Datapath support ~~ > >>> This new behavior of the datapth sample action is only supported on the > >>> kernel datapath. A more detailed analysis is needed (and planned) to > >>> find a way to also optimize the userspace datapath. However, although > >>> potentially inefficient, there is not that much risk of dropping packets > >>> with the existing IPFIX infrastructure. > >>> > >>> ~~ Testing ~~ > >>> The series includes an utility program called "ovs-psample" that listens > >>> to packets multicasted by the psample module and prints them (also > >>> printing the obs_domain and obs_point ids). In addition the kernel > >>> series includes a tracepoint for easy testing and troubleshooting. > >>> > >>> [1] https://patchwork.kernel.org/project/netdevbpf/list/?series=847473 > >>> > >>> > >> > >> Hi all, > >> > >> I had an offline meeting with Eelco, Ilya and Aaron about this topic and > >> wanted > >> to put out what was discussed and hopefully get more feedback. > >> > >> In general, 3 options were considered: > >> > >> Option A: Reusing the sample action > >> === > >> This is essentially what this proposal (and it's kernel counterpart) > >> consists > >> in. The datapath action would look like this: > >> > >> sample(probability=50%, actions(...), group=10, cookie=0x123) > >> > >> Pros > >> > >> - In userspace, it fits nicely with the existing per-flow sampling > >> infrastructure as this RFC exemplifies. > >> > >> - The probability is present in the context of sending the packet to > >> psample so > >> it's easily carried to psample's rate, making the consumer aware of the > >> accuracy > >> of the sample. > >> > >> - Relatively easy to implement in tc as a act_sample exists with similar > >> semantics. > >> > >> Cons > >> > >> - It breaks the original design of the "sample" action. The "sample" action > >> means: The packet is sampled (a probability is calculated) and, if the > >> result is > >> positive, a set of nested actions is executed. This follows the > >> "building-blocks" approach of datapath action. This option breaks this > >> assumption by adding more semantics and behavior to the "sample" act
Re: [ovs-dev] [PATCH ovn 4/4] controller, northd: Add support for CT zone limits.
Bleep bloop. Greetings Ales Musil, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Line is 84 characters long (recommended limit is 79) #497 FILE: ovn-nb.xml:726: type='{"type": "integer", "minInteger": 0, "maxInteger": 4294967295}'> WARNING: Line is 86 characters long (recommended limit is 79) #514 FILE: ovn-nb.xml:1137: type='{"type": "integer", "minInteger": 0, "maxInteger": 4294967295}'> WARNING: Line is 84 characters long (recommended limit is 79) #532 FILE: ovn-nb.xml:2811: type='{"type": "integer", "minInteger": 0, "maxInteger": 4294967295}'> Lines checked: 651, Warnings: 3, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn 1/4] controller: Move CT zone handling into separate module.
Bleep bloop. Greetings Ales Musil, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: WARNING: Comment with 'xxx' marker #190 FILE: controller/ct-zone.c:146: /* XXX Add method to limit zone assignment to logical router WARNING: Comment with 'xxx' marker #268 FILE: controller/ct-zone.c:224: /* xxx This is wasteful to assign a zone to each port--even if no WARNING: Comment with 'xxx' marker #269 FILE: controller/ct-zone.c:225: * xxx security policy is applied. */ Lines checked: 1109, Warnings: 3, Errors: 0 Please check this out. If you feel there has been an error, please email acon...@redhat.com Thanks, 0-day Robot ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH ovn 4/4] controller, northd: Add support for CT zone limits.
Add support for limitng the CT zone usage per Ls, LR or LSP. When the limit is configured on logical switch it will also implicitly set limits for all ports in that logical switch. The port configuration can be overwritten individually and has priority over the whole logical switch configuration. The value 0 means unlimited, when the value is not specified it is derived from OvS default CT limit specified for given OvS datapath. Reported-at: https://bugzilla.redhat.com/2189924 Signed-off-by: Ales Musil --- NEWS| 3 + controller/ct-zone.c| 170 controller/ct-zone.h| 12 ++- controller/ovn-controller.c | 25 +- lib/ovn-util.c | 17 lib/ovn-util.h | 3 + northd/northd.c | 8 ++ ovn-nb.xml | 29 ++ tests/ovn-controller.at | 99 + 9 files changed, 345 insertions(+), 21 deletions(-) diff --git a/NEWS b/NEWS index 81c958f9a..e0465a34f 100644 --- a/NEWS +++ b/NEWS @@ -21,6 +21,9 @@ Post v24.03.0 MAC addresses configured on the LSP with "unknown", are learnt via the OVN native FDB. - Add support for ovsdb-server `--config-file` option in ovn-ctl. + - Add support for CT zone limit that can be specified per LR +(options:ct-zone-limit), LS (other_config:ct-zone-limit) or LSP +(options:ct-zone-limit). OVN v24.03.0 - 01 Mar 2024 -- diff --git a/controller/ct-zone.c b/controller/ct-zone.c index 6065cbfe6..259d423bb 100644 --- a/controller/ct-zone.c +++ b/controller/ct-zone.c @@ -34,6 +34,17 @@ static bool ct_zone_assign_unused(struct ct_zone_ctx *ctx, static bool ct_zone_remove(struct ct_zone_ctx *ctx, const char *name); static void ct_zone_add(struct ct_zone_ctx *ctx, const char *name, uint16_t zone, bool set_pending); +static void ct_zone_limits_sync_per_dp(struct ct_zone_ctx *ctx, + const struct sbrec_datapath_binding *dp, + const char *name, + struct ovsdb_idl_index *pb_by_dp); +static void ct_zone_limit_sync(struct ct_zone_ctx *ctx, const char *name, + int64_t limit); +static int64_t ct_zone_get_dp_limit(const struct sbrec_datapath_binding *dp); +static int64_t ct_zone_get_pb_limit(const struct sbrec_port_binding *pb); +static int64_t ct_zone_limit_normalize(int64_t limit); +static struct ovsrec_ct_zone * +ct_zone_find_ovsrec(const struct ovsrec_datapath *dp, uint16_t zone_id); void ct_zones_restore(struct ct_zone_ctx *ctx, @@ -195,11 +206,14 @@ ct_zones_update(const struct sset *local_lports, void ct_zones_commit(const struct ovsrec_bridge *br_int, +const struct ovsrec_datapath *ovs_dp, +struct ovsdb_idl_txn *ovs_idl_txn, struct shash *pending_ct_zones) { struct shash_node *iter; SHASH_FOR_EACH (iter, pending_ct_zones) { struct ct_zone_pending_entry *ctzpe = iter->data; +struct ct_zone *ct_zone = &ctzpe->ct_zone; /* The transaction is open, so any pending entries in the * CT_ZONE_DB_QUEUED must be sent and any in CT_ZONE_DB_QUEUED @@ -211,7 +225,7 @@ ct_zones_commit(const struct ovsrec_bridge *br_int, char *user_str = xasprintf("ct-zone-%s", iter->name); if (ctzpe->add) { -char *zone_str = xasprintf("%"PRIu16, ctzpe->ct_zone.zone); +char *zone_str = xasprintf("%"PRIu16, ct_zone->zone); struct smap_node *node = smap_get_node(&br_int->external_ids, user_str); if (!node || strcmp(node->value, zone_str)) { @@ -226,6 +240,19 @@ ct_zones_commit(const struct ovsrec_bridge *br_int, } free(user_str); +struct ovsrec_ct_zone *ovs_zone = +ct_zone_find_ovsrec(ovs_dp, ct_zone->zone); +if ((!ctzpe->add || ct_zone->limit < 0) && ovs_zone) { +ovsrec_datapath_update_ct_zones_delkey(ovs_dp, ct_zone->zone); +} else if (ctzpe->add && ct_zone->limit >= 0) { +if (!ovs_zone) { +ovs_zone = ovsrec_ct_zone_insert(ovs_idl_txn); +ovsrec_datapath_update_ct_zones_setkey(ovs_dp, ct_zone->zone, + ovs_zone); +} +ovsrec_ct_zone_set_limit(ovs_zone, &ct_zone->limit, 1); +} + ctzpe->state = CT_ZONE_DB_SENT; } } @@ -246,8 +273,19 @@ ct_zones_pending_clear_commited(struct shash *pending) /* Returns "true" when there is no need for full recompute. */ bool ct_zone_handle_dp_update(struct ct_zone_ctx *ctx, - const struct sbrec_datapath_binding *dp) + const struct sbrec_datapath_binding *dp, + struct ovsdb_idl_index *pb_by_dp) { +const char *name = smap_get(&dp->external_ids,
[ovs-dev] [PATCH ovn 1/4] controller: Move CT zone handling into separate module.
Move the CT zone handling specific bits into it's own module. This allows for easier changes done within the module and separates the logic that is unrelated from ovn-controller. Signed-off-by: Ales Musil --- controller/automake.mk | 4 +- controller/ct-zone.c| 377 ++ controller/ct-zone.h| 74 +++ controller/ofctrl.c | 3 +- controller/ovn-controller.c | 392 +++- controller/ovn-controller.h | 21 +- controller/pinctrl.c| 2 +- tests/ovn.at| 4 +- 8 files changed, 485 insertions(+), 392 deletions(-) create mode 100644 controller/ct-zone.c create mode 100644 controller/ct-zone.h diff --git a/controller/automake.mk b/controller/automake.mk index 1b1b3aeb1..ed93cfb3c 100644 --- a/controller/automake.mk +++ b/controller/automake.mk @@ -47,7 +47,9 @@ controller_ovn_controller_SOURCES = \ controller/mac-cache.h \ controller/mac-cache.c \ controller/statctrl.h \ - controller/statctrl.c + controller/statctrl.c \ + controller/ct-zone.h \ + controller/ct-zone.c controller_ovn_controller_LDADD = lib/libovn.la $(OVS_LIBDIR)/libopenvswitch.la man_MANS += controller/ovn-controller.8 diff --git a/controller/ct-zone.c b/controller/ct-zone.c new file mode 100644 index 0..96084fd9e --- /dev/null +++ b/controller/ct-zone.c @@ -0,0 +1,377 @@ +/* Copyright (c) 2024, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "ct-zone.h" +#include "local_data.h" +#include "openvswitch/vlog.h" + +VLOG_DEFINE_THIS_MODULE(ct_zone); + +static void +ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table, +struct ct_zone_ctx *ctx, const char *name, int zone); +static void ct_zone_add_pending(struct shash *pending_ct_zones, +enum ct_zone_pending_state state, +int zone, bool add, const char *name); + +void +ct_zones_restore(struct ct_zone_ctx *ctx, + const struct ovsrec_open_vswitch_table *ovs_table, + const struct sbrec_datapath_binding_table *dp_table, + const struct ovsrec_bridge *br_int) +{ +memset(ctx->bitmap, 0, sizeof ctx->bitmap); +bitmap_set1(ctx->bitmap, 0); /* Zone 0 is reserved. */ + +struct shash_node *pending_node; +SHASH_FOR_EACH (pending_node, &ctx->pending) { +struct ct_zone_pending_entry *ctpe = pending_node->data; + +if (ctpe->add) { +ct_zone_restore(dp_table, ctx, pending_node->name, ctpe->zone); +} +} + +const struct ovsrec_open_vswitch *cfg; +cfg = ovsrec_open_vswitch_table_first(ovs_table); +if (!cfg) { +return; +} + +if (!br_int) { +/* If the integration bridge hasn't been defined, assume that + * any existing ct-zone definitions aren't valid. */ +return; +} + +struct smap_node *node; +SMAP_FOR_EACH (node, &br_int->external_ids) { +if (strncmp(node->key, "ct-zone-", 8)) { +continue; +} + +const char *user = node->key + 8; +if (!user[0]) { +continue; +} + +if (shash_find(&ctx->pending, user)) { +continue; +} + +unsigned int zone; +if (!str_to_uint(node->value, 10, &zone)) { +continue; +} + +ct_zone_restore(dp_table, ctx, user, zone); +} +} + +bool +ct_zone_assign_unused(struct ct_zone_ctx *ctx, const char *zone_name, + int *scan_start) +{ +/* We assume that there are 64K zones and that we own them all. */ +int zone = bitmap_scan(ctx->bitmap, 0, *scan_start, MAX_CT_ZONES + 1); +if (zone == MAX_CT_ZONES + 1) { +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); +VLOG_WARN_RL(&rl, "exhausted all ct zones"); +return false; +} + +*scan_start = zone + 1; + +ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, +zone, true, zone_name); + +bitmap_set1(ctx->bitmap, zone); +simap_put(&ctx->current, zone_name, zone); +return true; +} + +bool +ct_zone_remove(struct ct_zone_ctx *ctx, const char *name) +{ +struct simap_node *ct_zone = simap_find(&ctx->current, name); +if (!ct_zone) { +return false; +} + +VLOG_DBG("removing ct zone %"PRId32" for '%s'", ct_zo
[ovs-dev] [PATCH ovn 3/4] controller: Prepare structure around CT zone limiting.
In order to be able to store CT limits for specified zone, store the zone inside separate struct instead of simap. This allows to add the addition of limit without chaning the whole infrastructure again. This is a preparation step for the CT zone limits. Signed-off-by: Ales Musil --- controller/ct-zone.c| 171 +--- controller/ct-zone.h| 13 ++- controller/ofctrl.c | 2 +- controller/ovn-controller.c | 17 ++-- controller/physical.c | 17 ++-- controller/physical.h | 2 +- 6 files changed, 129 insertions(+), 93 deletions(-) diff --git a/controller/ct-zone.c b/controller/ct-zone.c index 16452bc2d..6065cbfe6 100644 --- a/controller/ct-zone.c +++ b/controller/ct-zone.c @@ -26,12 +26,14 @@ ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table, struct ct_zone_ctx *ctx, const char *name, int zone); static void ct_zone_add_pending(struct shash *pending_ct_zones, enum ct_zone_pending_state state, -int zone, bool add, const char *name); +struct ct_zone *zone, bool add, +const char *name); static int ct_zone_get_snat(const struct sbrec_datapath_binding *dp); static bool ct_zone_assign_unused(struct ct_zone_ctx *ctx, const char *zone_name, int *scan_start); -static bool ct_zone_remove(struct ct_zone_ctx *ctx, - struct simap_node *ct_zone); +static bool ct_zone_remove(struct ct_zone_ctx *ctx, const char *name); +static void ct_zone_add(struct ct_zone_ctx *ctx, const char *name, +uint16_t zone, bool set_pending); void ct_zones_restore(struct ct_zone_ctx *ctx, @@ -47,7 +49,8 @@ ct_zones_restore(struct ct_zone_ctx *ctx, struct ct_zone_pending_entry *ctpe = pending_node->data; if (ctpe->add) { -ct_zone_restore(dp_table, ctx, pending_node->name, ctpe->zone); +ct_zone_restore(dp_table, ctx, pending_node->name, +ctpe->ct_zone.zone); } } @@ -91,7 +94,6 @@ void ct_zones_update(const struct sset *local_lports, const struct hmap *local_datapaths, struct ct_zone_ctx *ctx) { -struct simap_node *ct_zone; int scan_start = 1; const char *user; struct sset all_users = SSET_INITIALIZER(&all_users); @@ -132,12 +134,14 @@ ct_zones_update(const struct sset *local_lports, } /* Delete zones that do not exist in above sset. */ -SIMAP_FOR_EACH_SAFE (ct_zone, &ctx->current) { -if (!sset_contains(&all_users, ct_zone->name)) { -ct_zone_remove(ctx, ct_zone); -} else if (!simap_find(&req_snat_zones, ct_zone->name)) { -bitmap_set1(unreq_snat_zones_map, ct_zone->data); -simap_put(&unreq_snat_zones, ct_zone->name, ct_zone->data); +struct shash_node *node; +SHASH_FOR_EACH_SAFE (node, &ctx->current) { +struct ct_zone *ct_zone = node->data; +if (!sset_contains(&all_users, node->name)) { +ct_zone_remove(ctx, node->name); +} else if (!simap_find(&req_snat_zones, node->name)) { +bitmap_set1(unreq_snat_zones_map, ct_zone->zone); +simap_put(&unreq_snat_zones, node->name, ct_zone->zone); } } @@ -152,7 +156,7 @@ ct_zones_update(const struct sset *local_lports, struct simap_node *unreq_node; SIMAP_FOR_EACH_SAFE (unreq_node, &unreq_snat_zones) { if (unreq_node->data == snat_req_node->data) { -simap_find_and_delete(&ctx->current, unreq_node->name); +ct_zone_remove(ctx, unreq_node->name); simap_delete(&unreq_snat_zones, unreq_node); } } @@ -163,26 +167,12 @@ ct_zones_update(const struct sset *local_lports, bitmap_set0(unreq_snat_zones_map, snat_req_node->data); } -struct simap_node *node = simap_find(&ctx->current, - snat_req_node->name); -if (node) { -if (node->data != snat_req_node->data) { -/* Zone request has changed for this node. delete old entry and - * create new one*/ -ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, -snat_req_node->data, true, -snat_req_node->name); -bitmap_set0(ctx->bitmap, node->data); -} -bitmap_set1(ctx->bitmap, snat_req_node->data); -node->data = snat_req_node->data; -} else { -ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, -snat_req_node->data, true, -snat_req_node->name); -bitmap_set1(ctx->bitmap, snat_req_node->data); -
[ovs-dev] [PATCH ovn 2/4] controller: Further encapsulate the CT zone handling.
Move more code into the new ct-zone module and encapsulate functionality that is strictly related to CT zone handling. Signed-off-by: Ales Musil --- controller/ct-zone.c| 156 +--- controller/ct-zone.h| 8 +- controller/ovn-controller.c | 49 ++- 3 files changed, 118 insertions(+), 95 deletions(-) diff --git a/controller/ct-zone.c b/controller/ct-zone.c index 96084fd9e..16452bc2d 100644 --- a/controller/ct-zone.c +++ b/controller/ct-zone.c @@ -27,6 +27,11 @@ ct_zone_restore(const struct sbrec_datapath_binding_table *dp_table, static void ct_zone_add_pending(struct shash *pending_ct_zones, enum ct_zone_pending_state state, int zone, bool add, const char *name); +static int ct_zone_get_snat(const struct sbrec_datapath_binding *dp); +static bool ct_zone_assign_unused(struct ct_zone_ctx *ctx, + const char *zone_name, int *scan_start); +static bool ct_zone_remove(struct ct_zone_ctx *ctx, + struct simap_node *ct_zone); void ct_zones_restore(struct ct_zone_ctx *ctx, @@ -82,47 +87,6 @@ ct_zones_restore(struct ct_zone_ctx *ctx, } } -bool -ct_zone_assign_unused(struct ct_zone_ctx *ctx, const char *zone_name, - int *scan_start) -{ -/* We assume that there are 64K zones and that we own them all. */ -int zone = bitmap_scan(ctx->bitmap, 0, *scan_start, MAX_CT_ZONES + 1); -if (zone == MAX_CT_ZONES + 1) { -static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); -VLOG_WARN_RL(&rl, "exhausted all ct zones"); -return false; -} - -*scan_start = zone + 1; - -ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, -zone, true, zone_name); - -bitmap_set1(ctx->bitmap, zone); -simap_put(&ctx->current, zone_name, zone); -return true; -} - -bool -ct_zone_remove(struct ct_zone_ctx *ctx, const char *name) -{ -struct simap_node *ct_zone = simap_find(&ctx->current, name); -if (!ct_zone) { -return false; -} - -VLOG_DBG("removing ct zone %"PRId32" for '%s'", ct_zone->data, - ct_zone->name); - -ct_zone_add_pending(&ctx->pending, CT_ZONE_OF_QUEUED, -ct_zone->data, false, ct_zone->name); -bitmap_set0(ctx->bitmap, ct_zone->data); -simap_delete(&ctx->current, ct_zone); - -return true; -} - void ct_zones_update(const struct sset *local_lports, const struct hmap *local_datapaths, struct ct_zone_ctx *ctx) @@ -170,7 +134,7 @@ ct_zones_update(const struct sset *local_lports, /* Delete zones that do not exist in above sset. */ SIMAP_FOR_EACH_SAFE (ct_zone, &ctx->current) { if (!sset_contains(&all_users, ct_zone->name)) { -ct_zone_remove(ctx, ct_zone->name); +ct_zone_remove(ctx, ct_zone); } else if (!simap_find(&req_snat_zones, ct_zone->name)) { bitmap_set1(unreq_snat_zones_map, ct_zone->data); simap_put(&unreq_snat_zones, ct_zone->name, ct_zone->data); @@ -276,12 +240,6 @@ ct_zones_commit(const struct ovsrec_bridge *br_int, } } -int -ct_zone_get_snat(const struct sbrec_datapath_binding *dp) -{ -return smap_get_int(&dp->external_ids, "snat-ct-zone", -1); -} - void ct_zones_pending_clear_commited(struct shash *pending) { @@ -295,6 +253,108 @@ ct_zones_pending_clear_commited(struct shash *pending) } } +/* Returns "true" when there is no need for full recompute. */ +bool +ct_zone_handle_dp_update(struct ct_zone_ctx *ctx, + const struct sbrec_datapath_binding *dp) +{ +int req_snat_zone = ct_zone_get_snat(dp); +if (req_snat_zone == -1) { +/* datapath snat ct zone is not set. This condition will also hit + * when CMS clears the snat-ct-zone for the logical router. + * In this case there is no harm in using the previosly specified + * snat ct zone for this datapath. Also it is hard to know + * if this option was cleared or if this option is never set. */ +return true; +} + +const char *name = smap_get(&dp->external_ids, "name"); +if (!name) { +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 1); +VLOG_ERR_RL(&rl, "Missing name for datapath '"UUID_FMT"' skipping" +"zone check.", UUID_ARGS(&dp->header_.uuid)); +return true; +} + +/* Check if the requested snat zone has changed for the datapath + * or not. If so, then fall back to full recompute of + * ct_zone engine. */ +char *snat_dp_zone_key = alloc_nat_zone_key(name, "snat"); +struct simap_node *simap_node = +simap_find(&ctx->current, snat_dp_zone_key); +free(snat_dp_zone_key); +if (!simap_node || simap_node->data != req_snat_zone) { +/* There is no entry yet or the requested snat zone has changed. + *
[ovs-dev] [PATCH ovn 0/4] Add ability to limit CT entries per LS/LR/LSP
Add ability that allows to set CT limits per logical switch, logical router or logical switch port. When the limit is applied to logical switch it will be implicitly set for all logical ports in the logical switch. This can be overwritten individually per port. To achieve this there is a small refactor of the CT zone handling logic which allows us to get the zone limiting more easily. Ales Musil (4): controller: Move CT zone handling into separate module. controller: Further encapsulate the CT zone handling. controller: Prepare structure around CT zone limiting. controller, northd: Add support for CT zone limits. NEWS| 3 + controller/automake.mk | 4 +- controller/ct-zone.c| 604 controller/ct-zone.h| 89 ++ controller/ofctrl.c | 5 +- controller/ovn-controller.c | 451 +++ controller/ovn-controller.h | 21 +- controller/physical.c | 17 +- controller/physical.h | 2 +- controller/pinctrl.c| 2 +- lib/ovn-util.c | 17 + lib/ovn-util.h | 3 + northd/northd.c | 8 + ovn-nb.xml | 29 ++ tests/ovn-controller.at | 99 ++ tests/ovn.at| 4 +- 16 files changed, 917 insertions(+), 441 deletions(-) create mode 100644 controller/ct-zone.c create mode 100644 controller/ct-zone.h -- 2.45.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH] srv6: Fix misaligned writes to segment list.
On 5/20/24 03:54, Nobuhiro MIKI wrote: > On 2024/05/18 3:33, Ilya Maximets wrote: >> Segments list in SRv6 header is 16-bit aligned as most of other fields >> in packet headers. A little counter-intuitively, compilers are allowed >> to make alignment assumptions based on the pointer type passed to >> memcpy(), so they can use copy instructions that require 32-bit alignment >> in case of struct in6_addr pointer. Reported by UBsan in Clang 18: >> >> lib/netdev-native-tnl.c:985:16: runtime error: store to misaligned >>address 0x7fd9e97351ce for type 'struct in6_addr *', which >>requires 4 byte alignment >> 0x7fd9e97351ce: note: pointer points here >> 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >> ^ >>0 0xc1de38 in netdev_srv6_build_header lib/netdev-native-tnl.c:985:9 >>1 0x6e794b in tnl_port_build_header ofproto/tunnel.c:751:11 >>2 0x6c9c0a in native_tunnel_output ofproto/ofproto-dpif-xlate.c:3887:11 >>3 0x6c9c0a in compose_output_action__ ofproto/ofproto-dpif-xlate.c:4502:13 >>4 0x6b6646 in compose_output_action ofproto/ofproto-dpif-xlate.c:4564:5 >>5 0x6b6646 in xlate_output_action ofproto/ofproto-dpif-xlate.c:5517:13 >>6 0x68cfee in do_xlate_actions ofproto/ofproto-dpif-xlate.c:7288:13 >>7 0x67fed0 in xlate_actions ofproto/ofproto-dpif-xlate.c:8314:13 >>8 0x6468bd in ofproto_trace__ ofproto/ofproto-dpif-trace.c:782:30 >>9 0x64484a in ofproto_trace ofproto/ofproto-dpif-trace.c:851:5 >> 10 0x647469 in ofproto_unixctl_trace ofproto/ofproto-dpif-trace.c:490:9 >> 11 0xc33771 in process_command lib/unixctl.c:310:13 >> 12 0xc33771 in run_connection lib/unixctl.c:344:17 >> 13 0xc33771 in unixctl_server_run lib/unixctl.c:395:21 >> 14 0x53e6ef in main vswitchd/ovs-vswitchd.c:131:9 >> 15 0x7f61c7 in __libc_start_call_main (/lib64/libc.so.6+0x2a1c7) >> 16 0x7f628a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a28a) >> 17 0x42ca24 in _start (vswitchd/ovs-vswitchd+0x42ca24) >> >> SUMMARY: UndefinedBehaviorSanitizer: >> undefined-behavior lib/netdev-native-tnl.c:985:16 >> >> Having misaligned pointers is also generally not allowed in C, let >> alone accessing memory through them. >> >> Fix that by using an appropriate ovs_16aligned_in6_addr pointer instead. >> >> Fixes: 7381fd440a88 ("odp: Add SRv6 tunnel actions.") >> Fixes: 03fc1ad78521 ("userspace: Add SRv6 tunnel support.") >> Signed-off-by: Ilya Maximets >> --- >> lib/netdev-native-tnl.c | 5 ++--- >> lib/odp-util.c | 4 ++-- >> 2 files changed, 4 insertions(+), 5 deletions(-) > > Hi Ilya, > > Thanks for the fix! > > Reviewed-by: Nobuhiro MIKI Thanks! Applied and backported down to 3.2. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH 2/2] atlocal: Replace deprecated pkg_resources.
On 5/21/24 08:32, Eelco Chaudron wrote: > > > On 17 May 2024, at 20:47, Ilya Maximets wrote: > >> 'pkg_resources' module is deprecated and no longer available in newer >> versions of python, so pytest tests are skipped: >> >> DeprecationWarning: pkg_resources is deprecated as an API. >> See https://setuptools.pypa.io/en/latest/pkg_resources.html >> >> Unfortunately, there is no direct replacement for it and functionality >> is scattered between different packages. >> >> Using a new standard library importlib.metadata to find installed >> packages and their versions. Using packaging.requirements to parse >> lines from the requirements file and compare versions. This covers >> all we need. >> >> The 'packaging' is a project used by pip and a dependency for many >> other libraries, so should be available for any supported verison of >> python. 'importlib' was introduced in python 3.8. Since we support >> older versions of python and 'packaging' is not part of the standard >> library, checking that import is possible and falling back to >> 'pkg_resources' if needed. We may remove the fallback when we stop >> supporting python below 3.8. >> >> Even though 'packaging' is a common dependency, added to the test >> requirements so it will not be missed in CI. >> >> Signed-off-by: Ilya Maximets > > Thanks for fixing this. The changes look good to me. > > Acked-by: Eelco Chaudron > Thanks! Applied and backported down to 3.0. Best regards, Ilya Maximets. ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] northd: Add bfd and bfd_consumer nodes to I-P engine.
On 5/16/24 15:08, Lorenzo Bianconi wrote: >> On Thu, May 2, 2024 at 8:22 AM Lorenzo Bianconi >> wrote: >>> >>> Introduce bfd and bfd_consumer nodes to northd I-P engine to track bfd >>> connections and northd static_route/policy_route changes. >>> >>> Reported-at: https://issues.redhat.com/browse/FDP-600 >>> Signed-off-by: Lorenzo Bianconi >> >> Hi Lorenzo, > > Hi Han, > > Thx for the review. > >> >> Thanks for the patch. Could you explain in more detail about the motivation >> and background information of this patch? I followed the link in >> Reported-at but still not clear. >> I assume it is for performance reasons, so could you explain what scenarios >> would benefit from this? > > The main motivation for this patch is to address a request from Dumitru [0] > for FDP-56 [1]. In particular, in the new ECMP_nexthop monitor [2], we need to > recalculate the static routes map, so he requested to compute this hashmap in > advance and reuse it in the lflow codebase and in the new ECMP_nexthop monitor > codebase. Since we will need even the BFD info there, I moved the code out of > lflow generation codebase and introduced two dedicated nodes (bfd and > bfd-consumer) in the Northd IP engine. > > @Dumitru: is my explanation correct? > I think that's accurate. >> >> For the patch, I haven't looked into details yet, but it seems you didn't >> add any change handler for the two new nodes. So will there be follow up >> patches for the real performance improvement? > > for BFD node the logic is already in en_bfd_run() since we do not always > return EN_UPDATED but if there are no valid changes, we will return > EN_UNCHANGED. Am I missing something? > For bfd-consumer node I think we actually do not need it. Am I wrong? > >> But even for the current patch, with this: engine_add_input(&en_bfd, >> &en_northd, NULL); does it mean any en_northd change would trigger a lflow >> recompute because of the dependency en_northd -> en_bfd -> en_lflow. Would >> this be a performance regression? If not, why? > > I guess no, otherwise the IP ovn tests will fail, right? > Like Lorenzo mentioned above, en_bfd will be marked as UPDATED only if the BFD state actually changed. Without this patch we were already triggering an lflow recompute: engine_add_input(&en_lflow, &en_sb_bfd, NULL); Which makes me wonder, Lorenzo, why do we still have this dependency? >> >> For the noop handler used, please provide a detailed explanation to justify >> it: why the node depends on the input but doesn't need to process when the >> input changes? +1, a detailed comment should be added, I agree with Han, it's not always obvious. > > I think for the bfd-consumer node we have just an order dependency with northd > one, right? I will review this part. > > Regards, > Lorenzo > >> >> Thanks, >> Han Regards, Dumitru > > [0] > https://patchwork.ozlabs.org/project/ovn/patch/f31fb6630039c9b310b7fa489247d354f8a46719.1710257650.git.lorenzo.bianc...@redhat.com/ > [1] https://issues.redhat.com/browse/FDP-56 > [2] > https://patchwork.ozlabs.org/project/ovn/cover/cover.1710257649.git.lorenzo.bianc...@redhat.com/ > >> >>> --- >>> northd/en-lflow.c| 19 +-- >>> northd/en-northd.c | 92 + >>> northd/en-northd.h | 8 ++ >>> northd/inc-proc-northd.c | 21 ++- >>> northd/northd.c | 274 --- >>> northd/northd.h | 48 ++- >>> 6 files changed, 344 insertions(+), 118 deletions(-) >>> >>> diff --git a/northd/en-lflow.c b/northd/en-lflow.c >>> index c4b927fb8..9efbd5117 100644 >>> --- a/northd/en-lflow.c >>> +++ b/northd/en-lflow.c >>> @@ -41,6 +41,9 @@ lflow_get_input_data(struct engine_node *node, >>> struct lflow_input *lflow_input) >>> { >>> struct northd_data *northd_data = engine_get_input_data("northd", >> node); >>> +struct bfd_data *bfd_data = engine_get_input_data("bfd", node); >>> +struct bfd_consumer_data *bfd_consumer_data = >>> +engine_get_input_data("bfd_consumer", node); >>> struct port_group_data *pg_data = >>> engine_get_input_data("port_group", node); >>> struct sync_meters_data *sync_meters_data = >>> @@ -78,7 +81,10 @@ lflow_get_input_data(struct engine_node *node, >>> lflow_input->meter_groups = &sync_meters_data->meter_groups; >>> lflow_input->lb_datapaths_map = &northd_data->lb_datapaths_map; >>> lflow_input->svc_monitor_map = &northd_data->svc_monitor_map; >>> -lflow_input->bfd_connections = NULL; >>> +lflow_input->bfd_connections = &bfd_data->bfd_connections; >>> +lflow_input->parsed_routes = &bfd_consumer_data->parsed_routes; >>> +lflow_input->route_tables = &bfd_consumer_data->route_tables; >>> +lflow_input->route_policies = &bfd_consumer_data->route_policies; >>> >>> struct ed_type_global_config *global_config = >>> engine_get_input_data("global_config", node); >>> @@ -95,25 +101,14 @@ void en_lflow_run(struct engine
Re: [ovs-dev] [v3] odp-execute: Fix AVX checksum calculation.
> On 21 May 2024, at 16:13, Emma Finn wrote: > > The AVX implementation for calcualting checksums was not > > handling carry-over addition correctly in some cases. > > This patch adds an additional shuffle to add 16-bit padding > > to the final part of the calculation to handle such cases. > > This commit also adds a unit test to fuzz test the actions > > autovalidator. > > > > Signed-off-by: Emma Finn > > Reported-by: Eelco Chaudron > > +AT_SETUP([datapath - Actions Autovalidator Fuzzy]) > > +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], []) > > +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 1 fuzzy > packets]) > > Hi Emma, Hi Eelco, Ilya & Emma (&OVS community) As you know checksums are a very "input sensitive" workload; its kind of why they're useful (hash ~uniquely represents inputset ideally) Internally, computing a hash always involves carry-bits and (expected/good) overflows, however this patch is fixing an unintended/bad overflow. Testing these corner-cases & boundary conditions is hard; for me the only way to be confident in its correctness is autovaldation and fuzzing. > As Ilya explained in the v2 reply, we do not like to have fuzzy tests in the > test suite as they are hard to reproduce. (link https://patchwork.ozlabs.org/project/openvswitch/patch/20240514134815.2576245-1-emma.f...@intel.com/#3313143) I understand that any randomness in CI is not desired, and understand that random tests can be hard to reproduce. The autovalidator tries to mitigate the "reproduce" issue by dumping the packet contents on failure, providing the erroring case to easily allow reproduction. This test *should* be 100% reliably passing, so no false-positives are expected. If not, we should fix the test to be 100% reliable. > So my request is to send out a v4 with a test modeled around the 'userspace > offload - ip csum offload' test case we have in tests/dpif-netdev.at, as per > Ilya’s request. I'll suggest 3 paths forward, and give my pro/cons list for each; whatever option is preferred by OVS community we can send as a v4; 1) Accept "SIMD code fix" part of patch only (no unit test, just fixes issue with local fuzz-testing during development) 2) Build "specific-to-this-issue unit-test" only (regression tests against this specific issue, but nothing else; low value in my opinion, as it leaves a large input space untested) 3) Accept "v3 patch as is, with fuzzing" (best test-coverage, confidence in current & future code changes. CI runtime on fuzz-tests, and if failures occur, they must be flagged & fixed, but are *actual real issues* that should be fixed.) I think 2) is what is being asked above, but wanted to make sure the tradeoffs are understood. As above, please indicate which is the preferred approach and a v4 will be on the way. > Thanks, > > Eelco Thanks & Regards, -Harry ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Question about dp_netdev_flow ref
Hi, all I have a question regarding the usage of the reference count in the dp_netdev_flow structure. When the revalidate thread ages out and deletes dpflow A, the reference count of dpflow A is decremented, initiating the RCU release process. Now, if another thread concurrently performs a table lookup and hits dpflow A, is it problematic if that thread continues to use dpflow A? Moreover, I noticed the following note in the dp_netdev_flow structure: * A flow 'flow' may be accessed without a risk of being freed during an RCU * grace period. Code that needs to hold onto a flow for a while * should try incrementing 'flow->ref_cnt' with dp_netdev_flow_ref(). In fast_path_processing, we acquire dpflow as follows: c复制代码 flow = dp_netdev_flow_cast(rules[i]); However, dp_netdev_flow_ref() is not called. Why is that? Is this a potential issue? BR ke.liu ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] Question about dp_netdev_flow ref
Hi, all I have a question regarding the usage of the reference count in the dp_netdev_flow structure. When the revalidate thread ages out and deletes dpflow A, the reference count of dpflow A is decremented, initiating the RCU release process. Now, if another thread concurrently performs a table lookup and hits dpflow A, is it problematic if that thread continues to use dpflow A? Moreover, I noticed the following note in the dp_netdev_flow structure: * A flow 'flow' may be accessed without a risk of being freed during an RCU * grace period. Code that needs to hold onto a flow for a while * should try incrementing 'flow->ref_cnt' with dp_netdev_flow_ref(). In fast_path_processing, we acquire dpflow as follows: c复制代码 flow = dp_netdev_flow_cast(rules[i]); However, dp_netdev_flow_ref() is not called. Why is that? Is this a potential issue? BR ke.liu ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH 1/1] ofproto-dpif-upcall: Avoid stale ukeys leaks.
It is observed in some environments that there are much more ukeys than actual DP flows. For example: $ ovs-appctl upcall/show system@ovs-system: flows : (current 7) (avg 6) (max 117) (limit 2125) offloaded flows : 525 dump duration : 1063ms ufid enabled : true 23: (keys 3612) 24: (keys 3625) 25: (keys 3485) The revalidator threads are busy revalidating the stale ukeys leading to high CPU and long dump duration. There are some possible situations that may result in stale ukeys that have no corresponding DP flows. In revalidator, push_dp_ops() doesn't check error if the op type is not DEL. It is possible that a PUT(MODIFY) fails, especially for tc offload case, where the old flow is deleted first and then the new one is created. If the creation fails, the ukey will be stale (no corresponding DP flow). This patch adds a warning in such case. Another possible scenario is in handle_upcalls() if a PUT operation did not succeed and op->error attribute was not set correctly it can lead to stale ukey in operational state. This patch adds checks in the sweep phase for such ukeys and move them to DELETE so that they can be cleared eventually. Co-authored-by: Han Zhou Signed-off-by: Han Zhou Signed-off-by: Roi Dayan --- ofproto/ofproto-dpif-upcall.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 83609ec62b63..e9520ebdf910 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow); COVERAGE_DEFINE(dumped_new_flow); COVERAGE_DEFINE(handler_duplicate_upcall); COVERAGE_DEFINE(revalidate_missed_dp_flow); +COVERAGE_DEFINE(revalidate_missed_dp_flow_del); COVERAGE_DEFINE(ukey_dp_change); COVERAGE_DEFINE(ukey_invalid_stat_reset); COVERAGE_DEFINE(ukey_replace_contention); @@ -278,6 +279,7 @@ enum flow_del_reason { FDR_BAD_ODP_FIT,/* Bad ODP flow fit. */ FDR_FLOW_IDLE, /* Flow idle timeout. */ FDR_FLOW_LIMIT, /* Kill all flows condition reached. */ +FDR_FLOW_STALE, /* Flow stale detected. */ FDR_FLOW_WILDCARDED,/* Flow needs a narrower wildcard mask. */ FDR_NO_OFPROTO, /* Bridge not found. */ FDR_PURGE, /* User requested flow deletion. */ @@ -2557,6 +2559,10 @@ push_dp_ops(struct udpif *udpif, struct ukey_op *ops, size_t n_ops) if (op->dop.type != DPIF_OP_FLOW_DEL) { /* Only deleted flows need their stats pushed. */ +if (op->dop.error) { +VLOG_WARN_RL(&rl, "push_dp_ops: error %d in op type %d, ukey " + "%p", op->dop.error, op->dop.type, op->ukey); +} continue; } @@ -3027,6 +3033,15 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) del_reason = purge ? FDR_PURGE : FDR_UPDATE_FAIL; } else if (!seq_mismatch) { result = UKEY_KEEP; +} else if (!ukey->stats.used && + udpif_flow_time_delta(udpif, ukey) * 1000 > + ofproto_max_idle) { +COVERAGE_INC(revalidate_missed_dp_flow_del); +VLOG_WARN_RL(&rl, "revalidator_sweep__: Remove stale ukey " + "%p delta %llus", ukey, + udpif_flow_time_delta(udpif, ukey)); +result = UKEY_DELETE; +del_reason = FDR_FLOW_STALE; } else { struct dpif_flow_stats stats; COVERAGE_INC(revalidate_missed_dp_flow); -- 2.21.0 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Re: [ovs-dev] [PATCH ovn] Do not reply on unicast arps for targets.
On Thu, May 23, 2024 at 1:51 PM Vasyl Saienko wrote: > > Thanks for feedback, > > I'm not an ipv6 expert, I did some research. There is no ARP in ipv6, it's replaced with Neighbor Discovery. The ARP resolution is done via Neighbor Solicitation and Neighbour Advertisement messages. > Neighbor Solicitation is an analog of ARP request and is sent to either > > solicited-node multicast address 33:33:ff:XX:XX:XX where XX:XX:XX is the last 3 bytes of the target ipv6 address > or target node address > > So in our case we can update the flows to match only the solicited-node multicast address 33:33:ff:XX:XX:XX as follows. From > > table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;) > table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };) > > to > > table=??(ls_in_arp_rsp ), priority=100 , match=(nd_ns && eth.dst == {33:33:ff:00:00:10, 33:33:ff:ff:00:10} && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10 && inport == "S1-vm1"), action=(next;) > table=??(ls_in_arp_rsp ), priority=50 , match=(nd_ns && eth.dst == {33:33:ff:00:00:10, 33:33:ff:ff:00:10} && ip6.dst == {fd00::10, ff02::1:ff00:10} && nd.target == fd00::10), action=(nd_na { eth.src = 50:54:00:00:00:10; ip6.src = fd00::10; nd.target = fd00::10; nd.tll = 50:54:00:00:00:10; outport = inport; flags.loopback = 1; output; };) > > Will this be okay? Maybe we can do this in a separate patch? > I am not an IPv6 expert either. I even forgot that IPv6 NS uses multicast instead of broadcast MAC address, and was assuming we could simply append the eth.dst == ff:ff:ff:ff:ff:ff to the IPv6 NS responder flows, which is obviously wrong. Please ignore my feedback for the current patch and we can address IPv6 in a separate patch if it is required. Thanks, Han > On Wed, May 22, 2024 at 12:16 AM Han Zhou wrote: >> >> >> >> On Tue, May 21, 2024 at 9:56 PM Numan Siddique wrote: >> > >> > On Mon, May 20, 2024 at 1:47 AM Vasyl Saienko wrote: >> > > >> > > Thanks for feedback Numan. >> > > >> > > On Wed, May 15, 2024 at 12:04 AM Numan Siddique wrote: >> > > >> > > > On Mon, May 13, 2024 at 9:00 AM Dumitru Ceara wrote: >> > > > > >> > > > > On 5/4/24 11:38, Vasyl Saienko wrote: >> > > > > > Hello Numan >> > > > > > >> > > > > > Thanks for review and feedback. >> > > > > > >> > > > > > On Fri, May 3, 2024 at 7:14 PM Numan Siddique wrote: >> > > > > > >> > > > > >> On Mon, Apr 22, 2024 at 2:54 AM Vasyl Saienko < vsaie...@mirantis.com> >> > > > > >> wrote: >> > > > > >>> >> > > > > >>> Reply only if target ethernet address is broadcast, if >> > > > > >>> address is specified explicitly do noting to let target >> > > > > >>> reply by itself. This technique allows to monitor target >> > > > > >>> aliveness with arping. >> > > > > >> >> > > > > >> Thanks for the patch. >> > > > > >> >> > > > > >> This patch does change the behavior of OVN. Having ARP responder >> > > > > >> flows avoids tunnelling the request packet if the destination is on a >> > > > > >> different compute node. >> > > > > >> But I don't think its a big deal. >> > > > > >> >> > > > > >> You are totally correct that the ARP responder allows limiting ARP >> > > > > > broadcast traffic between nodes. Normally ARP requests are sent to >> > > > > > broadcast ff:ff:ff:ff:ff:ff ethernet, as the protocol is designed to >> > > > > > identify ethernet addresses for known IP addresses. In this case since >> > > > > > traffic is broadcast it will be flooded among all nodes if ARP >> > > > responder is >> > > > > > not applied. At the same time the client may specify the target >> > > > > > ethernet address as unicast (in this case a broadcast storm will not >> > > > > > happen, as the destination ethernet address is unicast). >> > > > > > >> > > > > > In OpenStack we use unicast ARP requests as a way to monitor VM >> > > > aliveness >> > > > > > from remote locations to make sure there are no issues with >> > > > > > flows/underlying networking infra between nodes. We use ARPs due to >> > > > several >> > > > > > reasons: >> > > > > > 1. This protocol is mandatory and can't be disabled on the target >> > > > machine >> > > > > > (which guarantees that the target VM will always reply to ARPs if it is >> > > > > > alive) >> > > > > > 2. This protocol is not filtered by firewalls/security groups as it is >> > > > > > mandatory for network workability >> > > > > > 3. We can't use upper layers protocols such as ICMP as they will >> > > > require >> > > > > > specific firewall rules inside VMs, and we do not control VMs in cloud >> > > > > > cases, but still need to monitor infrastructure aliveness. >> > > > > > >> > > > > > If ARP responder replies to
[ovs-dev] [BUG] [ovs-2.17.1] [ovs-vswitchd] Miss setting port number in openflow while delete port and add port
Hi all, I found a bug in OVS-DPDK(ovs-2.17.1, dpdk-21.11). Which is Miss setting port number in openflow while delete port and add port. 1. The process of problem is: ``` add bridge; add port; add openflow, use port as in_port and output; delete port; openflow is error in port filed; add port; openflow is error in port filed; ``` 2. How to produce: ``` ### add bridge and port [root@bogon ~]# ovs-vsctl show 9f76671a-87a2-46f0-92a3-6c83f7b0ab86 Bridge br-int datapath_type: netdev Port pf0hpf Interface pf0hpf type: dpdk options: {dpdk-devargs=":01:00.0,representor=[65535]"} Port bond0 Interface bond0 type: dpdk options: {dpdk-devargs=net_bonding-bond0} Port br-int Interface br-int type: internal ovs_version: "2.17.2" ### add openflow [root@bogon ~]# ovs-ofctl dump-flows br-int cookie=0x0, duration=6.338s, table=0, n_packets=0, n_bytes=0, in_port=pf0hpf actions=output:bond0 cookie=0x0, duration=5.765s, table=0, n_packets=6, n_bytes=744, in_port=bond0 actions=output:pf0hpf ### delete port bond ovs-vsctl del-port br-int bond0 [root@bogon ~]# ovs-ofctl dump-flows br-int cookie=0x0, duration=28.889s, table=0, n_packets=0, n_bytes=0, in_port=pf0hpf actions=output:2 cookie=0x0, duration=28.316s, table=0, n_packets=25, n_bytes=3355, in_port=2 actions=output:pf0hpf [root@bogon ~]# ovs-ofctl show br-int OFPT_FEATURES_REPLY (xid=0x2): dpid:2aa3ffcabe42 n_tables:254, n_buffers:0 capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst 1(pf0hpf): addr:02:0a:35:4c:1b:2c config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max LOCAL(br-int): addr:2a:a3:ff:ca:be:42 config: PORT_DOWN state: LINK_DOWN current:10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0 BUG: In openflow, the "output:bond0" change to "output:2". Why not delete the openflow related of bond0? ### add bond0 back ovs-vsctl add-port br-int bond0 -- set interface bond0 type=dpdk options:dpdk-devargs=net_bonding-bond0 [root@bogon ~]# ovs-ofctl dump-flows br-int cookie=0x0, duration=275.262s, table=0, n_packets=0, n_bytes=0, in_port=pf0hpf actions=output:2 cookie=0x0, duration=274.689s, table=0, n_packets=25, n_bytes=3355, in_port=2 actions=output:pf0hpf (BUG: output:2 should change back to output:bond0, why NOT?) [root@bogon ~]# ovs-ofctl show br-int OFPT_FEATURES_REPLY (xid=0x2): dpid:2aa3ffcabe42 n_tables:254, n_buffers:0 capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst 1(pf0hpf): addr:02:0a:35:4c:1b:2c config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max 3(bond0): addr:a6:29:0d:b8:5b:3a config: 0 state: 0 current:AUTO_NEG speed: 0 Mbps now, 0 Mbps max LOCAL(br-int): addr:2a:a3:ff:ca:be:42 config: PORT_DOWN state: LINK_DOWN current:10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0 (BUG: Why bond0 NOT use 2, but use 3) ### for "delete pf0hpf ", same result. [root@bogon ~]# ovs-vsctl del-port br-int pf0hpf [root@bogon ~]# ovs-ofctl dump-flows br-int cookie=0x0, duration=404.519s, table=0, n_packets=0, n_bytes=0, in_port=1 actions=output:2 cookie=0x0, duration=403.946s, table=0, n_packets=25, n_bytes=3355, in_port=2 actions=output:1 ovs-vsctl add-port br-int pf0hpf -- set interface pf0hpf type=dpdk options:dpdk-devargs=:01:00.0,representor=[65535] [root@bogon ~]# ovs-ofctl dump-flows br-int cookie=0x0, duration=510.358s, table=0, n_packets=0, n_bytes=0, in_port=1 actions=output:2 cookie=0x0, duration=509.785s, table=0, n_packets=25, n_bytes=3355, in_port=2 actions=output:1 [root@bogon ~]# ovs-ofctl show br-int OFPT_FEATURES_REPLY (xid=0x2): dpid:2aa3ffcabe42 n_tables:254, n_buffers:0 capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst 3(bond0): addr:a6:29:0d:b8:5b:3a config: 0 state: 0 current:AUTO_NEG speed: 0 Mbps now, 0 Mbps max 4(pf0hpf): addr:02:0a:35:4c:1b:2c config: 0 state: 0 speed: 0 Mbps now, 0 Mbps max LOCAL(br-int): addr:2a:a3:ff:ca:be:42 config: PORT_DOWN state: LINK_DOWN current:10MB-FD COPPER speed: 10 Mbps now, 0 Mbps max OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0 ``` 3. What's effect of this BUG ``` Because of t
Re: [ovs-dev] [v3] odp-execute: Fix AVX checksum calculation.
On 21 May 2024, at 16:13, Emma Finn wrote: > The AVX implementation for calcualting checksums was not > handling carry-over addition correctly in some cases. > This patch adds an additional shuffle to add 16-bit padding > to the final part of the calculation to handle such cases. > This commit also adds a unit test to fuzz test the actions > autovalidator. > > Signed-off-by: Emma Finn > Reported-by: Eelco Chaudron > --- > lib/odp-execute-avx512.c | 5 + > tests/dpif-netdev.at | 26 ++ > 2 files changed, 31 insertions(+) > > diff --git a/lib/odp-execute-avx512.c b/lib/odp-execute-avx512.c > index 50c48bfd4..a74a85dc1 100644 > --- a/lib/odp-execute-avx512.c > +++ b/lib/odp-execute-avx512.c > @@ -366,6 +366,8 @@ avx512_get_delta(__m256i old_header, __m256i new_header) >0xF, 0xF, 0xF, 0xF); > v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta); > > +v_delta = _mm256_hadd_epi32(v_delta, v_zeros); > +v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a); > v_delta = _mm256_hadd_epi32(v_delta, v_zeros); > v_delta = _mm256_hadd_epi16(v_delta, v_zeros); > > @@ -575,6 +577,9 @@ avx512_ipv6_sum_header(__m512i ip6_header) >0xF, 0xF, 0xF, 0xF); > > v_delta = _mm256_permutexvar_epi32(v_swap32a, v_delta); > + > +v_delta = _mm256_hadd_epi32(v_delta, v_zeros); > +v_delta = _mm256_shuffle_epi8(v_delta, v_swap16a); > v_delta = _mm256_hadd_epi32(v_delta, v_zeros); > v_delta = _mm256_hadd_epi16(v_delta, v_zeros); > > diff --git a/tests/dpif-netdev.at b/tests/dpif-netdev.at > index 790b5a43a..48cb900ad 100644 > --- a/tests/dpif-netdev.at > +++ b/tests/dpif-netdev.at > @@ -1091,3 +1091,29 @@ OVS_VSWITCHD_STOP(["dnl > /Error: unknown miniflow extract implementation superstudy./d > /Error: invalid study_pkt_cnt value: -pmd./d"]) > AT_CLEANUP > + > +AT_SETUP([datapath - Actions Autovalidator Fuzzy]) > +AT_SKIP_IF([! $PYTHON3 -c "import scapy"], [], []) > +AT_SKIP_IF([! $PYTHON3 $srcdir/genpkts.py 1 fuzzy > packets]) Hi Emma, As Ilya explained in the v2 reply, we do not like to have fuzzy tests in the test suite as they are hard to reproduce. So my request is to send out a v4 with a test modeled around the 'userspace offload - ip csum offload' test case we have in tests/dpif-netdev.at, as per Ilya’s request. Thanks, Eelco > + > +OVS_VSWITCHD_START(add-port br0 p0 -- set Interface p0 type=dummy \ > + -- add-port br0 p1 -- set Interface p1 type=dummy) > + > +AT_CHECK([ovs-appctl odp-execute/action-impl-set autovalidator], [0], [dnl > +Action implementation set to autovalidator. > +]) > + > +AT_DATA([flows.txt], [dnl > + in_port=p0,ip,actions=mod_nw_src=10.1.1.1,p1 > + in_port=p0,ipv6,actions=set_field:fc00::100->ipv6_src,p1 > +]) > + > +AT_CHECK([ovs-ofctl del-flows br0]) > +AT_CHECK([ovs-ofctl -Oopenflow13 add-flows br0 flows.txt]) > + > +cat packets | while read line; do > + AT_CHECK([ovs-appctl netdev-dummy/receive p0 $line], [0], [ignore]) > +done > + > +OVS_VSWITCHD_STOP > +AT_CLEANUP > -- > 2.34.1 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH] dpif-netdev: Resolved flow table reference issue.
From: Shaohua Wu description: The statistics for the cached DP flow table are updated in packet_batch_per_flow_execute. There might be a timing gap where the dp flow in the batch reaches its aging time and is prematurely aged. If actions on the dp flows are executed at this time,and the memory for the DP flow has been released, it can lead to a crash. 0 raise () from /lib64/libc.so.6 1 abort () from /lib64/libc.so.6 2 odp_execute_actions at lib/odp-execute.c:1166 3 dp_netdev_execute_actions at lib/dpif-netdev.c:11339 4 packet_batch_per_flow_execute at lib/dpif-netdev.c:8537 5 dp_netdev_input__ at lib/dpif-netdev.c:10722 6 dp_netdev_input at lib/dpif-netdev.c:10731 7 dp_netdev_process_rxq_port at lib/dpif-netdev.c:6332 8 dpif_netdev_run at lib/dpif-netdev.c:7343 9 dpif_run at lib/dpif.c:479 10 type_run at ofproto/ofproto-dpif.c:370 11 ofproto_type_run at ofproto/ofproto.c:1789 12 bridge_run__ at vswitchd/bridge.c:3245 13 bridge_run at vswitchd/bridge.c:3310 14 main at vswitchd/ovs-vswitchd.c:127 (gdb) f 4 (gdb) p flow->ref_cnt $4 = {count = 0} (gdb) p flow->dead $5 = true Signed-off-by: Shaohua Wu --- lib/dpif-netdev.c | 23 ++- 1 file changed, 14 insertions(+), 9 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index c7f9e1490..8f96efde0 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -8255,12 +8255,13 @@ static inline void packet_batch_per_flow_init(struct packet_batch_per_flow *batch, struct dp_netdev_flow *flow) { -flow->batch = batch; - -batch->flow = flow; -dp_packet_batch_init(&batch->array); -batch->byte_count = 0; -batch->tcp_flags = 0; +if (dp_netdev_flow_ref(flow)) { +flow->batch = batch; +batch->flow = flow; +dp_packet_batch_init(&batch->array); +batch->byte_count = 0; +batch->tcp_flags = 0; +} } static inline void @@ -8322,9 +8323,11 @@ packet_enqueue_to_flow_map(struct dp_packet *packet, size_t index) { struct dp_packet_flow_map *map = &flow_map[index]; -map->flow = flow; -map->packet = packet; -map->tcp_flags = tcp_flags; +if (dp_netdev_flow_ref(flow)) { +map->flow = flow; +map->packet = packet; +map->tcp_flags = tcp_flags; +} } /* SMC lookup function for a batch of packets. @@ -,6 +8891,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd, } dp_netdev_queue_batches(map->packet, map->flow, map->tcp_flags, batches, &n_batches); +dp_netdev_flow_unref(map->flow); } /* All the flow batches need to be reset before any call to @@ -8905,6 +8909,7 @@ dp_netdev_input__(struct dp_netdev_pmd_thread *pmd, for (i = 0; i < n_batches; i++) { packet_batch_per_flow_execute(&batches[i], pmd); +dp_netdev_flow_unref(batches[i].flow); } } -- 2.30.2 ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev
[ovs-dev] [PATCH]ipf: Fix ovs ipf crash.
Description: when 1) The fragment timeout is between 15 seconds and 25 seconds; 2) The ipf_list currently has received more than 32 fragments, and there are other fragments of same big packet that have not been received. When the above two scenario conditions are met, due to exceeding the capacity of the packet batch(here is 32), ipf_dp_packet_batch_add returns false, and ipf_list will not be cleared. However, the 32 fragments packets added to the packet batch will be processed normally. When receiving the subsequent fragments of the ipf_list, because the first 32 fragments have been processed, when processing subsequent fragment packets, relevant information about the processed fragment packets will be read,therefore will occur carsh. One solution is do not forward timeout fragment packets from the above scenarios, that is, do not add them to the packet batch, and handle other scenarios according to the original logic. Signed-off-by: laixiangwu <15310488...@163.com> --- lib/ipf.c | 10 -- 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/lib/ipf.c b/lib/ipf.c index d45266374..9258173ab 100644 --- a/lib/ipf.c +++ b/lib/ipf.c @@ -1011,7 +1011,7 @@ ipf_purge_list_check(struct ipf *ipf, struct ipf_list *ipf_list, } /* Does the packet batch management and common accounting work associated - * with 'ipf_send_completed_frags()' and 'ipf_send_expired_frags()'. */ + * with 'ipf_send_completed_frags()'. */ static bool ipf_send_frags_in_list(struct ipf *ipf, struct ipf_list *ipf_list, struct dp_packet_batch *pb, @@ -1076,8 +1076,7 @@ ipf_send_completed_frags(struct ipf *ipf, struct dp_packet_batch *pb, * a packet batch to be processed by the calling application, typically * conntrack. Also cleans up the list context when it is empty.*/ static void -ipf_send_expired_frags(struct ipf *ipf, struct dp_packet_batch *pb, - long long now, bool v6) +ipf_clean_expired_frags(struct ipf *ipf, long long now) { enum { /* Very conservative, due to DOS probability. */ @@ -1099,8 +1098,7 @@ ipf_send_expired_frags(struct ipf *ipf, struct dp_packet_batch *pb, break; } -if (ipf_send_frags_in_list(ipf, ipf_list, pb, IPF_FRAG_EXPIRY_LIST, - v6, now)) { +if (ipf_purge_list_check(ipf, ipf_list, now)) { ipf_expiry_list_clean(&ipf->frag_lists, ipf_list); lists_removed++; } else { @@ -1249,7 +1247,7 @@ ipf_postprocess_conntrack(struct ipf *ipf, struct dp_packet_batch *pb, bool v6 = dl_type == htons(ETH_TYPE_IPV6); ipf_post_execute_reass_pkts(ipf, pb, v6); ipf_send_completed_frags(ipf, pb, now, v6); -ipf_send_expired_frags(ipf, pb, now, v6); +ipf_clean_expired_frags(ipf, now); } } -- ___ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev