Ilya Maximets <[email protected]> writes:

> On 1/15/26 10:19 AM, Eelco Chaudron wrote:
>> This patch series introduces a major architectural
>> refactoring of Open vSwitch's hardware offload
>> infrastructure. It replaces the tightly coupled
>> `netdev-offload` implementation with a new, modular
>> `dpif-offload-provider` framework.
>> 
>> MOTIVATION
>> -------------------------------------------------------------
>> The existing `netdev-offload` API tightly couples datapath
>> implementations (like `dpif-netdev`) with specific offload
>> technologies (DPDK's rte_flow). This design has several
>> limitations:
>> 
>> - Rigid Architecture: It creates complex dependencies,
>>   making the code difficult to maintain and extend.
>> 
>> - Limited Flexibility: Supporting multiple offload backends
>>   simultaneously or adding new ones is cumbersome.
>> 
>> - Inconsistent APIs: The logic for handling different
>>   offload types is scattered, leading to an inconsistent
>>   and hard-to-follow API surface.
>> 
>> This refactoring aims to resolve these issues by creating a
>> clean separation of concerns, improving modularity, and
>> establishing a clear path for future hardware offload
>> integrations.
>> 
>> PROPOSED SOLUTION: THE `DPIF-OFFLOAD-PROVIDER` FRAMEWORK
>> -------------------------------------------------------------
>> This series introduces the `dpif-offload-provider`
>> framework, which functions similarly to the existing
>> `dpif-provider` pattern. It treats hardware offload as a
>> distinct layer with multiple, dynamically selectable
>> backends.
>> 
>> Key features of the new framework include:
>> 
>> 1. Modular Architecture: A clean separation between the
>>    generic datapath interface and specific offload
>>    provider implementations (e.g., `dpif-offload-tc`,
>>    `dpif-offload-dpdk`). `dpif` layers are now generic
>>    clients of the offload API.
>> 
>> 2. Provider-based System: Allows multiple offload backends
>>    to coexist.
>> 
>> 3. Unified and Asynchronous API: Establishes a consistent
>>    API across all offload providers. For userspace
>>    datapaths, the API is extended to support asynchronous
>>    flow operations with callbacks, making `dpif-netdev` a
>>    more efficient client.
>> 
>> 4. Enhanced Configuration: Provides granular control over
>>    offload provider selection through a global and per-port
>>    priority system (`hw-offload-priority`), allowing
>>    fine-tuned policies for different hardware.
>> 
>> 5. Improved Testing: Includes a new test framework
>>    specifically for validating DPDKs rte_flow offloads,
>>    enhancing long-term maintainability.
>> 
>> PATCH SERIES ORGANIZATION
>> -------------------------------------------------------------
>> This large series is organized logically to facilitate
>> review:
>> 
>> 1. Framework Foundation: The initial patches establish the
>>    core `dpif-offload-provider` framework, including the
>>    necessary APIs for port management, flow mark
>>    allocation, configuration, and a dummy provider for
>>    testing.
>> 
>> 2. Provider Implementation: These patches introduce the new
>>    `dpif-offload-tc` and `dpif-offload-dpdk` providers,
>>    building out their specific implementations on top of
>>    the new framework.
>> 
>> 3. API Migration and Decoupling: The bulk of the series
>>    systematically migrates functionality from the legacy
>>    `netdev-offload` layer to the new providers. Key
>>    commits here decouple `dpif-netdev` and, crucially,
>>    `dpif-netlink` from their hardware offload
>>    entanglements.
>> 
>> 4. Cleanup: The final patches remove the now-redundant
>>    global APIs and structures from `netdev-offload`,
>>    completing the transition.
>> 
>> BACKWARD COMPATIBILITY
>> -------------------------------------------------------------
>> This refactoring maintains full API compatibility from a
>> user's perspective. All existing `ovs-vsctl` and
>> `ovs-appctl` commands continue to function as before. The
>> changes are primarily internal architectural improvements
>> designed to make OVS more robust and extensible.
>> 
>> REQUEST FOR COMMENTS
>> -------------------------------------------------------------
>> This is a significant architectural change that affects
>> core OVS infrastructure. We welcome feedback on:
>> 
>> - The overall architectural approach and the
>>   `dpif-offload-provider` concept.
>> - The API design, particularly the new asynchronous model
>>   for `dpif-netdev`.
>> - The migration strategy and any potential backward
>>   compatibility concerns.
>> - Performance implications of the new framework.
>> 
>> -------------------------------------------------------------
>> Changes from v1:
>>   - Fixed issues reported by Aaron and my AI experiments.
>>   - See individual patches for specific changes.
>> 
>> Changes from v2:
>>   - See individual patches for specific changes.
>> 
>> Changes from v3:
>>   - In patch 7, removed leftover netdev_close(port->netdev)
>>     causing netdev reference count issues.
>>   - Changed naming for dpif_offload_impl_type enum entries.
>>   - Merged previous patch 36, 'dpif-netdev: Add full name
>>     to the dp_netdev structure', with the next patch.
>> 
>> Changes from v4:
>>   - Applied review comments from Aaron, see individual patches.
>>   - Added Eli's ACK to the individual patches.
>> 
>> Changes from v5:
>>   - Addressed Ilya's comments, see upstream discussion for
>>     details.
>> 
>> Eelco Chaudron (40):
>>   dpif-offload-provider: Add dpif-offload-provider implementation.
>>   dpif-offload: Add provider for tc offload.
>>   dpif-offload: Add provider for dpdk (rte_flow).
>>   dpif-offload: Allow configuration of offload provider priority.
>>   dpif-offload: Move hw-offload configuration to dpif-offload.
>>   dpif-offload: Add offload provider set_config API.
>>   dpif-offload: Add port registration and management APIs.
>>   dpif-offload-tc: Add port management framework.
>>   dpif-offload-dpdk: Add port management framework.
>>   dpif-offload: Validate mandatory port class callbacks on registration.
>>   dpif-offload: Allow per-port offload provider priority config.
>>   dpif-offload: Introduce provider debug information API.
>>   dpif-offload: Call flow-flush netdev-offload APIs via dpif-offload.
>>   dpif-offload: Call meter netdev-offload APIs via dpif-offload.
>>   dpif-offload: Move the flow_get_n_flows() netdev-offload API to dpif.
>>   dpif-offload: Move hw_post_process netdev API to dpif.
>>   dpif-offload: Add flow dump APIs to dpif-offload.
>>   dpif-offload: Move the tc flow dump netdev APIs to dpif-offload.
>>   dpif-netlink: Remove netlink-offload integration.
>>   dpif-netlink: Add API to get offloaded netdev from port_id.
>>   dpif-offload: Add API to find offload implementation type.
>>   dpif-offload: Add operate implementation to dpif-offload.
>>   netdev-offload: Temporarily move thread-related APIs to dpif-netdev.
>>   dpif-offload: Add port dump APIs to dpif-offload.
>>   dpif-netdev: Remove indirect DPDK netdev offload API calls.
>>   dpif: Add dpif_get_features() API.
>>   dpif-offload: Add flow operations to dpif-offload-tc.
>>   dpif-netlink: Remove entangled hardware offload.
>>   dpif-offload-tc: Remove netdev-offload dependency.
>>   netdev_dummy: Remove hardware offload override.
>>   dpif-offload: Move the netdev_any_oor() API to dpif-offload.
>>   netdev-offload: Remove the global netdev-offload API.
>>   dpif-offload: Add inline flow APIs for userspace datapaths.
>>   dpif_netdev: Fix nullable memcpy in queue_netdev_flow_put().
>>   dpif-offload: Move offload_stats_get() API to dpif-offload.
>>   dpif-offload-dpdk: Abstract rte_flow implementation from dpif-netdev.
>>   dpif-offload-dummy: Add flow add/del/get APIs.
>>   netdev-offload: Fold netdev-offload APIs and files into dpif-offload.
>>   tests: Fix NSH decap header test for real Ethernet devices.
>>   tests: Add a simple DPDK rte_flow test framework.
>> 
>>  Documentation/topics/testing.rst              |   19 +
>>  include/openvswitch/json.h                    |    1 +
>>  include/openvswitch/netdev.h                  |    1 +
>>  include/openvswitch/thread.h                  |    6 +
>>  lib/automake.mk                               |   17 +-
>>  lib/dp-packet.h                               |    1 +
>>  lib/dpctl.c                                   |   50 +-
>>  lib/dpdk.c                                    |    2 -
>>  lib/dpif-netdev-avx512.c                      |    4 +-
>>  lib/dpif-netdev-private-flow.h                |    9 +-
>>  lib/dpif-netdev.c                             | 1243 +++---------
>>  lib/dpif-netlink.c                            |  557 +-----
>>  ...load-dpdk.c => dpif-offload-dpdk-netdev.c} |  592 ++++--
>>  lib/dpif-offload-dpdk-private.h               |   73 +
>>  lib/dpif-offload-dpdk.c                       | 1114 +++++++++++
>>  lib/dpif-offload-dummy.c                      |  890 +++++++++
>>  lib/dpif-offload-provider.h                   |  412 ++++
>>  ...-offload-tc.c => dpif-offload-tc-netdev.c} |  229 ++-
>>  lib/dpif-offload-tc-private.h                 |   73 +
>>  lib/dpif-offload-tc.c                         |  810 ++++++++
>>  lib/dpif-offload.c                            | 1773 +++++++++++++++++
>>  lib/dpif-offload.h                            |  227 +++
>>  lib/dpif-provider.h                           |   66 +-
>>  lib/dpif.c                                    |  164 +-
>>  lib/dpif.h                                    |   14 +-
>>  lib/dummy.h                                   |    9 +
>>  lib/json.c                                    |    7 +
>>  lib/netdev-dpdk.c                             |    9 +-
>>  lib/netdev-dpdk.h                             |    2 +-
>>  lib/netdev-dummy.c                            |  199 +-
>>  lib/netdev-linux.c                            |    3 +-
>>  lib/netdev-offload-provider.h                 |  148 --
>>  lib/netdev-offload.c                          |  910 ---------
>>  lib/netdev-offload.h                          |  169 --
>>  lib/netdev-provider.h                         |   10 +-
>>  lib/netdev.c                                  |   71 +-
>>  lib/netdev.h                                  |   22 +
>>  lib/tc.c                                      |    2 +-
>>  ofproto/ofproto-dpif-upcall.c                 |   50 +-
>>  ofproto/ofproto-dpif.c                        |   82 +-
>>  tests/.gitignore                              |    3 +
>>  tests/automake.mk                             |   24 +
>>  tests/dpif-netdev.at                          |   37 +-
>>  tests/ofproto-dpif.at                         |  170 ++
>>  tests/ofproto-macros.at                       |   17 +-
>>  tests/sendpkt.py                              |   12 +-
>>  tests/system-dpdk-offloads-macros.at          |  236 +++
>>  tests/system-dpdk-offloads-testsuite.at       |   28 +
>>  tests/system-dpdk-offloads.at                 |  223 +++
>>  tests/system-dpdk.at                          |   35 +
>>  tests/system-kmod-macros.at                   |    5 +
>>  tests/system-offloads-testsuite-macros.at     |    5 +
>>  tests/system-offloads-traffic.at              |   46 +
>>  tests/system-traffic.at                       |    9 +-
>>  tests/system-userspace-macros.at              |    5 +
>>  vswitchd/bridge.c                             |    7 +-
>>  vswitchd/vswitch.xml                          |   44 +
>>  57 files changed, 7601 insertions(+), 3345 deletions(-)
>>  rename lib/{netdev-offload-dpdk.c => dpif-offload-dpdk-netdev.c} (83%)
>>  create mode 100644 lib/dpif-offload-dpdk-private.h
>>  create mode 100644 lib/dpif-offload-dpdk.c
>>  create mode 100644 lib/dpif-offload-dummy.c
>>  create mode 100644 lib/dpif-offload-provider.h
>>  rename lib/{netdev-offload-tc.c => dpif-offload-tc-netdev.c} (95%)
>>  create mode 100644 lib/dpif-offload-tc-private.h
>>  create mode 100644 lib/dpif-offload-tc.c
>>  create mode 100644 lib/dpif-offload.c
>>  create mode 100644 lib/dpif-offload.h
>>  delete mode 100644 lib/netdev-offload-provider.h
>>  delete mode 100644 lib/netdev-offload.c
>>  delete mode 100644 lib/netdev-offload.h
>>  create mode 100644 tests/system-dpdk-offloads-macros.at
>>  create mode 100644 tests/system-dpdk-offloads-testsuite.at
>>  create mode 100644 tests/system-dpdk-offloads.at
>> 
>
> Thanks, Eelco for the update!  I read through the set again and it
> reads much easier after the renames.  There are still a few small
> style issues throughout the set, but they are all minor like an
> extra empty line at the EOF of dpif-offload.c in patch 7, a few
> more places with a missing double space or underscore or the thread
> safety annotations not on a separate line.  There is also a missed
> rename of dpif_offload_flow_get_n_offloaded() into something like
> dpif_ofload_flow_count(), since the underlying callback was renamed.
> And the provider_collection_add() should return a positive errno
> after all, since it's passed directly into ovs_strerror().
>
> As discussed off-list, instead of me writing all of these nits down
> in the emails and then you fixing them, it's simpler if I just
> handle the merge and fix the nits on commit.

Thanks for the work on that.

> Also, found a couple more issues that we should address:
>
> 1. In the first patch, the dpif_offload_dump_next() function is
>    describing a case that should be impossible, but still tries
>    to handle it.  We shouldn't do that.  And instead we need to
>    just take the collection reference in dpif_offload_dump_start()
>    to make sure the collection doesn't go away during the dump.
>    And then we can fully rely on the LIST_FOR_EACH_CONTINUE.
>
> 2. The per-port priority configuration should be per-interface
>    instead as datapath doesn't deal with ports, it only works
>    with interfaces.  And some ports, like bonds can have multiple
>    interfaces.  It may not be a big deal as in a normal case
>    I don't think we would need different offload for interfaces
>    in the same port, but it's not really correct to configure
>    ports, when it is applied on the interface.
>
> And there are two things that I flagged in v5, but we decided to
> handle separately:
>
> 3. Mass-rename of offload provider structures and functions to
>    make them coherent and shorter than they are.  Most of them
>    are not exported and don't need extensive prefixes.  And these
>    prefixes need unification after moving stuff between netdev*
>    and dpif* modules.
>
> 4. Move of the port manager instances from provider-specific
>    structures to the common struct dpif_offload.  This should
>    eliminate some duplication and make the module boundaries
>    more clear.  This should also eliminate the need for the
>    port dump API or at least significantly simplify it, e.g.
>    by just iterating over port cmap with a cmap_cursor.
>
>
> So, the situation is the following: We have branching planned for
> tomorrow.  Full CI run takes about 24 hours.  So, we don't really
> have time for v7.  We could postpone branching, but it also seems
> unreasonable as we only really need changes in about 4 patches
> out of 40, and none of the issues listed above are critical or
> breaking any functionality.  They are mostly internal code movements.
> It would be much easier to get the set merged and then work on the
> 4 items above next week and backport to the new branch.

+1 to that.

> As mentioned earlier, to save some time, I can handle the merging
> and fix all the small nits on commit instead of wasting a lot of
> time on writing them down on the mailing list and then Eelco
> fixing them.
>
> Eelco, Aaron, does that sound like a good plan?  Or did I miss
> anything in my summary?

<gimli-voice/> And my ACKs :)

(if it makes it easier, I can go through and send them or you can add
 them during your merge)

> Best regards, Ilya Maximets.

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to