Ilya Maximets <[email protected]> writes: > On 1/15/26 10:19 AM, Eelco Chaudron wrote: >> This patch series introduces a major architectural >> refactoring of Open vSwitch's hardware offload >> infrastructure. It replaces the tightly coupled >> `netdev-offload` implementation with a new, modular >> `dpif-offload-provider` framework. >> >> MOTIVATION >> ------------------------------------------------------------- >> The existing `netdev-offload` API tightly couples datapath >> implementations (like `dpif-netdev`) with specific offload >> technologies (DPDK's rte_flow). This design has several >> limitations: >> >> - Rigid Architecture: It creates complex dependencies, >> making the code difficult to maintain and extend. >> >> - Limited Flexibility: Supporting multiple offload backends >> simultaneously or adding new ones is cumbersome. >> >> - Inconsistent APIs: The logic for handling different >> offload types is scattered, leading to an inconsistent >> and hard-to-follow API surface. >> >> This refactoring aims to resolve these issues by creating a >> clean separation of concerns, improving modularity, and >> establishing a clear path for future hardware offload >> integrations. >> >> PROPOSED SOLUTION: THE `DPIF-OFFLOAD-PROVIDER` FRAMEWORK >> ------------------------------------------------------------- >> This series introduces the `dpif-offload-provider` >> framework, which functions similarly to the existing >> `dpif-provider` pattern. It treats hardware offload as a >> distinct layer with multiple, dynamically selectable >> backends. >> >> Key features of the new framework include: >> >> 1. Modular Architecture: A clean separation between the >> generic datapath interface and specific offload >> provider implementations (e.g., `dpif-offload-tc`, >> `dpif-offload-dpdk`). `dpif` layers are now generic >> clients of the offload API. >> >> 2. Provider-based System: Allows multiple offload backends >> to coexist. >> >> 3. Unified and Asynchronous API: Establishes a consistent >> API across all offload providers. For userspace >> datapaths, the API is extended to support asynchronous >> flow operations with callbacks, making `dpif-netdev` a >> more efficient client. >> >> 4. Enhanced Configuration: Provides granular control over >> offload provider selection through a global and per-port >> priority system (`hw-offload-priority`), allowing >> fine-tuned policies for different hardware. >> >> 5. Improved Testing: Includes a new test framework >> specifically for validating DPDKs rte_flow offloads, >> enhancing long-term maintainability. >> >> PATCH SERIES ORGANIZATION >> ------------------------------------------------------------- >> This large series is organized logically to facilitate >> review: >> >> 1. Framework Foundation: The initial patches establish the >> core `dpif-offload-provider` framework, including the >> necessary APIs for port management, flow mark >> allocation, configuration, and a dummy provider for >> testing. >> >> 2. Provider Implementation: These patches introduce the new >> `dpif-offload-tc` and `dpif-offload-dpdk` providers, >> building out their specific implementations on top of >> the new framework. >> >> 3. API Migration and Decoupling: The bulk of the series >> systematically migrates functionality from the legacy >> `netdev-offload` layer to the new providers. Key >> commits here decouple `dpif-netdev` and, crucially, >> `dpif-netlink` from their hardware offload >> entanglements. >> >> 4. Cleanup: The final patches remove the now-redundant >> global APIs and structures from `netdev-offload`, >> completing the transition. >> >> BACKWARD COMPATIBILITY >> ------------------------------------------------------------- >> This refactoring maintains full API compatibility from a >> user's perspective. All existing `ovs-vsctl` and >> `ovs-appctl` commands continue to function as before. The >> changes are primarily internal architectural improvements >> designed to make OVS more robust and extensible. >> >> REQUEST FOR COMMENTS >> ------------------------------------------------------------- >> This is a significant architectural change that affects >> core OVS infrastructure. We welcome feedback on: >> >> - The overall architectural approach and the >> `dpif-offload-provider` concept. >> - The API design, particularly the new asynchronous model >> for `dpif-netdev`. >> - The migration strategy and any potential backward >> compatibility concerns. >> - Performance implications of the new framework. >> >> ------------------------------------------------------------- >> Changes from v1: >> - Fixed issues reported by Aaron and my AI experiments. >> - See individual patches for specific changes. >> >> Changes from v2: >> - See individual patches for specific changes. >> >> Changes from v3: >> - In patch 7, removed leftover netdev_close(port->netdev) >> causing netdev reference count issues. >> - Changed naming for dpif_offload_impl_type enum entries. >> - Merged previous patch 36, 'dpif-netdev: Add full name >> to the dp_netdev structure', with the next patch. >> >> Changes from v4: >> - Applied review comments from Aaron, see individual patches. >> - Added Eli's ACK to the individual patches. >> >> Changes from v5: >> - Addressed Ilya's comments, see upstream discussion for >> details. >> >> Eelco Chaudron (40): >> dpif-offload-provider: Add dpif-offload-provider implementation. >> dpif-offload: Add provider for tc offload. >> dpif-offload: Add provider for dpdk (rte_flow). >> dpif-offload: Allow configuration of offload provider priority. >> dpif-offload: Move hw-offload configuration to dpif-offload. >> dpif-offload: Add offload provider set_config API. >> dpif-offload: Add port registration and management APIs. >> dpif-offload-tc: Add port management framework. >> dpif-offload-dpdk: Add port management framework. >> dpif-offload: Validate mandatory port class callbacks on registration. >> dpif-offload: Allow per-port offload provider priority config. >> dpif-offload: Introduce provider debug information API. >> dpif-offload: Call flow-flush netdev-offload APIs via dpif-offload. >> dpif-offload: Call meter netdev-offload APIs via dpif-offload. >> dpif-offload: Move the flow_get_n_flows() netdev-offload API to dpif. >> dpif-offload: Move hw_post_process netdev API to dpif. >> dpif-offload: Add flow dump APIs to dpif-offload. >> dpif-offload: Move the tc flow dump netdev APIs to dpif-offload. >> dpif-netlink: Remove netlink-offload integration. >> dpif-netlink: Add API to get offloaded netdev from port_id. >> dpif-offload: Add API to find offload implementation type. >> dpif-offload: Add operate implementation to dpif-offload. >> netdev-offload: Temporarily move thread-related APIs to dpif-netdev. >> dpif-offload: Add port dump APIs to dpif-offload. >> dpif-netdev: Remove indirect DPDK netdev offload API calls. >> dpif: Add dpif_get_features() API. >> dpif-offload: Add flow operations to dpif-offload-tc. >> dpif-netlink: Remove entangled hardware offload. >> dpif-offload-tc: Remove netdev-offload dependency. >> netdev_dummy: Remove hardware offload override. >> dpif-offload: Move the netdev_any_oor() API to dpif-offload. >> netdev-offload: Remove the global netdev-offload API. >> dpif-offload: Add inline flow APIs for userspace datapaths. >> dpif_netdev: Fix nullable memcpy in queue_netdev_flow_put(). >> dpif-offload: Move offload_stats_get() API to dpif-offload. >> dpif-offload-dpdk: Abstract rte_flow implementation from dpif-netdev. >> dpif-offload-dummy: Add flow add/del/get APIs. >> netdev-offload: Fold netdev-offload APIs and files into dpif-offload. >> tests: Fix NSH decap header test for real Ethernet devices. >> tests: Add a simple DPDK rte_flow test framework. >> >> Documentation/topics/testing.rst | 19 + >> include/openvswitch/json.h | 1 + >> include/openvswitch/netdev.h | 1 + >> include/openvswitch/thread.h | 6 + >> lib/automake.mk | 17 +- >> lib/dp-packet.h | 1 + >> lib/dpctl.c | 50 +- >> lib/dpdk.c | 2 - >> lib/dpif-netdev-avx512.c | 4 +- >> lib/dpif-netdev-private-flow.h | 9 +- >> lib/dpif-netdev.c | 1243 +++--------- >> lib/dpif-netlink.c | 557 +----- >> ...load-dpdk.c => dpif-offload-dpdk-netdev.c} | 592 ++++-- >> lib/dpif-offload-dpdk-private.h | 73 + >> lib/dpif-offload-dpdk.c | 1114 +++++++++++ >> lib/dpif-offload-dummy.c | 890 +++++++++ >> lib/dpif-offload-provider.h | 412 ++++ >> ...-offload-tc.c => dpif-offload-tc-netdev.c} | 229 ++- >> lib/dpif-offload-tc-private.h | 73 + >> lib/dpif-offload-tc.c | 810 ++++++++ >> lib/dpif-offload.c | 1773 +++++++++++++++++ >> lib/dpif-offload.h | 227 +++ >> lib/dpif-provider.h | 66 +- >> lib/dpif.c | 164 +- >> lib/dpif.h | 14 +- >> lib/dummy.h | 9 + >> lib/json.c | 7 + >> lib/netdev-dpdk.c | 9 +- >> lib/netdev-dpdk.h | 2 +- >> lib/netdev-dummy.c | 199 +- >> lib/netdev-linux.c | 3 +- >> lib/netdev-offload-provider.h | 148 -- >> lib/netdev-offload.c | 910 --------- >> lib/netdev-offload.h | 169 -- >> lib/netdev-provider.h | 10 +- >> lib/netdev.c | 71 +- >> lib/netdev.h | 22 + >> lib/tc.c | 2 +- >> ofproto/ofproto-dpif-upcall.c | 50 +- >> ofproto/ofproto-dpif.c | 82 +- >> tests/.gitignore | 3 + >> tests/automake.mk | 24 + >> tests/dpif-netdev.at | 37 +- >> tests/ofproto-dpif.at | 170 ++ >> tests/ofproto-macros.at | 17 +- >> tests/sendpkt.py | 12 +- >> tests/system-dpdk-offloads-macros.at | 236 +++ >> tests/system-dpdk-offloads-testsuite.at | 28 + >> tests/system-dpdk-offloads.at | 223 +++ >> tests/system-dpdk.at | 35 + >> tests/system-kmod-macros.at | 5 + >> tests/system-offloads-testsuite-macros.at | 5 + >> tests/system-offloads-traffic.at | 46 + >> tests/system-traffic.at | 9 +- >> tests/system-userspace-macros.at | 5 + >> vswitchd/bridge.c | 7 +- >> vswitchd/vswitch.xml | 44 + >> 57 files changed, 7601 insertions(+), 3345 deletions(-) >> rename lib/{netdev-offload-dpdk.c => dpif-offload-dpdk-netdev.c} (83%) >> create mode 100644 lib/dpif-offload-dpdk-private.h >> create mode 100644 lib/dpif-offload-dpdk.c >> create mode 100644 lib/dpif-offload-dummy.c >> create mode 100644 lib/dpif-offload-provider.h >> rename lib/{netdev-offload-tc.c => dpif-offload-tc-netdev.c} (95%) >> create mode 100644 lib/dpif-offload-tc-private.h >> create mode 100644 lib/dpif-offload-tc.c >> create mode 100644 lib/dpif-offload.c >> create mode 100644 lib/dpif-offload.h >> delete mode 100644 lib/netdev-offload-provider.h >> delete mode 100644 lib/netdev-offload.c >> delete mode 100644 lib/netdev-offload.h >> create mode 100644 tests/system-dpdk-offloads-macros.at >> create mode 100644 tests/system-dpdk-offloads-testsuite.at >> create mode 100644 tests/system-dpdk-offloads.at >> > > Thanks, Eelco for the update! I read through the set again and it > reads much easier after the renames. There are still a few small > style issues throughout the set, but they are all minor like an > extra empty line at the EOF of dpif-offload.c in patch 7, a few > more places with a missing double space or underscore or the thread > safety annotations not on a separate line. There is also a missed > rename of dpif_offload_flow_get_n_offloaded() into something like > dpif_ofload_flow_count(), since the underlying callback was renamed. > And the provider_collection_add() should return a positive errno > after all, since it's passed directly into ovs_strerror(). > > As discussed off-list, instead of me writing all of these nits down > in the emails and then you fixing them, it's simpler if I just > handle the merge and fix the nits on commit.
Thanks for the work on that. > Also, found a couple more issues that we should address: > > 1. In the first patch, the dpif_offload_dump_next() function is > describing a case that should be impossible, but still tries > to handle it. We shouldn't do that. And instead we need to > just take the collection reference in dpif_offload_dump_start() > to make sure the collection doesn't go away during the dump. > And then we can fully rely on the LIST_FOR_EACH_CONTINUE. > > 2. The per-port priority configuration should be per-interface > instead as datapath doesn't deal with ports, it only works > with interfaces. And some ports, like bonds can have multiple > interfaces. It may not be a big deal as in a normal case > I don't think we would need different offload for interfaces > in the same port, but it's not really correct to configure > ports, when it is applied on the interface. > > And there are two things that I flagged in v5, but we decided to > handle separately: > > 3. Mass-rename of offload provider structures and functions to > make them coherent and shorter than they are. Most of them > are not exported and don't need extensive prefixes. And these > prefixes need unification after moving stuff between netdev* > and dpif* modules. > > 4. Move of the port manager instances from provider-specific > structures to the common struct dpif_offload. This should > eliminate some duplication and make the module boundaries > more clear. This should also eliminate the need for the > port dump API or at least significantly simplify it, e.g. > by just iterating over port cmap with a cmap_cursor. > > > So, the situation is the following: We have branching planned for > tomorrow. Full CI run takes about 24 hours. So, we don't really > have time for v7. We could postpone branching, but it also seems > unreasonable as we only really need changes in about 4 patches > out of 40, and none of the issues listed above are critical or > breaking any functionality. They are mostly internal code movements. > It would be much easier to get the set merged and then work on the > 4 items above next week and backport to the new branch. +1 to that. > As mentioned earlier, to save some time, I can handle the merging > and fix all the small nits on commit instead of wasting a lot of > time on writing them down on the mailing list and then Eelco > fixing them. > > Eelco, Aaron, does that sound like a good plan? Or did I miss > anything in my summary? <gimli-voice/> And my ACKs :) (if it makes it easier, I can go through and send them or you can add them during your merge) > Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
