On 2 Oct 2025, at 12:19, Ilya Maximets wrote:
> On 9/26/25 1:41 PM, Eelco Chaudron via dev wrote: >> This RFC patch series introduces a major architectural >> refactoring of Open vSwitch's hardware offload >> infrastructure. It replaces the tightly coupled >> `netdev-offload` implementation with a new, modular >> `dpif-offload-provider` framework. >> >> MOTIVATION >> ------------------------------------------------------------- >> The existing `netdev-offload` API tightly couples datapath >> implementations (like `dpif-netdev`) with specific offload >> technologies (rte_flow). This design has several >> limitations: >> >> - Rigid Architecture: It creates complex dependencies, >> making the code difficult to maintain and extend. >> >> - Limited Flexibility: Supporting multiple offload backends >> simultaneously or adding new ones is cumbersome. >> >> - Inconsistent APIs: The logic for handling different >> offload types is scattered, leading to an inconsistent >> and hard-to-follow API surface. >> >> This refactoring aims to resolve these issues by creating a >> clean separation of concerns, improving modularity, and >> establishing a clear path for future hardware offload >> integrations. >> >> PROPOSED SOLUTION: THE `DPIF-OFFLOAD-PROVIDER` FRAMEWORK >> ------------------------------------------------------------- >> This series introduces the `dpif-offload-provider` >> framework, which functions similarly to the existing >> `dpif-provider` pattern. It treats hardware offload as a >> distinct layer with multiple, dynamically selectable >> backends. >> >> Key features of the new framework include: >> >> 1. Modular Architecture: A clean separation between the >> generic datapath interface and specific offload >> provider implementations (e.g., `dpif-offload-tc`, >> `dpif-offload-rte_flow`). `dpif` layers are now generic >> clients of the offload API. >> >> 2. Provider-based System: Allows multiple offload backends >> to coexist. >> >> 3. Unified and Asynchronous API: Establishes a consistent >> API across all offload providers. For userspace >> datapaths, the API is extended to support asynchronous >> flow operations with callbacks, making `dpif-netdev` a >> more efficient client. >> >> 4. Enhanced Configuration: Provides granular control over >> offload provider selection through a global and per-port >> priority system (`hw-offload-priority`), allowing >> fine-tuned policies for different hardware. >> >> 5. Improved Testing: Includes a new test framework >> specifically for validating RTE_Flow offloads, enhancing >> long-term maintainability. >> >> PATCH SERIES ORGANIZATION >> ------------------------------------------------------------- >> This large series is organized logically to facilitate >> review: >> >> 1. Framework Foundation: The initial patches establish the >> core `dpif-offload-provider` framework, including the >> necessary APIs for port management, flow mark >> allocation, configuration, and a dummy provider for >> testing. >> >> 2. Provider Implementation: These patches introduce the new >> `dpif-offload-tc` and `dpif-offload-rte_flow` providers, >> building out their specific implementations on top of >> the new framework. >> >> 3. API Migration and Decoupling: The bulk of the series >> systematically migrates functionality from the legacy >> `netdev-offload` layer to the new providers. Key >> commits here decouple `dpif-netdev` and, crucially, >> `dpif-netlink` from their hardware offload >> entanglements. >> >> 4. Cleanup: The final patches remove the now-redundant >> global APIs and structures from `netdev-offload`, >> completing the transition. >> >> BACKWARD COMPATIBILITY >> ------------------------------------------------------------- >> This refactoring maintains full API compatibility from a >> user's perspective. All existing `ovs-vsctl` and >> `ovs-appctl` commands continue to function as before. The >> changes are primarily internal architectural improvements >> designed to make OVS more robust and extensible. >> >> REQUEST FOR COMMENTS >> ------------------------------------------------------------- >> This is a significant architectural change that affects >> core OVS infrastructure. We welcome feedback on: >> >> - The overall architectural approach and the >> `dpif-offload-provider` concept. >> - The API design, particularly the new asynchronous model >> for `dpif-netdev`. >> - The migration strategy and any potential backward >> compatibility concerns. >> - Performance implications of the new framework. >> >> ------------------------------------------------------------- >> >> v2: >> - Fixed some minor AI review comments (see individual patches) >> - Fix some Coverity issues reported on this patch set. >> - Fixed and investigated some rte_flow unit test. >> > > Hi, Eelco. Thanks for series! That's a huge amount of work. > > Not a technical review yet, as I didn't get to read the set, but I have > a couple high level comments below. > > First of all - naming. :) I think there was a similar discussion before > when I was working on the netdev-offload provider API (at least there was > one in my head). This series is using 'rte_flow' as the name of one of > the providers, which is how the API is named in DPDK, which is fine. But > it frequently devolves into just 'rte', which makes no sense and even DPDK > itself is trying to get rid of this anachronism for a while now. We also > have user-facing bits that name the offload provider as 'dpdk', for example > in the flow dump filters and outputs. So, since rte_flow is the only flow > offload API that DPDK has and given that it's not a user-friendly name > anyways, I'd suggest to rename all the rte_flow and rte stuff into dpdk, > i.e. dpif-offload-dpdk. This applies to all the internal function and > structure names, filenames and so on. Makes sense to me. I was debating whether we should move to rte_flow or not. If no one else objects, I’ll start renaming next week before sending out a v3. If DPDK ever gets an additional flow offload library, we can still call it dpdk_NFFO (New Fancy Flow Offload) ;) > One other concerning this is the new testing framework. While it's great > to have it, I'm not sure about maintainability. We don't have any CI that > would be capable of running it, and most developers will likely not be bale > to run it due to hardware requirements. So, unless we have a dummy rte_flow > implementation inside the DPDK itself (or inside OVS, which would be less > ideal), it's going to be hard to maintain this testsuite in a good shape. > Do you have some thoughts on this? I do feel we need some form of testing, even if it can only be run manually occasionally by a developer. Perhaps this is a topic for broader discussion. Maybe we should only allow new offload providers if there is a public testing ground, for example, they provide a GitHub self-hosted runner or something equivalent. Not sure how this would work, but it’s worth discussing. > Best regards, Ilya Maximets. _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
