On 6/24/25 1:31 PM, Tim Rozet wrote:
Thanks Mark for the detailed response and taking the time to review the proposal. See inline.

Tim Rozet
Red Hat OpenShift Networking Team


On Tue, Jun 24, 2025 at 12:04 PM Mark Michelson <mmich...@redhat.com <mailto:mmich...@redhat.com>> wrote:

    On 6/18/25 8:26 PM, Numan Siddique wrote:
     > On Fri, Jun 13, 2025 at 8:44 AM Tim Rozet via dev
     > <ovs-dev@openvswitch.org <mailto:ovs-dev@openvswitch.org>> wrote:
     >>
     >> Hello,
     >> In the OVN-Kubernetes we have been discussing and designing a way to
     >> implement Service Function Chaining (SFC) for various use cases.
    Some of
     >> these use cases are fairly complicated, involving a DPU and multiple
     >> clusters. However, we have tried to abstract the OVN design and
    use case
     >> into a generic implementation that is not specific to our
    particular use
     >> cases. It follows SFC designs previously done within other
    projects like
     >> OpenStack Neutron and OpenDaylight. Please see:
     >>
     >> https://docs.google.com/document/
    d/1dLdpx_9ZCnjHHldbNZABIpJF_GXd69qb/edit#bookmark=id.a7vfofkk8rj5
    <https://docs.google.com/document/
    d/1dLdpx_9ZCnjHHldbNZABIpJF_GXd69qb/edit#bookmark=id.a7vfofkk8rj5>
     >>
     >> tl;dr the design includes new tables to declare chains and
    classifiers to
     >> get traffic into that chain. There needs to be a new stage in
    the datapath
     >> pipeline to evaluate this behavior upon port ingress. We also
    need these
     >> flows to be hardware offloadable.
     >>
     >> For more details on the specific use cases we are targeting in the
     >> OVN-Kubernetes project, please see:
     >>
     >> https://docs.google.com/document/d/1MDZlu4oHL3RCWndbSgC-
    IGLgs1QfnB1l47nPqtM5iNo/edit?tab=t.0#heading=h.g8u53k9ds9s5
    <https://docs.google.com/document/d/1MDZlu4oHL3RCWndbSgC-
    IGLgs1QfnB1l47nPqtM5iNo/edit?tab=t.0#heading=h.g8u53k9ds9s5>
     >>
     >> Would appreciate feedback (either on the mailing list or in the
    design doc)
     >> and thoughts from the OVN experts on how we can accomodate this
    feature.
     >>
     >
     > Hi Tim,
     >
     > There is a very similar proposal from @Sragdhara Datta Chaudhuri to
     > add Network Functions support in OVN.
     > Can you please take a look at it ?  Looks like there are many
     > similarities in the requirements.
     >
     > https://mail.openvswitch.org/pipermail/ovs-dev/2025-
    May/423586.html <https://mail.openvswitch.org/pipermail/ovs-
    dev/2025-May/423586.html>
     > https://mail.openvswitch.org/pipermail/ovs-dev/2025-
    June/424102.html <https://mail.openvswitch.org/pipermail/ovs-
    dev/2025-June/424102.html>
     >
     >
     > Thanks
     > Numan

    Hi Tim and Numan,

    I've looked at both the ovn-k proposal and the Nutanix patch series. I
    think the biggest differences between the proposals (aside from small
    things, like naming) are the following:

    1) Nutanix amends the ACL table to include a network function group to
    send the packet to if the packet matches. The ovn-k proposal suggests a
    new SFC_Classifier table that includes an ACL-like match.

    2) ovn-k wants load balancing of the service functions. The Nutanix
    patch series has no load balancing.

    3) ovn-k wants a Service_Function_Chain table, that allows for multiple
    services to be chained. The Nutanix patch series provides a
    Network_Function_Group table that allows a single network function
    to be
    the active one. There is no concept of chaining in the patch series.

    4) ovn-k wants NSH-awareness. I don't 100% know what this entails, but
    there is no NSH in the Nutanix patch series.


We don't necessarily require NSH. Some limited Cisco products support NSH, but I'm not aware of other vendors. So for now the majority of the CNF use case would be proxied. However, we do need some mechanism to store metadata to know what chain the packet is currently on, especially as packets go between nodes. This could be Geneve TLV metadata. I'm looking for feedback on this kind of stuff in the doc, as I'm not sure what is best suited for this and if it is offloadable.
> >
    IMO, items 2, 3, and 4 can be made as add-ons to the Nutanix patch
    series.


How do you envision it being added on? Would it be a separate feature, or an extension of the Nutanix effort?

These are great questions. My thought had been that it would be an extension of the Nutanix feature.

I'm a bit concerned if it is the latter, because I worry we will have boxed ourselves into a certain paradigm and be less flexible to accomodate the full SFC RFC. For example, in the Nutanix proposal it looks like the functionality relies on standard networking principles. The client ports are connected to the same subnet as the network function. In my proposal, there is no concept of this network connectivity. The new stage simply takes the packet and delivers it to the port, without any requirement of layer 2 or layer 3 connectivity.

I'm not 100% sure I understand what you mean about the Nutanix proposal relying on standard network principles. For instance, my reading of the Nutanix patches is that if the ACL matches, then the packet is sent to the configured switch outport. Then when the patch re-arrives on the configured switch inport, it uses conntrack information to put the packet back on track to go to its intended destination. The service function does not appear to require any sort of L2 switching based solely on that.

Even the final patch that introduces the health monitoring doesn't rely on the switch subnet but instead uses NB_Global options to determine the destination MAC to check. It doesn't seem to be necessary to be on the same subnet as the switch on which the service is configured.

I may be misinterpreting, though.

Furthermore in the Nutanix proposal there are requirements around the packet not being modified, while in SFC it is totally OK for the packet to be modified. Once classified, the packet is identified by its chain id and position in the chain (aforementioned NSH/Geneve metadata).

Can you refresh me on how the chain ID is determined in the SFC proposal? In the patch series, the function group ID is stored in conntrack, so when the packet rearrives into OVN, we use conntrack to identify that the packet has come from a network function and needs to be "resumed" as it were. Because the patches use conntrack, the packet's identifying info (src IP, dst IP, src port, dst port, l4 protocol) can't be changed, since it means that we won't be able to find the packet in conntrack any longer.

In the SFC proposal, if the packet is modified, then that means we would need to use something other than conntrack to track the chain ID. Would we require NSH in order to track the chain ID properly? Or is there some other way?



    Item 1 is the biggest sticking point. From my point of view, I prefer
    the Nutanix approach of modifying the ACL table since,
    * ACLs can be applied to switches or port groups. The proposed
    SFC_Classifier only applies to port groups.
    * ACLs have things like logging and sampling that can be useful in this
    scenario.
    * ACLs can be tiered.
    However, if there's a good reason why this will not work for ovn-k's
    scenario, then that would be good to know.


Using the ACLs I think would be fine for the OVNK use case as well. The reason I didn't propose using ACLs were 2 fold: 1. Trying to create a clear boundary for SFC. Since SFC does not behave like normal networking, I thought it would make sense to make it its own entity.

This is where I really wish we had something like composable services in place, because it sounds like SFC is only being added to logical switches because that's the current best fit for them. They would really be better suited to their own datapath type.

But for now, putting them on a logical switch is the best choice.

The nice thing about ACL stages is that they are very early in the logical switch pipelines. We perform FDB and mirror actions before the ACL, but that's it.

2. I didn't think OVN would be amenable to modifying ACL to have a new column to send to a chain.

In the Nutanix proposal it looks like the column is added to send to a NFG. Would we also add the ability to send to a SFC?

The way I had thought about it, we could expand NFGs to contain SFCs. Currently, an NFG has a list of network functions. But we could create a new column in the NFG table that could be one or more SFCs. The idea would be that if you configure the network_functions column, we use those. If you configure the service_function_chains column, we use those instead. It would be a misconfiguration to use both at the same time.



    Currently, I would prefer to review and accept the Nutanix patch series
    (for ovn25.09) and then add on the ovn-k features that are not present
    in the series (for ovn26.03).

    Tim, what do you think?


I think first we should have a solid plan for how we will add on the SFC part. For example will we expand NFG so that we can load balance across it or only have 1 active at a time? If so, then it would maybe make sense now to add a new field to the NFG to indicate this mode. Those types of detail I would like to iron out and have a plan for so we don't find ourselves cornered when we try to add SFC later. wdyt?

Yes, this is how my thought process was as well. The current NFG configuration allows for multiple network functions to be configured, choosing a single one as the active one based on health checks.

We have to consider that we want to:
1) Allow for multiple functions to be chained.
2) Allow for multiple functions/chains to be load balanced.

There are many possibilities for how to implement these based on the current patch series.

For chaining, I think the best plan is to create a new Service_Function_Chain (or Network_Function_Chain if we want to keep the same nomenclature) table. Then the NFG's network_function column could allow for either singular functions or chains in the list of network_functions.

Alternatively, we could get rid of the current Network_Function table in favor of replacing it with the Service_Function_Chain table. A Network_Function is nothing more than a Service_Function_Chain with a single function, after all.

For load balancing, we could either:
a) Add a boolean to the NFG table, called load_balance. If set to false, then a single active network function or service function chain is chosen from the list. If set to true, then all network functions or service function chains are viable, and we use load balancing to determine which to use. We can still use health checks to ensure we only try to load balance between live functions. b) Create a new Load_Balanced_Service_Function_Chain table that specifies lists of load balanced service function chains. Then the NFG could place these in the network_functions as well. c) The same as B, but instead of adding a new table, add a new column to the existing Load_Balancer table that allows a list of network_functions (or chains) to be listed. Then these load balancers could be applied to the NFG the same way as a network function.



    Thanks,
    Mark Michelson
     >
     >> Thanks
     >>
     >> Tim Rozet
     >> Red Hat OpenShift Networking Team
     >>
     >>>
     >> _______________________________________________
     >> dev mailing list
     >> d...@openvswitch.org <mailto:d...@openvswitch.org>
     >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://
    mail.openvswitch.org/mailman/listinfo/ovs-dev>
     > _______________________________________________
     > dev mailing list
     > d...@openvswitch.org <mailto:d...@openvswitch.org>
     > https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://
    mail.openvswitch.org/mailman/listinfo/ovs-dev>


_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to