On 6/24/25 1:31 PM, Tim Rozet wrote:
Thanks Mark for the detailed response and taking the time to review the
proposal. See inline.
Tim Rozet
Red Hat OpenShift Networking Team
On Tue, Jun 24, 2025 at 12:04 PM Mark Michelson <mmich...@redhat.com
<mailto:mmich...@redhat.com>> wrote:
On 6/18/25 8:26 PM, Numan Siddique wrote:
> On Fri, Jun 13, 2025 at 8:44 AM Tim Rozet via dev
> <ovs-dev@openvswitch.org <mailto:ovs-dev@openvswitch.org>> wrote:
>>
>> Hello,
>> In the OVN-Kubernetes we have been discussing and designing a way to
>> implement Service Function Chaining (SFC) for various use cases.
Some of
>> these use cases are fairly complicated, involving a DPU and multiple
>> clusters. However, we have tried to abstract the OVN design and
use case
>> into a generic implementation that is not specific to our
particular use
>> cases. It follows SFC designs previously done within other
projects like
>> OpenStack Neutron and OpenDaylight. Please see:
>>
>> https://docs.google.com/document/
d/1dLdpx_9ZCnjHHldbNZABIpJF_GXd69qb/edit#bookmark=id.a7vfofkk8rj5
<https://docs.google.com/document/
d/1dLdpx_9ZCnjHHldbNZABIpJF_GXd69qb/edit#bookmark=id.a7vfofkk8rj5>
>>
>> tl;dr the design includes new tables to declare chains and
classifiers to
>> get traffic into that chain. There needs to be a new stage in
the datapath
>> pipeline to evaluate this behavior upon port ingress. We also
need these
>> flows to be hardware offloadable.
>>
>> For more details on the specific use cases we are targeting in the
>> OVN-Kubernetes project, please see:
>>
>> https://docs.google.com/document/d/1MDZlu4oHL3RCWndbSgC-
IGLgs1QfnB1l47nPqtM5iNo/edit?tab=t.0#heading=h.g8u53k9ds9s5
<https://docs.google.com/document/d/1MDZlu4oHL3RCWndbSgC-
IGLgs1QfnB1l47nPqtM5iNo/edit?tab=t.0#heading=h.g8u53k9ds9s5>
>>
>> Would appreciate feedback (either on the mailing list or in the
design doc)
>> and thoughts from the OVN experts on how we can accomodate this
feature.
>>
>
> Hi Tim,
>
> There is a very similar proposal from @Sragdhara Datta Chaudhuri to
> add Network Functions support in OVN.
> Can you please take a look at it ? Looks like there are many
> similarities in the requirements.
>
> https://mail.openvswitch.org/pipermail/ovs-dev/2025-
May/423586.html <https://mail.openvswitch.org/pipermail/ovs-
dev/2025-May/423586.html>
> https://mail.openvswitch.org/pipermail/ovs-dev/2025-
June/424102.html <https://mail.openvswitch.org/pipermail/ovs-
dev/2025-June/424102.html>
>
>
> Thanks
> Numan
Hi Tim and Numan,
I've looked at both the ovn-k proposal and the Nutanix patch series. I
think the biggest differences between the proposals (aside from small
things, like naming) are the following:
1) Nutanix amends the ACL table to include a network function group to
send the packet to if the packet matches. The ovn-k proposal suggests a
new SFC_Classifier table that includes an ACL-like match.
2) ovn-k wants load balancing of the service functions. The Nutanix
patch series has no load balancing.
3) ovn-k wants a Service_Function_Chain table, that allows for multiple
services to be chained. The Nutanix patch series provides a
Network_Function_Group table that allows a single network function
to be
the active one. There is no concept of chaining in the patch series.
4) ovn-k wants NSH-awareness. I don't 100% know what this entails, but
there is no NSH in the Nutanix patch series.
We don't necessarily require NSH. Some limited Cisco products support
NSH, but I'm not aware of other vendors. So for now the majority of the
CNF use case would be proxied. However, we do need some mechanism to
store metadata to know what chain the packet is currently on, especially
as packets go between nodes. This could be Geneve TLV metadata. I'm
looking for feedback on this kind of stuff in the doc, as I'm not sure
what is best suited for this and if it is offloadable.
> >
IMO, items 2, 3, and 4 can be made as add-ons to the Nutanix patch
series.
How do you envision it being added on? Would it be a separate feature,
or an extension of the Nutanix effort?
These are great questions. My thought had been that it would be an
extension of the Nutanix feature.
I'm a bit concerned if it is the
latter, because I worry we will have boxed ourselves into a certain
paradigm and be less flexible to accomodate the full SFC RFC. For
example, in the Nutanix proposal it looks like the functionality relies
on standard networking principles. The client ports are connected to the
same subnet as the network function. In my proposal, there is no concept
of this network connectivity. The new stage simply takes the packet and
delivers it to the port, without any requirement of layer 2 or layer 3
connectivity.
I'm not 100% sure I understand what you mean about the Nutanix proposal
relying on standard network principles. For instance, my reading of the
Nutanix patches is that if the ACL matches, then the packet is sent to
the configured switch outport. Then when the patch re-arrives on the
configured switch inport, it uses conntrack information to put the
packet back on track to go to its intended destination. The service
function does not appear to require any sort of L2 switching based
solely on that.
Even the final patch that introduces the health monitoring doesn't rely
on the switch subnet but instead uses NB_Global options to determine the
destination MAC to check. It doesn't seem to be necessary to be on the
same subnet as the switch on which the service is configured.
I may be misinterpreting, though.
Furthermore in the Nutanix proposal there are requirements
around the packet not being modified, while in SFC it is totally OK for
the packet to be modified. Once classified, the packet is identified by
its chain id and position in the chain (aforementioned NSH/Geneve metadata).
Can you refresh me on how the chain ID is determined in the SFC
proposal? In the patch series, the function group ID is stored in
conntrack, so when the packet rearrives into OVN, we use conntrack to
identify that the packet has come from a network function and needs to
be "resumed" as it were. Because the patches use conntrack, the packet's
identifying info (src IP, dst IP, src port, dst port, l4 protocol) can't
be changed, since it means that we won't be able to find the packet in
conntrack any longer.
In the SFC proposal, if the packet is modified, then that means we would
need to use something other than conntrack to track the chain ID. Would
we require NSH in order to track the chain ID properly? Or is there some
other way?
Item 1 is the biggest sticking point. From my point of view, I prefer
the Nutanix approach of modifying the ACL table since,
* ACLs can be applied to switches or port groups. The proposed
SFC_Classifier only applies to port groups.
* ACLs have things like logging and sampling that can be useful in this
scenario.
* ACLs can be tiered.
However, if there's a good reason why this will not work for ovn-k's
scenario, then that would be good to know.
Using the ACLs I think would be fine for the OVNK use case as well. The
reason I didn't propose using ACLs were 2 fold:
1. Trying to create a clear boundary for SFC. Since SFC does not behave
like normal networking, I thought it would make sense to make it its own
entity.
This is where I really wish we had something like composable services in
place, because it sounds like SFC is only being added to logical
switches because that's the current best fit for them. They would really
be better suited to their own datapath type.
But for now, putting them on a logical switch is the best choice.
The nice thing about ACL stages is that they are very early in the
logical switch pipelines. We perform FDB and mirror actions before the
ACL, but that's it.
2. I didn't think OVN would be amenable to modifying ACL to have a new
column to send to a chain.
In the Nutanix proposal it looks like the column is added to send to a
NFG. Would we also add the ability to send to a SFC?
The way I had thought about it, we could expand NFGs to contain SFCs.
Currently, an NFG has a list of network functions. But we could create a
new column in the NFG table that could be one or more SFCs. The idea
would be that if you configure the network_functions column, we use
those. If you configure the service_function_chains column, we use those
instead. It would be a misconfiguration to use both at the same time.
Currently, I would prefer to review and accept the Nutanix patch series
(for ovn25.09) and then add on the ovn-k features that are not present
in the series (for ovn26.03).
Tim, what do you think?
I think first we should have a solid plan for how we will add on the SFC
part. For example will we expand NFG so that we can load balance across
it or only have 1 active at a time? If so, then it would maybe make
sense now to add a new field to the NFG to indicate this mode. Those
types of detail I would like to iron out and have a plan for so we don't
find ourselves cornered when we try to add SFC later. wdyt?
Yes, this is how my thought process was as well. The current NFG
configuration allows for multiple network functions to be configured,
choosing a single one as the active one based on health checks.
We have to consider that we want to:
1) Allow for multiple functions to be chained.
2) Allow for multiple functions/chains to be load balanced.
There are many possibilities for how to implement these based on the
current patch series.
For chaining, I think the best plan is to create a new
Service_Function_Chain (or Network_Function_Chain if we want to keep the
same nomenclature) table. Then the NFG's network_function column could
allow for either singular functions or chains in the list of
network_functions.
Alternatively, we could get rid of the current Network_Function table in
favor of replacing it with the Service_Function_Chain table. A
Network_Function is nothing more than a Service_Function_Chain with a
single function, after all.
For load balancing, we could either:
a) Add a boolean to the NFG table, called load_balance. If set to false,
then a single active network function or service function chain is
chosen from the list. If set to true, then all network functions or
service function chains are viable, and we use load balancing to
determine which to use. We can still use health checks to ensure we only
try to load balance between live functions.
b) Create a new Load_Balanced_Service_Function_Chain table that
specifies lists of load balanced service function chains. Then the NFG
could place these in the network_functions as well.
c) The same as B, but instead of adding a new table, add a new column to
the existing Load_Balancer table that allows a list of network_functions
(or chains) to be listed. Then these load balancers could be applied to
the NFG the same way as a network function.
Thanks,
Mark Michelson
>
>> Thanks
>>
>> Tim Rozet
>> Red Hat OpenShift Networking Team
>>
>>>
>> _______________________________________________
>> dev mailing list
>> d...@openvswitch.org <mailto:d...@openvswitch.org>
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://
mail.openvswitch.org/mailman/listinfo/ovs-dev>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org <mailto:d...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev <https://
mail.openvswitch.org/mailman/listinfo/ovs-dev>
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev