Sure Ales, will take a look at your doc. Been waiting for the review for some 
time. Glad it got picked up finally. Hope we can merge the changes soon.

Thanks,
Sragdhara

From: Ales Musil <amu...@redhat.com>
Date: Tuesday, April 29, 2025 at 7:21 AM
To: Numan Siddique <num...@ovn.org>
Cc: Sragdhara Datta Chaudhuri <sragdha.chau...@nutanix.com>, 
ovs-dev@openvswitch.org <ovs-dev@openvswitch.org>
Subject: Re: [ovs-dev] [PATCH OVN v2 0/5] *** Network Function Insertion. ***
On Mon, Apr 28, 2025 at 4: 53 PM Numan Siddique <numans@ ovn. org> wrote: On 
Thu, Mar 13, 2025 at 4: 46 AM Sragdhara Datta Chaudhuri <sragdha. chaudhu@ 
nutanix. com> wrote: > > RFC: NETWORK FUNCTION INSERTION IN OVN > > 1. 
Introduction
ZjQcmQRYFpfptBannerStart
CAUTION: External Email

ZjQcmQRYFpfptBannerEnd


On Mon, Apr 28, 2025 at 4:53 PM Numan Siddique 
<num...@ovn.org<mailto:num...@ovn.org>> wrote:
On Thu, Mar 13, 2025 at 4:46 AM Sragdhara Datta Chaudhuri
<sragdha.chau...@nutanix.com<mailto:sragdha.chau...@nutanix.com>> wrote:
>
> RFC: NETWORK FUNCTION INSERTION IN OVN
>
> 1. Introduction
> ================
> The objective is to insert a Network Function (NF) in the path of 
> outbound/inbound traffic from/to a port-group. The use case is to integrate a 
> 3rd party service in the path of traffic. An example of such a service would 
> be layer7 firewall. The NF VM will be like a bump in the wire and should not 
> modify the packet, i.e. the IP header, the MAC addresses, VLAN tag, sequence 
> numbers remain unchanged.
>
> Here are some of the highlights:
> - A new entity network-function (NF) has been introduced. It contains a pair 
> of LSPs. The CMS would designate one as “inport” and the other as “outport”.
> - For high-availability, a network function group (NFG) entity consists of a 
> group of NFs. Only one NF in a NFG has an active role based on health 
> monitoring.
> - ACL would accept NFG as a parameter and traffic matching the ACL would be 
> redirected to the associated active NF’s port. NFG is accepted for stateful 
> allow action only.
> - The ACL’s port-group is the point of reference when defining the role of 
> the NF ports. The “inport” is the port closer to the port-group and “outport” 
> is the one away from it. For from-lport ACLs, the request packets would be 
> redirected to the NF “inport” and for to-lport ACLs, the request packets 
> would be redirected to NF “outport”. When the same packet comes out of the 
> other NF port, it gets simply forwarded.
> - Statefulness will be maintained, i.e. the response traffic will also go 
> through the same pair of NF ports but in reverse order.
> - For the NF ports we need to disable port security check, fdb learning and 
> multicast/broadcast forwarding.
> - Health monitoring involves ovn-controller periodically injecting ICMP probe 
> packets into the NF inport and monitor the same packet coming out of the NF 
> outport.
> - If the traffic redirection involves cross-host traffic (e.g. for a 
> from-lport ACL, if the source VM and NF VM are on different hosts), packets 
> would be tunneled to and from the NF VM's host.
> - If the port-group to which the ACL is being applied has members spread 
> across multiple LSs, CMS needs to create child ports for the NF ports on each 
> of these LSs. The redirection rules in each LS will use the child ports on 
> that LS.
>
> 2. NB tables
> =============
> New NB tables
> —------------
> Network_Function: Each row contains {inport, outport, health_check}
> Network_Function_Group: Each row contains a list of Network_Function 
> entities. It also contains a unique id (between 1 and 255, generated by 
> northd) and a reference to the current active NF.
> Network_Function_Health_Check: Each row contains configuration for probes in 
> options field: {interval, timeout, success_count, failure_count}
>
>         "Network_Function_Health_Check": {
>             "columns": {
>                 "name": {"type": "string"},
>                 "options": {
>                      "type": {"key": "string",
>                               "value": "string",
>                               "min": 0,
>                               "max": "unlimited"}},
>                 "external_ids": {
>                     "type": {"key": "string", "value": "string",
>                              "min": 0, "max": "unlimited"}}},
>             "isRoot": true},
>         "Network_Function": {
>             "columns": {
>                 "name": {"type": "string"},
>                 "outport": {"type": {"key": {"type": "uuid",
>                                              "refTable": 
> "Logical_Switch_Port",
>                                              "refType": "strong"},
>                                      "min": 1, "max": 1}},
>                 "inport": {"type": {"key": {"type": "uuid",
>                                             "refTable": "Logical_Switch_Port",
>                                             "refType": "strong"},
>                                     "min": 1, "max": 1}},
>                 "health_check": {"type": {
>                     "key": {"type": "uuid",
>                             "refTable": "Network_Function_Health_Check",
>                             "refType": "strong"},
>                     "min": 0, "max": 1}},
>                 "external_ids": {
>                     "type": {"key": "string", "value": "string",
>                              "min": 0, "max": "unlimited"}}},
>             "isRoot": true},
>         "Network_Function_Group": {
>             "columns": {
>                 "name": {"type": "string"},
>                 "network_function": {"type":
>                                   {"key": {"type": "uuid",
>                                            "refTable": "Network_Function",
>                                            "refType": "strong"},
>                                            "min": 0, "max": "unlimited"}},
>                 "mode": {"type": {"key": {"type": "string",
>                                           "enum": ["set", ["inline"]]}}},
>                 "network_function_active": {"type":
>                                   {"key": {"type": "uuid",
>                                            "refTable": "Network_Function",
>                                            "refType": "strong"},
>                                            "min": 0, "max": 1}},
>                 "id": {
>                      "type": {"key": {"type": "integer",
>                                       "minInteger": 0,
>                                       "maxInteger": 255}}},
>                 "external_ids": {
>                     "type": {"key": "string", "value": "string",
>                              "min": 0, "max": "unlimited"}}},
>             "isRoot": true},
>
>
> Modified NB table
> —----------------
> ACL: The ACL entity would have a new optional field that is a reference to a 
> Network_Function_Group entity. This field can be present only for stateful 
> allow ACLs.
>
>         "ACL": {
>             "columns": {
>                 "network_function_group": {"type": {"key": {"type": "uuid",
>                                            "refTable": 
> "Network_Function_Group",
>                                            "refType": "strong"},
>                                            "min": 0,
>                                            "max": 1}},
>
> New options for Logical_Switch_Port
> —----------------------------------
> receive_multicast=<boolean>: Default true. If set to false, LS will not 
> forward broadcast/multicast traffic to this port. This is to prevent looping 
> of such packets.
>
> lsp_learn_fdb=<boolean>: Default true. If set to false, fdb learning will be 
> skipped for packets coming out of this port. Redirected packets from the NF 
> port would be carrying the originating VM’s MAC in source, and so learning 
> should not happen.
>
> CMS needs to set both the above options to false for NF ports, in addition to 
> disabling port security.
>
> network-function-linked-port=<lsp-name>: Each NF port needs to have this set 
> to the other NF port of the pair.
>
> New NB_global options
> —--------------------
> svc_monitor_mac_dst: destination MAC of probe packets (svc_monitor_mac is 
> already there and will be used as source MAC)
> svc_monitor_ip4: source IP of probe packets
> svc_monitor_ip4_dst: destination IP of probe packets
>
> Sample configuration
> —-------------------
> ovn-nbctl ls-add ls1
> ovn-nbctl lsp-add ls1 nfp1
> ovn-nbctl lsp-add ls1 nfp2
> ovn-nbctl set logical_switch_port nfp1 options:receive_multicast=false 
> options:lsp_learn_fdb=false options:network-function-linked-port=nfp2
> ovn-nbctl set logical_switch_port nfp2 options:receive_multicast=false 
> options:lsp_learn_fdb=false options:network-function-linked-port=nfp1
> ovn-nbctl network-function-add nf1 nfp1 nfp2
> ovn-nbctl network-function-group-add nfg1 nf1
> ovn-nbctl lsp-add ls1 p1 -- lsp-set-addresses p1 "50:6b:8d:3e:ed:c4 10.1.1.4"
> ovn-nbctl pg-add pg1 p1
> ovn-nbctl create Address_Set name=as1 addresses=10.1.1.4
> ovn-nbctl lsp-add ls1 p2 -- lsp-set-addresses p2 "50:6b:8d:3e:ed:c5 10.1.1.5"
> ovn-nbctl create Address_Set name=as2 addresses=10.1.1.5
> ovn-nbctl acl-add pg1 from-lport 200 'inport==@pg1 && ip4.dst == $as2' 
> allow-related nfg1
> ovn-nbctl acl-add pg1 to-lport 100 'outport==@pg1 && ip4.src == $as2' 
> allow-related nfg1
>
> 3. SB tables
> ============
> Service_Monitor:
> This is currently used by Load balancer. New fields are: “type” - to indicate 
> LB or NF, “mac” - the destination MAC address for monitor packets, 
> “logical_input_port” - the LSP to which the probe packet would be sent. Also, 
> “icmp” has been added as a protocol type, used only for NF.
>
>          "Service_Monitor": {
>              "columns": {
>                "type": {"type": {"key": {
>                           "type": "string",
>                           "enum": ["set", ["load-balancer", 
> "network-function"]]}}},
>                "mac": {"type": "string"},
>                  "protocol": {
>                      "type": {"key": {"type": "string",
>                             "enum": ["set", ["tcp", "udp", "icmp"]]},
>                               "min": 0, "max": 1}},
>                "logical_input_port": {"type": "string"},
>
> northd would create one Service_Monitor entity for each NF. The 
> logical_input_port and logical_port would be populated from the NF inport and 
> outport fields respectively. The probe packets would be injected into the 
> logical_input_port and would be monitored out of logical_port.
>
> 4. Logical Flows
> ================
> Logical Switch ingress pipeline:
> - in_network_function added after in_stateful.
> - Modifications to in_acl_eval, in_stateful and in_l2_lookup.
> Logical Switch egress pipeline:
> - out_network_function added after out_stateful.
> - Modifications to out_pre_acl, out_acl_eval and out_stateful.
>
> 4.1 from-lport ACL
> ------------------
> The diagram shows the request path for packets from VM1 port p1, which is a 
> member of the pg to which ACL is applied. The response would follow the 
> reverse path, i.e. packet would be redirected to nfp2 and come out of nfp1 
> and be forwarded to p1.
> Also, p2 does not need to be on the same LS. Only the p1, nfp1, nfp2 are on 
> the same LS.
>
>       -----                  -------                  -----
>      | VM1 |                | NF VM |                | VM2 |
>       -----                  -------                  -----
>         |                    /\    |                   / \
>         |                    |     |                    |
>        \ /                   |    \ /                   |
>    ------------------------------------------------------------
>   |     p1                 nfp1  nfp2                   p2     |
>   |                                                            |
>   |                      Logical Switch                        |
>    -------------------------------------------------------------
> pg1: [p1]         as2: [p2-ip]
> ovn-nbctl network-function-add nf1 nfp1 nfp2
> ovn-nbctl network-function-group-add nfg1 nf1
> ovn-nbctl acl-add pg1 from-lport 200 'inport==@pg1 && ip4.dst == $as2' 
> allow-related nfg1
> Say, the unique id northd assigned to this NFG, is 123
>
> The request packets from p1 matching a from-lport ACL with NFG, are 
> redirected to nfp1 and the NFG id is committed to the ct label in p1's zone. 
> When the same packet comes out of nfp2 it gets forwarded the normal way.
> Response packets have destination as p1's MAC. Ingress processing sets the 
> outport to p1 and the CT lookup in egress pipeline (in p1's ct zone) yields 
> the NFG id and the packet injected back to ingress pipeline after setting the 
> outport to nfp2.
>
> Below are the changes in detail.
>
> 4.1.1 Request processing
> ------------------------
>
> in_acl_eval: For from-lport ACLs with NFG, the existing rule's action has 
> been enhanced to set:
>  - reg8[21] = 1: to indicate that packet has matched a rule with NFG
>  - reg5[0..7] = <NFG-unique-id>
>  - reg8[22] = <direction> (1: request, 0: response)
>
>   table=8 (ls_in_acl_eval), priority=1200 , match=(reg0[7] == 1 && 
> (inport==@pg1 && ip4.dst == $as2)), action=(reg8[16] = 1; reg0[1] = 1; 
> reg8[21] = 1; reg8[22] = 1; reg5[0..7] = 123; next;)
>   table=8 (ls_in_acl_eval), priority=1200 , match=(reg0[8] == 1 && 
> (inport==@pg1 && ip4.dst == $as2)), action=(reg8[16] = 1; reg8[21] = 1; 
> reg8[22] = 1; reg5[0..7] = 123; next;)
>
> in_stateful: Priority 110: set NFG id in CT label if reg8[21] is set.
>  - bit 7 (ct_label.network_function_group): Set to 1 to indicate NF insertion.
>  - bits 17 to 24 (ct_label.network_function_group_id): Stores the 8 bit NFG id
>
>   table=21(ls_in_stateful     ), priority=110  , match=(reg0[1] == 1 && 
> reg0[13] == 0 && reg8[21] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_label.acl_id = reg2[16..31]; 
> ct_label.network_function_group = 1; ct_label.network_function_group_id = 
> reg5[0..7]; }; next;)
>   table=21(ls_in_stateful     ), priority=110  , match=(reg0[1] == 1 && 
> reg0[13] == 1 && reg8[21] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_mark.obs_stage = reg8[19..20]; 
> ct_mark.obs_collector_id = reg8[8..15]; ct_label.obs_point_id = reg9; 
> ct_label.acl_id = reg2[16..31]; ct_label.network_function_group = 1; 
> ct_label.network_function_group_id = reg5[0..7]; }; next;)
>   table=21(ls_in_stateful     ), priority=100  , match=(reg0[1] == 1 && 
> reg0[13] == 0), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_label.acl_id = reg2[16..31]; 
> ct_label.network_function_group = 0; ct_label.network_function_group_id = 0; 
> }; next;)
>   table=21(ls_in_stateful     ), priority=100  , match=(reg0[1] == 1 && 
> reg0[13] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_mark.obs_stage = reg8[19..20]; 
> ct_mark.obs_collector_id = reg8[8..15]; ct_label.obs_point_id = reg9; 
> ct_label.acl_id = reg2[16..31]; ct_label.network_function_group = 0; 
> ct_label.network_function_group_id = 0; }; next;)
>   table=21(ls_in_stateful     ), priority=0    , match=(1), action=(next;)
>
>
> For non-NFG cases, the existing priority 100 rules will be hit. There 
> additional action has been added to clear the NFG bits in ct label.
>
> in_network_function: A new stage with priority 99 rules to redirect packets 
> by setting outport to the NF “inport” (or its child port) based on the NFG id 
> set by the prior ACL stage.
> Priority 100 rules ensure that when the same packets come out of the NF 
> ports, they are not redirected again (the setting of reg5 here relates to the 
> cross-host packet tunneling and will be explained later).
> Priority 1 rule: if reg8[21] is set, but the NF port (or child port) is not 
> present on this LS, drop packets.
>
>   table=22(ls_in_network_function), priority=100  , match=(inport == "nfp1"), 
> action=(reg5[16..31] = ct_label.tun_if_id; next;)
>   table=22(ls_in_network_function), priority=100  , match=(inport == "nfp2"), 
> action=(reg5[16..31] = ct_label.tun_if_id; next;)
>   table=22(ls_in_network_function), priority=100  , match=(reg8[21] == 1 && 
> eth.mcast), action=(next;)
>   table=22(ls_in_network_function), priority=99   , match=(reg8[21] == 1 && 
> reg8[22] == 1 && reg5[0..7] == 1), action=(outport = "nfp1"; output;)
>   table=22(ls_in_network_function), priority=1    , match=(reg8[21] == 1), 
> action=(drop;)
>   table=22(ls_in_network_function), priority=0    , match=(1), action=(next;)
>
>
> 4.1.2 Response processing
> -------------------------
> out_acl_eval: High priority rules that allow response and related packets to 
> go through have been enhanced to also copy CT label NFG bit into reg8[21].
>
>   table=6(ls_out_acl_eval), priority=65532, match=(!ct.est && ct.rel && 
> !ct.new 
> [ct.new]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ct.new&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=0wEMXBfCisk9Hq7uJOh62OqEqdpCXXUDJeP-bXGblvE&e=>
>  && !ct.inv && ct_mark.blocked == 0), action=(reg8[21] = 
> ct_label.network_function_group; reg8[16] = 1; ct_commit_nat;)
>   table=6(ls_out_acl_eval), priority=65532, match=(ct.est && !ct.rel && 
> !ct.new 
> [ct.new]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ct.new&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=0wEMXBfCisk9Hq7uJOh62OqEqdpCXXUDJeP-bXGblvE&e=>
>  && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg8[21] = 
> ct_label.network_function_group; reg8[16] = 1; next;)
>
> out_network_function: Priority 99 rule matches on the nfg_id in ct_label and 
> sets the outport to the NF “outport”. It also sets reg8[23]=1 and injects the 
> packet to ingress pipeline (in_l2_lookup).
> Priority 100 rule forwards all packets to NF ports to the next table.
>
>   table=11 (ls_out_network_function), priority=100  , match=(outport == 
> "nfp1"), action=(next;)
>   table=11 (ls_out_network_function), priority=100  , match=(outport == 
> "nfp2"), action=(next;)
>   table=11(ls_out_network_function), priority=100  , match=(reg8[21] == 1 && 
> eth.mcast), action=(next;)
>   table=11 (ls_out_network_function), priority=99   , match=(reg8[21] == 1 && 
> reg8[22] == 0 && ct_label.network_function_group_id == 123), action=(outport 
> = "nfp2"; reg8[23] = 1; next(pipeline=ingress, table=29);)
>   table=11 (ls_out_network_function), priority=1    , match=(reg8[21] == 1), 
> action=(drop;)
>   table=11 (ls_out_network_function), priority=0    , match=(1), 
> action=(next;)
>
> in_l2_lkup: if reg8[23] == 1 (packet has come back from egress), simply 
> forward such packets as outport is already set.
>
>   table=29(ls_in_l2_lkup), priority=100  , match=(reg8[23] == 1), 
> action=(output;)
>
> The above set of rules ensure that the response packet is sent to nfp2. When 
> the same packet comes out of nfp1, the ingress pipeline would set the outport 
> to p1 and it enters the egress pipeline.
>
> out_pre_acl: If the packet is coming from the NF inport, skip the egress 
> pipeline upto the out_nf stage, as the packet has already gone through it and 
> we don't want the same packet to be processed by CT twice.
>   table=2 (ls_out_pre_acl     ), priority=110  , match=(inport == "nfp1"), 
> action=(next(pipeline=egress, table=12);)
>
>
> 4.2 to-lport ACL
> ----------------
>       -----                  --------                  -----
>      | VM1 |                |  NF VM |                | VM2 |
>       -----                  --------                  -----
>        / \                    |   / \                    |
>         |                     |    |                     |
>         |                    \ /   |                    \ /
>    -------------------------------------------------------------
>   |     p1                  nfp1   nfp2                  p2     |
>   |                                                             |
>   |                      Logical Switch                         |
>    -------------------------------------------------------------
> ovn-nbctl acl-add pg1 to-lport 100 'outport==@pg1&& ip4.src == $as2' 
> allow-related nfg1
> Diagram shows request traffic path. The response will follow a reverse path.
>
> Ingress pipeline sets the outport to p1 based on destination MAC lookup. The 
> packet enters the egress pipeline. There the to-lport ACL with NFG gets 
> evaluated and the NFG id gets committed to the CT label. Then the outport is 
> set to nfp2 and then the packet is injected back to ingress. When the same 
> packet comes out of nfp1, it gets forwarded to p1 the normal way.
> From the response packet from p1, ingress pipeline gets the NFG id from CT 
> label and accordingly redirects it to nfp1. When it comes out of nfp2 it is 
> forwarded the normal way.
>
> 4.2.1 Request processing
> ------------------------
> out_acl_eval: For to-lport ACLs with NFG, the existing rule's action has been 
> enhanced to set:
>  - reg8[21] = 1: to indicate that packet has matched a rule with NFG
>  - reg5[0..7] = <NFG-unique-id>
>  - reg8[22] = <direction> (1: request, 0: response)
>
>   table=6 (ls_out_acl_eval    ), priority=1100 , match=(reg0[7] == 1 && 
> (outport==@pg1 && ip4.src == $as2)), action=(reg8[16] = 1; reg0[1] = 1; 
> reg8[21] = 1; reg8[22] = 1; reg5[0..7] = 1; next;)
>   table=6 (ls_out_acl_eval    ), priority=1100 , match=(reg0[8] == 1 && 
> (outport==@pg1 && ip4.src == $as2)), action=(reg8[16] = 1; reg0[1] = 1; 
> reg8[21] = 1; reg8[22] = 1; reg5[0..7] = 1; next;)
>
>
>
> Out_stateful: Priority 110: set NFG id in CT label if reg8[21] is set.
>
>   table=10(ls_out_stateful    ), priority=110  , match=(reg0[1] == 1 && 
> reg0[13] == 0 && reg8[21] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_label.acl_id = reg2[16..31]; 
> ct_label.network_function_group = 1; ct_label.network_function_group_id = 
> reg5[0..7]; }; next;)
>   table=10(ls_out_stateful    ), priority=110  , match=(reg0[1] == 1 && 
> reg0[13] == 1 && reg8[21] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_mark.obs_stage = reg8[19..20]; 
> ct_mark.obs_collector_id = reg8[8..15]; ct_label.obs_point_id = reg9; 
> ct_label.acl_id = reg2[16..31]; ct_label.network_function_group = 1; 
> ct_label.network_function_group_id = reg5[0..7]; }; next;)
>   table=10(ls_out_stateful    ), priority=100  , match=(reg0[1] == 1 && 
> reg0[13] == 0), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_label.acl_id = reg2[16..31]; 
> ct_label.network_function_group = 0; ct_label.network_function_group_id = 0; 
> }; next;)
>   table=10(ls_out_stateful    ), priority=100  , match=(reg0[1] == 1 && 
> reg0[13] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_mark.obs_stage = reg8[19..20]; 
> ct_mark.obs_collector_id = reg8[8..15]; ct_label.obs_point_id = reg9; 
> ct_label.acl_id = reg2[16..31]; ct_label.network_function_group = 0; 
> ct_label.network_function_group_id = 0; }; next;)
>   table=10(ls_out_stateful    ), priority=0    , match=(1), action=(next;)
>
> out_network_function: A new stage that has priority 99 rules to redirect 
> packet by setting outport to the NF “outport” (or its child port) based on 
> the NFG id set by the prior ACL stage, and then injecting back to ingress. 
> Priority 100 rules ensure that when the packets are going to NF ports, they 
> are not redirected again.
> Priority 1 rule: if reg8[21] is set, but the NF port (or child port) is not 
> present on this LS, drop packets.
>
>   table=11(ls_out_network_function), priority=100  , match=(outport == 
> "nfp1"), action=(next;)
>   table=11(ls_out_network_function), priority=100  , match=(outport == 
> "nfp2"), action=(next;)
>   table=11(ls_out_network_function), priority=100  , match=(reg8[21] == 1 && 
> eth.mcast), action=(next;)
>   table=11(ls_out_network_function), priority=99   , match=(reg8[21] == 1 && 
> reg8[22] == 1 && reg5[0..7] == 123), action=(outport = "nfp2"; reg8[23] = 1; 
> next(pipeline=ingress, table=29);)
>   table=11(ls_out_network_function), priority=1    , match=(reg8[21] == 1), 
> action=(drop;)
>   table=11(ls_out_network_function), priority=0    , match=(1), action=(next;)
>
>
> in_l2_lkup: As described earlier, the priority 100 rule will forward these 
> packets.
>
> Then the same packet comes out from nfp1 and goes through the ingress 
> processing where the outport gets set to p1. The egress pipeline out_pre_acl 
> priority 110 rule described earlier, matches against inport as nfp1 and 
> directly jumps to the stage after out_network_function. Thus the packet is 
> not redirected again.
>
> 4.2.2 Response processing
> -------------------------
> in_acl_eval: High priority rules that allow response and related packets to 
> go through have been enhanced to also copy CT label NFG bit into reg8[21].
>
>   table=8(ls_in_acl_eval), priority=65532, match=(!ct.est && ct.rel && 
> !ct.new 
> [ct.new]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ct.new&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=0wEMXBfCisk9Hq7uJOh62OqEqdpCXXUDJeP-bXGblvE&e=>
>  && !ct.inv && ct_mark.blocked == 0), action=(reg0[17] = 1; reg8[21] = 
> ct_label.network_function_group; reg8[16] = 1; ct_commit_nat;)
>   table=8 (ls_in_acl_eval), priority=65532, match=(ct.est && !ct.rel && 
> !ct.new 
> [ct.new]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ct.new&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=0wEMXBfCisk9Hq7uJOh62OqEqdpCXXUDJeP-bXGblvE&e=>
>  && !ct.inv && ct.rpl && ct_mark.blocked == 0), action=(reg0[9] = 0; reg0[10] 
> = 0; reg0[17] = 1; reg8[21] = ct_label.network_function_group; reg8[16] = 1; 
> next;)
>
> in_network_function: Priority 99 rule matches on the nfg_id in ct_label and 
> sets the outport to the NF “inport”.
> Priority 100 rule forwards all packets to NF ports to the next table.
>   table=22(ls_in_network_function), priority=99   , match=(reg8[21] == 1 && 
> reg8[22] == 0 && ct_label.network_function_group_id == 123), action=(outport 
> = "nfp1"; output;)
>
>
> 5. Cross-host Traffic for VLAN Network
> ======================================
> For overlay subnets, all cross-host traffic exchanges are tunneled. In the 
> case of VLAN subnets, there needs to be special handling to selectively 
> tunnel only the traffic to or from the NF ports.
> Take the example of a from-lport ACL. Packets from p1 to p2, gets redirected 
> to nfp1 in host1. If this packet is simply sent out from host1, the physical 
> network will directly forward it to host2 where VM2 is. So, we need to tunnel 
> the redirected packets from host1 to host3. Now, once the packets come out of 
> nfp2, if host3 sends the packets out, the physical network would learn p1's 
> MAC coming from host3. So, these packets need to be tunneled back to host1. 
> From there the packet would be forwarded to VM2 via the physical network.
>
>       -----                  -----                  --------
>      | VM2 |                | VM1 |                | NF VM  |
>       -----                  -----                  --------
>        / \                     |                    / \   |
>         | (7)                  |  (1)             (3)|    |(4)
>         |                     \ /                    |   \ /
>   --------------        --------------   (2)    ---------------
>  |      p2      |  (6) |      p1      |______\ |   nfp1  nfp2  |
>  |              |/____ |              |------/ |               |
>  |    host2     |\     |     host1    |/______ |     host3     |
>  |              |      |              |\------ |               |
>   --------------        --------------   (5)    --------------
>
> The above figure shows the request packet path for a from-lport ACL. Response 
> would follow the same path in reverse direction.
>
> To achieve this, the following would be done:
>
> On host where the ACL port group members are present (host1)
> —-----------------------------------------------------------
> REMOTE_OUTPUT (table 42):
> Currently, it tunnels traffic destined to all non-local overlay ports to 
> their associated hosts. The same rule is now also added for traffic to 
> non-local NF ports. Thus the packets from p1 get tunneled to host 3.
>
> On host with NF (host3) forward packet to nfp1
> —----------------------------------------------
> Upon reaching host3, the following rules come into play:
> PHY_TO_LOG (table 0):
> Ppriority 100: Existing rule - for each geneve tunnel interface on the 
> chassis, copies info from header to inport, outport, metadata registers. Now 
> the same rule also stores the tun intf id in a register (reg5[16..31]).
>
> CHECK_LOOPBACK (table 44)
> This table has a rule that clears all the registers. The change is to skip 
> the clearing of reg5[16..31].
>
> Logical egress pipeline:
>
> ls_out_stateful priority 120: If the outport is an NF port, copy reg5[16..31] 
> (table0 had set it) to ct_label.tun_if_id.)
>
>   table=10(ls_out_stateful    ), priority=120  , match=(outport == "nfp1" && 
> reg0[13] == 0), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_label.acl_id = reg2[16..31]; 
> ct_label.tun_if_id = reg5[16..31]; }; next;)
>   table=10(ls_out_stateful    ), priority=120  , match=(outport == "nfp1" && 
> reg0[13] == 1), action=(ct_commit { ct_mark.blocked = 0; 
> ct_mark.allow_established = reg0[20]; ct_label.acl_id = reg2[16..31]; 
> ct_mark.obs_stage = reg8[19..20]; ct_mark.obs_collector_id = reg8[8..15]; 
> ct_label.obs_point_id = reg9; ct_label.tun_if_id = reg5[16..31]; }; next;)
>
> The above sequence of flows ensure that if a packet is received via tunnel on 
> host3, with outport as nfp1, the tunnel interface id is committed to the ct 
> entry in nfp1's zone.
>
> On host with NF (host3) tunnel packets from nfp2 back to host1
> —--------------------------------------------------------------
> When the same packet comes out of nfp2 on host3:
>
> LOCAL_OUTPUT (table 43)
> When the packet comes out of the other NF port (nfp2), following two rules 
> send it back to the host that it originally came from:
>
> Priority 110: For each NF port local to this host, following rule processes 
> the
> packet through CT of linked port (for nfp2, it is nfp1):
>   match: inport==nfp2 && RECIRC_BIT==0
>   action: RECIRC_BIT = 1, ct(zone=nfp1’s zone, table=LOCAL), resubmit to 
> table 43
>
> Priority 109: For each {tunnel_id, nf port} on this host, if the tun_if_id in 
> ct_label matches the tunnel_id, send the recirculated packet using tunnetl_id:
>   match: inport==nfp1 && RECIRC_BIT==1 && ct_label.tun_if_id==<tun-id>
>   action: tunnel packet using tun-id
>
> If p1 and nfp1 happen to be on the same host, the tun_if_id would not be set 
> and thus none of the priority 109 rules would match. It would be forwarded 
> the usual way matching the existing priority 100 rules in LOCAL_TABLE.
>
> Special handling of the case where NF responds back on nfp1, instead of 
> forwarding packet out of nfp2:
> For example, a SYN packet from p1 got redirected to nfp1. Then the NF, which 
> is a firewall VM, drops the SYN and sends RST back on port nfp1. In this 
> case, looking up in the linked port (nfp2) ct zone will not give anything. 
> The following rule uses ct.inv to identify such scenarios and uses nfp1’s CT 
> zone to send the packet back. To achieve this, following 2 rules are 
> installed:
>
> in_network_function:
> Priority 100 rule that allows packets incoming from NF type ports, is 
> enhanced with additional action to store the tun_if_id from ct_label into 
> reg5[16..31].
>   table=22(ls_in_network_function), priority=100  , match=(inport == "nfp1"), 
> action=(reg5[16..31] = ct_label.tun_if_id; next;)
>
> LOCAL_OUTPUT (table 43)
> Priority 110 rule: for recirculated packets, if ct (of the linked port) is 
> invalid, use the tun id from reg5[16..31] to tunnel the packet back to host1 
> (as CT zone info has been overwritten in the above 110 priority rule in table 
> 42).
>       match: inport==nf1 && RECIRC_BIT==1 && ct.inv && 
> MFF_LOG_TUN_OFPORT==<tun-id>
>       action: tunnel packet using tun-id
>
>
> 6. NF insertion across logical switches
> =======================================
> If the port-group where the ACL is being applied has members across multiple 
> logical switches, there needs to be a NF port pair on each of these switches.
> The NF VM will have only one inport and one outport. The CMS is expected to 
> create child ports linked to these ports on each logical switch where 
> port-group members are present.
> The network-function entity would be configured with the parent ports only. 
> When CMS creates the child ports, it does not need to change any of the NF, 
> NFG or ACL config tables.
> When northd configures the redirection rules for a specific LS, it will use 
> the parent or child port depending on what it finds on that LS.
>                                      --------
>                                     | NF VM  |
>                                      --------
>                                      |      |
>           -----                      |      |              -----
>          | VM1 |                    nfp1   nfp2           | VM2 |
>           ---- -   |     |         --------------          -----    |      |
>             |      |     |        |    SVC LS    |          |       |      |
>           p1|  nfp1_ch1  nfp2_ch1  --------------         p3|  nfp1_ch2  
> nfp2_ch2
>           --------------------                             
> --------------------
>          |         LS1        |                           |         LS2       
>  |
>           --------------------                             
> --------------------
>
> In this example, the CMS created the parent ports for the NF VM on LS named 
> SVC LS. The ports are nfp1 and nfp2. The CMS configures the NF using these 
> ports:
> ovn-nbctl network-function-add nf1 nfp1 nfp2
> ovn-nbctl network-function-group-add nfg1 nf1
> ovn-nbctl acl-add pg1 from-lport 200 'inport==@pg1 && ip4.dst == $as2' 
> allow-related nfg1
>
> The port group to which the ACL is applied is pg1 and pg1 has two ports: p1 
> on LS1 and p3 on LS2.
> The CMS needs to create child ports for the NF ports on LS1 and LS2. On LS1: 
> nfp1_ch1 and nfp2_ch1. On LS2: nfp1_ch2 and nfp2_ch2
>
> When northd creates rules on LS1, it would use nfp1_ch1 and nfp2_ch1.
>
>   table=22(ls_in_network_function), priority=100  , match=(inport == 
> "nfp2_ch1"), action=(reg5[16..31] = ct_label.tun_if_id; next;)
>   table=22(ls_in_network_function), priority=99   , match=(reg8[21] == 1 && 
> reg8[22] == 1 && reg5[0..7] == 1), action=(outport = "nfp1_ch1"; output;)
>
> When northd is creating rules on LS2, it would use nfp1_ch2 and nfp2_ch2.
>   table=22(ls_in_network_function), priority=100  , match=(inport == 
> "nfp2_ch2"), action=(reg5[16..31] = ct_label.tun_if_id; next;)
>   table=22(ls_in_network_function), priority=99   , match=(reg8[21] == 1 && 
> reg8[22] == 1 && reg5[0..7] == 1), action=(outport = "nfp1_ch2"; output;)
>
>

Hi Sragdhara,

Sorry for the late reviews on this patch series.  I haven't looked
into the series yet.  I plan to take a look this week.  Is it possible
to rebase and submit v3 ? As it has conflicts.

Thanks
Numan

> 7. Health Monitoring
> ====================
> The LB health monitoring functionality has been extended to support NFs. 
> Network_Function_Group has a list of Network_Functions, each of which has a 
> reference to network_Function_Health_Check that has the monitoring config. 
> There is a corresponding SB service_monitor maintaining the online/offline 
> status. When status changes, northd picks one of the “online” NFs and sets it 
> in the network_function_active field of NFG. The redirection rule in LS uses 
> the ports from this NF.
>
> Ovn-controller performs the health monitoring by sending ICMP echo request 
> with source IP and MAC from NB global options “svc_monitor_ip4” and 
> “svc_monitor_mac”, and destination IP and MAC from new NB global options 
> “svc_monitor_ip4_dst” and “svc_monitor_mac_dst”. The sequence number and id 
> are randomly generated and stored in service_mon. The NF VM forwards the same 
> packet out of the other port. When it comes out, ovn-controller matches the 
> sequence number and id with stored values and marks online if matched.
>
> V1:
>   - First patch.
>
> V2:
>   - Rebased code.
>   - Added "mode" field in Network_function_group table, with only allowed
>     value as "inline". This is for future expansion to include "mirror" mode.
>   - Added a flow in the in_network_function and out_network_function table to
>     skip redirection of multicast traffic.
>
> Sragdhara Datta Chaudhuri (5):
>   ovn-nb: Network Function insertion OVN-NB schema changes
>   ovn-nbctl: Network Function insertion commands.
>   northd, tests: Network Function insertion logical flow programming.
>   controller, tests: Network Function insertion tunneling of cross-host
>     VLAN traffic.
>   northd, controller: Network Function Health monitoring.
>
>  controller/physical.c        | 249 ++++++++++-
>  controller/pinctrl.c         | 252 +++++++++--
>  include/ovn/logical-fields.h |  16 +-
>  lib/logical-fields.c         |  26 ++
>  lib/ovn-util.h               |   2 +-
>  northd/en-global-config.c    |  75 ++++
>  northd/en-global-config.h    |  12 +-
>  northd/en-multicast.c        |   2 +-
>  northd/en-northd.c           |   8 +
>  northd/en-sync-sb.c          |  16 +-
>  northd/inc-proc-northd.c     |   6 +-
>  northd/northd.c              | 789 +++++++++++++++++++++++++++++++++--
>  northd/northd.h              |  39 +-
>  ovn-nb.ovsschema             |  64 ++-
>  ovn-nb.xml                   | 123 ++++++
>  ovn-sb.ovsschema             |  12 +-
>  ovn-sb.xml                   |  22 +-
>  tests/ovn-controller.at 
> [ovn-controller.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn-2Dcontroller.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=_mOSjN7C0uYA1AXTNaFHVglaF-KVyhNfEXH-0ZpM9wg&e=>
>       |   6 +-
>  tests/ovn-nbctl.at 
> [ovn-nbctl.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn-2Dnbctl.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=hK2HnW_pWYwiEONgmGZqFDYc4xQ83Ng7bs8pUlb0RPs&e=>
>            |  83 ++++
>  tests/ovn-northd.at 
> [ovn-northd.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn-2Dnorthd.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=UPdL1fFB0vOQGVjFixsbKTY1l7MMNLigBOwb4CQN6EY&e=>
>           | 508 ++++++++++++++++------
>  tests/ovn.at 
> [ovn.at]<https://urldefense.proofpoint.com/v2/url?u=http-3A__ovn.at&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=W4lEHNAqcVwmKTEOIZqdk3pyApaoZBL2UP_zVzdrvUw&e=>
>                  | 137 ++++++
>  utilities/ovn-nbctl.c        | 533 ++++++++++++++++++++++-
>  22 files changed, 2747 insertions(+), 233 deletions(-)
>
> --
> 2.39.3
>
> _______________________________________________
> dev mailing list
> d...@openvswitch.org<mailto:d...@openvswitch.org>
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> [mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=rEfv_nAzqBDHxy2vrFSTPOpzkzYDFygb8871KklrMww&e=>
_______________________________________________
dev mailing list
d...@openvswitch.org<mailto:d...@openvswitch.org>
https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
[mail.openvswitch.org]<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=rEfv_nAzqBDHxy2vrFSTPOpzkzYDFygb8871KklrMww&e=>

Hello Sragdhara and Numan,
first of all sorry for the huge delay in reviews.
We have are currently working on design of service function
chaining that would be used primarily by ovn-k [0]. I hope we will
be able to align both efforts into something that applicable to both
usacase. Please take a look into the document if you can.

[0] https://docs.google.com/document/d/1dLdpx_9ZCnjHHldbNZABIpJF_GXd69qb 
[docs.google.com]<https://urldefense.proofpoint.com/v2/url?u=https-3A__docs.google.com_document_d_1dLdpx-5F9ZCnjHHldbNZABIpJF-5FGXd69qb&d=DwMFaQ&c=s883GpUCOChKOHiocYtGcg&r=uXnTjPDrt8WYa8nbZqANTqL0TyzFTTKpPHphGFPgvBw&m=7aFJQeR6eAVCW3kmb9zgfTBBYF0cMR_gkou8BtbRW_tR8-5jTRSGPI-quir2bvc1&s=gaY-IsHUrnzDw3vqkSAGTAsHI-ck_7mosxV8XDcFjHU&e=>
Thanks,
Ales
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to