John, On Thu, Apr 6, 2017 at 9:15 AM, John McDowall < [email protected]> wrote:
> Mickey, > > > > Thanks, here is what I propose, based on your comments: > > > > 1. Support for non-ip traffic : I will make the agreed on changes. > > 2. Match Filter: Will make the changes > Not sure that there are any changes to make other than documentation, unless you found some magic solution :-). Are you referring to documentation? > 3. Support for VNF’s that change src/dst: Will leave as an open > issue for now and look for relevant use cases. > OK. > 4. Multiple classifiers: Will limit each port(application ) to a > single chain for now, to prevent any errors. If there are use cases for > multiple chains we can revisit > Would you actually check and limit? Or just document? > 5. Mac Learning Issue: I looked at using other tunnels but as you > correctly point out it adds a lot of complexity for not a huge amount of > value. I think it is better to use the current Geneve overlay in OVN. I > will make the changes to add a new register flag for chaining and have the > flows exit the chain from the original src so the mac addresses are correct. > If you advance to an ingress table after S_SWITCH_IN_CHAIN, then you do not need the flag. I saw the ability to start at a later table after I wrote the flag comment, and forgot that that makes the flag unnecessary. Mickey > > > Regards > > > > John > > > > > > > > *From: *Mickey Spiegel <[email protected]> > *Date: *Wednesday, April 5, 2017 at 4:02 PM > *To: *John McDowall <[email protected]> > *Cc: *ovs dev <[email protected]> > *Subject: *Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service > function chaining > > > > John, > > > > On Tue, Apr 4, 2017 at 10:03 AM, John McDowall < > [email protected]> wrote: > > Mickey, > > > > Here are the proposed changes to address issues you brought up, if these > changes are acceptable I will add them and re-submit the patch. > > > > 1. Support for Non-IP Traffic: The current code uses ipv4.src and > ipv4.dst to identify flow source and destination. Change this is eth.src > and eth.dst, there is no reason to require IP. This also addresses IPv6 > issues. If a user wants to add additional IP filtering they can add it to > the “match” filter. > > This would resolve the issue with non-IP traffic. > > > > Do you want to support VNFs that modify the source or destination ethernet > address? > > If so, that would require further investigation and additional logic. > > > > > 1. Match filter: Can cause conflicts but not really any different from > ACL where user can construct conflicting filters. I will try and add some > light checking (warnings) to catch any major conflicts. > > Are you referring to this issue that I had mentioned? > > > > a) If two different logical port chain classifiers with different "match" > were specified on the same port, with different logical port chains that > share at least one logical port pair, then the IP address does not > distinguish between the two logical port chains. > > > > This seems like a tricky issue. I don't see any way to resolve it unless > you use a tunneling format that specifies a chain ID such as NSH. > > > > If it is not resolved, then people have to think carefully whenever they > specific a "match" condition. IMO this is less obvious than ACL conflicts, > since it involves interactions between the "match" conditions and the > logical port chain definitions. > > > > > 1. End of the service chain (MAC Learning Issue): I will add a new > rule at the end of the chain to return the flow to the original src to > ensure that the packet header is “correct” when leaving the host. I think > I can set the recirculate flag at the end of the chain and then detect the > flag and send it to the original destination.This will ensure that no > packet leaves OVN with an “incorrect” header. > > When you say "original src", I assume you mean the location where eth.src > resides? > > > > I am not sure what you mean by the "recirculate flag"? > > Are you referring to "flags.loopback"? Lots of things set that including > DHCP and ARP responder, so you cannot infer much from the fact that the > flag is set. In any case, that just lets the outport be the same as the > inport. It does not determine what the outport is. > > > > To do item #3 I need some help composing the rule to recirculate the flow > back to the original host - can someone point me to a good example or > sample code? > > > > I assume you want the logic in build_chain to be something like: > > > > + if (j == (lpc->n_port_pair_groups-1)) { > > /* Not sure why you are using xasprintf rather than using > > * OVS dynamic strings like everything else in ovn-northd.c */ > > + lcc_action = xasprintf("outport=%s; output;", > > + traffic_port->json_key); > > + } > > > > The big question is whether you want to use the normal geneve tunnels that > already exist between OVN chassis (which is what will happen with the > simple logic that I proposed above), or you want to add SFC specific > tunnels. If you use the normal geneve tunnels, then you end up on the > egress pipeline at the hypervisor where eth.src resides. If you add your > own tunnels, you might be able to carry additional information or inject to > the ingress pipeline instead, but it will add lots of complexity. > > > > To turn the packet around after it arrives at the hypervisor where eth.src > resides in the egress pipeline, you would have to do "egress loopback". I > proposed this for distributed NAT, and Ben developed some patches to give > us the actions that we need to make it happen. > > > > See the Egress Loopback table code in ovn/northd/ovn-northd.c starting at > line 4644 (this is for distributed routers, you would have to add something > similar for logical switches running SFC). The key part is the actions: > > > > ds_clear(&actions); > > ds_put_format(&actions, > > "clone { ct_clear; " > > "inport = outport; outport = \"\"; " > > "flags = 0; flags.loopback = 1; "); > > for (int i = 0; i < MFF_N_LOG_REGS; i++) { > > ds_put_format(&actions, "reg%d = 0; ", i); > > } > > ds_put_format(&actions, REGBIT_EGRESS_LOOPBACK" = 1; " > > "next(pipeline=ingress, table=0); };"); > > > > Instead of REGBIT_EGRESS_LOOPBACK, you would have to define a different > REGBIT that you can match on, in order to avoid hitting the logical port > chain classifier again. > > > > You might also want to adjust the table to start at in the ingress > pipeline, in order to skip the tables that were already processed before > S_SWITCH_IN_CHAIN. > > > > I do not remember why I set flags.loopback, not sure whether that is > necessary. > > > > I guess you would want to do this before the other egress pipeline stages, > since ingress traffic is not subject to egress ACLs, etc. > > > > Where I have a bit of trouble and raised my concerns about security is the > match conditions for egress loopback. What would they be in case of SFC? > > Just (outport == traffic_port && eth.src == traffic_logical_port_ea)? > > Maybe with a higher priority rule matching eth.dst == > traffic_logical_port_ea with action "next; " to avoid looping in case both > eth.src and eth.dst == traffic_logical_port_ea? > > > > Mickey > > > > > > Regards > > > > John > > > > > > *From: *Mickey Spiegel <[email protected]> > *Date: *Wednesday, March 15, 2017 at 6:24 PM > *To: *John McDowall <[email protected]> > *Cc: *ovs dev <[email protected]> > *Subject: *Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service > function chaining > > > > > > On Mon, Mar 13, 2017 at 1:28 PM, John McDowall < > [email protected]> wrote: > > This patch set is an alternative implementation of service function > chaining (SFC) for OVS/OVN. The major change from the previous patch is > that the overloading of the ACL stage in ovn-northd.c has been removed > and replaced with additional logic in the CHAIN stage. > > This was done to improve modularity of the code as it was felt that > overloading the ACL stage did not add a lot and made it hard to compose > reusable SFCs. > > The new approach can still use the matching logic in OVN as a match > argument has been added. This has not been fully implemented yet as it > may need some error checking to ensure the match does not conflicit > with the lport in the chain and the bi-directional case. > > The other major change is the logic for the final destination of the > flows exiting the service chain. This has been simplified such that the > rules in the chain stage determine when the flow is exiting the port-chain > and then the flow just follows the normal path to the src or dst of the > flow. > > Areas to review > > 1) Logic for delivering flow after it leaves the chain. I think it is now > general and should work across subnets etc. > > > > This revision does address one of my concerns with the previous revision, > specifically the previous proposal to specify a 'last_hop_port' at the > end of the logical port chain. > > > > However, it does not address my other major concern about the behavior at > the end of the service chain, the interaction with the way OVN forwards > traffic to the outside world. > > > > > > Repeating my comments from the last revision: > > > > In OVN without SFC, traffic is forwarded through a logical switch to a > logical switch port of type 'localnet' that effectively resides on all > hypervisors, that then forwards to the physical L2 network. When a VM > residing on that same logical switch originates traffic destined for the > outside world, it is directed to the local instance of the type 'localnet' > logical switch port on the VM's hypervisor. The physical L2 network learns > that the VM's MAC address resides on that hypervisor, and forwards all > traffic to that MAC address to that specific hypervisor. Typically the > physical L2 network first learns the VM's MAC address due to a gratuitous > ARP packet sent by the ovn-controller on the VM's hypervisor. > > > > In this SFC proposal, at the end of the logical port chain, traffic is > forwarded directly to the destination. This might work for destinations > attached directly to OVN, but it would break forwarding to the physical L2 > network: > > > > 1. If the last port pair group in the logical port chain contains more > than one port pair spread across more than one chassis, then this would > completely break upstream L2 learning. Traffic would have to be forwarded > to a common point before being sent upstream. > > > > 2. What if the SFC classifier granularity is finer than per source VM, and > the multiple logical port chains (that one VM's traffic can be redirected > to) end at different chassis? > > > > 3. The gratuitous ARP packets generated by OVN for VIFs residing on the > same logical switch as the 'localnet' port are sent from each VIF's > chassis. This may be different than the chassis at the end of the logical > port chain. > > > > There are some topologies where this SFC proposal will work even for > traffic destined for the outside world: if there are no VIFs on the same > logical switch as the 'localnet' port, and either a gateway router is used > or a distributed router with only centralized NAT rules is used. If a > distributed router is specified with distributed NAT rules, then point 3 > above gets even worse, since ARP replies for a distributed NAT external IP > address are restricted to the chassis where the corresponding VIF resides. > > > > My preference is to return to the originating hypervisor at the end of the > logical port chain. With such an approach, L2 and L3 forwarding are > unaffected. > However, there is a question how to determine from a packet at the end of > the logical port chain, what is the originating hypervisor? > > The best answer is to use an encapsulation such as Geneve or NSH that can > carry context information. > Another alternative is to do a lookup of the source address when the chain > was entered in the "exit-lport" direction. > > > > > > An example to clarify the problem: > > > > Single Logical Switch LS1. > > VM1 on HV1 with MAC1, IP1. > > Logical Port Chain LPC1 with one Logical Port Pair Group LPPG1. > > LPPG1 has two Logical Port Pairs LPP1 and LPP2, residing on HV2 and HV3 > respectively. > > > > When VM1 comes up, HV1 will send a gratuitous ARP packet out the localnet > interface on HV1 to the physical L2 network, using MAC1 and IP1. > > The physical L2 network will learn that MAC1 is on HV1. > > > > VM1 sends a packet to the outside world. > > The packet gets redirected down LPC1 to LPP1 on HV2. Using the logic > below, the packet will go through a normal L2 destination lookup on HV2, so > it will be sent out the localnet interface on HV2 to the physical L2 > network. > > The physical L2 network will now learn that MAC1 is on HV2. > > > > VM1 sends a packet for a different flow to the outside world. > > The packet gets redirected down LPC1 to LPP2 on HV3. Using the logical > below, the packet will go through a normal L2 destination lookup on HV3, so > it will be sent out the localnet interface on HV3 to the physical L2 > network. > > The physical L2 network will now learn that MAC1 is on HV3. > > > > > > 2) Do the command names make sense - currently rather long and complex. > 3) Using the match to classify flows to a finer granularity is mainly > useful for uni-directional flows but could lead to conflicits with > bi-directional flows (I think) and suggestions on how to make this better > would be appreciated. > > > > I wonder about the implicit classification in this patch. > > > > If I am reading the code correctly, when a user specifies a "port" in a > "Logical_Port_Chain_Classifier", that port is not used directly as match > criteria. Instead, the first IP address of the port is used in match > criteria, not just for the initial classification into the logical port > chain, but also for subsequent hops along the port chain. > > > > The primary benefit of this approach is that it allows a VNF to be used in > different logical port chains. Since the IP address is used in the match > criteria, different logical port chain classifiers on different ports with > different logical port chains can use the same logical port pair. > > > > This approach also allows for "entry-lport" logical port chain classifiers > to be applied in the ingress pipeline, before the egress port is determined. > > > > I can think of several issues with this approach: > > > > a) If two different logical port chain classifiers with different "match" > were specified on the same port, with different logical port chains that > share at least one logical port pair, then the IP address does not > distinguish between the two logical port chains. > > > > b) This approach does not support non-IP traffic. > > > > c) This approach does not support VNFs that modify IP addresses such as > load balancers or NAT. > > > > Another issue is that OVN allows a port to have multiple LSP addresses, > each specifying any number of IP addresses. This can be fixed in the code > below. > > > > In terms of interaction with "match", in general it should not be a > problem. Just concatenate (" && %s", implicit match criteria) with the user > specified "match". For bidirectional logical port chains, it would be up to > the user to determine that the "match" criteria makes sense, e.g. does not > include any "source" or "destination" matches. Otherwise the user should > use "uni-directional" logical port chains. > > > > > Current todo list > > 1) I have standalone tests need to add tests to ovs/ovn framework. > 2) Add IPv6 support > 3) Implement "match" in the CHAIN stage. > 4) Load-balancing support for port-pair-groups > 5) Publish more detailed examples. > > Simple test case using ovn-trace > > #!/bin/sh > # > clear > ovn-nbctl ls-add swt1 > > ovn-nbctl lsp-add swt1 swt1-appc > ovn-nbctl lsp-add swt1 swt1-apps > ovn-nbctl lsp-add swt1 swt1-vnfp1 > ovn-nbctl lsp-add swt1 swt1-vnfp2 > > ovn-nbctl lsp-set-addresses swt1-appc "00:00:00:00:00:01 192.168.33.1" > ovn-nbctl lsp-set-addresses swt1-apps "00:00:00:00:00:02 192.168.33.2" > ovn-nbctl lsp-set-addresses swt1-vnfp1 00:00:00:00:00:03 > ovn-nbctl lsp-set-addresses swt1-vnfp2 00:00:00:00:00:04 > # > # Configure Service chain > # > ovn-nbctl lsp-pair-add swt1 swt1-vnfp1 swt1-vnfp2 pp1 > ovn-nbctl lsp-chain-add swt1 pc1 > ovn-nbctl lsp-pair-group-add pc1 ppg1 > ovn-nbctl lsp-pair-group-add-port-pair ppg1 pp1 > ovn-nbctl lsp-chain-classifier-add swt1 pc1 swt1-appc "exit-lport" > "bi-directional" "" pcc1 > # > ovn-sbctl dump-flows > # > # Run trace command > printf "\n---------Flow 1 -------------\n\n" > ovn-trace --detailed swt1 'inport == "swt1-appc" && eth.src == > 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02 && ip4.src == > 192.168.33.1 && ip4.dst == 192.168.33.2' > printf "\n---------Flow 2 -------------\n\n" > ovn-trace --detailed swt1 'inport == "swt1-vnfp1" && eth.src == > 00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02 && ip4.src == > 192.168.33.1 && ip4.dst == 192.168.33.2' > printf "\n---------Flow 3 -------------\n\n" > ovn-trace --detailed swt1 'inport == "swt1-apps" && eth.dst == > 00:00:00:00:00:01 && eth.src == 00:00:00:00:00:02 && ip4.dst == > 192.168.33.1 && ip4.src == 192.168.33.2' > printf "\n---------Flow 4 -------------\n\n" > ovn-trace --detailed swt1 'inport == "swt1-vnfp2" && eth.dst == > 00:00:00:00:00:01 && eth.src == 00:00:00:00:00:02 && ip4.dst == > 192.168.33.1 && ip4.src == 192.168.33.2' > # > # Cleanup > # > ovn-nbctl lsp-chain-classifier-del pcc1 > ovn-nbctl lsp-pair-group-del ppg1 > ovn-nbctl lsp-chain-del pc1 > ovn-nbctl lsp-pair-del pp1 > ovn-nbctl ls-del swt1 > > Co-authored-by: Flavio Fernandes <flavio at flaviof.com > <https://urldefense.proofpoint.com/v2/url?u=http-3A__flaviof.com&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=yLQ5jFc-CTEYRKuasTBad0BHVwQnNkrMyXaQgEkryqw&s=q-H4fUtyXYXLrq0zqHd2wDd4Uk00KfFClqzQGDa7Vgs&e=> > > > Reported at: https://mail.openvswitch.org/pipermail/ovs-discuss/2016- > March/040381.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMarch_040381.html&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=yLQ5jFc-CTEYRKuasTBad0BHVwQnNkrMyXaQgEkryqw&s=Az85h2_odJeFPHWmi4r6wR9sLyAxO-AqWaCfFT_tsiw&e=> > Reported at: https://mail.openvswitch.org/pipermail/ovs-discuss/2016- > May/041359.html > <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMay_041359.html&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=yLQ5jFc-CTEYRKuasTBad0BHVwQnNkrMyXaQgEkryqw&s=IPTcf9byltkg5tQveUq31LIie5mFEqro5Sm_VKcxifs&e=> > > Signed-off-by: John McDowall <jmcdowall at paloaltonetworks.com> > --- > ovn/northd/ovn-northd.8.xml | 60 ++- > ovn/northd/ovn-northd.c | 293 ++++++++++- > ovn/ovn-architecture.7.xml | 91 ++++ > ovn/ovn-nb.ovsschema | 87 +++- > ovn/ovn-nb.xml | 187 +++++++ > ovn/utilities/ovn-nbctl.8.xml | 218 ++++++++ > ovn/utilities/ovn-nbctl.c | 1133 ++++++++++++++++++++++++++++++ > +++++++++++ > 7 files changed, 2042 insertions(+), 27 deletions(-) > > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
