Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service function chaining

John McDowall Thu, 06 Apr 2017 10:03:07 -0700

Mickey,

2. Documentation and clean up the code.
4. Will put a check in the code and issue an error also will add to the 
documentation.
5. OK – will make the changes and write some tests to verify

Really appreciate all the help.

Regards

John

From: Mickey Spiegel <[email protected]>
Date: Thursday, April 6, 2017 at 9:37 AM
To: John McDowall <[email protected]>
Cc: ovs dev <[email protected]>
Subject: Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service function 
chaining

John,

On Thu, Apr 6, 2017 at 9:15 AM, John McDowall 
<[email protected]<mailto:[email protected]>> wrote:
Mickey,

Thanks, here is what I propose, based on your comments:

1.       Support for non-ip traffic : I will make the agreed on changes.

2.       Match Filter: Will make the changes
Not sure that there are any changes to make other than documentation, unless 
you found some magic solution :-). Are you referring to documentation?

3.       Support for VNF’s that change src/dst: Will leave as an open issue for 
now and look for relevant use cases.
OK.

4.       Multiple classifiers: Will limit each port(application ) to a single 
chain for now, to prevent any errors. If there are use cases for multiple 
chains we can revisit
Would you actually check and limit?
Or just document?

5.       Mac Learning Issue: I looked at using other tunnels but as you 
correctly point out it adds a lot of complexity for not a huge amount of value. 
I think it is better to use the current Geneve overlay in OVN. I will make the 
changes to add a new register flag for chaining and have the flows exit the 
chain from the original src so the mac addresses are correct.
If you advance to an ingress table after S_SWITCH_IN_CHAIN, then you do not 
need the flag. I saw the ability to start at a later table after I wrote the 
flag comment, and forgot that that makes the flag unnecessary.

Mickey

Regards

John

From: Mickey Spiegel <[email protected]<mailto:[email protected]>>
Date: Wednesday, April 5, 2017 at 4:02 PM
To: John McDowall 
<[email protected]<mailto:[email protected]>>
Cc: ovs dev <[email protected]<mailto:[email protected]>>
Subject: Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service function 
chaining

John,

On Tue, Apr 4, 2017 at 10:03 AM, John McDowall 
<[email protected]<mailto:[email protected]>> wrote:
Mickey,

Here are the proposed changes to address issues you brought up,  if these 
changes are acceptable I will add them and re-submit the patch.

  1.  Support for Non-IP Traffic: The current code uses ipv4.src and ipv4.dst 
to identify flow source and destination. Change this is eth.src and eth.dst, 
there is no reason to require IP. This also addresses IPv6 issues. If a user 
wants to add additional IP filtering they can add it to the “match” filter.
This would resolve the issue with non-IP traffic.

Do you want to support VNFs that modify the source or destination ethernet 
address?
If so, that would require further investigation and additional logic.

  1.  Match filter: Can cause conflicts but not really any different from ACL 
where user can construct conflicting filters. I will try and add some light 
checking (warnings) to catch any major conflicts.
Are you referring to this issue that I had mentioned?

a) If two different logical port chain classifiers with different "match" were 
specified on the same port, with different logical port chains that share at 
least one logical port pair, then the IP address does not distinguish between 
the two logical port chains.

This seems like a tricky issue. I don't see any way to resolve it unless you 
use a tunneling format that specifies a chain ID such as NSH.

If it is not resolved, then people have to think carefully whenever they 
specific a "match" condition. IMO this is less obvious than ACL conflicts, 
since it involves interactions between the "match" conditions and the logical 
port chain definitions.

  1.  End of the service chain (MAC Learning Issue): I will add a new rule at 
the end of the chain to return the flow to the original src to ensure that the 
packet header is “correct” when leaving the host. I think I can set the 
recirculate flag at the end of the chain and then detect the flag and send it 
to the original destination.This will ensure that no packet leaves OVN with an 
“incorrect” header.
When you say "original src", I assume you mean the location where eth.src 
resides?

I am not sure what you mean by the "recirculate flag"?
Are you referring to "flags.loopback"? Lots of things set that including DHCP 
and ARP responder, so you cannot infer much from the fact that the flag is set. 
In any case, that just lets the outport be the same as the inport. It does not 
determine what the outport is.

To do item #3 I need some help composing the rule to recirculate the flow back 
to the original host - can someone point me to a good example or sample code?

I assume you want the logic in build_chain to be something like:

+        if (j == (lpc->n_port_pair_groups-1)) {
             /* Not sure why you are using xasprintf rather than using
              * OVS dynamic strings like everything else in ovn-northd.c */
+            lcc_action = xasprintf("outport=%s; output;",
+                                   traffic_port->json_key);
+            }

The big question is whether you want to use the normal geneve tunnels that 
already exist between OVN chassis (which is what will happen with the simple 
logic that I proposed above), or you want to add SFC specific tunnels. If you 
use the normal geneve tunnels, then you end up on the egress pipeline at the 
hypervisor where eth.src resides. If you add your own tunnels, you might be 
able to carry additional information or inject to the ingress pipeline instead, 
but it will add lots of complexity.

To turn the packet around after it arrives at the hypervisor where eth.src 
resides in the egress pipeline, you would have to do "egress loopback". I 
proposed this for distributed NAT, and Ben developed some patches to give us 
the actions that we need to make it happen.

See the Egress Loopback table code in ovn/northd/ovn-northd.c starting at line 
4644 (this is for distributed routers, you would have to add something similar 
for logical switches running SFC). The key part is the actions:

                ds_clear(&actions);
                ds_put_format(&actions,
                              "clone { ct_clear; "
                              "inport = outport; outport = \"\"; "
                              "flags = 0; flags.loopback = 1; ");
                for (int i = 0; i < MFF_N_LOG_REGS; i++) {
                    ds_put_format(&actions, "reg%d = 0; ", i);
                }
                ds_put_format(&actions, REGBIT_EGRESS_LOOPBACK" = 1; "
                              "next(pipeline=ingress, table=0); };");

Instead of REGBIT_EGRESS_LOOPBACK, you would have to define a different REGBIT 
that you can match on, in order to avoid hitting the logical port chain 
classifier again.

You might also want to adjust the table to start at in the ingress pipeline, in 
order to skip the tables that were already processed before S_SWITCH_IN_CHAIN.

I do not remember why I set flags.loopback, not sure whether that is necessary.

I guess you would want to do this before the other egress pipeline stages, 
since ingress traffic is not subject to egress ACLs, etc.

Where I have a bit of trouble and raised my concerns about security is the 
match conditions for egress loopback. What would they be in case of SFC?
Just (outport == traffic_port && eth.src == traffic_logical_port_ea)?
Maybe with a higher priority rule matching eth.dst == traffic_logical_port_ea 
with action "next; " to avoid looping in case both eth.src and eth.dst == 
traffic_logical_port_ea?

Mickey

Regards

John

From: Mickey Spiegel <[email protected]<mailto:[email protected]>>
Date: Wednesday, March 15, 2017 at 6:24 PM
To: John McDowall 
<[email protected]<mailto:[email protected]>>
Cc: ovs dev <[email protected]<mailto:[email protected]>>
Subject: Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service function 
chaining

On Mon, Mar 13, 2017 at 1:28 PM, John McDowall 
<[email protected]<mailto:[email protected]>> wrote:
This patch set is an alternative implementation of service function
chaining (SFC) for OVS/OVN. The major change from the previous patch is
that the overloading of the ACL stage in ovn-northd.c has been removed
and replaced with additional logic in the CHAIN stage.

This was done to improve modularity of the code as it was felt that
overloading the ACL stage did not add a lot and made it hard to compose
reusable SFCs.

The new approach can still use the matching logic in OVN as a match
argument has been added. This has not been fully implemented yet as it
may need some error checking to ensure the match does not conflicit
with the lport in the chain and the bi-directional case.

The other major change is the logic for the final destination of the
flows exiting the service chain. This has been simplified such that the
rules in the chain stage determine when the flow is exiting the port-chain
and then the flow just follows the normal path to the src or dst of the flow.

Areas to review

1) Logic for delivering flow after it leaves the chain. I think it is now
general and should work across subnets etc.

This revision does address one of my concerns with the previous revision, 
specifically the previous proposal to specify a 'last_hop_port' at the end of 
the logical port chain.

However, it does not address my other major concern about the behavior at the 
end of the service chain, the interaction with the way OVN forwards traffic to 
the outside world.

Repeating my comments from the last revision:

In OVN without SFC, traffic is forwarded through a logical switch to a logical 
switch port of type 'localnet' that effectively resides on all hypervisors, 
that then forwards to the physical L2 network.  When a VM residing on that same 
logical switch originates traffic destined for the outside world, it is 
directed to the local instance of the type 'localnet' logical switch port on 
the VM's hypervisor.  The physical L2 network learns that the VM's MAC address 
resides on that hypervisor, and forwards all traffic to that MAC address to 
that specific hypervisor.  Typically the physical L2 network first learns the 
VM's MAC address due to a gratuitous ARP packet sent by the ovn-controller on 
the VM's hypervisor.

In this SFC proposal, at the end of the logical port chain, traffic is 
forwarded directly to the destination.  This might work for destinations 
attached directly to OVN, but it would break forwarding to the physical L2 
network:

1. If the last port pair group in the logical port chain contains more than one 
port pair spread across more than one chassis, then this would completely break 
upstream L2 learning.  Traffic would have to be forwarded to a common point 
before being sent upstream.

2. What if the SFC classifier granularity is finer than per source VM, and the 
multiple logical port chains (that one VM's traffic can be redirected to) end 
at different chassis?

3. The gratuitous ARP packets generated by OVN for VIFs residing on the same 
logical switch as the 'localnet' port are sent from each VIF's chassis.  This 
may be different than the chassis at the end of the logical port chain.

There are some topologies where this SFC proposal will work even for traffic 
destined for the outside world: if there are no VIFs on the same logical switch 
as the 'localnet' port, and either a gateway router is used or a distributed 
router with only centralized NAT rules is used.  If a distributed router is 
specified with distributed NAT rules, then point 3 above gets even worse, since 
ARP replies for a distributed NAT external IP address are restricted to the 
chassis where the corresponding VIF resides.

My preference is to return to the originating hypervisor at the end of the 
logical port chain.  With such an approach, L2 and L3 forwarding are unaffected.
However, there is a question how to determine from a packet at the end of the 
logical port chain, what is the originating hypervisor?
The best answer is to use an encapsulation such as Geneve or NSH that can carry 
context information.
Another alternative is to do a lookup of the source address when the chain was 
entered in the "exit-lport" direction.

An example to clarify the problem:

Single Logical Switch LS1.
VM1 on HV1 with MAC1, IP1.
Logical Port Chain LPC1 with one Logical Port Pair Group LPPG1.
LPPG1 has two Logical Port Pairs LPP1 and LPP2, residing on HV2 and HV3 
respectively.

When VM1 comes up, HV1 will send a gratuitous ARP packet out the localnet 
interface on HV1 to the physical L2 network, using MAC1 and IP1.
The physical L2 network will learn that MAC1 is on HV1.

VM1 sends a packet to the outside world.
The packet gets redirected down LPC1 to LPP1 on HV2. Using the logic below, the 
packet will go through a normal L2 destination lookup on HV2, so it will be 
sent out the localnet interface on HV2 to the physical L2 network.
The physical L2 network will now learn that MAC1 is on HV2.

VM1 sends a packet for a different flow to the outside world.
The packet gets redirected down LPC1 to LPP2 on HV3. Using the logical below, 
the packet will go through a normal L2 destination lookup on HV3, so it will be 
sent out the localnet interface on HV3 to the physical L2 network.
The physical L2 network will now learn that MAC1 is on HV3.

2) Do the command names make sense - currently rather long and complex.
3) Using the match to classify flows to a finer granularity is mainly
useful for uni-directional flows but could lead to conflicits with
bi-directional flows (I think) and suggestions on how to make this better
would be appreciated.

I wonder about the implicit classification in this patch.

If I am reading the code correctly, when a user specifies a "port" in a 
"Logical_Port_Chain_Classifier", that port is not used directly as match 
criteria. Instead, the first IP address of the port is used in match criteria, 
not just for the initial classification into the logical port chain, but also 
for subsequent hops along the port chain.

The primary benefit of this approach is that it allows a VNF to be used in 
different logical port chains. Since the IP address is used in the match 
criteria, different logical port chain classifiers on different ports with 
different logical port chains can use the same logical port pair.

This approach also allows for "entry-lport" logical port chain classifiers to 
be applied in the ingress pipeline, before the egress port is determined.

I can think of several issues with this approach:

a) If two different logical port chain classifiers with different "match" were 
specified on the same port, with different logical port chains that share at 
least one logical port pair, then the IP address does not distinguish between 
the two logical port chains.

b) This approach does not support non-IP traffic.

c) This approach does not support VNFs that modify IP addresses such as load 
balancers or NAT.

Another issue is that OVN allows a port to have multiple LSP addresses, each 
specifying any number of IP addresses. This can be fixed in the code below.

In terms of interaction with "match", in general it should not be a problem. 
Just concatenate (" && %s", implicit match criteria) with the user specified 
"match". For bidirectional logical port chains, it would be up to the user to 
determine that the "match" criteria makes sense, e.g. does not include any 
"source" or "destination" matches. Otherwise the user should use 
"uni-directional" logical port chains.

Current todo list

1) I have standalone tests need to add tests to ovs/ovn framework.
2) Add IPv6 support
3) Implement "match" in the CHAIN stage.
4) Load-balancing support for port-pair-groups
5) Publish more detailed examples.

Simple test case using ovn-trace

#!/bin/sh
#
clear
ovn-nbctl ls-add swt1

ovn-nbctl lsp-add swt1 swt1-appc
ovn-nbctl lsp-add swt1 swt1-apps
ovn-nbctl lsp-add swt1 swt1-vnfp1
ovn-nbctl lsp-add swt1 swt1-vnfp2

ovn-nbctl lsp-set-addresses swt1-appc "00:00:00:00:00:01 192.168.33.1"
ovn-nbctl lsp-set-addresses swt1-apps "00:00:00:00:00:02 192.168.33.2"
ovn-nbctl lsp-set-addresses swt1-vnfp1 00:00:00:00:00:03
ovn-nbctl lsp-set-addresses swt1-vnfp2 00:00:00:00:00:04
#
# Configure Service chain
#
ovn-nbctl lsp-pair-add swt1 swt1-vnfp1 swt1-vnfp2 pp1
ovn-nbctl lsp-chain-add swt1 pc1
ovn-nbctl lsp-pair-group-add pc1 ppg1
ovn-nbctl lsp-pair-group-add-port-pair ppg1 pp1
ovn-nbctl lsp-chain-classifier-add swt1 pc1 swt1-appc "exit-lport" 
"bi-directional" "" pcc1
#
ovn-sbctl dump-flows
#
# Run trace command
printf "\n---------Flow 1 -------------\n\n"
ovn-trace --detailed  swt1 'inport == "swt1-appc" && eth.src == 
00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02 && ip4.src == 192.168.33.1 && 
ip4.dst == 192.168.33.2'
printf "\n---------Flow 2 -------------\n\n"
ovn-trace --detailed  swt1 'inport == "swt1-vnfp1" && eth.src == 
00:00:00:00:00:01 && eth.dst == 00:00:00:00:00:02 && ip4.src == 192.168.33.1 && 
ip4.dst == 192.168.33.2'
printf "\n---------Flow 3 -------------\n\n"
ovn-trace --detailed  swt1 'inport == "swt1-apps" && eth.dst == 
00:00:00:00:00:01 && eth.src == 00:00:00:00:00:02 && ip4.dst == 192.168.33.1 && 
ip4.src == 192.168.33.2'
printf "\n---------Flow 4 -------------\n\n"
ovn-trace --detailed  swt1 'inport == "swt1-vnfp2" && eth.dst == 
00:00:00:00:00:01 && eth.src == 00:00:00:00:00:02 && ip4.dst == 192.168.33.1 && 
ip4.src == 192.168.33.2'
#
# Cleanup
#
ovn-nbctl lsp-chain-classifier-del pcc1
ovn-nbctl lsp-pair-group-del ppg1
ovn-nbctl lsp-chain-del pc1
ovn-nbctl lsp-pair-del pp1
ovn-nbctl ls-del swt1

Co-authored-by: Flavio Fernandes <flavio at 
flaviof.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__flaviof.com&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=yLQ5jFc-CTEYRKuasTBad0BHVwQnNkrMyXaQgEkryqw&s=q-H4fUtyXYXLrq0zqHd2wDd4Uk00KfFClqzQGDa7Vgs&e=>>
Reported at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2016-March/040381.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMarch_040381.html&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=yLQ5jFc-CTEYRKuasTBad0BHVwQnNkrMyXaQgEkryqw&s=Az85h2_odJeFPHWmi4r6wR9sLyAxO-AqWaCfFT_tsiw&e=>
Reported at: 
https://mail.openvswitch.org/pipermail/ovs-discuss/2016-May/041359.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddiscuss_2016-2DMay_041359.html&d=DwMFaQ&c=V9IgWpI5PvzTw83UyHGVSoW3Uc1MFWe5J8PTfkrzVSo&r=vZ6VUDaavDpfOdPQrz1ED54jEjvAE36A8TVJroVlrOQ&m=yLQ5jFc-CTEYRKuasTBad0BHVwQnNkrMyXaQgEkryqw&s=IPTcf9byltkg5tQveUq31LIie5mFEqro5Sm_VKcxifs&e=>

Signed-off-by: John McDowall <jmcdowall at 
paloaltonetworks.com<http://paloaltonetworks.com>>
---
 ovn/northd/ovn-northd.8.xml   |   60 ++-
 ovn/northd/ovn-northd.c       |  293 ++++++++++-
 ovn/ovn-architecture.7.xml    |   91 ++++
 ovn/ovn-nb.ovsschema          |   87 +++-
 ovn/ovn-nb.xml                |  187 +++++++
 ovn/utilities/ovn-nbctl.8.xml |  218 ++++++++
 ovn/utilities/ovn-nbctl.c     | 1133 +++++++++++++++++++++++++++++++++++++++++
 7 files changed, 2042 insertions(+), 27 deletions(-)
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [ovs-dev, RFC] ovn: Revised support for service function chaining

Reply via email to