On Mon, May 4, 2020 at 9:52 AM Ihar Hrachyshka <[email protected]> wrote:
>
> On Fri, May 1, 2020 at 3:28 AM Han Zhou <[email protected]> wrote:
> >
> >
> >
> > On Tue, Apr 28, 2020 at 1:38 PM Ihar Hrachyshka <[email protected]>
wrote:
> > >
> > > Assuming only a single localnet port is actually plugged mapped on
> > > each chassis, this allows to maintain disjoint networks plugged to the
> > > same switch.  This is useful to simplify resource management for
> > > OpenStack "routed provider networks" feature [1] where a single
> > > "network" (which traditionally maps to logical switches in OVN) is
> > > comprised of multiple L2 segments and assumes external L3 routing
> > > implemented between the segments.
> > >
> > > TODO: consider E-W routing between localnet vlan tagged LSs
> > >       (ovn-chassis-mac-mappings).
> > >
> > > [1]:
https://docs.openstack.org/ocata/networking-guide/config-routed-networks.html
> >
> > After reading the context and related discussions, I am a little
concerned about this approach. It changes the concept of a logical switch
in OVN, which was a L2 concept. It may be ok to change the concept if all
the aspects are understood and properly documented and implemented.
However, at least it is not clear to me how would it work on the physical
pipeline when VIF1 on chassisA sends packet to VIF2 on chassis B through
physical network when the two VIFs are on same LS that has multiple
localnet ports. Today only one localnet port is allowed, so it is handled
by sending the packet through *the* localnet port (without localnet port it
goes through tunnel). With this patch it seems would choose the last
localnet port, regardless of the physical location (L2 segment) of each
VIF. I would expect the data structure change in "local_datapath" to
maintain a list of localnet ports instead of only one, and the logic in
physical.c to pick up the desired localnet port, according to the L2
segment it physically belongs to. (it shouldn't be a mapping between
chassis and physical network/vlan, because it is natural to have multiple
VLANs connected to a single chassis and have VIFs on different VLANs).
>
> The assumption in this patch is, while the north database may have
> multiple localnet ports per switch, only one segment can actually be
> plugged per chassis. Which reduces the problem of multiple localnet
> ports and selecting the right one to the current situation where
> there's a single localnet port per switch for each chassis.

OK. I think I didn't get the real meaning of this assumption earlier. It is
reasonable to have this assumption, just like the earlier assumption that
we only allows one localnet port for a L2 LS. Now that the LS is actually
modeling a L3 fabric connected with multiple L2 segments, it is reasonable
to assume that:
1. Only one localnet port should exist for each L2 segment. (so if there
are N localnet ports in one LS, we assume they represents N L2 segments
connected by L3 fabric)
2. For each chassis, it should plug to only one L2 segment for each such L3
LS. (I think this needs to be validated in the binding.c to at least warn
such misconfiguration.)

It seems the implementation supported these assumptions well, but it may be
documented more clearly for the L3 LS concept (I am not sure what can be a
better word, since LS appears to be L2 concept) and use cases, and the
above assumptions/constraints as well.

In addition, it is better to make it clear that a chassis can still be
plugged with multiple localnet ports for different L3 LSes. This is a
typical scenario of routed network use case with VRFs. It should still work
with this patch, but I'd suggest to add a test case to cover such scenario
to make sure.

> (The
> assumption is implicit right now but we can also enforce it if
> needed.) Perhaps in the future we should consider expanding the
> feature to allow for multiple localnet ports plugged to fabric per
> chassis, but that's not the goal here.
>
For different L3 LSes, I think it is supported (and should be) already. For
multiple localnet port for same L3 LS, I am not sure if it is reasonable in
the future, but at least I totally agree that it is not a goal for now.

> >
> > Furthermore, if we change the concept of LS being a single L2 domain,
we may need to consider more changes in the L2 logical pipelines, such as:
> > - L2 lookup stage should only look up destination for the ports that
are in the same segment.
> > - BUM traffic should only go to the localnet port that belong to the
same L2 segment of the LS.
> >
I thought about the BUM (broadcast, unknown, multicast) traffic again. It
seems current phyisical pipeline handling for localnet would also handle
that well in the case of mulitple localnet ports in same LS. (I implemented
that logical initially, and never thought about this use case, and I am
surprised that it seems would just work for this case, with the earlier
assumptions). I'd suggest to at least add one or two test cases to ensure
it works and avoid any future regressions since there are implications to
make this work and I am afraid if there are any futher changes in the
future could break it. The test cases can be:
- For "unknown MAC", it is sent out through the physical bridge, once and
only once.
- For broadcast traffic (e.g. ARP), it is sent to all ports that are
physically located on the same L2 segment (some on the same chassis, some
on other chassis but same L2), once and only once.

> > All this are suggesting that we need to introduce a *segment*
abstraction in OVN, and multiple segments belong to same LS. I wonder if it
is even more complex to implement this in OVN than in Neutron plugin. Maybe
I missed something here and made it more complex than necessary. Please
correct me if I am wrong :)
>
> Indeed, if OVN models the OpenStack multi-segment network concept
> natively, 1) it would be easier for OpenStack and 2) it would just
> shift complexity from OpenStack to OVN. (I don't think it would be
> MORE complex than implementing it in the neutron plugin, but about the
> same.) This patch series is an attempt to facilitate neutron
> implementation of routed provider networks (multi-segment networks
> with L3-adjacent L2-domains) with as little blood on both sides.
>
Initially I was thinking about the full L2 segment modeling within LS. In
theory it would be same level of complexity as implementing in Neutron
plugin, but C would usually take more effort than python for such things.
However, I realized that the modeling change may be not necessary in OVN to
support the routed network use case, given that the above mentioned
assumptions are stricly followed. Each localnet port is the modeling of a
L2 segment - we just can't tell which port belong to which L2 segment from
NB objects. The mapping is only reflected by the port-bindings and chassis
configuration. So L2 becomes physical part, and NB only tells L3 model for
the routed network. With this design, there are still things that would
looks confusing because of the modeling missing in NB, such as:
- the logical flow generation blindly generate L2 lookup flows, ARP
response flows, etc. for all ports on the LS
- the MC_Groups (MC_Flood, MC_Unknown, which are L2 concept) generated in
SB contains all ports in the L3 network

In most cases the tricks we did with localnet port in physical pipeline
would handle it well. Maybe one weird scenario I could think of is that
when a VIF1 sends ARP request for IP of VIF2 which is on a different
chassis on a different L2 segment, VIF1 would get the ARP resolved, while
this shall never happen in physical network. However, I don't think this
should create any real problem. I hope this would not result in any
unexpected limitations because of the "simplified" modeling in NB for
routed network.

To summarize, I agree with the solution of this patch, with the below
comments:
- Document the concept change of LS (aka L3 LS), the use case, the
assumptions and constraints more clearly, in the localnet description in
ovn-architecture.xml.
- Validate the assumption and print warning logs if it is not met.
- Add test cases to cover:
  a) The VRF use case of routed network. I.e. M logical networks (LSes),
each has N segments (so in total there are M x N localnet ports), M
localnet ports for the M LSes can still work on same chassis.
  b) For "unknown MAC", it is sent out through the physical bridge, once
and only once.
  c) For broadcast traffic (e.g. ARP), it is sent to all ports that are
physically located on the same L2 segment (some on the same chassis, some
on other chassis but same L2), once and only once.

Thanks,
Han
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to