+1 on using VRFs for tenant isolation. We do not want to use separate BGP 
daemon for the purpose of isolation but use VRFs. There should be 1 BGP daemon 
at the system level that caters to multiple tenants. 

Regards
Gurpreet

> On Sep 1, 2024, at 6:07 AM, Frode Nordahl <fnord...@ubuntu.com> wrote:
> 
> 
> 
> fre. 30. aug. 2024, 19:48 skrev Roberto Bartzen Acosta 
> <roberto.aco...@luizalabs.com <mailto:roberto.aco...@luizalabs.com>>:
>> Hello Frode,
>> 
>> Thanks for working on this.
> 
> 
> Hello, Roberto,
> 
> Thank you for your interest in the work.
> 
>> 
>> Em sex., 30 de ago. de 2024 às 12:37, Frode Nordahl <fnord...@ubuntu.com 
>> <mailto:fnord...@ubuntu.com>> escreveu:
>>> Hello, Tiago,
>>> 
>>> Please find my response in-line below.
>>> 
>>> fre. 30. aug. 2024, 17:09 skrev Tiago Pires <tiago.pi...@luizalabs.com 
>>> <mailto:tiago.pi...@luizalabs.com>>:
>>> 
>>> > Hi all,
>>> >
>>> > I did test your Routing protocol port redirection patch and I was
>>> > wondering how you guys are planning to make the learning and
>>> > advertisement of the routes between the Logical router's routing table and
>>> > the bgp daemon.
>>> >
>>> 
>>> Thank you for your interest in this work. The answer to your question is in
>>> the thread you are quoting, I'll highlight it with comments below.
>>> 
>>> Thanks
>>> >
>>> > Regards,
>>> >
>>> > Tiago Pires
>>> >
>>> > On Tue, Aug 6, 2024 at 9:03 AM Frode Nordahl <fnord...@ubuntu.com 
>>> > <mailto:fnord...@ubuntu.com>> wrote:
>>> >
>>> >> On Mon, Aug 5, 2024 at 12:02 PM Ales Musil <amu...@redhat.com 
>>> >> <mailto:amu...@redhat.com>> wrote:
>>> >> > On Thu, Aug 1, 2024 at 6:04 PM Frode Nordahl <fnord...@ubuntu.com 
>>> >> > <mailto:fnord...@ubuntu.com>>
>>> >> wrote:
>>> >> >>
>>> >> >> Hello, Ales,
>>> >> >>
>>> >> >> This is a fork of the thread to go back to discuss some of the items
>>> >> >> raised in the most recent instance of the OVN A/V Community meeting
>>> >> >> [6].
>>> >> >
>>> >> >
>>> >> >
>>> >> > Hi Frode,
>>> >> >
>>> >> > thank you for the followup discussion.
>>> >> >
>>> >> >>
>>> >> >> On Fri, Jun 28, 2024 at 11:03 AM Ales Musil <amu...@redhat.com 
>>> >> >> <mailto:amu...@redhat.com>> wrote:
>>> >> >> > On Tue, Jun 25, 2024 at 6:52 PM Frode Nordahl <fnord...@ubuntu.com 
>>> >> >> > <mailto:fnord...@ubuntu.com>>
>>> >> wrote:
>>> >> >> >>
>>> >> >> >> Hello,
>>> >> >> >>
>>> >> >> >> We are increasingly seeing requests for integration between OVN
>>> >> >> >> powered CMSs/workloads and the fabric.
>>> >> >> >>
>>> >> >> >> As a side note, this is a very interesting topic to me personally,
>>> >> and
>>> >> >> >> I think there are opportunities in the long term for this class of
>>> >> >> >> software to potentially fill a void for more automated and SDN-like
>>> >> >> >> ways of managing the physical network, as previously closed 
>>> >> >> >> physical
>>> >> >> >> switch hardware is increasingly opening up to programmatic 
>>> >> >> >> extension
>>> >> >> >> and control.
>>> >> >> >>
>>> >> >> >> While very exciting, it will take a while, both in terms of 
>>> >> >> >> evolving
>>> >> >> >> how networking teams are organized, in terms of the longevity of
>>> >> >> >> networking gear making entity wide refresh cycles very long, not to
>>> >> >> >> mention gathering agreement and momentum to build such a thing from
>>> >> >> >> the pieces we have.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> So to be pragmatic, we need to integrate with something that fabric
>>> >> >> >> network engineers are comfortable with, and already available on
>>> >> most
>>> >> >> >> networking hardware, be it closed or open, today.
>>> >> >> >>
>>> >> >> >> The most ubiquitous routing protocol, which has prevailed in modern
>>> >> >> >> layer 3 only data center designs [0], is BGP.
>>> >> >> >>
>>> >> >> >> Use cases:
>>> >> >> >> * Allow fabric to locate and direct traffic to reroutable resources
>>> >> >> >> such as IPv4/IPv6 prefixes, Floating IPs (FIPs) and Load Balancer
>>> >> >> >> VIPs.
>>> >> >> >>
>>> >> >> >> * Use the fabric as a load balancer, announcing the same service IP
>>> >> on
>>> >> >> >> multiple hosts (anycast).
>>> >> >> >>
>>> >> >> >> * Aggregate announcements from stacked CMSes (i.e. Kubernetes
>>> >> running
>>> >> >> >> on top of OpenStack).
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Requirements:
>>> >> >> >> * Data path must be hardware offloaded, i.e. the next hop address
>>> >> the
>>> >> >> >> peer resolves for announcements of OVN resources needs to be an LRP
>>> >> >> >> IP.
>>> >> >> >>
>>> >> >> >> * Minimize configuration overhead through the use of IPv6 LLAs for
>>> >> >> >> peering routing both IPv4 and IPv6 prefixes over a IPv6 BGP session
>>> >> >> >> [1] (aka. “BGP Unnumbered”).
>>> >> >> >>
>>> >> >> >> * Support ECMP out of the host, i.e. use L3 interfaces potentially
>>> >> >> >> connecting to two different ToRs, instead of bonds, avoiding the
>>> >> >> >> additional complexity of multi-chassis bonds.
>>> >> >> >>
>>> >> >> >> * Support BGP authentication [2][3], i.e. the source, destination
>>> >> >> >> address and ports in packet headers can not be changed.
>>> >> >> >>
>>> >> >> >> * Compatibility
>>> >> >> >>    * Running a BGP protocol suite on the host is becoming a thing 
>>> >> >> >> in
>>> >> >> >> its own right, and our users may have requirements of their own 
>>> >> >> >> that
>>> >> >> >> influence their choice of implementation. We need to take this into
>>> >> >> >> account and choose integration methods that allow OVN to work with
>>> >> >> >> multiple protocol suite implementations.
>>> >> >> >>
>>> >> >> >>    * While we have the power to change and fix issues in popular
>>> >> >> >> routing protocol suites, such as FRR, we need to be able to
>>> >> integrate
>>> >> >> >> with versions that exist on networking hardware out there today.
>>> >> >> >>
>>> >> >> >> Limitations that influence/dictate implementation choices:
>>> >> >> >> * Peering with IPv6 LLAs to meet the configuration overhead
>>> >> >> >> requirement makes the peering relationship point to point.
>>> >> >> >>
>>> >> >> >> * Popular BGP implementations, such as FRR which is used as routing
>>> >> >> >> protocol suite by many ToR open source NOSes, does not accept
>>> >> >> >> sending/receiving IPv6 LLA next hop with the route, so the BGP peer
>>> >> >> >> address will be used as next hop. (There are even mentions of 3rd
>>> >> >> >> party nexthop currently not being supported, but not sure if that 
>>> >> >> >> is
>>> >> >> >> accurate [4]).
>>> >> >> >>
>>> >> >> >> * As mentioned above, BGP authentication requires IP headers to be
>>> >> >> >> unchanged for the BGP TCP packets going to/from the BGP speaker.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Proposed implementation:
>>> >> >> >>
>>> >> >> >> We are in the process of preparing some RFC/PoC patches that at a
>>> >> high
>>> >> >> >> level will:
>>> >> >> >> * Manage a VRF in the system serving two purposes:
>>> >> >> >>    * Leaking of route information from ovn-controller to the VRF
>>> >> >> >> routing table, which a routing protocol suite can redistribute
>>> >> subject
>>> >> >> >> to configuration.
>>> >> >> >>
>>> >> >> >>    * Provide an IP endpoint that a VRF aware application, such as
>>> >> FRR,
>>> >> >> >> can bind to serving as a BGP speaker on behalf of a OVN LRP IP.
>>> >> >> >>
>>> >> >> >> * We will attach a OVN VIF to this VRF that has data path rules
>>> >> that:
>>> >> >> >>
>>> >> >> >>    * Forward required traffic destined to the OVN LRP IP to the 
>>> >> >> >> VRF.
>>> >> >> >>
>>> >> >> >>    * Forward required traffic from the application bound to the VRF
>>> >> as
>>> >> >> >> if it originated from the OVN LRP IP.
>>> >>
>>> >
>>> The above bullets give an overview of the proposed implementation.
>> 
>> Just to validate the understanding and align expectations, the main purpose 
>> of this implementation/module developed for route-exchange-netlink is to 
>> export addresses that are typically "external" from the point of view of the 
>> OVN router / SDN. In this case dnat_and_snat rules for FIPs and LB VIP's 
>> configured on the Logical Router, right?
> 
> 
> Development happens in iterations, so this is indeed where we start. We also 
> want to get to learning of routes, as that would simplify configuration, and 
> hopefully we will.
> 
>> So, it's out of the scope to advertise/learn routes from the router's 
>> internal networks, as well as the router's static route table, directly 
>> connected LS subnets, etc. Therefore, it's out of the scope to make the 
>> service and integration of the BGP daemon multi-tenant, since there is no 
>> plan for segmentation by namespaces to run different BGP daemons, right? I 
>> imagine that in the multi-tenant use case we would have a BGP daemon for 
>> each logical router, in addition to playing with the router's complete route 
>> table.
> 
> 
> The BGP redirect option and route redistribute options are per LRP, and there 
> are no restrictions on how the LSP redirected to is terminated in the system. 
> Likewise, the use of VRFs to exchange route information give you isolation. 
> We view the BGP daemon as a system/admin level entity and the isolation you 
> seek can be configured in it?
> 
> We have large users that do not use NAT and would be interested in 
> redistributing LR networks attached to distributed gateways (the OpenStack 
> use case).
> 
> However, before diving into that we need to figure out how to connect a 
> distributed topology with the per chassis gateway router, as the answer to 
> that will impact what resources it makes sense to redistribute where.
> 
> --
> Frode Nordahl 
> 
> 
> 
>> Best regards,
>> Roberto
>> 
>>> 
>>> >> >>
>>> >> >> >>
>>> >> >> >> Hopefully we'll have something up on the list before the end of 
>>> >> >> >> this
>>> >> >> >> week, which makes it real and easier to reason about for further
>>> >> >> >> discussion.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Prior art:
>>> >> >> >>
>>> >> >> >> We recognize that there already exists a third party approach to
>>> >> this
>>> >> >> >> in the ovn-bgp-agent [5] governed by OpenStack, and our goal with
>>> >> this
>>> >> >> >> work is to provide a tighter integration that might cater
>>> >> generically
>>> >> >> >> for other CMSes and use cases.
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> 0: https://datatracker.ietf.org/doc/html/rfc7938
>>> >> >> >> 1: https://datatracker.ietf.org/doc/html/rfc5549
>>> >> >> >> 2: https://datatracker.ietf.org/doc/html/rfc2385
>>> >> >> >> 3: https://datatracker.ietf.org/doc/html/rfc5925
>>> >> >> >> 4:
>>> >> https://github.com/FRRouting/frr/blob/cc3519f3e6eaa06f762e0d447202df32df66e129/bgpd/bgp_route.c#L2719
>>> >> >> >> 5: https://docs.openstack.org/ovn-bgp-agent/latest/
>>> >> >>
>>> >> >>
>>> >> >> 6:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-August/416209.html
>>> >> >>
>>> >> >> >
>>> >> >> > Hi Frode,
>>> >> >> >
>>> >> >> > looking forward to the RFC.
>>> >> >>
>>> >> >> As we agreed, the current set of patches that we have
>>> >> >> [7][8][9][10][11][12][13] will not be considered for the 24.09 release
>>> >> >> as we would like to make it more feature complete and target the 25.03
>>> >> >> release instead. In that context I guess they serve as the RFC
>>> >> >> patches.
>>> >> >>
>>> >> >> 7:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416038.html
>>> >> >> 8:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416039.html
>>> >> >> 9:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416040.html
>>> >> >> 10:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416042.html
>>> >> >> 11:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416041.html
>>> >> >> 12:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416043.html
>>> >> >> 13:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416044.html
>>> >> >>
>>> >>
>>> >
>>> The above patches are the current state of the work and we will continue it
>>> the coming cycle with the intention to get it into the 25.03 release. If
>>> you have cycles to try it out and provide feedback, that would be most
>>> welcome.
>>> 
>>> --
>>> Frode Nordahl
>>> 
>>> >> In addition to the above there is the LRP BGP redirect patch from
>>> >> >> Martin [14], which could be useful independently.
>>> >> >>
>>> >> >> 14:
>>> >> https://mail.openvswitch.org/pipermail/ovs-dev/2024-July/416095.html
>>> >> >>
>>> >> >> Discussion points from meeting:
>>> >> >> 1) OVN and Netlink code
>>> >> >> In the meeting you raised some concerns about introducing Netlink code
>>> >> >> in the OVN repository. While I agree with you 100% that the part of
>>> >> >> [10] that vendors code from OVS (contents of
>>> >> >> route-exchange-netlink-private.h), should instead be a patch for OVS.
>>> >> >>
>>> >> >> The parts of [10] that provide higher layer helper functions,
>>> >> >> consuming OVS library code, do not naturally fit in OVS as OVS itself
>>> >> >> has no use for them.
>>> >> >
>>> >> >
>>> >> >
>>> >> > After the discussion that we had during the meeting I agree that we
>>> >> should reuse
>>> >> > as much OvS code as possible.
>>> >> >
>>> >> >>
>>> >> >> As a quick reminder, we are looking at Netlink because it provides a
>>> >> >> simple and established API for exchange of this type of information
>>> >> >> which is already supported by all routing protocol suites out there.
>>> >> >> It is not tied to any particular data path type, we could
>>> >> >> theoretically even use it as IPC between two userspace processes,
>>> >> >> removing the kernel from the picture with support on the routing
>>> >> >> protocol suite side. (There has been some discussion on this for BIRD:
>>> >> >>
>>> >> https://bird.network.cz/pipermail/bird-users/2021-September/015707.html).
>>> >> >>
>>> >> >> Would it be possible to reach some compromise to include only the
>>> >> >> parts that consume OVS library code (route-exchange-netlink.{c|h})?
>>> >> >>
>>> >> >> While a plugin based approach was also suggested, and we have prior
>>> >> >> examples of successfully using that, it does not come without
>>> >> >> substantial cost. So I want to explore what options there are to host
>>> >> >> this inside the main repository.
>>> >> >
>>> >> >
>>> >> > We could maintain the netlink plugin in the OVN codebase as an example
>>> >> > and still have the plugin system in place. The plugin system has
>>> >> potential
>>> >> > benefit of not locking ourselves to just netlink, but we could
>>> >> potentially use
>>> >> > API regardless of the OVN code. Would that be enough of justification
>>> >> > and to add the plugin system + netlink as default plugin housed
>>> >> > in the OVN codebase?
>>> >>
>>> >> This makes sense and provides a path forward, thanks!
>>> >>
>>> >> We are in a point of the development cycle where we need to tend to
>>> >> some downstream stuff, but we will continue this work. I'll try to get
>>> >> patches to make the OVS route-table module consumable/reusable for
>>> >> this work posted as soon as possible and start thinking about what's
>>> >> needed in the route-exchange provider interface.
>>> >>
>>> >> >> 2) OVS OpenFlow extensions
>>> >> >> One of the counter proposals you brought up was to add OVS OpenFlow
>>> >> >> extensions to allow OVN instruct OVS to insert routes into a system
>>> >> >> routing table.
>>> >> >>
>>> >> >> While I see this could be a clear separation of concerns between OVN
>>> >> >> and OVS, and OpenFlow being OVN's native integration language, I
>>> >> >> struggle a bit with the general usefulness of such an extension.
>>> >> >>
>>> >> >> Our use case for inserting routes into a system routing table is
>>> >> >> purely for exchange of control plane information with some external
>>> >> >> system, such as a routing protocol suite, and we have no interest in
>>> >> >> using it for actual data path control. This is in contrast to how OVN
>>> >> >> uses OpenFlow generally, which as far as I understand is to control
>>> >> >> the data path.
>>> >> >
>>> >> >
>>> >> > In the light of the other discussion it doesn't really make sense to
>>> >> have
>>> >> > plugin when most of the netlink code wouldn't be in the OvS codebase
>>> >> > anyway.
>>> >>
>>> >> Assuming you are referring to the OpenFlow extension here, and yes, I
>>> >> agree.
>>> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> [ snip ]
>>> >> >>
>>> >> >> > Another important part that we should keep in mind if possible, is
>>> >> the EVNP
>>> >> >> > use case. To be able to configure VXLAN tunnels based on the info
>>> >> that we
>>> >> >> > will receive.
>>> >> >> >
>>> >> >> > I'm not sure how far/deep in the actual design you are but maybe the
>>> >> following
>>> >> >> > might be helpful in some way. What I had in mind was sort of plugin
>>> >> that would
>>> >> >> > expose the info of bound entities that are interesting in terms of
>>> >> BGP
>>> >> >> > (it could be configurable), so mainly FIPS, LBs, GW router IPs. For
>>> >> the import
>>> >> >> > part (which would be applicable only for GW LR) we would create
>>> >> entries in SB
>>> >> >> > DB similarly as we do currently with Multicast_Group
>>> >> >> > (BGP_Routes? EVPN_Tunnels?). Northd could consume those values and
>>> >> configure
>>> >> >> > logical flows and encaps as needed.
>>> >> >>
>>> >> >> While the EVPN part is not a priority for us at this point in time, we
>>> >> >> will of course be interested in making sure the work we put into stage
>>> >> >> 1 (ovn-controller redistributing FIPS, LBs and GW router IPs), stage 2
>>> >> >> (ovn-controller learning routes) will be consumable for a stage 3.
>>> >> >
>>> >> >
>>> >> > Great, yeah my point during the discussion wasn't about
>>> >> > making it available right away just to reiterate that it would be of
>>> >> > interest for us and if needed we can of course help with the 
>>> >> > development
>>> >> > process.
>>> >>
>>> >> Cool stuff, let's do it!
>>> >>
>>> >> --
>>> >> Frode Nordahl
>>> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Frode Nordahl
>>> >> >>
>>> >> >> > Let me know if that makes sense.
>>> >> >> >
>>> >> >> > Thanks,
>>> >> >> > Ales
>>> >> >> >
>>> >> >> >
>>> >> >> > --
>>> >> >> >
>>> >> >> > Ales Musil
>>> >> >> >
>>> >> >> > Senior Software Engineer - OVN Core
>>> >> >> >
>>> >> >> > Red Hat EMEA
>>> >> >> >
>>> >> >> > amu...@redhat.com <mailto:amu...@redhat.com>
>>> >> >>
>>> >> >
>>> >> > Thanks,
>>> >> > Ales
>>> >> >
>>> >> > --
>>> >> >
>>> >> > Ales Musil
>>> >> >
>>> >> > Senior Software Engineer - OVN Core
>>> >> >
>>> >> > Red Hat EMEA
>>> >> >
>>> >> > amu...@redhat.com <mailto:amu...@redhat.com>
>>> >> _______________________________________________
>>> >> dev mailing list
>>> >> d...@openvswitch.org <mailto:d...@openvswitch.org>
>>> >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> >>
>>> >
>>> >
>>> > *‘Esta mensagem é direcionada apenas para os endereços constantes no
>>> > cabeçalho inicial. Se você não está listado nos endereços constantes no
>>> > cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa
>>> > mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas 
>>> > estão
>>> > imediatamente anuladas e proibidas’.*
>>> >
>>> >  *‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para
>>> > assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não
>>> > poderá aceitar a responsabilidade por quaisquer perdas ou danos causados
>>> > por esse e-mail ou por seus anexos’.*
>>> >
>>> _______________________________________________
>>> dev mailing list
>>> d...@openvswitch.org <mailto:d...@openvswitch.org>
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> 
>> 
>> ‘Esta mensagem é direcionada apenas para os endereços constantes no 
>> cabeçalho inicial. Se você não está listado nos endereços constantes no 
>> cabeçalho, pedimos-lhe que desconsidere completamente o conteúdo dessa 
>> mensagem e cuja cópia, encaminhamento e/ou execução das ações citadas estão 
>> imediatamente anuladas e proibidas’.
>>  ‘Apesar do Magazine Luiza tomar todas as precauções razoáveis para 
>> assegurar que nenhum vírus esteja presente nesse e-mail, a empresa não 
>> poderá aceitar a responsabilidade por quaisquer perdas ou danos causados por 
>> esse e-mail ou por seus anexos’.
>> 

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to