On 8/6/24 19:22, Vladislav Odintsov wrote: > Hi Martin, Dumitru, > Hi Vladislav,
> On 06.08.2024 19:23, Dumitru Ceara wrote: >> Hi Martin, >> >> Sorry, I was reviewing V3 but I was slow in actually sending out the email. >> >> On 8/6/24 13:17, Martin Kalcok wrote: >>> This change adds a 'bgp-redirect' option to LRP that allows >>> redirecting of BGP control plane traffic to an arbitrary LSP >>> in its peer LS. >>> >>> The option expects a string with a LSP name. When set, >>> any traffic entering LS that's destined for any of the >>> LRP's IP addresses (including IPv6 LLA) is redirected >>> to the LSP specified in the option's value. >>> >>> This enables external BGP daemons to listen on an interface >>> bound to a LSP and effectively act as if they were listening >>> on (and speaking from) LRP's IP address. >>> >>> Signed-off-by: Martin Kalcok <[email protected]> >>> --- >> Strictly on this version of the patch, and with the thought in mind that >> we might want to consider this feature experimental [0] and maybe change >> it (NB too) in the future, I left a few comments inline. With those >> addressed I think the patch looks OK to me. >> >> [0] >> https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/ >> >> Now, for the point when we'd remove the "experimental" tag: >> >> In general, it makes me a bit uneasy that we have to share the MAC and >> IP between the LRP and the VIF of logical switch port that's connected >> to the same switch as the LRP. >> >> I was thinking of alternatives for the future and the only things I >> could come up with until now are: >> >> 1. Create a separate, "disconnected" logical switch: >> ovn-nbctl ls-add bgp >> >> Add the bgp-daemon port to it: >> ovn-nbctl lsp-add bgp bgp-daemon. >> >> Then we don't need "unknown" either, I think. >> >> But it's not possible today to move packets from the ing,ress pipeline of >> a logical switch ("public" in this test) to the egress pipeline of a >> different logical switch ("bgp" in my example). It also feels icky to >> implement that. >> >> 2. Add the ability to bind an OVS port directly to a logical router port: >> >> then we could do the same type of redirect you do here but instead in >> the logical router pipeline. The advantage is we don't have to drop any >> non-bgp traffic in the switch pipeline. The switch keeps functioning as >> it does today. >> >> Maybe an advantage of this second alternative would be that we can >> easily attach a filtering option to this LRP (e.g., >> LRP.options:control-traffic-filter) to allow us to be more flexible in >> what kind of traffic we forward to the actuall routing protocol daemon >> that runs behind that OVS port - Vladislav also mentioned during the >> meeting it might be interesting to forward BFD packets to the FRR (or >> whatever application implements the routing protocol) instance too. > > @Martin, I'm again kindly asking about possible support for BFD > redirect. Is it possible to incorporate these changes as an additional > configuration know ("bfd-redirect") into your patch so we can start > using this functionality internally and possibly give some feedback > (hopefully positive)? > > It doesn't seem to be a very big change but looks very attractive for us. > > I can send a separate patch for this on top of your patch, but > technically it won't be able to jump into the 24.09 because formally > it's already a soft-freeze in progress. As an option I can send a patch, > which can be squashed into this one. > > What do you think? I didn't check with other maintainers (added them in CC now), but given that we agreed to try to accept the port redirecting patch as experimental I think it's fine to expand it to allow BFD support too. Posting a follow up patch on top of this one is fine from my perspective. However, what if we change the option to? - NB.Logical_Router_Port.options:routing-protocol-redirect=<LSP-NAME> - NB.Logical_Router_Port.options:routing-protocols=<LIST_OF_PROTOCOLS> E.g.: NB.Logical_Router_Port.options:routing-protocol-redirect="bgp-daemon" NB.Logical_Router_Port.options:routing-protocols="BGP;BFD" Would this have a higher chance of becoming fully supported in the next release? Regards, Dumitru > >> But again, if we consider the current feature "experimental", we can >> spend more time during the next release cycle to figure out what's best. >> >>> v4 of this patch renames the feature from "bgp-mirror" to "bgp-redirect" >>> as discussed during community meeting. >>> >>> northd/northd.c | 108 ++++++++++++++++++++++++++++++++++++++++ >>> northd/ovn-northd.8.xml | 23 +++++++++ >>> ovn-nb.xml | 14 ++++++ >>> tests/ovn-northd.at | 58 +++++++++++++++++++++ >>> tests/system-ovn.at | 86 ++++++++++++++++++++++++++++++++ >>> 5 files changed, 289 insertions(+) >>> >>> diff --git a/northd/northd.c b/northd/northd.c >>> index 4353df07d..088104f25 100644 >>> --- a/northd/northd.c >>> +++ b/northd/northd.c >>> @@ -13048,6 +13048,113 @@ build_arp_resolve_flows_for_lrp(struct ovn_port >>> *op, >>> } >>> } >>> >>> +static void >>> +build_bgp_redirect_rule__( >>> + const char *s_addr, const char *bgp_port_name, bool is_ipv6, >>> + struct ovn_port *ls_peer, struct lflow_table *lflows, >>> + struct ds *match, struct ds *actions) >>> +{ >>> + int ip_ver = is_ipv6 ? 6 : 4; >>> + /* Redirect packets in the input pipeline destined for LR's IP to >>> + * the port specified in 'bgp-redirect' option. >>> + */ >>> + ds_clear(match); >>> + ds_clear(actions); >>> + ds_put_format(match, "ip%d.dst == %s && tcp.dst == 179", ip_ver, >>> s_addr); >>> + ds_put_format(actions, "outport = \"%s\"; output;", bgp_port_name); >>> + ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_L2_LKUP, 100, >>> + ds_cstr(match), >>> + ds_cstr(actions), >>> + ls_peer->lflow_ref); >>> + >>> + >>> + /* Drop any traffic originating from 'bgp-redirect' port that does >>> + * not originate from BGP daemon port. This blocks unnecessary >>> + * traffic like ARP broadcasts or IPv6 router solicitation packets >>> + * from the dummy 'bgp-redirect' port. >>> + */ >>> + ds_clear(match); >>> + ds_put_format(match, "inport == \"%s\"", bgp_port_name); >>> + ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_CHECK_PORT_SEC, 80, >>> + ds_cstr(match), >>> + REGBIT_PORT_SEC_DROP " = 1; next;", >>> + ls_peer->lflow_ref); >> Not a huge deal but this logical flow gets regenerated multiple times >> for each bgp_port_name (for each LRP address). In the end we'll only >> install one in the database so there's no impact on SB. But maybe we >> should consider moving this part out of this function. >> >>> + >>> + ds_put_format(match, >>> + " && ip%d.src == %s && tcp.src == 179", >>> + ip_ver, >>> + s_addr); >>> + ovn_lflow_add(lflows, ls_peer->od, S_SWITCH_IN_CHECK_PORT_SEC, 81, >>> + ds_cstr(match), >>> + REGBIT_PORT_SEC_DROP " = check_in_port_sec(); next;", >>> + ls_peer->lflow_ref); >> In your system test we set 'addresses = "unknown"' for 'bgp_port_name'. >> And I guess we kind of have to do it like that all the time when using >> this feature. I'm trying to think of scenarios in which users set >> LSP.addresses = 'unknown' and LSP.port-security = '<some-addresses>'. >> I'm not sure I saw those until now but maybe I'm wrong. >> >>> +} >>> + >>> +static void >>> +build_lrouter_bgp_redirect( >>> + struct ovn_port *op, struct lflow_table *lflows, >>> + struct ds *match, struct ds *actions) >>> +{ >>> + /* LRP has to have a peer.*/ >>> + if (op->peer == NULL) { >>> + return; >>> + } >>> + /* LRP has to have NB record.*/ >>> + if (op->nbrp == NULL) { >>> + return; >>> + } >>> + >>> + /* Proceed only for LRPs that have 'bgp-redirect' option set. Value of >>> this >>> + * option is the name of LSP to which a BGP traffic will be redirected. >>> + */ >>> + const char *bgp_port = smap_get(&op->nbrp->options, "bgp-redirect"); >>> + if (bgp_port == NULL) { >>> + return; >>> + } >>> + >>> + /* Ensure that LSP, to which the BGP traffic is redirected, exists.*/ >>> + struct ovn_port *peer_lsp; >>> + bool bgp_port_exists = false; >>> + HMAP_FOR_EACH (peer_lsp, dp_node, &op->peer->od->ports) { >>> + size_t peer_lsp_s = strlen(peer_lsp->key); >>> + if (peer_lsp_s == strlen(bgp_port) >>> + && !strncmp(peer_lsp->key, bgp_port, peer_lsp_s)){ >>> + bgp_port_exists = true; >>> + break; >>> + } >>> + } >>> + >>> + if (!bgp_port_exists) { >>> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); >>> + VLOG_WARN_RL(&rl, "Option 'bgp-redirect' set on Logical Router >>> Port " >>> + "'%s' refers to non-existent Logical Switch >>> Port. " >>> + "BGP redirecting won't be configured.", >>> + op->key); >>> + return; >>> + } >>> + >>> + if (op->cr_port != NULL) { >>> + static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5); >>> + VLOG_WARN_RL(&rl, "Option 'bgp-redirect' is not supported on" >>> + " Distributed Gateway Port '%s'", op->key); >>> + return; >>> + } >>> + >>> + /* Mirror traffic destined for LRP's IPs and default BGP port >>> + * to the port defined in 'bgp-redirect' option. >>> + */ >>> + for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs; i++) { >>> + const char *ip_s = op->lrp_networks.ipv4_addrs[i].addr_s; >>> + build_bgp_redirect_rule__(ip_s, bgp_port, false, op->peer, lflows, >>> + match, actions); >>> + } >>> + for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs; i++) { >>> + const char *ip_s = op->lrp_networks.ipv6_addrs[i].addr_s; >>> + build_bgp_redirect_rule__(ip_s, bgp_port, true, op->peer, lflows, >>> + match, actions); >>> + } >>> +} >>> + >>> /* This function adds ARP resolve flows related to a LSP. */ >>> static void >>> build_arp_resolve_flows_for_lsp( >>> @@ -16003,6 +16110,7 @@ build_lswitch_and_lrouter_iterate_by_lrp(struct >>> ovn_port *op, >>> lsi->meter_groups, op->lflow_ref); >>> build_lrouter_icmp_packet_toobig_admin_flows(op, lsi->lflows, >>> &lsi->match, >>> &lsi->actions, >>> op->lflow_ref); >>> + build_lrouter_bgp_redirect(op, lsi->lflows, &lsi->match, >>> &lsi->actions); >>> } >>> >>> static void * >>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml >>> index b06b09ac5..4cbdf8f74 100644 >>> --- a/northd/ovn-northd.8.xml >>> +++ b/northd/ovn-northd.8.xml >>> @@ -284,6 +284,21 @@ >>> dropped in the next stage. >>> </li> >>> >>> + <li> >>> + For each logical port that's defined as a target of BGP redirecting >>> + (via <code>bgp-redirect</code> option set on Logical Router Port), >>> a >>> + filter is set in place that disallows any traffic entering this >>> port >>> + that does not originate from Logical Router Port's IPs and default >>> BGP >>> + port (TCP 179). This filtering is achieved by two rules. First >>> rule has >>> + priority 81, it matches on <code>inport == >>> <var>BGP_MIRROR_PORT</var> >>> + && ip.src == <var>LRP_IP</var> && tcp.src == >>> 179</code> >>> + and allows traffic further with <code>REGBIT_PORT_SEC_DROP" = >>> + check_in_port_sec(); next;</code>. Second rule with priority 80 >>> matches >>> + the rest of the traffic from that port and sets >>> + <code>REGBIT_PORT_SEC_DROP" = 1; next;"</code> so that the packets >>> are >>> + dropped in the next stage. >>> + </li> >>> + >>> <li> >>> For each (enabled) vtep logical port, a priority 70 flow is added >>> which >>> matches on all packets and applies the action >>> @@ -1949,6 +1964,14 @@ output; >>> on the logical switch. >>> </li> >>> >>> + <li> >>> + For any logical port that's defined as a target of BGP redirecting >>> (via >>> + <code>bgp-redirect</code> option set on Logical Router Port), a >>> + priority-100 flow is added that redirects traffic for Logical >>> Router >>> + Port IPs (including IPv6 LLA) and TCP port 179, to the targeted >>> + logical port. >>> + </li> >>> + >>> <li> >>> Priority-90 flows for transit switches that forward registered >>> IP multicast traffic to their corresponding multicast group , which >>> diff --git a/ovn-nb.xml b/ovn-nb.xml >>> index 9552534f6..2d2f5c9e0 100644 >>> --- a/ovn-nb.xml >>> +++ b/ovn-nb.xml >>> @@ -3440,6 +3440,20 @@ or >>> </p> >>> </column> >>> >>> + <column name="options" key="bgp-redirect" type='{"type": "string"}'> >>> + <p> >>> + This option expects a name of a Logical Switch Port that's >>> present >>> + in the peer's Logical Switch. If set, it causes any traffic >>> + that's destined for Logical Router Port's IP addresses (including >>> + its IPv6 LLA) and the default BGP port (TCP 179), to be >>> redirected >>> + to the specified Logical Switch Port. >>> + >>> + This allows external BGP daemon to be bound to a port in OVN's >>> + Logical Switch and act as if it was listening on Logical Router >>> + Port's IP address. >>> + </p> >>> + </column> >>> + >>> <column name="options" key="gateway_mtu_bypass"> >>> <p> >>> When configured, represents a match expression, in the same >>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at >>> index a389d1988..df1e924ab 100644 >>> --- a/tests/ovn-northd.at >>> +++ b/tests/ovn-northd.at >>> @@ -12721,3 +12721,61 @@ AT_CHECK([ovn-sbctl dump-flows lr | grep >>> lr_in_dnat | ovn_strip_lflows], [0], [d >>> >>> AT_CLEANUP >>> ]) >>> + >>> +OVN_FOR_EACH_NORTHD_NO_HV([ >>> +AT_SETUP([BGP control plane redirect]) >>> +ovn_start >>> + >>> +check ovn-sbctl chassis-add hv1 geneve 127.0.0.1 >>> + >>> +check ovn-nbctl lr-add lr -- \ >>> + lrp-add lr lr-ls 02:ac:10:01:00:01 172.16.1.1/24 >>> +check ovn-nbctl --wait=sb set logical_router lr options:chassis=hv1 >>> + >>> +check ovn-nbctl ls-add ls -- \ >>> + lsp-add ls ls-lr -- \ >>> + lsp-set-type ls-lr router -- \ >>> + lsp-set-addresses ls-lr router -- \ >>> + lsp-set-options ls-lr router-port=lr-ls >>> + >>> +check ovn-nbctl lsp-add ls lsp-bgp -- \ >>> + lsp-set-addresses lsp-bgp unknown >>> + >>> +# Function that ensures that no BGP redirect ruels are installed. >>> +check_no_bgp_redirect() { >>> + AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_l2_lkup | grep >>> "tcp.dst == 179" | wc -l], [0], [0 >>> +]) >>> + >>> + AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_check_port_sec | grep >>> -E "priority=80|priority=81" | wc -l], [0], [0 >>> +]) >>> +} >>> + >>> +# By default, no rules related to BGP redirect are present >>> +check_no_bgp_redirect >>> + >>> +# Set "lsp-bgp" port as target of BGP control plane redirected traffic >>> +check ovn-nbctl --wait=sb set logical_router_port lr-ls >>> options:bgp-redirect=lsp-bgp >>> + >>> +# Check that BGP control plane traffic is redirected to "lsp-bgp" >>> +AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_l2_lkup | grep "tcp.dst == >>> 179" | ovn_strip_lflows], [0], [dnl >>> + table=??(ls_in_l2_lkup ), priority=100 , match=(ip4.dst == >>> 172.16.1.1 && tcp.dst == 179), action=(outport = "lsp-bgp"; output;) >>> + table=??(ls_in_l2_lkup ), priority=100 , match=(ip6.dst == >>> fe80::ac:10ff:fe01:1 && tcp.dst == 179), action=(outport = "lsp-bgp"; >>> output;) >>> +]) >>> + >>> +# Check that only BGP-related traffic is accepted on "lsp-bgp" port >>> +AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_check_port_sec | grep -E >>> "priority=80|priority=81" | ovn_strip_lflows], [0], [dnl >>> + table=??(ls_in_check_port_sec), priority=80 , match=(inport == >>> "lsp-bgp"), action=(reg0[[15]] = 1; next;) >>> + table=??(ls_in_check_port_sec), priority=81 , match=(inport == >>> "lsp-bgp" && ip4.src == 172.16.1.1 && tcp.src == 179), action=(reg0[[15]] = >>> check_in_port_sec(); next;) >>> + table=??(ls_in_check_port_sec), priority=81 , match=(inport == >>> "lsp-bgp" && ip6.src == fe80::ac:10ff:fe01:1 && tcp.src == 179), >>> action=(reg0[[15]] = check_in_port_sec(); next;) >>> +]) >>> + >>> +# Remove 'bgp-redirect' option from LRP and check that rules are removed >>> +check ovn-nbctl --wait=sb remove logical_router_port lr-ls options >>> bgp-redirect >>> +check_no_bgp_redirect >>> + >>> +# Set non-existent LSP as target of 'bgp-redirect' and check that no rules >>> are added >>> +check ovn-nbctl --wait=sb set logical_router_port lr-ls >>> options:bgp-redirect=lsp-foo >>> +check_no_bgp_redirect >>> + >>> +AT_CLEANUP >>> +]) >>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at >>> index c24ede7c5..c09c1136e 100644 >>> --- a/tests/system-ovn.at >>> +++ b/tests/system-ovn.at >>> @@ -13027,3 +13027,89 @@ OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port >>> patch-.*/d >>> /connection dropped.*/d"]) >>> AT_CLEANUP >>> ]) >>> + >>> +OVN_FOR_EACH_NORTHD([ >>> +AT_SETUP([BGP control plane redirect]) >> Nit: probably best to add AT_SKIP_IF([test $HAVE_NC = no]) here. >> >>> +ovn_start >>> +OVS_TRAFFIC_VSWITCHD_START() >>> + >>> +ADD_BR([br-int]) >>> +ADD_BR([br-ext]) >>> + >>> +ovs-ofctl add-flow br-ext action=normal >>> +# Set external-ids in br-int needed for ovn-controller >>> +ovs-vsctl \ >>> + -- set Open_vSwitch . external-ids:system-id=hv1 \ >>> + -- set Open_vSwitch . >>> external-ids:ovn-remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \ >>> + -- set Open_vSwitch . external-ids:ovn-encap-type=geneve \ >>> + -- set Open_vSwitch . external-ids:ovn-encap-ip=169.0.0.1 \ >>> + -- set bridge br-int fail-mode=secure >>> other-config:disable-in-band=true >> Most ov*-ctl commands should be prefixed with "check" in this test. >> >>> + >>> +# Start ovn-controller >>> +start_daemon ovn-controller >>> + >>> +ovn-nbctl lr-add R1 \ >>> + -- set Logical_Router R1 options:chassis=hv1 >>> + >>> +ovn-nbctl ls-add public >>> + >>> +ovn-nbctl lrp-add R1 rp-public 00:00:02:01:02:03 172.16.1.1/24 >>> + >>> +ovn-nbctl lsp-add public public-rp -- set Logical_Switch_Port public-rp \ >>> + type=router options:router-port=rp-public \ >>> + -- lsp-set-addresses public-rp router >>> + >>> +ovn-nbctl lsp-add public bgp-daemon \ >>> + -- lsp-set-addresses bgp-daemon unknown >>> + >>> +AT_CHECK([ovs-vsctl set Open_vSwitch . >>> external-ids:ovn-bridge-mappings=phynet:br-ext]) >>> +ovn-nbctl lsp-add public public1 \ >>> + -- lsp-set-addresses public1 unknown \ >>> + -- lsp-set-type public1 localnet \ >>> + -- lsp-set-options public1 network_name=phynet >>> + >>> +ovn-nbctl --wait=hv sync >>> + >>> +# Set option that redirects BGP control plane traffic to a LSP "bgp-daemon" >>> +ovn-nbctl --wait=sb set logical_router_port rp-public >>> options:bgp-redirect=bgp-daemon >>> + >>> +# Create "bgp-daemon" interface in a namespace with IP and MAC matching >>> LRP "rp-public" >>> +ADD_NAMESPACES(bgp-daemon) >>> +ADD_VETH(bgp-daemon, bgp-daemon, br-int, "172.16.1.1/24", >>> "00:00:02:01:02:03") >>> + >>> +ADD_NAMESPACES(ext-foo) >>> +ADD_VETH(ext-foo, ext-foo, br-ext, "172.16.1.100/24", "00:10:10:01:02:13", >>> \ >>> + "172.16.1.1") >>> + >>> +# Flip the interface down/up to get proper IPv6 LLA >>> +NS_EXEC([bgp-daemon], [ip link set down bgp-daemon]) >>> +NS_EXEC([bgp-daemon], [ip link set up bgp-daemon]) >>> + >>> +# Wait until IPv6 LLA loses the "tentative" flag otherwise it can't be >>> bound to. >>> +OVS_WAIT_UNTIL([NS_EXEC([bgp-daemon], [ip a show dev bgp-daemon | grep >>> "fe80::" | grep -v tentative])]) >>> +OVS_WAIT_UNTIL([NS_EXEC([ext-foo], [ip a show dev ext-foo | grep "fe80::" >>> | grep -v tentative])]) >>> + >>> +# Verify that BGP control plane traffic is delivered to the "bgp-daemon" >>> +# interface on both IPv4 and IPv6 LLA addresses >>> +NETNS_DAEMONIZE([bgp-daemon], [nc -l -k 172.16.1.1 179], [bgp_v4.pid]) >>> +NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only 172.16.1.1 179]) >>> + >>> +NETNS_DAEMONIZE([bgp-daemon], [nc -l -6 -k >>> fe80::200:2ff:fe01:203%bgp-daemon 179], [bgp_v6.pid]) >>> +NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only -6 >>> fe80::200:2ff:fe01:203%ext-foo 179]) >>> + >>> +OVS_APP_EXIT_AND_WAIT([ovn-controller]) >>> + >>> +as ovn-sb >>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) >>> + >>> +as ovn-nb >>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server]) >>> + >>> +as northd >>> +OVS_APP_EXIT_AND_WAIT([ovn-northd]) >>> + >>> +as >>> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d >>> +/.*terminating with signal 15.*/d"]) >>> +AT_CLEANUP >>> +]) >> Regards, >> Dumitru >> >> _______________________________________________ >> dev mailing list >> [email protected] >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
