On 8/8/24 13:51, Vladislav Odintsov wrote:
>
> On 08.08.2024 18:45, Dumitru Ceara wrote:
>> On 8/8/24 11:14, Vladislav Odintsov wrote:
>>> On 08.08.2024 15:51, [email protected] wrote:
>>>> On Thu, 2024-08-08 at 10:46 +0200, Dumitru Ceara wrote:
>>>>> On 8/8/24 10:42, Vladislav Odintsov wrote:
>>>>>> On 08.08.2024 03:16, Dumitru Ceara wrote:
>>>>>>> Re-adding the dev list.
>>>>>>>
>>>>>>> On 8/7/24 18:12, Vladislav Odintsov wrote:
>>>>>>>> Hi Dumitru,
>>>>>>>>
>>>>>>> Hi Vladislav,
>>>>>>>
>>>>>>>> I'd like to add some thoughts to your inputs.
>>>>>>>>
>>>>>>> Thanks for that, I added some more of my own below. :)
>>>>>>>
>>>>>>>> On 06.08.2024 19:23, Dumitru Ceara wrote:
>>>>>>>>> Hi Martin,
>>>>>>>>>
>>>>>>>>> Sorry, I was reviewing V3 but I was slow in actually sending
>>>>>>>>> out the email.
>>>>>>>>>
>>>>>>>>> On 8/6/24 13:17, Martin Kalcok wrote:
>>>>>>>>>> This change adds a 'bgp-redirect' option to LRP that allows
>>>>>>>>>> redirecting of BGP control plane traffic to an arbitrary
>>>>>>>>>> LSP
>>>>>>>>>> in its peer LS.
>>>>>>>>>>
>>>>>>>>>> The option expects a string with a LSP name. When set,
>>>>>>>>>> any traffic entering LS that's destined for any of the
>>>>>>>>>> LRP's IP addresses (including IPv6 LLA) is redirected
>>>>>>>>>> to the LSP specified in the option's value.
>>>>>>>>>>
>>>>>>>>>> This enables external BGP daemons to listen on an interface
>>>>>>>>>> bound to a LSP and effectively act as if they were
>>>>>>>>>> listening
>>>>>>>>>> on (and speaking from) LRP's IP address.
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Martin Kalcok <[email protected]>
>>>>>>>>>> ---
>>>>>>>>> Strictly on this version of the patch, and with the thought
>>>>>>>>> in mind that
>>>>>>>>> we might want to consider this feature experimental [0] and
>>>>>>>>> maybe change
>>>>>>>>> it (NB too) in the future, I left a few comments inline.
>>>>>>>>> With those
>>>>>>>>> addressed I think the patch looks OK to me.
>>>>>>>>>
>>>>>>>>> [0]
>>>>>>>>> https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/
>>>>>>>>>
>>>>>>>>> Now, for the point when we'd remove the "experimental" tag:
>>>>>>>>>
>>>>>>>>> In general, it makes me a bit uneasy that we have to share
>>>>>>>>> the MAC and
>>>>>>>>> IP between the LRP and the VIF of logical switch port that's
>>>>>>>>> connected
>>>>>>>>> to the same switch as the LRP.
>>>>>>>>>
>>>>>>>>> I was thinking of alternatives for the future and the only
>>>>>>>>> things I
>>>>>>>>> could come up with until now are:
>>>>>>>>>
>>>>>>>>> 1. Create a separate, "disconnected" logical switch:
>>>>>>>>> ovn-nbctl ls-add bgp
>>>>>>>>>
>>>>>>>>> Add the bgp-daemon port to it:
>>>>>>>>> ovn-nbctl lsp-add bgp bgp-daemon.
>>>>>>>>>
>>>>>>>>> Then we don't need "unknown" either, I think.
>>>>>>>>>
>>>>>>>>> But it's not possible today to move packets from the ingress
>>>>>>>>> pipeline of
>>>>>>>>> a logical switch ("public" in this test) to the egress
>>>>>>>>> pipeline of a
>>>>>>>>> different logical switch ("bgp" in my example). It also
>>>>>>>>> feels icky to
>>>>>>>>> implement that.
>>>>>>>>>
>>>>>>>>> 2. Add the ability to bind an OVS port directly to a logical
>>>>>>>>> router port:
>>>>>>>>>
>>>>>>>>> then we could do the same type of redirect you do here but
>>>>>>>>> instead in
>>>>>>>>> the logical router pipeline. The advantage is we don't have
>>>>>>>>> to drop any
>>>>>>>>> non-bgp traffic in the switch pipeline. The switch keeps
>>>>>>>>> functioning as
>>>>>>>>> it does today.
>>>>>>>>>
>>>>>>>>> Maybe an advantage of this second alternative would be that
>>>>>>>>> we can
>>>>>>>>> easily attach a filtering option to this LRP (e.g.,
>>>>>>>>> LRP.options:control-traffic-filter) to allow us to be more
>>>>>>>>> flexible in
>>>>>>>>> what kind of traffic we forward to the actuall routing
>>>>>>>>> protocol daemon
>>>>>>>>> that runs behind that OVS port - Vladislav also mentioned
>>>>>>>>> during the
>>>>>>>>> meeting it might be interesting to forward BFD packets to the
>>>>>>>>> FRR (or
>>>>>>>>> whatever application implements the routing protocol)
>>>>>>>>> instance too.
>>>>>>>> The idea to be able to bind LRP to OVS port sounds very
>>>>>>>> interesting to
>>>>>>>> me. But with a note that it's not a pure "bind", but a partial:
>>>>>>>> as you
>>>>>>>> wrote with some "filter" applied to pass not all the traffic.
>>>>>>>> The main
>>>>>>>> idea here is to pass only control plane traffic destined to
>>>>>>>> LRP's
>>>>>>> As we're discussing on the other thread (Martin pointed it out)
>>>>>>> we also
>>>>>>> probably need to involve conntrack and allow replies to any kind
>>>>>>> of
>>>>>>> traffic initiated by the entity behind the LRP's VIF (e.g., BGP
>>>>>>> sessions
>>>>>>> initiated from the host).
>>>>>>>
>>>>>>>> addresses. Other than that seems odd. Transit traffic should be
>>>>>>>> remained
>>>>>>>> in LR pipeline otherwise it will have no difference with a
>>>>>>>> regular VIF LSP.
>>>>>>>>
>>>>>>> Definitely, traffic that needs to be DNATed (DNAT/unSNAT rules or
>>>>>>> LB
>>>>>>> rules that will change the destination IP from LRP IP to
>>>>>>> something else)
>>>>>>> should not be affected. All other "transit" traffic doesn't have
>>>>>>> LRP
>>>>>>> IP, does it?
>>>>>> You're right.
>>>>>>
>>>>>> @Dumitru, @Martin, what if we just redirect all traffic destined to
>>>>>> LRP
>>>>>> IPs to redirect port? Is there any drawbacks?
>>>>>> For security it is possible to optionally use ACLs with or without
>>>>>> conntrack. It's up to user/CMS.
>>>>>>
>>>>>> This seems quite simple in the code and very flexible. So no
>>>>>> additional
>>>>>> flows seems to be needed in future to support any other routing
>>>>>> protocols (or for another possible non-routing usecases).
>>>>>>
>>>>> Won't this break all "transit" traffic that has destination IP the
>>>>> LRP
>>>>> IP (DNAT, LB), etc?
>>>>>
>>>> I'm afraid so. I guess I was just tired yesterday when I made that
>>>> proposal for general redirect. Given that the redirect is implemented
>>>> in the LS pipeline, it would eat up all the traffic for LRP's IP before
>>>> the DNAT/unSNAT occurs in the LR pipeline. I'll give it a quick test in
>>>> case I'm wrong here again.
>>> Oops, this will definitely break that cases. For some reason I thought
>>> only about routing cases.
>>>
>>>> If that's the case, I'll try the conntrack approach and update you all
>>>> asap.
>>>>
>>>> Martin.
>>>>
>>>>>>>> Having a filter (or match like in ACL or
>>>>>>>> Logical_Router_Policies) looks
>>>>>>>> more flexible in terms of used protocols. This can coexist with
>>>>>>>> proposal
>>>>>>>> from current patch, where the flow is not programmable from
>>>>>>>> user.
>>>>>>>>
>>>>>>> I think so too.
>>>>>>> Regards,
>>>>>>> Dumitru
>>>>>>>
>>>>>>>>> But again, if we consider the current feature "experimental",
>>>>>>>>> we can
>>>>>>>>> spend more time during the next release cycle to figure out
>>>>>>>>> what's best.
>>>>>>>>>
>>>>>>>>>> v4 of this patch renames the feature from "bgp-mirror" to
>>>>>>>>>> "bgp-redirect"
>>>>>>>>>> as discussed during community meeting.
>>>>>>>>>>
>>>>>>>>>> northd/northd.c | 108
>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>> northd/ovn-northd.8.xml | 23 +++++++++
>>>>>>>>>> ovn-nb.xml | 14 ++++++
>>>>>>>>>> tests/ovn-northd.at | 58 +++++++++++++++++++++
>>>>>>>>>> tests/system-ovn.at | 86
>>>>>>>>>> ++++++++++++++++++++++++++++++++
>>>>>>>>>> 5 files changed, 289 insertions(+)
>>>>>>>>>>
>>>>>>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>>>>>>> index 4353df07d..088104f25 100644
>>>>>>>>>> --- a/northd/northd.c
>>>>>>>>>> +++ b/northd/northd.c
>>>>>>>>>> @@ -13048,6 +13048,113 @@
>>>>>>>>>> build_arp_resolve_flows_for_lrp(struct ovn_port *op,
>>>>>>>>>> }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> +static void
>>>>>>>>>> +build_bgp_redirect_rule__(
>>>>>>>>>> + const char *s_addr, const char *bgp_port_name,
>>>>>>>>>> bool is_ipv6,
>>>>>>>>>> + struct ovn_port *ls_peer, struct lflow_table
>>>>>>>>>> *lflows,
>>>>>>>>>> + struct ds *match, struct ds *actions)
>>>>>>>>>> +{
>>>>>>>>>> + int ip_ver = is_ipv6 ? 6 : 4;
>>>>>>>>>> + /* Redirect packets in the input pipeline destined for
>>>>>>>>>> LR's IP to
>>>>>>>>>> + * the port specified in 'bgp-redirect' option.
>>>>>>>>>> + */
>>>>>>>>>> + ds_clear(match);
>>>>>>>>>> + ds_clear(actions);
>>>>>>>>>> + ds_put_format(match, "ip%d.dst == %s && tcp.dst ==
>>>>>>>>>> 179", ip_ver, s_addr);
>>>>>>>>>> + ds_put_format(actions, "outport = \"%s\"; output;",
>>>>>>>>>> bgp_port_name);
>>>>>>>>>> + ovn_lflow_add(lflows, ls_peer->od,
>>>>>>>>>> S_SWITCH_IN_L2_LKUP, 100,
>>>>>>>>>> + ds_cstr(match),
>>>>>>>>>> + ds_cstr(actions),
>>>>>>>>>> + ls_peer->lflow_ref);
>>>>>>>>>> +
>>>>>>>>>> +
>>>>>>>>>> + /* Drop any traffic originating from 'bgp-redirect'
>>>>>>>>>> port that does
>>>>>>>>>> + * not originate from BGP daemon port. This blocks
>>>>>>>>>> unnecessary
>>>>>>>>>> + * traffic like ARP broadcasts or IPv6 router
>>>>>>>>>> solicitation packets
>>>>>>>>>> + * from the dummy 'bgp-redirect' port.
>>>>>>>>>> + */
>>>>>>>>>> + ds_clear(match);
>>>>>>>>>> + ds_put_format(match, "inport == \"%s\"",
>>>>>>>>>> bgp_port_name);
>>>>>>>>>> + ovn_lflow_add(lflows, ls_peer->od,
>>>>>>>>>> S_SWITCH_IN_CHECK_PORT_SEC, 80,
>>>>>>>>>> + ds_cstr(match),
>>>>>>>>>> + REGBIT_PORT_SEC_DROP " = 1; next;",
>>>>>>>>>> + ls_peer->lflow_ref);
>>>>>>>>> Not a huge deal but this logical flow gets regenerated
>>>>>>>>> multiple times
>>>>>>>>> for each bgp_port_name (for each LRP address). In the end
>>>>>>>>> we'll only
>>>>>>>>> install one in the database so there's no impact on SB. But
>>>>>>>>> maybe we
>>>>>>>>> should consider moving this part out of this function.
>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> + ds_put_format(match,
>>>>>>>>>> + " && ip%d.src == %s && tcp.src == 179",
>>>>>>>>>> + ip_ver,
>>>>>>>>>> + s_addr);
>>>>>>>>>> + ovn_lflow_add(lflows, ls_peer->od,
>>>>>>>>>> S_SWITCH_IN_CHECK_PORT_SEC, 81,
>>>>>>>>>> + ds_cstr(match),
>>>>>>>>>> + REGBIT_PORT_SEC_DROP " =
>>>>>>>>>> check_in_port_sec(); next;",
>>>>>>>>>> + ls_peer->lflow_ref);
>>>>>>>>> In your system test we set 'addresses = "unknown"' for
>>>>>>>>> 'bgp_port_name'.
>>>>>>>>> And I guess we kind of have to do it like that all the time
>>>>>>>>> when using
>>>>>>>>> this feature. I'm trying to think of scenarios in which
>>>>>>>>> users set
>>>>>>>>> LSP.addresses = 'unknown' and LSP.port-security = '<some-
>>>>>>>>> addresses>'.
>>>>>>>>> I'm not sure I saw those until now but maybe I'm wrong.
>>>>>>>> I'm also interested whether this LSP port with unknown
>>>>>>>> addresses will
>>>>>>>> get all the broadcast traffic?
>>>>>>>>
>>>>>>>> @Martin, didn't you look for ARP behavior? If we have one LS
>>>>>>>> with three
>>>>>>>> LSPs:
>>>>>>>> - 1 router type LSP with MAC 00:00:00:00:00:01 and IP
>>>>>>>> 169.254.0.1/30
>>>>>>>> - one "vif" LSP with "unknown" addresses set and configured
>>>>>>>> underlying
>>>>>>>> OVS port with same MAC and IP set
>>>>>>>> - one localnet port for external connectivity.
>>>>>>>>
>>>>>>>> If some party from localnet side sends broadcast ARP request
>>>>>>>> "Who has
>>>>>>>> 169.254.0.1? Answer <...>", will this ARP request got flooded
>>>>>>>> to both
>>>>>>>> LRP and LSP with "unknown" addresses? If yes, will the sender
>>>>>>>> receive 2
>>>>>>>> ARP replies?
>>>>>>>>
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +static void
>>>>>>>>>> +build_lrouter_bgp_redirect(
>>>>>>>>>> + struct ovn_port *op, struct lflow_table *lflows,
>>>>>>>>>> + struct ds *match, struct ds *actions)
>>>>>>>>>> +{
>>>>>>>>>> + /* LRP has to have a peer.*/
>>>>>>>>>> + if (op->peer == NULL) {
>>>>>>>>>> + return;
>>>>>>>>>> + }
>>>>>>>>>> + /* LRP has to have NB record.*/
>>>>>>>>>> + if (op->nbrp == NULL) {
>>>>>>>>>> + return;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + /* Proceed only for LRPs that have 'bgp-redirect'
>>>>>>>>>> option set. Value of this
>>>>>>>>>> + * option is the name of LSP to which a BGP traffic
>>>>>>>>>> will be redirected.
>>>>>>>>>> + */
>>>>>>>>>> + const char *bgp_port = smap_get(&op->nbrp->options,
>>>>>>>>>> "bgp-redirect");
>>>>>>>>>> + if (bgp_port == NULL) {
>>>>>>>>>> + return;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + /* Ensure that LSP, to which the BGP traffic is
>>>>>>>>>> redirected, exists.*/
>>>>>>>>>> + struct ovn_port *peer_lsp;
>>>>>>>>>> + bool bgp_port_exists = false;
>>>>>>>>>> + HMAP_FOR_EACH (peer_lsp, dp_node, &op->peer->od-
>>>>>>>>>>> ports) {
>>>>>>>>>> + size_t peer_lsp_s = strlen(peer_lsp->key);
>>>>>>>>>> + if (peer_lsp_s == strlen(bgp_port)
>>>>>>>>>> + && !strncmp(peer_lsp->key, bgp_port,
>>>>>>>>>> peer_lsp_s)){
>>>>>>>>>> + bgp_port_exists = true;
>>>>>>>>>> + break;
>>>>>>>>>> + }
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + if (!bgp_port_exists) {
>>>>>>>>>> + static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(1, 5);
>>>>>>>>>> + VLOG_WARN_RL(&rl, "Option 'bgp-redirect' set on
>>>>>>>>>> Logical Router Port "
>>>>>>>>>> + "'%s' refers to non-existent
>>>>>>>>>> Logical Switch Port. "
>>>>>>>>>> + "BGP redirecting won't be
>>>>>>>>>> configured.",
>>>>>>>>>> + op->key);
>>>>>>>>>> + return;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + if (op->cr_port != NULL) {
>>>>>>>>>> + static struct vlog_rate_limit rl =
>>>>>>>>>> VLOG_RATE_LIMIT_INIT(1, 5);
>>>>>>>>>> + VLOG_WARN_RL(&rl, "Option 'bgp-redirect' is not
>>>>>>>>>> supported on"
>>>>>>>>>> + " Distributed Gateway Port
>>>>>>>>>> '%s'", op->key);
>>>>>>>>>> + return;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + /* Mirror traffic destined for LRP's IPs and default
>>>>>>>>>> BGP port
>>>>>>>>>> + * to the port defined in 'bgp-redirect' option.
>>>>>>>>>> + */
>>>>>>>>>> + for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs;
>>>>>>>>>> i++) {
>>>>>>>>>> + const char *ip_s = op-
>>>>>>>>>>> lrp_networks.ipv4_addrs[i].addr_s;
>>>>>>>>>> + build_bgp_redirect_rule__(ip_s, bgp_port, false,
>>>>>>>>>> op->peer, lflows,
>>>>>>>>>> + match, actions);
>>>>>>>>>> + }
>>> I'm thinking more and more that having a "match" expression in a
>>> "NB.Logical_Router_Port.options:redirect-match" option it a good choice.
>>>
>> Would the match be enforced only for redirected traffic (from outside to
>> bgp target LSP)?
> I guess I didn't get your question, sorry.
I was thinking of the case Martin brought up yesterday, with the session
initiated from inside.
If we have the following config (not sure if the option names will
change but I hope it's clear):
LRP {
.name = lrp
.networks = [{00:00:00:00:00:01, 1.1.1.1/24}]
.options = {
.redirect-port = 'bgp-daemon'
.redirect-match = 'tcp.dst == 179 || tcp.src == 179'
}
}
Then, in the switch pipeline in the L2_LOOKUP stage, we'll only force
the outport to be 'bgp-daemon' if "destination IP == 1.1.1.1 &&
'redirect-match'" is true. So, only traffic coming from "outside"
towards the router IP and that also matches the filter.
No additional restrictions.
Will this be enough though?
What if 'bgp-daemon' ARPs for 1.1.1.2 (an external host)? How does that
work? ARP reply is unicast and will have dmac that of the LRP.
We could include ARP and ND in the match too but then what about ARP
replies that actually should go to the router port?
I'm actually thinking more and more that we shouldn't redirect any
traffic like this, we should just bind the VIF directly to the LRP.
>>> LRP can have multiple IP addresses and enabling redirect for all of them
>>> could be implicit for the user and may contain security concerns. As an
>>> example: LR can have configured LB with tcp/179. If user configured
>>> peering IP-address (which can be in a link-local address range) and also
>>> added an IP address for Load Balancer in a `networks` list of a LRP.
>>> Enabling BGP redirect for all IPs of LRP will break load balancing for
>>> tcp/179. While it is not needed for load balancers to assign their VIP
>>> to networks, there is no validations, which prohibit doing so. It is
>> I think this is actually something that ovn-kubernetes does in their
>> deployments for cluster-IP services. And I suspect that today there's
>> nothing preventing users from creating cluster-IP services for TCP/179.
>
> So, this proofs that this is a possible problem if enabled redirect and
> if load balancer's IP is configured to be in networks list.
>
> In any case this doesn't prevent from current implementation to merged
> as "experimental". I'm just started discussion about a GA solution.
>
>>> still possible and can be difficult to catch such behavior for
>>> redirected traffic with Load Balancers.
>>>
>>>>>>>>>> + for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs;
>>>>>>>>>> i++) {
>>>>>>>>>> + const char *ip_s = op-
>>>>>>>>>>> lrp_networks.ipv6_addrs[i].addr_s;
>>>>>>>>>> + build_bgp_redirect_rule__(ip_s, bgp_port, true,
>>>>>>>>>> op->peer, lflows,
>>>>>>>>>> + match, actions);
>>>>>>>>>> + }
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> /* This function adds ARP resolve flows related to a LSP.
>>>>>>>>>> */
>>>>>>>>>> static void
>>>>>>>>>> build_arp_resolve_flows_for_lsp(
>>>>>>>>>> @@ -16003,6 +16110,7 @@
>>>>>>>>>> build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port
>>>>>>>>>> *op,
>>>>>>>>>> lsi->meter_groups, op-
>>>>>>>>>>> lflow_ref);
>>>>>>>>>> build_lrouter_icmp_packet_toobig_admin_flows(op, lsi-
>>>>>>>>>>> lflows, &lsi->match,
>>>>>>>>>> &lsi-
>>>>>>>>>>> actions, op->lflow_ref);
>>>>>>>>>> + build_lrouter_bgp_redirect(op, lsi->lflows, &lsi-
>>>>>>>>>>> match, &lsi->actions);
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> static void *
>>>>>>>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-
>>>>>>>>>> northd.8.xml
>>>>>>>>>> index b06b09ac5..4cbdf8f74 100644
>>>>>>>>>> --- a/northd/ovn-northd.8.xml
>>>>>>>>>> +++ b/northd/ovn-northd.8.xml
>>>>>>>>>> @@ -284,6 +284,21 @@
>>>>>>>>>> dropped in the next stage.
>>>>>>>>>> </li>
>>>>>>>>>>
>>>>>>>>>> + <li>
>>>>>>>>>> + For each logical port that's defined as a target
>>>>>>>>>> of BGP redirecting
>>>>>>>>>> + (via <code>bgp-redirect</code> option set on
>>>>>>>>>> Logical Router Port), a
>>>>>>>>>> + filter is set in place that disallows any traffic
>>>>>>>>>> entering this port
>>>>>>>>>> + that does not originate from Logical Router Port's
>>>>>>>>>> IPs and default BGP
>>>>>>>>>> + port (TCP 179). This filtering is achieved by two
>>>>>>>>>> rules. First rule has
>>>>>>>>>> + priority 81, it matches on <code>inport ==
>>>>>>>>>> <var>BGP_MIRROR_PORT</var>
>>>>>>>>>> + && ip.src == <var>LRP_IP</var> &&
>>>>>>>>>> tcp.src == 179</code>
>>>>>>>>>> + and allows traffic further with
>>>>>>>>>> <code>REGBIT_PORT_SEC_DROP" =
>>>>>>>>>> + check_in_port_sec(); next;</code>. Second rule
>>>>>>>>>> with priority 80 matches
>>>>>>>>>> + the rest of the traffic from that port and sets
>>>>>>>>>> + <code>REGBIT_PORT_SEC_DROP" = 1; next;"</code> so
>>>>>>>>>> that the packets are
>>>>>>>>>> + dropped in the next stage.
>>>>>>>>>> + </li>
>>>>>>>>>> +
>>>>>>>>>> <li>
>>>>>>>>>> For each (enabled) vtep logical port, a priority
>>>>>>>>>> 70 flow is added which
>>>>>>>>>> matches on all packets and applies the action
>>>>>>>>>> @@ -1949,6 +1964,14 @@ output;
>>>>>>>>>> on the logical switch.
>>>>>>>>>> </li>
>>>>>>>>>>
>>>>>>>>>> + <li>
>>>>>>>>>> + For any logical port that's defined as a target of
>>>>>>>>>> BGP redirecting (via
>>>>>>>>>> + <code>bgp-redirect</code> option set on Logical
>>>>>>>>>> Router Port), a
>>>>>>>>>> + priority-100 flow is added that redirects traffic
>>>>>>>>>> for Logical Router
>>>>>>>>>> + Port IPs (including IPv6 LLA) and TCP port 179, to
>>>>>>>>>> the targeted
>>>>>>>>>> + logical port.
>>>>>>>>>> + </li>
>>>>>>>>>> +
>>>>>>>>>> <li>
>>>>>>>>>> Priority-90 flows for transit switches that
>>>>>>>>>> forward registered
>>>>>>>>>> IP multicast traffic to their corresponding
>>>>>>>>>> multicast group , which
>>>>>>>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>>>>>>>>> index 9552534f6..2d2f5c9e0 100644
>>>>>>>>>> --- a/ovn-nb.xml
>>>>>>>>>> +++ b/ovn-nb.xml
>>>>>>>>>> @@ -3440,6 +3440,20 @@ or
>>>>>>>>>> </p>
>>>>>>>>>> </column>
>>>>>>>>>>
>>>>>>>>>> + <column name="options" key="bgp-redirect"
>>>>>>>>>> type='{"type": "string"}'>
>>>>>>>>>> + <p>
>>>>>>>>>> + This option expects a name of a Logical Switch
>>>>>>>>>> Port that's present
>>>>>>>>>> + in the peer's Logical Switch. If set, it causes
>>>>>>>>>> any traffic
>>>>>>>>>> + that's destined for Logical Router Port's IP
>>>>>>>>>> addresses (including
>>>>>>>>>> + its IPv6 LLA) and the default BGP port (TCP
>>>>>>>>>> 179), to be redirected
>>>>>>>>>> + to the specified Logical Switch Port.
>>>>>>>>>> +
>>>>>>>>>> + This allows external BGP daemon to be bound to a
>>>>>>>>>> port in OVN's
>>>>>>>>>> + Logical Switch and act as if it was listening on
>>>>>>>>>> Logical Router
>>>>>>>>>> + Port's IP address.
>>>>>>>>>> + </p>
>>>>>>>>>> + </column>
>>>>>>>>>> +
>>>>>>>>>> <column name="options" key="gateway_mtu_bypass">
>>>>>>>>>> <p>
>>>>>>>>>> When configured, represents a match expression,
>>>>>>>>>> in the same
>>>>>>>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>>>>>>>>> index a389d1988..df1e924ab 100644
>>>>>>>>>> --- a/tests/ovn-northd.at
>>>>>>>>>> +++ b/tests/ovn-northd.at
>>>>>>>>>> @@ -12721,3 +12721,61 @@ AT_CHECK([ovn-sbctl dump-flows lr
>>>>>>>>>> | grep lr_in_dnat | ovn_strip_lflows], [0], [d
>>>>>>>>>>
>>>>>>>>>> AT_CLEANUP
>>>>>>>>>> ])
>>>>>>>>>> +
>>>>>>>>>> +OVN_FOR_EACH_NORTHD_NO_HV([
>>>>>>>>>> +AT_SETUP([BGP control plane redirect])
>>>>>>>>>> +ovn_start
>>>>>>>>>> +
>>>>>>>>>> +check ovn-sbctl chassis-add hv1 geneve 127.0.0.1
>>>>>>>>>> +
>>>>>>>>>> +check ovn-nbctl lr-add lr -- \
>>>>>>>>>> + lrp-add lr lr-ls 02:ac:10:01:00:01 172.16.1.1/24
>>>>>>>>>> +check ovn-nbctl --wait=sb set logical_router lr
>>>>>>>>>> options:chassis=hv1
>>>>>>>>>> +
>>>>>>>>>> +check ovn-nbctl ls-add ls -- \
>>>>>>>>>> + lsp-add ls ls-lr -- \
>>>>>>>>>> + lsp-set-type ls-lr router -- \
>>>>>>>>>> + lsp-set-addresses ls-lr router -- \
>>>>>>>>>> + lsp-set-options ls-lr router-port=lr-ls
>>>>>>>>>> +
>>>>>>>>>> +check ovn-nbctl lsp-add ls lsp-bgp -- \
>>>>>>>>>> + lsp-set-addresses lsp-bgp unknown
>>>>>>>>>> +
>>>>>>>>>> +# Function that ensures that no BGP redirect ruels are
>>>>>>>>>> installed.
>>>>>>>>>> +check_no_bgp_redirect() {
>>>>>>>>>> + AT_CHECK([ovn-sbctl dump-flows ls | grep
>>>>>>>>>> ls_in_l2_lkup | grep "tcp.dst == 179" | wc -l], [0], [0
>>>>>>>>>> +])
>>>>>>>>>> +
>>>>>>>>>> + AT_CHECK([ovn-sbctl dump-flows ls | grep
>>>>>>>>>> ls_in_check_port_sec | grep -E "priority=80|priority=81" |
>>>>>>>>>> wc -l], [0], [0
>>>>>>>>>> +])
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>> +# By default, no rules related to BGP redirect are present
>>>>>>>>>> +check_no_bgp_redirect
>>>>>>>>>> +
>>>>>>>>>> +# Set "lsp-bgp" port as target of BGP control plane
>>>>>>>>>> redirected traffic
>>>>>>>>>> +check ovn-nbctl --wait=sb set logical_router_port lr-ls
>>>>>>>>>> options:bgp-redirect=lsp-bgp
>>>>>>>>>> +
>>>>>>>>>> +# Check that BGP control plane traffic is redirected to
>>>>>>>>>> "lsp-bgp"
>>>>>>>>>> +AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_l2_lkup |
>>>>>>>>>> grep "tcp.dst == 179" | ovn_strip_lflows], [0], [dnl
>>>>>>>>>> + table=??(ls_in_l2_lkup ), priority=100 ,
>>>>>>>>>> match=(ip4.dst == 172.16.1.1 && tcp.dst == 179),
>>>>>>>>>> action=(outport = "lsp-bgp"; output;)
>>>>>>>>>> + table=??(ls_in_l2_lkup ), priority=100 ,
>>>>>>>>>> match=(ip6.dst == fe80::ac:10ff:fe01:1 && tcp.dst == 179),
>>>>>>>>>> action=(outport = "lsp-bgp"; output;)
>>>>>>>>>> +])
>>>>>>>>>> +
>>>>>>>>>> +# Check that only BGP-related traffic is accepted on "lsp-
>>>>>>>>>> bgp" port
>>>>>>>>>> +AT_CHECK([ovn-sbctl dump-flows ls | grep
>>>>>>>>>> ls_in_check_port_sec | grep -E "priority=80|priority=81" |
>>>>>>>>>> ovn_strip_lflows], [0], [dnl
>>>>>>>>>> + table=??(ls_in_check_port_sec), priority=80 ,
>>>>>>>>>> match=(inport == "lsp-bgp"), action=(reg0[[15]] = 1; next;)
>>>>>>>>>> + table=??(ls_in_check_port_sec), priority=81 ,
>>>>>>>>>> match=(inport == "lsp-bgp" && ip4.src == 172.16.1.1 &&
>>>>>>>>>> tcp.src == 179), action=(reg0[[15]] = check_in_port_sec();
>>>>>>>>>> next;)
>>>>>>>>>> + table=??(ls_in_check_port_sec), priority=81 ,
>>>>>>>>>> match=(inport == "lsp-bgp" && ip6.src ==
>>>>>>>>>> fe80::ac:10ff:fe01:1 && tcp.src == 179), action=(reg0[[15]]
>>>>>>>>>> = check_in_port_sec(); next;)
>>>>>>>>>> +])
>>>>>>>>>> +
>>>>>>>>>> +# Remove 'bgp-redirect' option from LRP and check that
>>>>>>>>>> rules are removed
>>>>>>>>>> +check ovn-nbctl --wait=sb remove logical_router_port lr-ls
>>>>>>>>>> options bgp-redirect
>>>>>>>>>> +check_no_bgp_redirect
>>>>>>>>>> +
>>>>>>>>>> +# Set non-existent LSP as target of 'bgp-redirect' and
>>>>>>>>>> check that no rules are added
>>>>>>>>>> +check ovn-nbctl --wait=sb set logical_router_port lr-ls
>>>>>>>>>> options:bgp-redirect=lsp-foo
>>>>>>>>>> +check_no_bgp_redirect
>>>>>>>>>> +
>>>>>>>>>> +AT_CLEANUP
>>>>>>>>>> +])
>>>>>>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>>>>>>>> index c24ede7c5..c09c1136e 100644
>>>>>>>>>> --- a/tests/system-ovn.at
>>>>>>>>>> +++ b/tests/system-ovn.at
>>>>>>>>>> @@ -13027,3 +13027,89 @@
>>>>>>>>>> OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-
>>>>>>>>>> .*/d
>>>>>>>>>> /connection dropped.*/d"])
>>>>>>>>>> AT_CLEANUP
>>>>>>>>>> ])
>>>>>>>>>> +
>>>>>>>>>> +OVN_FOR_EACH_NORTHD([
>>>>>>>>>> +AT_SETUP([BGP control plane redirect])
>>>>>>>>> Nit: probably best to add AT_SKIP_IF([test $HAVE_NC = no])
>>>>>>>>> here.
>>>>>>>>>
>>>>>>>>>> +ovn_start
>>>>>>>>>> +OVS_TRAFFIC_VSWITCHD_START()
>>>>>>>>>> +
>>>>>>>>>> +ADD_BR([br-int])
>>>>>>>>>> +ADD_BR([br-ext])
>>>>>>>>>> +
>>>>>>>>>> +ovs-ofctl add-flow br-ext action=normal
>>>>>>>>>> +# Set external-ids in br-int needed for ovn-controller
>>>>>>>>>> +ovs-vsctl \
>>>>>>>>>> + -- set Open_vSwitch . external-ids:system-id=hv1 \
>>>>>>>>>> + -- set Open_vSwitch . external-ids:ovn-
>>>>>>>>>> remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>>>>>>>>> + -- set Open_vSwitch . external-ids:ovn-encap-
>>>>>>>>>> type=geneve \
>>>>>>>>>> + -- set Open_vSwitch . external-ids:ovn-encap-
>>>>>>>>>> ip=169.0.0.1 \
>>>>>>>>>> + -- set bridge br-int fail-mode=secure other-
>>>>>>>>>> config:disable-in-band=true
>>>>>>>>> Most ov*-ctl commands should be prefixed with "check" in this
>>>>>>>>> test.
>>>>>>>>>
>>>>>>>>>> +
>>>>>>>>>> +# Start ovn-controller
>>>>>>>>>> +start_daemon ovn-controller
>>>>>>>>>> +
>>>>>>>>>> +ovn-nbctl lr-add R1 \
>>>>>>>>>> + -- set Logical_Router R1 options:chassis=hv1
>>>>>>>>>> +
>>>>>>>>>> +ovn-nbctl ls-add public
>>>>>>>>>> +
>>>>>>>>>> +ovn-nbctl lrp-add R1 rp-public 00:00:02:01:02:03
>>>>>>>>>> 172.16.1.1/24
>>>>>>>>>> +
>>>>>>>>>> +ovn-nbctl lsp-add public public-rp -- set
>>>>>>>>>> Logical_Switch_Port public-rp \
>>>>>>>>>> + type=router options:router-port=rp-public \
>>>>>>>>>> + -- lsp-set-addresses public-rp router
>>>>>>>>>> +
>>>>>>>>>> +ovn-nbctl lsp-add public bgp-daemon \
>>>>>>>>>> + -- lsp-set-addresses bgp-daemon unknown
>>>>>>>>>> +
>>>>>>>>>> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-
>>>>>>>>>> bridge-mappings=phynet:br-ext])
>>>>>>>>>> +ovn-nbctl lsp-add public public1 \
>>>>>>>>>> + -- lsp-set-addresses public1 unknown \
>>>>>>>>>> + -- lsp-set-type public1 localnet \
>>>>>>>>>> + -- lsp-set-options public1 network_name=phynet
>>>>>>>>>> +
>>>>>>>>>> +ovn-nbctl --wait=hv sync
>>>>>>>>>> +
>>>>>>>>>> +# Set option that redirects BGP control plane traffic to a
>>>>>>>>>> LSP "bgp-daemon"
>>>>>>>>>> +ovn-nbctl --wait=sb set logical_router_port rp-public
>>>>>>>>>> options:bgp-redirect=bgp-daemon
>>>>>>>>>> +
>>>>>>>>>> +# Create "bgp-daemon" interface in a namespace with IP and
>>>>>>>>>> MAC matching LRP "rp-public"
>>>>>>>>>> +ADD_NAMESPACES(bgp-daemon)
>>>>>>>>>> +ADD_VETH(bgp-daemon, bgp-daemon, br-int, "172.16.1.1/24",
>>>>>>>>>> "00:00:02:01:02:03")
>>>>>>>>>> +
>>>>>>>>>> +ADD_NAMESPACES(ext-foo)
>>>>>>>>>> +ADD_VETH(ext-foo, ext-foo, br-ext, "172.16.1.100/24",
>>>>>>>>>> "00:10:10:01:02:13", \
>>>>>>>>>> + "172.16.1.1")
>>>>>>>>>> +
>>>>>>>>>> +# Flip the interface down/up to get proper IPv6 LLA
>>>>>>>>>> +NS_EXEC([bgp-daemon], [ip link set down bgp-daemon])
>>>>>>>>>> +NS_EXEC([bgp-daemon], [ip link set up bgp-daemon])
>>>>>>>>>> +
>>>>>>>>>> +# Wait until IPv6 LLA loses the "tentative" flag otherwise
>>>>>>>>>> it can't be bound to.
>>>>>>>>>> +OVS_WAIT_UNTIL([NS_EXEC([bgp-daemon], [ip a show dev bgp-
>>>>>>>>>> daemon | grep "fe80::" | grep -v tentative])])
>>>>>>>>>> +OVS_WAIT_UNTIL([NS_EXEC([ext-foo], [ip a show dev ext-foo
>>>>>>>>>> | grep "fe80::" | grep -v tentative])])
>>>>>>>>>> +
>>>>>>>>>> +# Verify that BGP control plane traffic is delivered to
>>>>>>>>>> the "bgp-daemon"
>>>>>>>>>> +# interface on both IPv4 and IPv6 LLA addresses
>>>>>>>>>> +NETNS_DAEMONIZE([bgp-daemon], [nc -l -k 172.16.1.1 179],
>>>>>>>>>> [bgp_v4.pid])
>>>>>>>>>> +NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only
>>>>>>>>>> 172.16.1.1 179])
>>>>>>>>>> +
>>>>>>>>>> +NETNS_DAEMONIZE([bgp-daemon], [nc -l -6 -k
>>>>>>>>>> fe80::200:2ff:fe01:203%bgp-daemon 179], [bgp_v6.pid])
>>>>>>>>>> +NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only
>>>>>>>>>> -6 fe80::200:2ff:fe01:203%ext-foo 179])
>>>>>>>>>> +
>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>>>>>>> +
>>>>>>>>>> +as ovn-sb
>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>>>>>>> +
>>>>>>>>>> +as ovn-nb
>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>>>>>>> +
>>>>>>>>>> +as northd
>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
>>>>>>>>>> +
>>>>>>>>>> +as
>>>>>>>>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>>>>>>>>>> +/.*terminating with signal 15.*/d"])
>>>>>>>>>> +AT_CLEANUP
>>>>>>>>>> +])
>>>>>>>>> Regards,
>>>>>>>>> Dumitru
>>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> dev mailing list
>>>>>>> [email protected]
>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>> _______________________________________________
>>>> dev mailing list
>>>> [email protected]
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>> _______________________________________________
>> dev mailing list
>> [email protected]
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev