On 8/8/24 14:04, Dumitru Ceara wrote:
> On 8/8/24 13:51, Vladislav Odintsov wrote:
>>
>> On 08.08.2024 18:45, Dumitru Ceara wrote:
>>> On 8/8/24 11:14, Vladislav Odintsov wrote:
>>>> On 08.08.2024 15:51, [email protected] wrote:
>>>>> On Thu, 2024-08-08 at 10:46 +0200, Dumitru Ceara wrote:
>>>>>> On 8/8/24 10:42, Vladislav Odintsov wrote:
>>>>>>> On 08.08.2024 03:16, Dumitru Ceara wrote:
>>>>>>>> Re-adding the dev list.
>>>>>>>>
>>>>>>>> On 8/7/24 18:12, Vladislav Odintsov wrote:
>>>>>>>>> Hi Dumitru,
>>>>>>>>>
>>>>>>>> Hi Vladislav,
>>>>>>>>
>>>>>>>>> I'd like to add some thoughts to your inputs.
>>>>>>>>>
>>>>>>>> Thanks for that, I added some more of my own below. :)
>>>>>>>>
>>>>>>>>> On 06.08.2024 19:23, Dumitru Ceara wrote:
>>>>>>>>>> Hi Martin,
>>>>>>>>>>
>>>>>>>>>> Sorry, I was reviewing V3 but I was slow in actually sending
>>>>>>>>>> out the email.
>>>>>>>>>>
>>>>>>>>>> On 8/6/24 13:17, Martin Kalcok wrote:
>>>>>>>>>>> This change adds a 'bgp-redirect' option to LRP that allows
>>>>>>>>>>> redirecting of BGP control plane traffic to an arbitrary
>>>>>>>>>>> LSP
>>>>>>>>>>> in its peer LS.
>>>>>>>>>>>
>>>>>>>>>>> The option expects a string with a LSP name. When set,
>>>>>>>>>>> any traffic entering LS that's destined for any of the
>>>>>>>>>>> LRP's IP addresses (including IPv6 LLA) is redirected
>>>>>>>>>>> to the LSP specified in the option's value.
>>>>>>>>>>>
>>>>>>>>>>> This enables external BGP daemons to listen on an interface
>>>>>>>>>>> bound to a LSP and effectively act as if they were
>>>>>>>>>>> listening
>>>>>>>>>>> on (and speaking from) LRP's IP address.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Martin Kalcok <[email protected]>
>>>>>>>>>>> ---
>>>>>>>>>> Strictly on this version of the patch, and with the thought
>>>>>>>>>> in mind that
>>>>>>>>>> we might want to consider this feature experimental [0] and
>>>>>>>>>> maybe change
>>>>>>>>>> it (NB too) in the future, I left a few comments inline. 
>>>>>>>>>> With those
>>>>>>>>>> addressed I think the patch looks OK to me.
>>>>>>>>>>
>>>>>>>>>> [0]
>>>>>>>>>> https://patchwork.ozlabs.org/project/ovn/patch/[email protected]/
>>>>>>>>>>
>>>>>>>>>> Now, for the point when we'd remove the "experimental" tag:
>>>>>>>>>>
>>>>>>>>>> In general, it makes me a bit uneasy that we have to share
>>>>>>>>>> the MAC and
>>>>>>>>>> IP between the LRP and the VIF of logical switch port that's
>>>>>>>>>> connected
>>>>>>>>>> to the same switch as the LRP.
>>>>>>>>>>
>>>>>>>>>> I was thinking of alternatives for the future and the only
>>>>>>>>>> things I
>>>>>>>>>> could come up with until now are:
>>>>>>>>>>
>>>>>>>>>> 1. Create a separate, "disconnected" logical switch:
>>>>>>>>>> ovn-nbctl ls-add bgp
>>>>>>>>>>
>>>>>>>>>> Add the bgp-daemon port to it:
>>>>>>>>>> ovn-nbctl lsp-add bgp bgp-daemon.
>>>>>>>>>>
>>>>>>>>>> Then we don't need "unknown" either, I think.
>>>>>>>>>>
>>>>>>>>>> But it's not possible today to move packets from the ingress
>>>>>>>>>> pipeline of
>>>>>>>>>> a logical switch ("public" in this test) to the egress
>>>>>>>>>> pipeline of a
>>>>>>>>>> different logical switch ("bgp" in my example).  It also
>>>>>>>>>> feels icky to
>>>>>>>>>> implement that.
>>>>>>>>>>
>>>>>>>>>> 2. Add the ability to bind an OVS port directly to a logical
>>>>>>>>>> router port:
>>>>>>>>>>
>>>>>>>>>> then we could do the same type of redirect you do here but
>>>>>>>>>> instead in
>>>>>>>>>> the logical router pipeline.  The advantage is we don't have
>>>>>>>>>> to drop any
>>>>>>>>>> non-bgp traffic in the switch pipeline.  The switch keeps
>>>>>>>>>> functioning as
>>>>>>>>>> it does today.
>>>>>>>>>>
>>>>>>>>>> Maybe an advantage of this second alternative would be that
>>>>>>>>>> we can
>>>>>>>>>> easily attach a filtering option to this LRP (e.g.,
>>>>>>>>>> LRP.options:control-traffic-filter) to allow us to be more
>>>>>>>>>> flexible in
>>>>>>>>>> what kind of traffic we forward to the actuall routing
>>>>>>>>>> protocol daemon
>>>>>>>>>> that runs behind that OVS port - Vladislav also mentioned
>>>>>>>>>> during the
>>>>>>>>>> meeting it might be interesting to forward BFD packets to the
>>>>>>>>>> FRR (or
>>>>>>>>>> whatever application implements the routing protocol)
>>>>>>>>>> instance too.
>>>>>>>>> The idea to be able to bind LRP to OVS port sounds very
>>>>>>>>> interesting to
>>>>>>>>> me. But with a note that it's not a pure "bind", but a partial:
>>>>>>>>> as you
>>>>>>>>> wrote with some "filter" applied to pass not all the traffic.
>>>>>>>>> The main
>>>>>>>>> idea here is to pass only control plane traffic destined to
>>>>>>>>> LRP's
>>>>>>>> As we're discussing on the other thread (Martin pointed it out)
>>>>>>>> we also
>>>>>>>> probably need to involve conntrack and allow replies to any kind
>>>>>>>> of
>>>>>>>> traffic initiated by the entity behind the LRP's VIF (e.g., BGP
>>>>>>>> sessions
>>>>>>>> initiated from the host).
>>>>>>>>
>>>>>>>>> addresses. Other than that seems odd. Transit traffic should be
>>>>>>>>> remained
>>>>>>>>> in LR pipeline otherwise it will have no difference with a
>>>>>>>>> regular VIF LSP.
>>>>>>>>>
>>>>>>>> Definitely, traffic that needs to be DNATed (DNAT/unSNAT rules or
>>>>>>>> LB
>>>>>>>> rules that will change the destination IP from LRP IP to
>>>>>>>> something else)
>>>>>>>> should not be affected.  All other "transit" traffic doesn't have
>>>>>>>> LRP
>>>>>>>> IP, does it?
>>>>>>> You're right.
>>>>>>>
>>>>>>> @Dumitru, @Martin, what if we just redirect all traffic destined to
>>>>>>> LRP
>>>>>>> IPs to redirect port? Is there any drawbacks?
>>>>>>> For security it is possible to optionally use ACLs with or without
>>>>>>> conntrack. It's up to user/CMS.
>>>>>>>
>>>>>>> This seems quite simple in the code and very flexible. So no
>>>>>>> additional
>>>>>>> flows seems to be needed in future to support any other routing
>>>>>>> protocols (or for another possible non-routing usecases).
>>>>>>>
>>>>>> Won't this break all "transit" traffic that has destination IP the
>>>>>> LRP
>>>>>> IP (DNAT, LB), etc?
>>>>>>
>>>>> I'm afraid so. I guess I was just tired yesterday when I made that
>>>>> proposal for general redirect. Given that the redirect is implemented
>>>>> in the LS pipeline, it would eat up all the traffic for LRP's IP before
>>>>> the DNAT/unSNAT occurs in the LR pipeline. I'll give it a quick test in
>>>>> case I'm wrong here again.
>>>> Oops, this will definitely break that cases. For some reason I thought
>>>> only about routing cases.
>>>>
>>>>> If that's the case, I'll try the conntrack approach and update you all
>>>>> asap.
>>>>>
>>>>> Martin.
>>>>>
>>>>>>>>> Having a filter (or match like in ACL or
>>>>>>>>> Logical_Router_Policies) looks
>>>>>>>>> more flexible in terms of used protocols. This can coexist with
>>>>>>>>> proposal
>>>>>>>>> from current patch, where the flow is not programmable from
>>>>>>>>> user.
>>>>>>>>>
>>>>>>>> I think so too.
>>>>>>>> Regards,
>>>>>>>> Dumitru
>>>>>>>>
>>>>>>>>>> But again, if we consider the current feature "experimental",
>>>>>>>>>> we can
>>>>>>>>>> spend more time during the next release cycle to figure out
>>>>>>>>>> what's best.
>>>>>>>>>>
>>>>>>>>>>> v4 of this patch renames the feature from "bgp-mirror" to
>>>>>>>>>>> "bgp-redirect"
>>>>>>>>>>> as discussed during community meeting.
>>>>>>>>>>>
>>>>>>>>>>>  northd/northd.c         | 108
>>>>>>>>>>> ++++++++++++++++++++++++++++++++++++++++
>>>>>>>>>>>  northd/ovn-northd.8.xml |  23 +++++++++
>>>>>>>>>>>  ovn-nb.xml              |  14 ++++++
>>>>>>>>>>>  tests/ovn-northd.at     |  58 +++++++++++++++++++++
>>>>>>>>>>>  tests/system-ovn.at     |  86
>>>>>>>>>>> ++++++++++++++++++++++++++++++++
>>>>>>>>>>>  5 files changed, 289 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/northd/northd.c b/northd/northd.c
>>>>>>>>>>> index 4353df07d..088104f25 100644
>>>>>>>>>>> --- a/northd/northd.c
>>>>>>>>>>> +++ b/northd/northd.c
>>>>>>>>>>> @@ -13048,6 +13048,113 @@
>>>>>>>>>>> build_arp_resolve_flows_for_lrp(struct ovn_port *op,
>>>>>>>>>>>      }
>>>>>>>>>>>  }
>>>>>>>>>>>  
>>>>>>>>>>> +static void
>>>>>>>>>>> +build_bgp_redirect_rule__(
>>>>>>>>>>> +        const char *s_addr, const char *bgp_port_name,
>>>>>>>>>>> bool is_ipv6,
>>>>>>>>>>> +        struct ovn_port *ls_peer, struct lflow_table
>>>>>>>>>>> *lflows,
>>>>>>>>>>> +        struct ds *match, struct ds *actions)
>>>>>>>>>>> +{
>>>>>>>>>>> +    int ip_ver = is_ipv6 ? 6 : 4;
>>>>>>>>>>> +    /* Redirect packets in the input pipeline destined for
>>>>>>>>>>> LR's IP to
>>>>>>>>>>> +     * the port specified in 'bgp-redirect' option.
>>>>>>>>>>> +     */
>>>>>>>>>>> +    ds_clear(match);
>>>>>>>>>>> +    ds_clear(actions);
>>>>>>>>>>> +    ds_put_format(match, "ip%d.dst == %s && tcp.dst ==
>>>>>>>>>>> 179", ip_ver, s_addr);
>>>>>>>>>>> +    ds_put_format(actions, "outport = \"%s\"; output;",
>>>>>>>>>>> bgp_port_name);
>>>>>>>>>>> +    ovn_lflow_add(lflows, ls_peer->od,
>>>>>>>>>>> S_SWITCH_IN_L2_LKUP, 100,
>>>>>>>>>>> +                  ds_cstr(match),
>>>>>>>>>>> +                  ds_cstr(actions),
>>>>>>>>>>> +                  ls_peer->lflow_ref);
>>>>>>>>>>> +
>>>>>>>>>>> +
>>>>>>>>>>> +    /* Drop any traffic originating from 'bgp-redirect'
>>>>>>>>>>> port that does
>>>>>>>>>>> +     * not originate from BGP daemon port. This blocks
>>>>>>>>>>> unnecessary
>>>>>>>>>>> +     * traffic like ARP broadcasts or IPv6 router
>>>>>>>>>>> solicitation packets
>>>>>>>>>>> +     * from the dummy 'bgp-redirect' port.
>>>>>>>>>>> +     */
>>>>>>>>>>> +    ds_clear(match);
>>>>>>>>>>> +    ds_put_format(match, "inport == \"%s\"",
>>>>>>>>>>> bgp_port_name);
>>>>>>>>>>> +    ovn_lflow_add(lflows, ls_peer->od,
>>>>>>>>>>> S_SWITCH_IN_CHECK_PORT_SEC, 80,
>>>>>>>>>>> +                  ds_cstr(match),
>>>>>>>>>>> +                  REGBIT_PORT_SEC_DROP " = 1; next;",
>>>>>>>>>>> +                  ls_peer->lflow_ref);
>>>>>>>>>> Not a huge deal but this logical flow gets regenerated
>>>>>>>>>> multiple times
>>>>>>>>>> for each bgp_port_name (for each LRP address).  In the end
>>>>>>>>>> we'll only
>>>>>>>>>> install one in the database so there's no impact on SB.  But
>>>>>>>>>> maybe we
>>>>>>>>>> should consider moving this part out of this function.
>>>>>>>>>>
>>>>>>>>>>> +
>>>>>>>>>>> +    ds_put_format(match,
>>>>>>>>>>> +                  " && ip%d.src == %s && tcp.src == 179",
>>>>>>>>>>> +                  ip_ver,
>>>>>>>>>>> +                  s_addr);
>>>>>>>>>>> +    ovn_lflow_add(lflows, ls_peer->od,
>>>>>>>>>>> S_SWITCH_IN_CHECK_PORT_SEC, 81,
>>>>>>>>>>> +                  ds_cstr(match),
>>>>>>>>>>> +                  REGBIT_PORT_SEC_DROP " =
>>>>>>>>>>> check_in_port_sec(); next;",
>>>>>>>>>>> +                  ls_peer->lflow_ref);
>>>>>>>>>> In your system test we set 'addresses = "unknown"' for
>>>>>>>>>> 'bgp_port_name'.
>>>>>>>>>>   And I guess we kind of have to do it like that all the time
>>>>>>>>>> when using
>>>>>>>>>> this feature.  I'm trying to think of scenarios in which
>>>>>>>>>> users set
>>>>>>>>>> LSP.addresses = 'unknown' and LSP.port-security = '<some-
>>>>>>>>>> addresses>'.
>>>>>>>>>> I'm not sure I saw those until now but maybe I'm wrong.
>>>>>>>>> I'm also interested whether this LSP port with unknown
>>>>>>>>> addresses will
>>>>>>>>> get all the broadcast traffic?
>>>>>>>>>
>>>>>>>>> @Martin, didn't you look for ARP behavior? If we have one LS
>>>>>>>>> with three
>>>>>>>>> LSPs:
>>>>>>>>> - 1 router type LSP with MAC 00:00:00:00:00:01 and IP
>>>>>>>>> 169.254.0.1/30
>>>>>>>>> - one "vif" LSP with "unknown" addresses set and configured
>>>>>>>>> underlying
>>>>>>>>> OVS port with same MAC and IP set
>>>>>>>>> - one localnet port for external connectivity.
>>>>>>>>>
>>>>>>>>> If some party from localnet side sends broadcast ARP request
>>>>>>>>> "Who has
>>>>>>>>> 169.254.0.1? Answer <...>", will this ARP request got flooded
>>>>>>>>> to both
>>>>>>>>> LRP and LSP with "unknown" addresses? If yes, will the sender
>>>>>>>>> receive 2
>>>>>>>>> ARP replies?
>>>>>>>>>
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +static void
>>>>>>>>>>> +build_lrouter_bgp_redirect(
>>>>>>>>>>> +        struct ovn_port *op, struct lflow_table *lflows,
>>>>>>>>>>> +        struct ds *match, struct ds *actions)
>>>>>>>>>>> +{
>>>>>>>>>>> +    /* LRP has to have a peer.*/
>>>>>>>>>>> +    if (op->peer == NULL) {
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    }
>>>>>>>>>>> +    /* LRP has to have NB record.*/
>>>>>>>>>>> +    if (op->nbrp == NULL) {
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    /* Proceed only for LRPs that have 'bgp-redirect'
>>>>>>>>>>> option set. Value of this
>>>>>>>>>>> +     * option is the name of LSP to which a BGP traffic
>>>>>>>>>>> will be redirected.
>>>>>>>>>>> +     */
>>>>>>>>>>> +    const char *bgp_port = smap_get(&op->nbrp->options,
>>>>>>>>>>> "bgp-redirect");
>>>>>>>>>>> +    if (bgp_port == NULL) {
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    /* Ensure that LSP, to which the BGP traffic is
>>>>>>>>>>> redirected, exists.*/
>>>>>>>>>>> +    struct ovn_port *peer_lsp;
>>>>>>>>>>> +    bool bgp_port_exists = false;
>>>>>>>>>>> +    HMAP_FOR_EACH (peer_lsp, dp_node, &op->peer->od-
>>>>>>>>>>>> ports) {
>>>>>>>>>>> +        size_t peer_lsp_s = strlen(peer_lsp->key);
>>>>>>>>>>> +        if (peer_lsp_s == strlen(bgp_port)
>>>>>>>>>>> +            && !strncmp(peer_lsp->key, bgp_port,
>>>>>>>>>>> peer_lsp_s)){
>>>>>>>>>>> +            bgp_port_exists = true;
>>>>>>>>>>> +            break;
>>>>>>>>>>> +        }
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    if (!bgp_port_exists) {
>>>>>>>>>>> +        static struct vlog_rate_limit rl =
>>>>>>>>>>> VLOG_RATE_LIMIT_INIT(1, 5);
>>>>>>>>>>> +        VLOG_WARN_RL(&rl, "Option 'bgp-redirect' set on
>>>>>>>>>>> Logical Router Port "
>>>>>>>>>>> +                          "'%s' refers to non-existent
>>>>>>>>>>> Logical Switch Port. "
>>>>>>>>>>> +                          "BGP redirecting won't be
>>>>>>>>>>> configured.",
>>>>>>>>>>> +                          op->key);
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    if (op->cr_port != NULL) {
>>>>>>>>>>> +        static struct vlog_rate_limit rl =
>>>>>>>>>>> VLOG_RATE_LIMIT_INIT(1, 5);
>>>>>>>>>>> +        VLOG_WARN_RL(&rl, "Option 'bgp-redirect' is not
>>>>>>>>>>> supported on"
>>>>>>>>>>> +                          " Distributed Gateway Port
>>>>>>>>>>> '%s'", op->key);
>>>>>>>>>>> +        return;
>>>>>>>>>>> +    }
>>>>>>>>>>> +
>>>>>>>>>>> +    /* Mirror traffic destined for LRP's IPs and default
>>>>>>>>>>> BGP port
>>>>>>>>>>> +     * to the port defined in 'bgp-redirect' option.
>>>>>>>>>>> +     */
>>>>>>>>>>> +    for (size_t i = 0; i < op->lrp_networks.n_ipv4_addrs;
>>>>>>>>>>> i++) {
>>>>>>>>>>> +        const char *ip_s = op-
>>>>>>>>>>>> lrp_networks.ipv4_addrs[i].addr_s;
>>>>>>>>>>> +        build_bgp_redirect_rule__(ip_s, bgp_port, false, 
>>>>>>>>>>> op->peer, lflows,
>>>>>>>>>>> +                                  match, actions);
>>>>>>>>>>> +    }
>>>> I'm thinking more and more that having a "match" expression in a
>>>> "NB.Logical_Router_Port.options:redirect-match" option it a good choice.
>>>>
>>> Would the match be enforced only for redirected traffic (from outside to
>>> bgp target LSP)?
>> I guess I didn't get your question, sorry.
> 
> I was thinking of the case Martin brought up yesterday, with the session
> initiated from inside.
> 
> If we have the following config (not sure if the option names will
> change but I hope it's clear):
> 
> LRP {
>   .name = lrp
>   .networks = [{00:00:00:00:00:01, 1.1.1.1/24}]
>   .options = {
>     .redirect-port = 'bgp-daemon'
>     .redirect-match = 'tcp.dst == 179 || tcp.src == 179'
>   }
> }
> 
> Then, in the switch pipeline in the L2_LOOKUP stage, we'll only force
> the outport to be 'bgp-daemon' if "destination IP == 1.1.1.1 &&
> 'redirect-match'" is true.  So, only traffic coming from "outside"
> towards the router IP and that also matches the filter.
> 
> No additional restrictions.
> 
> Will this be enough though?
> 
> What if 'bgp-daemon' ARPs for 1.1.1.2 (an external host)?  How does that
> work?  ARP reply is unicast and will have dmac that of the LRP.
> 
> We could include ARP and ND in the match too but then what about ARP
> replies that actually should go to the router port?
> 

Also, what about other control traffic, e.g.:

a. ICMP replies from external hosts for ICMP echo initiated by bgp-daemon

vs

b. ICMP replies from external hosts for ICMP echo initiated by something
else (a different VIF) behind the LR that got SNAT-ed to the LRP IP.

> I'm actually thinking more and more that we shouldn't redirect any
> traffic like this, we should just bind the VIF directly to the LRP.
> 
>>>> LRP can have multiple IP addresses and enabling redirect for all of them
>>>> could be implicit for the user and may contain security concerns. As an
>>>> example: LR can have configured LB with tcp/179. If user configured
>>>> peering IP-address (which can be in a link-local address range) and also
>>>> added an IP address for Load Balancer in a `networks` list of a LRP.
>>>> Enabling BGP redirect for all IPs of LRP will break load balancing for
>>>> tcp/179. While it is not needed for load balancers to assign their VIP
>>>> to networks, there is no validations, which prohibit doing so. It is
>>> I think this is actually something that ovn-kubernetes does in their
>>> deployments for cluster-IP services.  And I suspect that today there's
>>> nothing preventing users from creating cluster-IP services for TCP/179.
>>
>> So, this proofs that this is a possible problem if enabled redirect and
>> if load balancer's IP is configured to be in networks list.
>>
>> In any case this doesn't prevent from current implementation to merged
>> as "experimental". I'm just started discussion about a GA solution.
>>
>>>> still possible and can be difficult to catch such behavior for
>>>> redirected traffic with Load Balancers.
>>>>
>>>>>>>>>>> +    for (size_t i = 0; i < op->lrp_networks.n_ipv6_addrs;
>>>>>>>>>>> i++) {
>>>>>>>>>>> +        const char *ip_s = op-
>>>>>>>>>>>> lrp_networks.ipv6_addrs[i].addr_s;
>>>>>>>>>>> +        build_bgp_redirect_rule__(ip_s, bgp_port, true,
>>>>>>>>>>> op->peer, lflows,
>>>>>>>>>>> +                                  match, actions);
>>>>>>>>>>> +    }
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>>  /* This function adds ARP resolve flows related to a LSP.
>>>>>>>>>>> */
>>>>>>>>>>>  static void
>>>>>>>>>>>  build_arp_resolve_flows_for_lsp(
>>>>>>>>>>> @@ -16003,6 +16110,7 @@
>>>>>>>>>>> build_lswitch_and_lrouter_iterate_by_lrp(struct ovn_port
>>>>>>>>>>> *op,
>>>>>>>>>>>                                  lsi->meter_groups, op-
>>>>>>>>>>>> lflow_ref);
>>>>>>>>>>>      build_lrouter_icmp_packet_toobig_admin_flows(op, lsi-
>>>>>>>>>>>> lflows, &lsi->match,
>>>>>>>>>>>                                                   &lsi-
>>>>>>>>>>>> actions, op->lflow_ref);
>>>>>>>>>>> +    build_lrouter_bgp_redirect(op, lsi->lflows, &lsi-
>>>>>>>>>>>> match, &lsi->actions);
>>>>>>>>>>>  }
>>>>>>>>>>>  
>>>>>>>>>>>  static void *
>>>>>>>>>>> diff --git a/northd/ovn-northd.8.xml b/northd/ovn-
>>>>>>>>>>> northd.8.xml
>>>>>>>>>>> index b06b09ac5..4cbdf8f74 100644
>>>>>>>>>>> --- a/northd/ovn-northd.8.xml
>>>>>>>>>>> +++ b/northd/ovn-northd.8.xml
>>>>>>>>>>> @@ -284,6 +284,21 @@
>>>>>>>>>>>          dropped in the next stage.
>>>>>>>>>>>        </li>
>>>>>>>>>>>  
>>>>>>>>>>> +      <li>
>>>>>>>>>>> +        For each logical port that's defined as a target
>>>>>>>>>>> of BGP redirecting
>>>>>>>>>>> +        (via <code>bgp-redirect</code> option set on
>>>>>>>>>>> Logical Router Port), a
>>>>>>>>>>> +        filter is set in place that disallows any traffic
>>>>>>>>>>> entering this port
>>>>>>>>>>> +        that does not originate from Logical Router Port's
>>>>>>>>>>> IPs and default BGP
>>>>>>>>>>> +        port (TCP 179). This filtering is achieved by two
>>>>>>>>>>> rules. First rule has
>>>>>>>>>>> +        priority 81, it matches on <code>inport ==
>>>>>>>>>>> <var>BGP_MIRROR_PORT</var>
>>>>>>>>>>> +        &amp;&amp; ip.src == <var>LRP_IP</var> &amp;&amp;
>>>>>>>>>>> tcp.src == 179</code>
>>>>>>>>>>> +        and allows traffic further with
>>>>>>>>>>> <code>REGBIT_PORT_SEC_DROP" =
>>>>>>>>>>> +        check_in_port_sec(); next;</code>. Second rule
>>>>>>>>>>> with priority 80 matches
>>>>>>>>>>> +        the rest of the traffic from that port and sets
>>>>>>>>>>> +        <code>REGBIT_PORT_SEC_DROP" = 1; next;"</code> so
>>>>>>>>>>> that the packets are
>>>>>>>>>>> +        dropped in the next stage.
>>>>>>>>>>> +      </li>
>>>>>>>>>>> +
>>>>>>>>>>>        <li>
>>>>>>>>>>>          For each (enabled) vtep logical port, a priority
>>>>>>>>>>> 70 flow is added which
>>>>>>>>>>>          matches on all packets and applies the action
>>>>>>>>>>> @@ -1949,6 +1964,14 @@ output;
>>>>>>>>>>>          on the logical switch.
>>>>>>>>>>>        </li>
>>>>>>>>>>>  
>>>>>>>>>>> +      <li>
>>>>>>>>>>> +        For any logical port that's defined as a target of
>>>>>>>>>>> BGP redirecting (via
>>>>>>>>>>> +        <code>bgp-redirect</code> option set on Logical
>>>>>>>>>>> Router Port), a
>>>>>>>>>>> +        priority-100 flow is added that redirects traffic
>>>>>>>>>>> for Logical Router
>>>>>>>>>>> +        Port IPs (including IPv6 LLA) and TCP port 179, to
>>>>>>>>>>> the targeted
>>>>>>>>>>> +        logical port.
>>>>>>>>>>> +      </li>
>>>>>>>>>>> +
>>>>>>>>>>>        <li>
>>>>>>>>>>>          Priority-90 flows for transit switches that
>>>>>>>>>>> forward registered
>>>>>>>>>>>          IP multicast traffic to their corresponding
>>>>>>>>>>> multicast group , which
>>>>>>>>>>> diff --git a/ovn-nb.xml b/ovn-nb.xml
>>>>>>>>>>> index 9552534f6..2d2f5c9e0 100644
>>>>>>>>>>> --- a/ovn-nb.xml
>>>>>>>>>>> +++ b/ovn-nb.xml
>>>>>>>>>>> @@ -3440,6 +3440,20 @@ or
>>>>>>>>>>>          </p>
>>>>>>>>>>>        </column>
>>>>>>>>>>>  
>>>>>>>>>>> +      <column name="options" key="bgp-redirect"
>>>>>>>>>>> type='{"type": "string"}'>
>>>>>>>>>>> +        <p>
>>>>>>>>>>> +          This option expects a name of a Logical Switch
>>>>>>>>>>> Port that's present
>>>>>>>>>>> +          in the peer's Logical Switch. If set, it causes
>>>>>>>>>>> any traffic
>>>>>>>>>>> +          that's destined for Logical Router Port's IP
>>>>>>>>>>> addresses (including
>>>>>>>>>>> +          its IPv6 LLA) and the default BGP port (TCP
>>>>>>>>>>> 179), to be redirected
>>>>>>>>>>> +          to the specified Logical Switch Port.
>>>>>>>>>>> +
>>>>>>>>>>> +          This allows external BGP daemon to be bound to a
>>>>>>>>>>> port in OVN's
>>>>>>>>>>> +          Logical Switch and act as if it was listening on
>>>>>>>>>>> Logical Router
>>>>>>>>>>> +          Port's IP address.
>>>>>>>>>>> +        </p>
>>>>>>>>>>> +      </column>
>>>>>>>>>>> +
>>>>>>>>>>>        <column name="options" key="gateway_mtu_bypass">
>>>>>>>>>>>          <p>
>>>>>>>>>>>            When configured, represents a match expression,
>>>>>>>>>>> in the same
>>>>>>>>>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>>>>>>>>>> index a389d1988..df1e924ab 100644
>>>>>>>>>>> --- a/tests/ovn-northd.at
>>>>>>>>>>> +++ b/tests/ovn-northd.at
>>>>>>>>>>> @@ -12721,3 +12721,61 @@ AT_CHECK([ovn-sbctl dump-flows lr
>>>>>>>>>>> | grep lr_in_dnat | ovn_strip_lflows], [0], [d
>>>>>>>>>>>  
>>>>>>>>>>>  AT_CLEANUP
>>>>>>>>>>>  ])
>>>>>>>>>>> +
>>>>>>>>>>> +OVN_FOR_EACH_NORTHD_NO_HV([
>>>>>>>>>>> +AT_SETUP([BGP control plane redirect])
>>>>>>>>>>> +ovn_start
>>>>>>>>>>> +
>>>>>>>>>>> +check ovn-sbctl chassis-add hv1 geneve 127.0.0.1
>>>>>>>>>>> +
>>>>>>>>>>> +check ovn-nbctl lr-add lr -- \
>>>>>>>>>>> +    lrp-add lr lr-ls 02:ac:10:01:00:01 172.16.1.1/24
>>>>>>>>>>> +check ovn-nbctl --wait=sb set logical_router lr
>>>>>>>>>>> options:chassis=hv1
>>>>>>>>>>> +
>>>>>>>>>>> +check ovn-nbctl ls-add ls -- \
>>>>>>>>>>> +    lsp-add ls ls-lr -- \
>>>>>>>>>>> +    lsp-set-type ls-lr router -- \
>>>>>>>>>>> +    lsp-set-addresses ls-lr router -- \
>>>>>>>>>>> +    lsp-set-options ls-lr router-port=lr-ls
>>>>>>>>>>> +
>>>>>>>>>>> +check ovn-nbctl lsp-add ls lsp-bgp -- \
>>>>>>>>>>> +    lsp-set-addresses lsp-bgp unknown
>>>>>>>>>>> +
>>>>>>>>>>> +# Function that ensures that no BGP redirect ruels are
>>>>>>>>>>> installed.
>>>>>>>>>>> +check_no_bgp_redirect() {
>>>>>>>>>>> +    AT_CHECK([ovn-sbctl dump-flows ls | grep
>>>>>>>>>>> ls_in_l2_lkup  | grep "tcp.dst == 179" | wc -l], [0], [0
>>>>>>>>>>> +])
>>>>>>>>>>> +
>>>>>>>>>>> +    AT_CHECK([ovn-sbctl dump-flows ls | grep
>>>>>>>>>>> ls_in_check_port_sec | grep -E "priority=80|priority=81" |
>>>>>>>>>>> wc -l], [0], [0
>>>>>>>>>>> +])
>>>>>>>>>>> +}
>>>>>>>>>>> +
>>>>>>>>>>> +# By default, no rules related to BGP redirect are present
>>>>>>>>>>> +check_no_bgp_redirect
>>>>>>>>>>> +
>>>>>>>>>>> +# Set "lsp-bgp" port as target of BGP control plane
>>>>>>>>>>> redirected traffic
>>>>>>>>>>> +check ovn-nbctl --wait=sb set logical_router_port lr-ls
>>>>>>>>>>> options:bgp-redirect=lsp-bgp
>>>>>>>>>>> +
>>>>>>>>>>> +# Check that BGP control plane traffic is redirected to
>>>>>>>>>>> "lsp-bgp"
>>>>>>>>>>> +AT_CHECK([ovn-sbctl dump-flows ls | grep ls_in_l2_lkup |
>>>>>>>>>>> grep "tcp.dst == 179" | ovn_strip_lflows], [0], [dnl
>>>>>>>>>>> +  table=??(ls_in_l2_lkup      ), priority=100  ,
>>>>>>>>>>> match=(ip4.dst == 172.16.1.1 && tcp.dst == 179),
>>>>>>>>>>> action=(outport = "lsp-bgp"; output;)
>>>>>>>>>>> +  table=??(ls_in_l2_lkup      ), priority=100  ,
>>>>>>>>>>> match=(ip6.dst == fe80::ac:10ff:fe01:1 && tcp.dst == 179),
>>>>>>>>>>> action=(outport = "lsp-bgp"; output;)
>>>>>>>>>>> +])
>>>>>>>>>>> +
>>>>>>>>>>> +# Check that only BGP-related traffic is accepted on "lsp-
>>>>>>>>>>> bgp" port
>>>>>>>>>>> +AT_CHECK([ovn-sbctl dump-flows ls | grep
>>>>>>>>>>> ls_in_check_port_sec | grep -E "priority=80|priority=81" |
>>>>>>>>>>> ovn_strip_lflows], [0], [dnl
>>>>>>>>>>> +  table=??(ls_in_check_port_sec), priority=80   ,
>>>>>>>>>>> match=(inport == "lsp-bgp"), action=(reg0[[15]] = 1; next;)
>>>>>>>>>>> +  table=??(ls_in_check_port_sec), priority=81   ,
>>>>>>>>>>> match=(inport == "lsp-bgp" && ip4.src == 172.16.1.1 &&
>>>>>>>>>>> tcp.src == 179), action=(reg0[[15]] = check_in_port_sec();
>>>>>>>>>>> next;)
>>>>>>>>>>> +  table=??(ls_in_check_port_sec), priority=81   ,
>>>>>>>>>>> match=(inport == "lsp-bgp" && ip6.src ==
>>>>>>>>>>> fe80::ac:10ff:fe01:1 && tcp.src == 179), action=(reg0[[15]]
>>>>>>>>>>> = check_in_port_sec(); next;)
>>>>>>>>>>> +])
>>>>>>>>>>> +
>>>>>>>>>>> +# Remove 'bgp-redirect' option from LRP and check that
>>>>>>>>>>> rules are removed
>>>>>>>>>>> +check ovn-nbctl --wait=sb remove logical_router_port lr-ls
>>>>>>>>>>> options bgp-redirect
>>>>>>>>>>> +check_no_bgp_redirect
>>>>>>>>>>> +
>>>>>>>>>>> +# Set non-existent LSP as target of 'bgp-redirect' and
>>>>>>>>>>> check that no rules are added
>>>>>>>>>>> +check ovn-nbctl --wait=sb set logical_router_port lr-ls
>>>>>>>>>>> options:bgp-redirect=lsp-foo
>>>>>>>>>>> +check_no_bgp_redirect
>>>>>>>>>>> +
>>>>>>>>>>> +AT_CLEANUP
>>>>>>>>>>> +])
>>>>>>>>>>> diff --git a/tests/system-ovn.at b/tests/system-ovn.at
>>>>>>>>>>> index c24ede7c5..c09c1136e 100644
>>>>>>>>>>> --- a/tests/system-ovn.at
>>>>>>>>>>> +++ b/tests/system-ovn.at
>>>>>>>>>>> @@ -13027,3 +13027,89 @@
>>>>>>>>>>> OVS_TRAFFIC_VSWITCHD_STOP(["/failed to query port patch-
>>>>>>>>>>> .*/d
>>>>>>>>>>>  /connection dropped.*/d"])
>>>>>>>>>>>  AT_CLEANUP
>>>>>>>>>>>  ])
>>>>>>>>>>> +
>>>>>>>>>>> +OVN_FOR_EACH_NORTHD([
>>>>>>>>>>> +AT_SETUP([BGP control plane redirect])
>>>>>>>>>> Nit: probably best to add AT_SKIP_IF([test $HAVE_NC = no])
>>>>>>>>>> here.
>>>>>>>>>>
>>>>>>>>>>> +ovn_start
>>>>>>>>>>> +OVS_TRAFFIC_VSWITCHD_START()
>>>>>>>>>>> +
>>>>>>>>>>> +ADD_BR([br-int])
>>>>>>>>>>> +ADD_BR([br-ext])
>>>>>>>>>>> +
>>>>>>>>>>> +ovs-ofctl add-flow br-ext action=normal
>>>>>>>>>>> +# Set external-ids in br-int needed for ovn-controller
>>>>>>>>>>> +ovs-vsctl \
>>>>>>>>>>> +        -- set Open_vSwitch . external-ids:system-id=hv1 \
>>>>>>>>>>> +        -- set Open_vSwitch . external-ids:ovn-
>>>>>>>>>>> remote=unix:$ovs_base/ovn-sb/ovn-sb.sock \
>>>>>>>>>>> +        -- set Open_vSwitch . external-ids:ovn-encap-
>>>>>>>>>>> type=geneve \
>>>>>>>>>>> +        -- set Open_vSwitch . external-ids:ovn-encap-
>>>>>>>>>>> ip=169.0.0.1 \
>>>>>>>>>>> +        -- set bridge br-int fail-mode=secure other-
>>>>>>>>>>> config:disable-in-band=true
>>>>>>>>>> Most ov*-ctl commands should be prefixed with "check" in this
>>>>>>>>>> test.
>>>>>>>>>>
>>>>>>>>>>> +
>>>>>>>>>>> +# Start ovn-controller
>>>>>>>>>>> +start_daemon ovn-controller
>>>>>>>>>>> +
>>>>>>>>>>> +ovn-nbctl lr-add R1 \
>>>>>>>>>>> +    -- set Logical_Router R1 options:chassis=hv1
>>>>>>>>>>> +
>>>>>>>>>>> +ovn-nbctl ls-add public
>>>>>>>>>>> +
>>>>>>>>>>> +ovn-nbctl lrp-add R1 rp-public 00:00:02:01:02:03
>>>>>>>>>>> 172.16.1.1/24
>>>>>>>>>>> +
>>>>>>>>>>> +ovn-nbctl lsp-add public public-rp -- set
>>>>>>>>>>> Logical_Switch_Port public-rp \
>>>>>>>>>>> +    type=router options:router-port=rp-public \
>>>>>>>>>>> +    -- lsp-set-addresses public-rp router
>>>>>>>>>>> +
>>>>>>>>>>> +ovn-nbctl lsp-add public bgp-daemon \
>>>>>>>>>>> +    -- lsp-set-addresses bgp-daemon unknown
>>>>>>>>>>> +
>>>>>>>>>>> +AT_CHECK([ovs-vsctl set Open_vSwitch . external-ids:ovn-
>>>>>>>>>>> bridge-mappings=phynet:br-ext])
>>>>>>>>>>> +ovn-nbctl lsp-add public public1 \
>>>>>>>>>>> +        -- lsp-set-addresses public1 unknown \
>>>>>>>>>>> +        -- lsp-set-type public1 localnet \
>>>>>>>>>>> +        -- lsp-set-options public1 network_name=phynet
>>>>>>>>>>> +
>>>>>>>>>>> +ovn-nbctl --wait=hv sync
>>>>>>>>>>> +
>>>>>>>>>>> +# Set option that redirects BGP control plane traffic to a
>>>>>>>>>>> LSP "bgp-daemon"
>>>>>>>>>>> +ovn-nbctl --wait=sb set logical_router_port rp-public
>>>>>>>>>>> options:bgp-redirect=bgp-daemon
>>>>>>>>>>> +
>>>>>>>>>>> +# Create "bgp-daemon" interface in a namespace with IP and
>>>>>>>>>>> MAC matching LRP "rp-public"
>>>>>>>>>>> +ADD_NAMESPACES(bgp-daemon)
>>>>>>>>>>> +ADD_VETH(bgp-daemon, bgp-daemon, br-int, "172.16.1.1/24",
>>>>>>>>>>> "00:00:02:01:02:03")
>>>>>>>>>>> +
>>>>>>>>>>> +ADD_NAMESPACES(ext-foo)
>>>>>>>>>>> +ADD_VETH(ext-foo, ext-foo, br-ext, "172.16.1.100/24",
>>>>>>>>>>> "00:10:10:01:02:13", \
>>>>>>>>>>> +         "172.16.1.1")
>>>>>>>>>>> +
>>>>>>>>>>> +# Flip the interface down/up to get proper IPv6 LLA
>>>>>>>>>>> +NS_EXEC([bgp-daemon], [ip link set down bgp-daemon])
>>>>>>>>>>> +NS_EXEC([bgp-daemon], [ip link set up bgp-daemon])
>>>>>>>>>>> +
>>>>>>>>>>> +# Wait until IPv6 LLA loses the "tentative" flag otherwise
>>>>>>>>>>> it can't be bound to.
>>>>>>>>>>> +OVS_WAIT_UNTIL([NS_EXEC([bgp-daemon], [ip a show dev bgp-
>>>>>>>>>>> daemon | grep "fe80::" | grep -v tentative])])
>>>>>>>>>>> +OVS_WAIT_UNTIL([NS_EXEC([ext-foo], [ip a show dev ext-foo
>>>>>>>>>>> | grep "fe80::" | grep -v tentative])])
>>>>>>>>>>> +
>>>>>>>>>>> +# Verify that BGP control plane traffic is delivered to
>>>>>>>>>>> the "bgp-daemon"
>>>>>>>>>>> +# interface on both IPv4 and IPv6 LLA addresses
>>>>>>>>>>> +NETNS_DAEMONIZE([bgp-daemon], [nc -l -k 172.16.1.1 179],
>>>>>>>>>>> [bgp_v4.pid])
>>>>>>>>>>> +NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only
>>>>>>>>>>> 172.16.1.1 179])
>>>>>>>>>>> +
>>>>>>>>>>> +NETNS_DAEMONIZE([bgp-daemon], [nc -l -6 -k
>>>>>>>>>>> fe80::200:2ff:fe01:203%bgp-daemon 179], [bgp_v6.pid])
>>>>>>>>>>> +NS_CHECK_EXEC([ext-foo], [echo "TCP test" | nc --send-only
>>>>>>>>>>> -6 fe80::200:2ff:fe01:203%ext-foo 179])
>>>>>>>>>>> +
>>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovn-controller])
>>>>>>>>>>> +
>>>>>>>>>>> +as ovn-sb
>>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>>>>>>>> +
>>>>>>>>>>> +as ovn-nb
>>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovsdb-server])
>>>>>>>>>>> +
>>>>>>>>>>> +as northd
>>>>>>>>>>> +OVS_APP_EXIT_AND_WAIT([ovn-northd])
>>>>>>>>>>> +
>>>>>>>>>>> +as
>>>>>>>>>>> +OVS_TRAFFIC_VSWITCHD_STOP(["/.*error receiving.*/d
>>>>>>>>>>> +/.*terminating with signal 15.*/d"])
>>>>>>>>>>> +AT_CLEANUP
>>>>>>>>>>> +])
>>>>>>>>>> Regards,
>>>>>>>>>> Dumitru
>>>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> dev mailing list
>>>>>>>> [email protected]
>>>>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>>>> _______________________________________________
>>>>> dev mailing list
>>>>> [email protected]
>>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>>> _______________________________________________
>>> dev mailing list
>>> [email protected]
>>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to