Hi Daniel,

I agree with Numan that this seems like a good approach to take.

On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:

On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff <[email protected] <mailto:[email protected]>> wrote:
 >
 > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez <[email protected] <mailto:[email protected]>>
 > > wrote:
 > >
 > > > Hi,
 > > >
 > > > After digging further. The problem seems to be reduced to reusing an
 > > > old gateway IP address for a dnat_and_snat entry.
 > > > When a gateway port is bound to a chassis, its entry will show up in
 > > > the MAC_Binding table (at least when that Logical Switch is connected
 > > > to more than one Logical Router). After deleting the Logical Router
 > > > and all its ports, this entry will remain there. If a new Logical
 > > > Router is created and a Floating IP (dnat_and_snat) is assigned to a
 > > > VM with the old gw IP address, it will become unreachable.
 > > >
 > > > A workaround now from networking-ovn (OpenStack integration) is to
 > > > delete MAC_Binding entries for that IP address upon a FIP creation. I
 > > > think that this however should be done from OVN, what do you folks
 > > > think?
 > > >
 > > >
 > > Agree. Since the MAC_Binding table row is created by ovn-controller, it
 > > should
 > > be handled properly within OVN.
 >
 > I see that this has been sitting here for a while.  The solution seems
 > reasonable to me.  Are either of you working on it?

I started working on it. I came up with a solution (see patch below) which works but I wanted to give you a bit more of context and get your feedback:


                            ^ localnet
                            |
                        +---+---+
                        |       |
                 +------+  pub  +------+
                 |      |       |      |
                 |      +-------+      |
                 | 172.24.4.0/24 <http://172.24.4.0/24>    |
                 |                     |
    172.24.4.220 |                     | 172.24.4.221
             +---+---+             +---+---+
             |       |             |       |
             |  LR0  |             |  LR1  |
             |       |             |       |
             +---+---+             +---+---+
      10.0.0.254 |                     | 20.0.0.254
                 |                     |
             +---+---+             +---+---+
             |       |             |       |
10.0.0.0/24 <http://10.0.0.0/24> |  SW0  |             |  SW1  | 20.0.0.0/24 <http://20.0.0.0/24>
             |       |             |       |
             +---+---+             +---+---+
                 |                     |
                 |                     |
             +---+---+             +---+---+
             |       |             |       |
             |  VM0  |             |  VM1  |
             |       |             |       |
             +-------+             +-------+
             10.0.0.10             20.0.0.10
           172.24.4.100           172.24.4.200


When I ping VM1 floating IP from the external network, a new entry for 172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:

_uuid               : 85e30e87-3c59-423e-8681-ec4cfd9205f9
datapath            : ac5984b9-0fea-485f-84d4-031bdeced29b
ip                  : "172.24.4.221"
logical_port        : "lrp02"
mac                 : "00:00:02:01:02:04"


Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused for VM2 FIP with different MAC and new gateway IP is created (for example 172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable from VM1 until the old MAC_Binding entry gets deleted as pinging 172.24.4.221 will use the wrong address ("00:00:02:01:02:04").

With the patch below, removing LR1 results in deleting all MAC_Binding entries for every datapath where '172.24.4.221' appears in the 'ip' column so the problem goes away.

Another solution would be implementing some kind of 'aging' for MAC_Binding entries but perhaps it's more complex.
Looking forward for your comments :)


diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 58bef7d..a86733e 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2324,6 +2324,18 @@ cleanup_mac_bindings(struct northd_context *ctx, struct hmap *ports)
      }
  }

+static void
+delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
+{
+    const struct sbrec_mac_binding *b, *n;
+    SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
+        if (strstr(ip, b->ip)) {
+            sbrec_mac_binding_delete(b);
+        }
+    }
+}
+
+
 /* Updates the southbound Port_Binding table so that it contains the logical
   * switch ports specified by the northbound database.
   *
@@ -2383,6 +2395,15 @@ build_ports(struct northd_context *ctx,
      /* Delete southbound records without northbound matches. */
      LIST_FOR_EACH_SAFE(op, next, list, &sb_only) {
          ovs_list_remove(&op->list);
+
+        /* Delete all MAC_Binding entries which match the IP addresses of the
+         * deleted logical router port (ie. port with a peer). */
+        const char *peer = smap_get(&op->sb->options, "peer");
+        if (peer) {
+            for (int i = 0; i < op->sb->n_mac; i++) {
+                delete_mac_binding_by_ip(ctx, op->sb->mac[i]);
+            }
+        }
          sbrec_port_binding_delete(op->sb);
          ovn_port_destroy(ports, op);
      }


_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to