Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-12-03 Thread Daniel Alvarez Sanchez
On Mon, Dec 3, 2018 at 3:48 PM Mark Michelson  wrote:
>
> On 12/01/2018 03:44 PM, Han Zhou wrote:
> >
> >
> > On Fri, Nov 30, 2018 at 7:29 AM Daniel Alvarez Sanchez
> > mailto:dalva...@redhat.com>> wrote:
> >  >
> >  > Thanks folks again for the discussion.
> >  > I sent an RFC patch here [0]. I tried it out with my reproducer and it
> >  > seems to work well. Instead of outputting the packet to the localnet
> >  > ofport, it will inject it to the public switch pipeline so it'll get
> >  > broadcasted to the rest of the ports resulting in other Logical
> >  > Routers connected to the external switch updating their neighbours. As
> >  > it's broadcasted, the GARP will also be sent out through the localnet
> >  > port as before.
> >  >
> >  > Looking forward to your comments before moving on and writing tests.
> >  >
> >  > Thanks Numan for your help!
> >  >
> >  > [0]
> > https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354220.html
> >  > On Wed, Nov 28, 2018 at 3:32 PM Daniel Alvarez Sanchez
> >  > mailto:dalva...@redhat.com>> wrote:
> >  > >
> >  > > Hi all,
> >  > >
> >  > > As this thread is getting big I'm summarizing the issue I see so far:
> >  > >
> >  > > * When a dnat_and_snat entry is added to a logical router (or port
> >  > > gets bound to a chassis), ovn-controller will send GARPs to announce
> >  > > the MAC address of the FIP(s) (either the gw port or of the actual FIP
> >  > > MAC address if distributed) only through localnet ports [0].
> >  > >
> >  > > * This means that gateway ports bound to that same chassis and
> >  > > connected to the public switch won't get the GARPs, so they won't
> >  > > update their MAC_Binding entries causing unreachability. In the
> >  > > diagram of this thread, LR0 won't get the GARP sent by ovn-controller
> >  > > if both gateway ports are bound to the same chassis.
> >  > >
> >  > > I tried out sending GARPs from the external network using master
> >  > > branch and MAC_Binding entries get updated. However, in order to cover
> >  > > missing cases, I think it would make sense to send the GARPs not only
> >  > > to localnet ports but to all ports of those logical switches that have
> >  > > a localnet port. What do you think?
> >  > >
> >  > > [0]
> > https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073
> >  > >
> >  > > [0]
> > https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073On
> >  > > Fri, Nov 23, 2018 at 5:28 PM Daniel Alvarez Sanchez
> >  > > mailto:dalva...@redhat.com>> wrote:
> >  > > >
> >  > > > On Wed, Nov 21, 2018 at 9:04 PM Han Zhou  > > wrote:
> >  > > > >
> >  > > > >
> >  > > > >
> >  > > > > On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson
> > mailto:mmich...@redhat.com>> wrote:
> >  > > > > >
> >  > > > > > Hi Daniel,
> >  > > > > >
> >  > > > > > I agree with Numan that this seems like a good approach to take.
> >  > > > > >
> >  > > > > > On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> >  > > > > > >
> >  > > > > > > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > 
> >  > > > > > > >> wrote:
> >  > > > > > >  >
> >  > > > > > >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique
> > wrote:
> >  > > > > > >  > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez
> >  > > > > > > mailto:dalva...@redhat.com>
> > >>
> >  > > > > > >  > > wrote:
> >  > > > > > >  > >
> >  > > > > > >  > > > Hi,
> >  > > > > > >  > > >
> >  > > > > > >  > > > After digging further. The problem seems to be
> > reduced to reusing an
> >  > > > > > >  > > > old gateway IP address for a dnat_and_snat entry.
> >  > > > > > >  > > > When a gateway port is bound to a chassis, its entry
> > will show up in
> >  > > > > > >  > > > the MAC_Binding table (at least when that Logical
> > Switch is connected
> >  > > > > > >  > > > to more than one Logical Router). After deleting the
> > Logical Router
> >  > > > > > >  > > > and all its ports, this entry will remain there. If
> > a new Logical
> >  > > > > > >  > > > Router is created and a Floating IP (dnat_and_snat)
> > is assigned to a
> >  > > > > > >  > > > VM with the old gw IP address, it will become
> > unreachable.
> >  > > > > > >  > > >
> >  > > > > > >  > > > A workaround now from networking-ovn (OpenStack
> > integration) is to
> >  > > > > > >  > > > delete MAC_Binding entries for that IP address upon
> > a FIP creation. I
> >  > > > > > >  > > > think that this however should be done from OVN,
> > what do you folks
> >  > > > > > >  > > > think?
> >  > > > > > >  > > >
> >  > > > > > >  > > >
> >  > > > > > >  > > Agree. Since the MAC_Binding table row is created by
> > ovn-controller, it
> >  > > > > > >  > > should
> >  > > > > > >  > > be handled properly within OVN.
> >  > > > > > >  >
> >  > > > > > >  > I see that this has been sitting here for a while.  The
> > solution seems
> 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-12-01 Thread Han Zhou
On Fri, Nov 30, 2018 at 7:29 AM Daniel Alvarez Sanchez 
wrote:
>
> Thanks folks again for the discussion.
> I sent an RFC patch here [0]. I tried it out with my reproducer and it
> seems to work well. Instead of outputting the packet to the localnet
> ofport, it will inject it to the public switch pipeline so it'll get
> broadcasted to the rest of the ports resulting in other Logical
> Routers connected to the external switch updating their neighbours. As
> it's broadcasted, the GARP will also be sent out through the localnet
> port as before.
>
> Looking forward to your comments before moving on and writing tests.
>
> Thanks Numan for your help!
>
> [0]
https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354220.html
> On Wed, Nov 28, 2018 at 3:32 PM Daniel Alvarez Sanchez
>  wrote:
> >
> > Hi all,
> >
> > As this thread is getting big I'm summarizing the issue I see so far:
> >
> > * When a dnat_and_snat entry is added to a logical router (or port
> > gets bound to a chassis), ovn-controller will send GARPs to announce
> > the MAC address of the FIP(s) (either the gw port or of the actual FIP
> > MAC address if distributed) only through localnet ports [0].
> >
> > * This means that gateway ports bound to that same chassis and
> > connected to the public switch won't get the GARPs, so they won't
> > update their MAC_Binding entries causing unreachability. In the
> > diagram of this thread, LR0 won't get the GARP sent by ovn-controller
> > if both gateway ports are bound to the same chassis.
> >
> > I tried out sending GARPs from the external network using master
> > branch and MAC_Binding entries get updated. However, in order to cover
> > missing cases, I think it would make sense to send the GARPs not only
> > to localnet ports but to all ports of those logical switches that have
> > a localnet port. What do you think?
> >
> > [0]
https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073
> >
> > [0]
https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073On
> > Fri, Nov 23, 2018 at 5:28 PM Daniel Alvarez Sanchez
> >  wrote:
> > >
> > > On Wed, Nov 21, 2018 at 9:04 PM Han Zhou  wrote:
> > > >
> > > >
> > > >
> > > > On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson 
wrote:
> > > > >
> > > > > Hi Daniel,
> > > > >
> > > > > I agree with Numan that this seems like a good approach to take.
> > > > >
> > > > > On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> > > > > >
> > > > > > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > > > > > > wrote:
> > > > > >  >
> > > > > >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique
wrote:
> > > > > >  > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez
> > > > > > mailto:dalva...@redhat.com>>
> > > > > >  > > wrote:
> > > > > >  > >
> > > > > >  > > > Hi,
> > > > > >  > > >
> > > > > >  > > > After digging further. The problem seems to be reduced
to reusing an
> > > > > >  > > > old gateway IP address for a dnat_and_snat entry.
> > > > > >  > > > When a gateway port is bound to a chassis, its entry
will show up in
> > > > > >  > > > the MAC_Binding table (at least when that Logical Switch
is connected
> > > > > >  > > > to more than one Logical Router). After deleting the
Logical Router
> > > > > >  > > > and all its ports, this entry will remain there. If a
new Logical
> > > > > >  > > > Router is created and a Floating IP (dnat_and_snat) is
assigned to a
> > > > > >  > > > VM with the old gw IP address, it will become
unreachable.
> > > > > >  > > >
> > > > > >  > > > A workaround now from networking-ovn (OpenStack
integration) is to
> > > > > >  > > > delete MAC_Binding entries for that IP address upon a
FIP creation. I
> > > > > >  > > > think that this however should be done from OVN, what do
you folks
> > > > > >  > > > think?
> > > > > >  > > >
> > > > > >  > > >
> > > > > >  > > Agree. Since the MAC_Binding table row is created by
ovn-controller, it
> > > > > >  > > should
> > > > > >  > > be handled properly within OVN.
> > > > > >  >
> > > > > >  > I see that this has been sitting here for a while.  The
solution seems
> > > > > >  > reasonable to me.  Are either of you working on it?
> > > > > >
> > > > > > I started working on it. I came up with a solution (see patch
below)
> > > > > > which works but I wanted to give you a bit more of context and
get your
> > > > > > feedback:
> > > > > >
> > > > > >
> > > > > > ^ localnet
> > > > > > |
> > > > > > +---+---+
> > > > > > |   |
> > > > > >  +--+  pub  +--+
> > > > > >  |  |   |  |
> > > > > >  |  +---+  |
> > > > > >  | 172.24.4.0/24 |
> > > > > >  | |
> > > > > > 172.24.4.220 | | 172.24.4.221
> > > > > >  +---+---+

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-30 Thread Daniel Alvarez Sanchez
Thanks folks again for the discussion.
I sent an RFC patch here [0]. I tried it out with my reproducer and it
seems to work well. Instead of outputting the packet to the localnet
ofport, it will inject it to the public switch pipeline so it'll get
broadcasted to the rest of the ports resulting in other Logical
Routers connected to the external switch updating their neighbours. As
it's broadcasted, the GARP will also be sent out through the localnet
port as before.

Looking forward to your comments before moving on and writing tests.

Thanks Numan for your help!

[0] https://mail.openvswitch.org/pipermail/ovs-dev/2018-November/354220.html
On Wed, Nov 28, 2018 at 3:32 PM Daniel Alvarez Sanchez
 wrote:
>
> Hi all,
>
> As this thread is getting big I'm summarizing the issue I see so far:
>
> * When a dnat_and_snat entry is added to a logical router (or port
> gets bound to a chassis), ovn-controller will send GARPs to announce
> the MAC address of the FIP(s) (either the gw port or of the actual FIP
> MAC address if distributed) only through localnet ports [0].
>
> * This means that gateway ports bound to that same chassis and
> connected to the public switch won't get the GARPs, so they won't
> update their MAC_Binding entries causing unreachability. In the
> diagram of this thread, LR0 won't get the GARP sent by ovn-controller
> if both gateway ports are bound to the same chassis.
>
> I tried out sending GARPs from the external network using master
> branch and MAC_Binding entries get updated. However, in order to cover
> missing cases, I think it would make sense to send the GARPs not only
> to localnet ports but to all ports of those logical switches that have
> a localnet port. What do you think?
>
> [0] 
> https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073
>
> [0] 
> https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073On
> Fri, Nov 23, 2018 at 5:28 PM Daniel Alvarez Sanchez
>  wrote:
> >
> > On Wed, Nov 21, 2018 at 9:04 PM Han Zhou  wrote:
> > >
> > >
> > >
> > > On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson  
> > > wrote:
> > > >
> > > > Hi Daniel,
> > > >
> > > > I agree with Numan that this seems like a good approach to take.
> > > >
> > > > On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> > > > >
> > > > > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > > > > > wrote:
> > > > >  >
> > > > >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > > > >  > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez
> > > > > mailto:dalva...@redhat.com>>
> > > > >  > > wrote:
> > > > >  > >
> > > > >  > > > Hi,
> > > > >  > > >
> > > > >  > > > After digging further. The problem seems to be reduced to 
> > > > > reusing an
> > > > >  > > > old gateway IP address for a dnat_and_snat entry.
> > > > >  > > > When a gateway port is bound to a chassis, its entry will show 
> > > > > up in
> > > > >  > > > the MAC_Binding table (at least when that Logical Switch is 
> > > > > connected
> > > > >  > > > to more than one Logical Router). After deleting the Logical 
> > > > > Router
> > > > >  > > > and all its ports, this entry will remain there. If a new 
> > > > > Logical
> > > > >  > > > Router is created and a Floating IP (dnat_and_snat) is 
> > > > > assigned to a
> > > > >  > > > VM with the old gw IP address, it will become unreachable.
> > > > >  > > >
> > > > >  > > > A workaround now from networking-ovn (OpenStack integration) 
> > > > > is to
> > > > >  > > > delete MAC_Binding entries for that IP address upon a FIP 
> > > > > creation. I
> > > > >  > > > think that this however should be done from OVN, what do you 
> > > > > folks
> > > > >  > > > think?
> > > > >  > > >
> > > > >  > > >
> > > > >  > > Agree. Since the MAC_Binding table row is created by 
> > > > > ovn-controller, it
> > > > >  > > should
> > > > >  > > be handled properly within OVN.
> > > > >  >
> > > > >  > I see that this has been sitting here for a while.  The solution 
> > > > > seems
> > > > >  > reasonable to me.  Are either of you working on it?
> > > > >
> > > > > I started working on it. I came up with a solution (see patch below)
> > > > > which works but I wanted to give you a bit more of context and get 
> > > > > your
> > > > > feedback:
> > > > >
> > > > >
> > > > > ^ localnet
> > > > > |
> > > > > +---+---+
> > > > > |   |
> > > > >  +--+  pub  +--+
> > > > >  |  |   |  |
> > > > >  |  +---+  |
> > > > >  | 172.24.4.0/24 |
> > > > >  | |
> > > > > 172.24.4.220 | | 172.24.4.221
> > > > >  +---+---+ +---+---+
> > > > >  |   | |   |
> > > > >  |  LR0  | |  LR1  |
> > 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-28 Thread Ben Pfaff
On Wed, Nov 28, 2018 at 03:17:10PM +0100, Daniel Alvarez Sanchez wrote:
> On Wed, Nov 28, 2018 at 3:10 PM Ben Pfaff  wrote:
> >
> > On Wed, Nov 28, 2018 at 12:07:55PM +0100, Daniel Alvarez Sanchez wrote:
> > > On Mon, Nov 26, 2018 at 9:30 PM Ben Pfaff  wrote:
> > > >
> > > > On Fri, Nov 16, 2018 at 06:41:33PM +0100, Daniel Alvarez Sanchez wrote:
> > > > > +static void
> > > > > +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> > > > > +{
> > > > > +const struct sbrec_mac_binding *b, *n;
> > > > > +SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> > > > > +if (strstr(ip, b->ip)) {
> > > > > +sbrec_mac_binding_delete(b);
> > > > > +}
> > > > > +}
> > > > > +}
> > > >
> > > > I haven't read the whole thread properly yet, but: why does this use
> > > > strstr()?
> > >
> > > I used it because b->ip could be like "50:57:00:00:00:02 20.0.0.10"
> > > and wanted to check if the IP address was present there.
> >
> > Is the 'ip' column in the MAC_Binding table documented incorrectly?  It
> > is currently documented as:
> >
> >ip: string
> >   The bound IP address.
> >
> > which doesn't mention strings that also contain a MAC address.
> Sorry for the confusion, the prototype is misleading. It's not the
> 'ip' col of the MAC_Binding table but the 'mac' column of the
> Port_Binding table which is what's being passed to the
> 'delete_mac_binding_by_ip()' function.
> +for (int i = 0; i < op->sb->n_mac; i++) {
> +delete_mac_binding_by_ip(ctx, op->sb->mac[i]);

Got it, thanks.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-28 Thread Daniel Alvarez Sanchez
Hi all,

As this thread is getting big I'm summarizing the issue I see so far:

* When a dnat_and_snat entry is added to a logical router (or port
gets bound to a chassis), ovn-controller will send GARPs to announce
the MAC address of the FIP(s) (either the gw port or of the actual FIP
MAC address if distributed) only through localnet ports [0].

* This means that gateway ports bound to that same chassis and
connected to the public switch won't get the GARPs, so they won't
update their MAC_Binding entries causing unreachability. In the
diagram of this thread, LR0 won't get the GARP sent by ovn-controller
if both gateway ports are bound to the same chassis.

I tried out sending GARPs from the external network using master
branch and MAC_Binding entries get updated. However, in order to cover
missing cases, I think it would make sense to send the GARPs not only
to localnet ports but to all ports of those logical switches that have
a localnet port. What do you think?

[0] 
https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073

[0] 
https://github.com/openvswitch/ovs/blob/master/ovn/controller/pinctrl.c#L2073On
Fri, Nov 23, 2018 at 5:28 PM Daniel Alvarez Sanchez
 wrote:
>
> On Wed, Nov 21, 2018 at 9:04 PM Han Zhou  wrote:
> >
> >
> >
> > On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson  wrote:
> > >
> > > Hi Daniel,
> > >
> > > I agree with Numan that this seems like a good approach to take.
> > >
> > > On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> > > >
> > > > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > > > > wrote:
> > > >  >
> > > >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > > >  > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez
> > > > mailto:dalva...@redhat.com>>
> > > >  > > wrote:
> > > >  > >
> > > >  > > > Hi,
> > > >  > > >
> > > >  > > > After digging further. The problem seems to be reduced to 
> > > > reusing an
> > > >  > > > old gateway IP address for a dnat_and_snat entry.
> > > >  > > > When a gateway port is bound to a chassis, its entry will show 
> > > > up in
> > > >  > > > the MAC_Binding table (at least when that Logical Switch is 
> > > > connected
> > > >  > > > to more than one Logical Router). After deleting the Logical 
> > > > Router
> > > >  > > > and all its ports, this entry will remain there. If a new Logical
> > > >  > > > Router is created and a Floating IP (dnat_and_snat) is assigned 
> > > > to a
> > > >  > > > VM with the old gw IP address, it will become unreachable.
> > > >  > > >
> > > >  > > > A workaround now from networking-ovn (OpenStack integration) is 
> > > > to
> > > >  > > > delete MAC_Binding entries for that IP address upon a FIP 
> > > > creation. I
> > > >  > > > think that this however should be done from OVN, what do you 
> > > > folks
> > > >  > > > think?
> > > >  > > >
> > > >  > > >
> > > >  > > Agree. Since the MAC_Binding table row is created by 
> > > > ovn-controller, it
> > > >  > > should
> > > >  > > be handled properly within OVN.
> > > >  >
> > > >  > I see that this has been sitting here for a while.  The solution 
> > > > seems
> > > >  > reasonable to me.  Are either of you working on it?
> > > >
> > > > I started working on it. I came up with a solution (see patch below)
> > > > which works but I wanted to give you a bit more of context and get your
> > > > feedback:
> > > >
> > > >
> > > > ^ localnet
> > > > |
> > > > +---+---+
> > > > |   |
> > > >  +--+  pub  +--+
> > > >  |  |   |  |
> > > >  |  +---+  |
> > > >  | 172.24.4.0/24 |
> > > >  | |
> > > > 172.24.4.220 | | 172.24.4.221
> > > >  +---+---+ +---+---+
> > > >  |   | |   |
> > > >  |  LR0  | |  LR1  |
> > > >  |   | |   |
> > > >  +---+---+ +---+---+
> > > >   10.0.0.254 | | 20.0.0.254
> > > >  | |
> > > >  +---+---+ +---+---+
> > > >  |   | |   |
> > > > 10.0.0.0/24  |  SW0  | |  SW1  |
> > > > 20.0.0.0/24 
> > > >  |   | |   |
> > > >  +---+---+ +---+---+
> > > >  | |
> > > >  | |
> > > >  +---+---+ +---+---+
> > > >  |   | |   |
> > > >  |  VM0  | |  VM1  |
> > > >  |   | |   |
> > > >  +---+ +---+
> > > >  10.0.0.10  

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-28 Thread Daniel Alvarez Sanchez
On Wed, Nov 28, 2018 at 3:10 PM Ben Pfaff  wrote:
>
> On Wed, Nov 28, 2018 at 12:07:55PM +0100, Daniel Alvarez Sanchez wrote:
> > On Mon, Nov 26, 2018 at 9:30 PM Ben Pfaff  wrote:
> > >
> > > On Fri, Nov 16, 2018 at 06:41:33PM +0100, Daniel Alvarez Sanchez wrote:
> > > > +static void
> > > > +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> > > > +{
> > > > +const struct sbrec_mac_binding *b, *n;
> > > > +SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> > > > +if (strstr(ip, b->ip)) {
> > > > +sbrec_mac_binding_delete(b);
> > > > +}
> > > > +}
> > > > +}
> > >
> > > I haven't read the whole thread properly yet, but: why does this use
> > > strstr()?
> >
> > I used it because b->ip could be like "50:57:00:00:00:02 20.0.0.10"
> > and wanted to check if the IP address was present there.
>
> Is the 'ip' column in the MAC_Binding table documented incorrectly?  It
> is currently documented as:
>
>ip: string
>   The bound IP address.
>
> which doesn't mention strings that also contain a MAC address.
Sorry for the confusion, the prototype is misleading. It's not the
'ip' col of the MAC_Binding table but the 'mac' column of the
Port_Binding table which is what's being passed to the
'delete_mac_binding_by_ip()' function.
+for (int i = 0; i < op->sb->n_mac; i++) {
+delete_mac_binding_by_ip(ctx, op->sb->mac[i]);

Thanks!
Daniel


>
> > I am sending another email to this thread with more details about the
> > current issue, to gather more feedback.  As Han says, the patch I sent
> > is not covering all situations and perhaps it's not the best way to
> > fix it but need to confirm few things before moving forward.
>
> Thanks.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-28 Thread Ben Pfaff
On Wed, Nov 28, 2018 at 12:07:55PM +0100, Daniel Alvarez Sanchez wrote:
> On Mon, Nov 26, 2018 at 9:30 PM Ben Pfaff  wrote:
> >
> > On Fri, Nov 16, 2018 at 06:41:33PM +0100, Daniel Alvarez Sanchez wrote:
> > > +static void
> > > +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> > > +{
> > > +const struct sbrec_mac_binding *b, *n;
> > > +SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> > > +if (strstr(ip, b->ip)) {
> > > +sbrec_mac_binding_delete(b);
> > > +}
> > > +}
> > > +}
> >
> > I haven't read the whole thread properly yet, but: why does this use
> > strstr()?
> 
> I used it because b->ip could be like "50:57:00:00:00:02 20.0.0.10"
> and wanted to check if the IP address was present there. 

Is the 'ip' column in the MAC_Binding table documented incorrectly?  It
is currently documented as:

   ip: string
  The bound IP address.

which doesn't mention strings that also contain a MAC address.

> I am sending another email to this thread with more details about the
> current issue, to gather more feedback.  As Han says, the patch I sent
> is not covering all situations and perhaps it's not the best way to
> fix it but need to confirm few things before moving forward.

Thanks.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-28 Thread Daniel Alvarez Sanchez
On Mon, Nov 26, 2018 at 9:30 PM Ben Pfaff  wrote:
>
> On Fri, Nov 16, 2018 at 06:41:33PM +0100, Daniel Alvarez Sanchez wrote:
> > +static void
> > +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> > +{
> > +const struct sbrec_mac_binding *b, *n;
> > +SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> > +if (strstr(ip, b->ip)) {
> > +sbrec_mac_binding_delete(b);
> > +}
> > +}
> > +}
>
> I haven't read the whole thread properly yet, but: why does this use
> strstr()?

I used it because b->ip could be like "50:57:00:00:00:02 20.0.0.10"
and wanted to check if the IP address was present there. I am sending
another email to this thread with more details about the current
issue, to gather more feedback.
As Han says, the patch I sent is not covering all situations and
perhaps it's not the best way to fix it but need to confirm few things
before moving forward.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-26 Thread Ben Pfaff
On Fri, Nov 16, 2018 at 06:41:33PM +0100, Daniel Alvarez Sanchez wrote:
> +static void
> +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> +{
> +const struct sbrec_mac_binding *b, *n;
> +SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> +if (strstr(ip, b->ip)) {
> +sbrec_mac_binding_delete(b);
> +}
> +}
> +}

I haven't read the whole thread properly yet, but: why does this use
strstr()?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-23 Thread Daniel Alvarez Sanchez
On Wed, Nov 21, 2018 at 9:04 PM Han Zhou  wrote:
>
>
>
> On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson  wrote:
> >
> > Hi Daniel,
> >
> > I agree with Numan that this seems like a good approach to take.
> >
> > On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> > >
> > > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > > > wrote:
> > >  >
> > >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > >  > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez
> > > mailto:dalva...@redhat.com>>
> > >  > > wrote:
> > >  > >
> > >  > > > Hi,
> > >  > > >
> > >  > > > After digging further. The problem seems to be reduced to reusing 
> > > an
> > >  > > > old gateway IP address for a dnat_and_snat entry.
> > >  > > > When a gateway port is bound to a chassis, its entry will show up 
> > > in
> > >  > > > the MAC_Binding table (at least when that Logical Switch is 
> > > connected
> > >  > > > to more than one Logical Router). After deleting the Logical Router
> > >  > > > and all its ports, this entry will remain there. If a new Logical
> > >  > > > Router is created and a Floating IP (dnat_and_snat) is assigned to 
> > > a
> > >  > > > VM with the old gw IP address, it will become unreachable.
> > >  > > >
> > >  > > > A workaround now from networking-ovn (OpenStack integration) is to
> > >  > > > delete MAC_Binding entries for that IP address upon a FIP 
> > > creation. I
> > >  > > > think that this however should be done from OVN, what do you folks
> > >  > > > think?
> > >  > > >
> > >  > > >
> > >  > > Agree. Since the MAC_Binding table row is created by ovn-controller, 
> > > it
> > >  > > should
> > >  > > be handled properly within OVN.
> > >  >
> > >  > I see that this has been sitting here for a while.  The solution seems
> > >  > reasonable to me.  Are either of you working on it?
> > >
> > > I started working on it. I came up with a solution (see patch below)
> > > which works but I wanted to give you a bit more of context and get your
> > > feedback:
> > >
> > >
> > > ^ localnet
> > > |
> > > +---+---+
> > > |   |
> > >  +--+  pub  +--+
> > >  |  |   |  |
> > >  |  +---+  |
> > >  | 172.24.4.0/24 |
> > >  | |
> > > 172.24.4.220 | | 172.24.4.221
> > >  +---+---+ +---+---+
> > >  |   | |   |
> > >  |  LR0  | |  LR1  |
> > >  |   | |   |
> > >  +---+---+ +---+---+
> > >   10.0.0.254 | | 20.0.0.254
> > >  | |
> > >  +---+---+ +---+---+
> > >  |   | |   |
> > > 10.0.0.0/24  |  SW0  | |  SW1  |
> > > 20.0.0.0/24 
> > >  |   | |   |
> > >  +---+---+ +---+---+
> > >  | |
> > >  | |
> > >  +---+---+ +---+---+
> > >  |   | |   |
> > >  |  VM0  | |  VM1  |
> > >  |   | |   |
> > >  +---+ +---+
> > >  10.0.0.10 20.0.0.10
> > >172.24.4.100   172.24.4.200
> > >
> > >
> > > When I ping VM1 floating IP from the external network, a new entry for
> > > 172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:
> > >
> > > _uuid   : 85e30e87-3c59-423e-8681-ec4cfd9205f9
> > > datapath: ac5984b9-0fea-485f-84d4-031bdeced29b
> > > ip  : "172.24.4.221"
> > > logical_port: "lrp02"
> > > mac : "00:00:02:01:02:04"
> > >
> > >
> > > Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused
> > > for VM2 FIP with different MAC and new gateway IP is created (for
> > > example 172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable
> > > from VM1 until the old MAC_Binding entry gets deleted as pinging
> > > 172.24.4.221 will use the wrong address ("00:00:02:01:02:04").
> > >
> > > With the patch below, removing LR1 results in deleting all MAC_Binding
> > > entries for every datapath where '172.24.4.221' appears in the 'ip'
> > > column so the problem goes away.
> > >
> > > Another solution would be implementing some kind of 'aging' for
> > > MAC_Binding entries but perhaps it's more complex.
> > > Looking forward for your comments :)
> > >
> > >
> > > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > > index 58bef7d..a86733e 100644
> > > --- a/ovn/northd/ovn-northd.c
> > > +++ 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-23 Thread Daniel Alvarez Sanchez
Hi Han,

Yes, I agree that the patch is not enough. I'll take a look at the
GARP thing because it's either not implemented or not working. Here's
a reproducer while I jump back into it.

When you ping 172.24.4.200 from the namespace 1 the first time, a
MAC_Binding entry gets created:

# ovn-sbctl list mac_binding | grep 200 -C2
_uuid   : 07967416-c89c-4233-8cc2-4dc929720838
datapath: 918a9363-fa6e-4086-98ee-8d073b924d29
ip  : "172.24.4.200"
logical_port: "lr0-public"
mac : "00:00:20:20:12:15"


After recreating lr1 and sw1 using a different MAC address,
172.24.4.200 becomes unreachable from sw0 as the MAC_Binding entry
never gets updated.


reproducer.sh

#!/bin/bash
for i in $(ovn-sbctl list mac_binding | grep uuid  | awk '{print
$3}'); do ovn-sbctl destroy mac_binding $i; done

ip net del ns1
ip net del ns2
ovs-vsctl del-port ns1
ovs-vsctl del-port ns2
ovn-nbctl lr-del lr0
ovn-nbctl lr-del lr1
ovn-nbctl ls-del sw0
ovn-nbctl ls-del sw1
ovn-nbctl ls-del public

chassis_name=`ovn-sbctl find chassis | grep ^name | awk '{print $3}'`
ovn-nbctl ls-add sw0
ovn-nbctl lsp-add sw0 sw0-port1
ovn-nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:01 10.0.0.10"


ovn-nbctl lr-add lr0
# Connect sw0 to lr0
ovn-nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.0.254/24
ovn-nbctl lsp-add sw0 sw0-lr0
ovn-nbctl lsp-set-type sw0-lr0 router
ovn-nbctl lsp-set-addresses sw0-lr0 router
ovn-nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0


ovn-nbctl ls-add public
ovn-nbctl lrp-add lr0  lr0-public 00:00:20:20:12:13 172.24.4.220/24
ovn-nbctl lsp-add public public-lr0
ovn-nbctl lsp-set-type public-lr0 router
ovn-nbctl lsp-set-addresses public-lr0 router
ovn-nbctl lsp-set-options public-lr0 router-port=lr0-public

# localnet port
ovn-nbctl lsp-add public ln-public
ovn-nbctl lsp-set-type ln-public localnet
ovn-nbctl lsp-set-addresses ln-public unknown
ovn-nbctl lsp-set-options ln-public network_name=public

ovn-nbctl ls-add sw1
ovn-nbctl lsp-add sw1 sw1-port1
ovn-nbctl lsp-set-addresses sw1-port1 "50:57:00:00:00:02 20.0.0.10"

ovn-nbctl lr-add lr1
# Connect sw1 to lr1
ovn-nbctl lrp-add lr1 lr1-sw1 00:00:00:00:ff:02 20.0.0.254/24
ovn-nbctl lsp-add sw1 sw1-lr1
ovn-nbctl lsp-set-type sw1-lr1 router
ovn-nbctl lsp-set-addresses sw1-lr1 router
ovn-nbctl lsp-set-options sw1-lr1 router-port=lr1-sw1

ovn-nbctl lrp-add lr1  lr1-public 00:00:20:20:12:15 172.24.4.221/24
ovn-nbctl lsp-add public public-lr1
ovn-nbctl lsp-set-type public-lr1 router
ovn-nbctl lsp-set-addresses public-lr1 router
ovn-nbctl lsp-set-options public-lr1 router-port=lr1-public


ovn-nbctl lr-nat-add lr0 snat 172.24.4.220 10.0.0.0/24
ovn-nbctl lr-nat-add lr1 snat 172.24.4.221  20.0.0.0/24

# Create the FIPs
ovn-nbctl lr-nat-add lr0 dnat_and_snat 172.24.4.100 10.0.0.10
ovn-nbctl lr-nat-add lr1 dnat_and_snat 172.24.4.200 20.0.0.10

# Schedule the gateways
ovn-nbctl lrp-set-gateway-chassis lr0-public $chassis_name 20
ovn-nbctl lrp-set-gateway-chassis lr1-public $chassis_name  20


add_phys_port() {
name=$1
mac=$2
ip=$3
mask=$4
gw=$5
iface_id=$6
ip netns add $name
ovs-vsctl add-port br-int $name -- set interface $name type=internal
ip link set $name netns $name
ip netns exec $name ip link set $name address $mac
ip netns exec $name ip addr add $ip/$mask dev $name
ip netns exec $name ip link set $name up
ip netns exec $name ip route add default via $gw
ovs-vsctl set Interface $name external_ids:iface-id=$iface_id
}


add_phys_port ns1 50:54:00:00:00:01 10.0.0.10  24 10.0.0.254 sw0-port1
add_phys_port ns2 50:57:00:00:00:02 20.0.0.10  24 20.0.0.254 sw1-port1

# Pinging from sw0
ip net e ns1 ping -c 4 172.24.4.200

ovn-nbctl lr-del lr1
ovn-nbctl ls-del sw1

ovn-nbctl ls-add sw1
ovn-nbctl lsp-add sw1 sw1-port1
ovn-nbctl lsp-set-addresses sw1-port1 "50:57:00:00:00:02 20.0.0.10"

ovn-nbctl lr-add lr1
# Connect sw1 to lr1
ovn-nbctl lrp-add lr1 lr1-sw1 00:00:00:00:ff:02 20.0.0.254/24
ovn-nbctl lsp-add sw1 sw1-lr1
ovn-nbctl lsp-set-type sw1-lr1 router
ovn-nbctl lsp-set-addresses sw1-lr1 router
ovn-nbctl lsp-set-options sw1-lr1 router-port=lr1-sw1


# Change the MAC address of the LRP
ovn-nbctl lrp-add lr1  lr1-public 00:00:20:20:12:95 172.24.4.221/24

ovn-nbctl lr-nat-add lr1 snat 172.24.4.221  20.0.0.0/24
ovn-nbctl lr-nat-add lr1 dnat_and_snat 172.24.4.200 20.0.0.10

ovn-nbctl lrp-set-gateway-chassis lr1-public centosl-rdocloud 20

# Pinging from sw0 won't work now. For the outside it will.
ip net e ns1 ping -c 4 172.24.4.200
On Wed, Nov 21, 2018 at 9:04 PM Han Zhou  wrote:
>
>
>
> On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson  wrote:
> >
> > Hi Daniel,
> >
> > I agree with Numan that this seems like a good approach to take.
> >
> > On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> > >
> > > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > > > wrote:
> > >  >
> > >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > >  > > On 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-21 Thread Han Zhou
On Tue, Nov 20, 2018 at 5:21 AM Mark Michelson  wrote:
>
> Hi Daniel,
>
> I agree with Numan that this seems like a good approach to take.
>
> On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:
> >
> > On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  > > wrote:
> >  >
> >  > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> >  > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez
> > mailto:dalva...@redhat.com>>
> >  > > wrote:
> >  > >
> >  > > > Hi,
> >  > > >
> >  > > > After digging further. The problem seems to be reduced to
reusing an
> >  > > > old gateway IP address for a dnat_and_snat entry.
> >  > > > When a gateway port is bound to a chassis, its entry will show
up in
> >  > > > the MAC_Binding table (at least when that Logical Switch is
connected
> >  > > > to more than one Logical Router). After deleting the Logical
Router
> >  > > > and all its ports, this entry will remain there. If a new Logical
> >  > > > Router is created and a Floating IP (dnat_and_snat) is assigned
to a
> >  > > > VM with the old gw IP address, it will become unreachable.
> >  > > >
> >  > > > A workaround now from networking-ovn (OpenStack integration) is
to
> >  > > > delete MAC_Binding entries for that IP address upon a FIP
creation. I
> >  > > > think that this however should be done from OVN, what do you
folks
> >  > > > think?
> >  > > >
> >  > > >
> >  > > Agree. Since the MAC_Binding table row is created by
ovn-controller, it
> >  > > should
> >  > > be handled properly within OVN.
> >  >
> >  > I see that this has been sitting here for a while.  The solution
seems
> >  > reasonable to me.  Are either of you working on it?
> >
> > I started working on it. I came up with a solution (see patch below)
> > which works but I wanted to give you a bit more of context and get your
> > feedback:
> >
> >
> > ^ localnet
> > |
> > +---+---+
> > |   |
> >  +--+  pub  +--+
> >  |  |   |  |
> >  |  +---+  |
> >  | 172.24.4.0/24 |
> >  | |
> > 172.24.4.220 | | 172.24.4.221
> >  +---+---+ +---+---+
> >  |   | |   |
> >  |  LR0  | |  LR1  |
> >  |   | |   |
> >  +---+---+ +---+---+
> >   10.0.0.254 | | 20.0.0.254
> >  | |
> >  +---+---+ +---+---+
> >  |   | |   |
> > 10.0.0.0/24  |  SW0  | |  SW1  |
> > 20.0.0.0/24 
> >  |   | |   |
> >  +---+---+ +---+---+
> >  | |
> >  | |
> >  +---+---+ +---+---+
> >  |   | |   |
> >  |  VM0  | |  VM1  |
> >  |   | |   |
> >  +---+ +---+
> >  10.0.0.10 20.0.0.10
> >172.24.4.100   172.24.4.200
> >
> >
> > When I ping VM1 floating IP from the external network, a new entry for
> > 172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:
> >
> > _uuid   : 85e30e87-3c59-423e-8681-ec4cfd9205f9
> > datapath: ac5984b9-0fea-485f-84d4-031bdeced29b
> > ip  : "172.24.4.221"
> > logical_port: "lrp02"
> > mac : "00:00:02:01:02:04"
> >
> >
> > Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused
> > for VM2 FIP with different MAC and new gateway IP is created (for
> > example 172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable
> > from VM1 until the old MAC_Binding entry gets deleted as pinging
> > 172.24.4.221 will use the wrong address ("00:00:02:01:02:04").
> >
> > With the patch below, removing LR1 results in deleting all MAC_Binding
> > entries for every datapath where '172.24.4.221' appears in the 'ip'
> > column so the problem goes away.
> >
> > Another solution would be implementing some kind of 'aging' for
> > MAC_Binding entries but perhaps it's more complex.
> > Looking forward for your comments :)
> >
> >
> > diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> > index 58bef7d..a86733e 100644
> > --- a/ovn/northd/ovn-northd.c
> > +++ b/ovn/northd/ovn-northd.c
> > @@ -2324,6 +2324,18 @@ cleanup_mac_bindings(struct northd_context *ctx,
> > struct hmap *ports)
> >   }
> >   }
> >
> > +static void
> > +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> > +{
> > +const struct sbrec_mac_binding *b, *n;
> > +

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-20 Thread Mark Michelson

Hi Daniel,

I agree with Numan that this seems like a good approach to take.

On 11/16/2018 12:41 PM, Daniel Alvarez Sanchez wrote:


On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff > wrote:

 >
 > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
 > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez 
mailto:dalva...@redhat.com>>

 > > wrote:
 > >
 > > > Hi,
 > > >
 > > > After digging further. The problem seems to be reduced to reusing an
 > > > old gateway IP address for a dnat_and_snat entry.
 > > > When a gateway port is bound to a chassis, its entry will show up in
 > > > the MAC_Binding table (at least when that Logical Switch is connected
 > > > to more than one Logical Router). After deleting the Logical Router
 > > > and all its ports, this entry will remain there. If a new Logical
 > > > Router is created and a Floating IP (dnat_and_snat) is assigned to a
 > > > VM with the old gw IP address, it will become unreachable.
 > > >
 > > > A workaround now from networking-ovn (OpenStack integration) is to
 > > > delete MAC_Binding entries for that IP address upon a FIP creation. I
 > > > think that this however should be done from OVN, what do you folks
 > > > think?
 > > >
 > > >
 > > Agree. Since the MAC_Binding table row is created by ovn-controller, it
 > > should
 > > be handled properly within OVN.
 >
 > I see that this has been sitting here for a while.  The solution seems
 > reasonable to me.  Are either of you working on it?

I started working on it. I came up with a solution (see patch below) 
which works but I wanted to give you a bit more of context and get your 
feedback:



                            ^ localnet
                            |
                        +---+---+
                        |       |
                 +--+  pub  +--+
                 |      |       |      |
                 |      +---+      |
                 | 172.24.4.0/24     |
                 |                     |
    172.24.4.220 |                     | 172.24.4.221
             +---+---+             +---+---+
             |       |             |       |
             |  LR0  |             |  LR1  |
             |       |             |       |
             +---+---+             +---+---+
      10.0.0.254 |                     | 20.0.0.254
                 |                     |
             +---+---+             +---+---+
             |       |             |       |
10.0.0.0/24  |  SW0  |             |  SW1  | 
20.0.0.0/24 

             |       |             |       |
             +---+---+             +---+---+
                 |                     |
                 |                     |
             +---+---+             +---+---+
             |       |             |       |
             |  VM0  |             |  VM1  |
             |       |             |       |
             +---+             +---+
             10.0.0.10             20.0.0.10
           172.24.4.100           172.24.4.200


When I ping VM1 floating IP from the external network, a new entry for 
172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:


_uuid               : 85e30e87-3c59-423e-8681-ec4cfd9205f9
datapath            : ac5984b9-0fea-485f-84d4-031bdeced29b
ip                  : "172.24.4.221"
logical_port        : "lrp02"
mac                 : "00:00:02:01:02:04"


Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused 
for VM2 FIP with different MAC and new gateway IP is created (for 
example 172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable 
from VM1 until the old MAC_Binding entry gets deleted as pinging 
172.24.4.221 will use the wrong address ("00:00:02:01:02:04").


With the patch below, removing LR1 results in deleting all MAC_Binding 
entries for every datapath where '172.24.4.221' appears in the 'ip' 
column so the problem goes away.


Another solution would be implementing some kind of 'aging' for 
MAC_Binding entries but perhaps it's more complex.

Looking forward for your comments :)


diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 58bef7d..a86733e 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2324,6 +2324,18 @@ cleanup_mac_bindings(struct northd_context *ctx, 
struct hmap *ports)

      }
  }

+static void
+delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
+{
+    const struct sbrec_mac_binding *b, *n;
+    SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
+        if (strstr(ip, b->ip)) {
+            sbrec_mac_binding_delete(b);
+        }
+    }
+}
+
+
  /* Updates the southbound Port_Binding table so that it contains the 
logical

   * switch ports specified by the northbound database.
   *
@@ -2383,6 +2395,15 @@ build_ports(struct northd_context *ctx,
      /* Delete southbound records without northbound matches. */
      LIST_FOR_EACH_SAFE(op, next, list, _only) {
         

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-19 Thread Numan Siddique
On Mon, Nov 19, 2018 at 2:56 PM Daniel Alvarez Sanchez 
wrote:

> Having thought this again, I'd rather merge the patch I proposed in my
> previous email (I'd need tests and propose a formal patch after your
> feedback) but in the long term I think it'd make sense to also implement
> some sort of aging to the MAC_Binding entries so that they eventually
> expire, especially for entries that come from external networks.
>
> On Fri, Nov 16, 2018 at 6:41 PM Daniel Alvarez Sanchez <
> dalva...@redhat.com> wrote:
>
>>
>> On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  wrote:
>> >
>> > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
>> > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez <
>> dalva...@redhat.com>
>> > > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > After digging further. The problem seems to be reduced to reusing an
>> > > > old gateway IP address for a dnat_and_snat entry.
>> > > > When a gateway port is bound to a chassis, its entry will show up in
>> > > > the MAC_Binding table (at least when that Logical Switch is
>> connected
>> > > > to more than one Logical Router). After deleting the Logical Router
>> > > > and all its ports, this entry will remain there. If a new Logical
>> > > > Router is created and a Floating IP (dnat_and_snat) is assigned to a
>> > > > VM with the old gw IP address, it will become unreachable.
>> > > >
>> > > > A workaround now from networking-ovn (OpenStack integration) is to
>> > > > delete MAC_Binding entries for that IP address upon a FIP creation.
>> I
>> > > > think that this however should be done from OVN, what do you folks
>> > > > think?
>> > > >
>> > > >
>> > > Agree. Since the MAC_Binding table row is created by ovn-controller,
>> it
>> > > should
>> > > be handled properly within OVN.
>> >
>> > I see that this has been sitting here for a while.  The solution seems
>> > reasonable to me.  Are either of you working on it?
>>
>> I started working on it. I came up with a solution (see patch below)
>> which works but I wanted to give you a bit more of context and get your
>> feedback:
>>
>>
>>^ localnet
>>|
>>+---+---+
>>|   |
>> +--+  pub  +--+
>> |  |   |  |
>> |  +---+  |
>> |172.24.4.0/24|
>> | |
>>172.24.4.220 | | 172.24.4.221
>> +---+---+ +---+---+
>> |   | |   |
>> |  LR0  | |  LR1  |
>> |   | |   |
>> +---+---+ +---+---+
>>  10.0.0.254 | | 20.0.0.254
>> | |
>> +---+---+ +---+---+
>> |   | |   |
>> 10.0.0.0/24 |  SW0  | |  SW1  | 20.0.0.0/24
>> |   | |   |
>> +---+---+ +---+---+
>> | |
>> | |
>> +---+---+ +---+---+
>> |   | |   |
>> |  VM0  | |  VM1  |
>> |   | |   |
>> +---+ +---+
>> 10.0.0.10 20.0.0.10
>>   172.24.4.100   172.24.4.200
>>
>>
>> When I ping VM1 floating IP from the external network, a new entry for
>> 172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:
>>
>> _uuid   : 85e30e87-3c59-423e-8681-ec4cfd9205f9
>> datapath: ac5984b9-0fea-485f-84d4-031bdeced29b
>> ip  : "172.24.4.221"
>> logical_port: "lrp02"
>> mac : "00:00:02:01:02:04"
>>
>>
>> Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused
>> for VM2 FIP with different MAC and new gateway IP is created (for example
>> 172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable from VM1
>> until the old MAC_Binding entry gets deleted as pinging 172.24.4.221 will
>> use the wrong address ("00:00:02:01:02:04").
>>
>> With the patch below, removing LR1 results in deleting all MAC_Binding
>> entries for every datapath where '172.24.4.221' appears in the 'ip' column
>> so the problem goes away.
>>
>> Another solution would be implementing some kind of 'aging' for
>> MAC_Binding entries but perhaps it's more complex.
>> Looking forward for your comments :)
>>
>>
As discussed with you offline, ageing itself might not solve this issue. We
might still hit the issue until the mac _binding entry ages out and flushed
out. Your proposed solution seems fine to me.

Thanks
Numan


>> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
>> index 58bef7d..a86733e 100644
>> --- a/ovn/northd/ovn-northd.c
>> +++ b/ovn/northd/ovn-northd.c
>> @@ 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-19 Thread Daniel Alvarez Sanchez
Having thought this again, I'd rather merge the patch I proposed in my
previous email (I'd need tests and propose a formal patch after your
feedback) but in the long term I think it'd make sense to also implement
some sort of aging to the MAC_Binding entries so that they eventually
expire, especially for entries that come from external networks.

On Fri, Nov 16, 2018 at 6:41 PM Daniel Alvarez Sanchez 
wrote:

>
> On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  wrote:
> >
> > On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez <
> dalva...@redhat.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > After digging further. The problem seems to be reduced to reusing an
> > > > old gateway IP address for a dnat_and_snat entry.
> > > > When a gateway port is bound to a chassis, its entry will show up in
> > > > the MAC_Binding table (at least when that Logical Switch is connected
> > > > to more than one Logical Router). After deleting the Logical Router
> > > > and all its ports, this entry will remain there. If a new Logical
> > > > Router is created and a Floating IP (dnat_and_snat) is assigned to a
> > > > VM with the old gw IP address, it will become unreachable.
> > > >
> > > > A workaround now from networking-ovn (OpenStack integration) is to
> > > > delete MAC_Binding entries for that IP address upon a FIP creation. I
> > > > think that this however should be done from OVN, what do you folks
> > > > think?
> > > >
> > > >
> > > Agree. Since the MAC_Binding table row is created by ovn-controller, it
> > > should
> > > be handled properly within OVN.
> >
> > I see that this has been sitting here for a while.  The solution seems
> > reasonable to me.  Are either of you working on it?
>
> I started working on it. I came up with a solution (see patch below) which
> works but I wanted to give you a bit more of context and get your feedback:
>
>
>^ localnet
>|
>+---+---+
>|   |
> +--+  pub  +--+
> |  |   |  |
> |  +---+  |
> |172.24.4.0/24|
> | |
>172.24.4.220 | | 172.24.4.221
> +---+---+ +---+---+
> |   | |   |
> |  LR0  | |  LR1  |
> |   | |   |
> +---+---+ +---+---+
>  10.0.0.254 | | 20.0.0.254
> | |
> +---+---+ +---+---+
> |   | |   |
> 10.0.0.0/24 |  SW0  | |  SW1  | 20.0.0.0/24
> |   | |   |
> +---+---+ +---+---+
> | |
> | |
> +---+---+ +---+---+
> |   | |   |
> |  VM0  | |  VM1  |
> |   | |   |
> +---+ +---+
> 10.0.0.10 20.0.0.10
>   172.24.4.100   172.24.4.200
>
>
> When I ping VM1 floating IP from the external network, a new entry for
> 172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:
>
> _uuid   : 85e30e87-3c59-423e-8681-ec4cfd9205f9
> datapath: ac5984b9-0fea-485f-84d4-031bdeced29b
> ip  : "172.24.4.221"
> logical_port: "lrp02"
> mac : "00:00:02:01:02:04"
>
>
> Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused
> for VM2 FIP with different MAC and new gateway IP is created (for example
> 172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable from VM1
> until the old MAC_Binding entry gets deleted as pinging 172.24.4.221 will
> use the wrong address ("00:00:02:01:02:04").
>
> With the patch below, removing LR1 results in deleting all MAC_Binding
> entries for every datapath where '172.24.4.221' appears in the 'ip' column
> so the problem goes away.
>
> Another solution would be implementing some kind of 'aging' for
> MAC_Binding entries but perhaps it's more complex.
> Looking forward for your comments :)
>
>
> diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
> index 58bef7d..a86733e 100644
> --- a/ovn/northd/ovn-northd.c
> +++ b/ovn/northd/ovn-northd.c
> @@ -2324,6 +2324,18 @@ cleanup_mac_bindings(struct northd_context *ctx,
> struct hmap *ports)
>  }
>  }
>
> +static void
> +delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
> +{
> +const struct sbrec_mac_binding *b, *n;
> +SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
> +if (strstr(ip, b->ip)) {
> +sbrec_mac_binding_delete(b);
> +}
> +}
> +}
> +
> +
>  /* 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-16 Thread Daniel Alvarez Sanchez
On Sat, Nov 10, 2018 at 12:21 AM Ben Pfaff  wrote:
>
> On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> > On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez <
dalva...@redhat.com>
> > wrote:
> >
> > > Hi,
> > >
> > > After digging further. The problem seems to be reduced to reusing an
> > > old gateway IP address for a dnat_and_snat entry.
> > > When a gateway port is bound to a chassis, its entry will show up in
> > > the MAC_Binding table (at least when that Logical Switch is connected
> > > to more than one Logical Router). After deleting the Logical Router
> > > and all its ports, this entry will remain there. If a new Logical
> > > Router is created and a Floating IP (dnat_and_snat) is assigned to a
> > > VM with the old gw IP address, it will become unreachable.
> > >
> > > A workaround now from networking-ovn (OpenStack integration) is to
> > > delete MAC_Binding entries for that IP address upon a FIP creation. I
> > > think that this however should be done from OVN, what do you folks
> > > think?
> > >
> > >
> > Agree. Since the MAC_Binding table row is created by ovn-controller, it
> > should
> > be handled properly within OVN.
>
> I see that this has been sitting here for a while.  The solution seems
> reasonable to me.  Are either of you working on it?

I started working on it. I came up with a solution (see patch below) which
works but I wanted to give you a bit more of context and get your feedback:


   ^ localnet
   |
   +---+---+
   |   |
+--+  pub  +--+
|  |   |  |
|  +---+  |
|172.24.4.0/24|
| |
   172.24.4.220 | | 172.24.4.221
+---+---+ +---+---+
|   | |   |
|  LR0  | |  LR1  |
|   | |   |
+---+---+ +---+---+
 10.0.0.254 | | 20.0.0.254
| |
+---+---+ +---+---+
|   | |   |
10.0.0.0/24 |  SW0  | |  SW1  | 20.0.0.0/24
|   | |   |
+---+---+ +---+---+
| |
| |
+---+---+ +---+---+
|   | |   |
|  VM0  | |  VM1  |
|   | |   |
+---+ +---+
10.0.0.10 20.0.0.10
  172.24.4.100   172.24.4.200


When I ping VM1 floating IP from the external network, a new entry for
172.24.4.221 in the LR0 datapath appears in the MAC_Binding table:

_uuid   : 85e30e87-3c59-423e-8681-ec4cfd9205f9
datapath: ac5984b9-0fea-485f-84d4-031bdeced29b
ip  : "172.24.4.221"
logical_port: "lrp02"
mac : "00:00:02:01:02:04"


Now, if LR1 gets removed and the old gateway IP (172.24.4.221) is reused
for VM2 FIP with different MAC and new gateway IP is created (for example
172.24.4.222 00:00:02:01:02:99),  VM2 FIP becomes unreachable from VM1
until the old MAC_Binding entry gets deleted as pinging 172.24.4.221 will
use the wrong address ("00:00:02:01:02:04").

With the patch below, removing LR1 results in deleting all MAC_Binding
entries for every datapath where '172.24.4.221' appears in the 'ip' column
so the problem goes away.

Another solution would be implementing some kind of 'aging' for MAC_Binding
entries but perhaps it's more complex.
Looking forward for your comments :)


diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 58bef7d..a86733e 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -2324,6 +2324,18 @@ cleanup_mac_bindings(struct northd_context *ctx,
struct hmap *ports)
 }
 }

+static void
+delete_mac_binding_by_ip(struct northd_context *ctx, const char *ip)
+{
+const struct sbrec_mac_binding *b, *n;
+SBREC_MAC_BINDING_FOR_EACH_SAFE (b, n, ctx->ovnsb_idl) {
+if (strstr(ip, b->ip)) {
+sbrec_mac_binding_delete(b);
+}
+}
+}
+
+
 /* Updates the southbound Port_Binding table so that it contains the
logical
  * switch ports specified by the northbound database.
  *
@@ -2383,6 +2395,15 @@ build_ports(struct northd_context *ctx,
 /* Delete southbound records without northbound matches. */
 LIST_FOR_EACH_SAFE(op, next, list, _only) {
 ovs_list_remove(>list);
+
+/* Delete all MAC_Binding entries which match the IP addresses of
the
+ * deleted logical router port (ie. port with a peer). */
+const char *peer = smap_get(>sb->options, "peer");
+if (peer) {
+for (int i = 0; i < op->sb->n_mac; i++) {
+ 

Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-11-09 Thread Ben Pfaff
On Mon, Oct 29, 2018 at 05:21:13PM +0530, Numan Siddique wrote:
> On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez 
> wrote:
> 
> > Hi,
> >
> > After digging further. The problem seems to be reduced to reusing an
> > old gateway IP address for a dnat_and_snat entry.
> > When a gateway port is bound to a chassis, its entry will show up in
> > the MAC_Binding table (at least when that Logical Switch is connected
> > to more than one Logical Router). After deleting the Logical Router
> > and all its ports, this entry will remain there. If a new Logical
> > Router is created and a Floating IP (dnat_and_snat) is assigned to a
> > VM with the old gw IP address, it will become unreachable.
> >
> > A workaround now from networking-ovn (OpenStack integration) is to
> > delete MAC_Binding entries for that IP address upon a FIP creation. I
> > think that this however should be done from OVN, what do you folks
> > think?
> >
> >
> Agree. Since the MAC_Binding table row is created by ovn-controller, it
> should
> be handled properly within OVN.

I see that this has been sitting here for a while.  The solution seems
reasonable to me.  Are either of you working on it?
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-10-29 Thread Numan Siddique
On Mon, Oct 29, 2018 at 5:00 PM Daniel Alvarez Sanchez 
wrote:

> Hi,
>
> After digging further. The problem seems to be reduced to reusing an
> old gateway IP address for a dnat_and_snat entry.
> When a gateway port is bound to a chassis, its entry will show up in
> the MAC_Binding table (at least when that Logical Switch is connected
> to more than one Logical Router). After deleting the Logical Router
> and all its ports, this entry will remain there. If a new Logical
> Router is created and a Floating IP (dnat_and_snat) is assigned to a
> VM with the old gw IP address, it will become unreachable.
>
> A workaround now from networking-ovn (OpenStack integration) is to
> delete MAC_Binding entries for that IP address upon a FIP creation. I
> think that this however should be done from OVN, what do you folks
> think?
>
>
Agree. Since the MAC_Binding table row is created by ovn-controller, it
should
be handled properly within OVN.

Thanks
Numan



> Thanks,
> Daniel
> On Fri, Oct 26, 2018 at 11:39 AM Daniel Alvarez Sanchez
>  wrote:
> >
> > Hi all,
> >
> > While analyzing a problem in OpenStack I think I have found out a
> > severe bug in OVN when it comes to reuse floating IPs (which is a very
> > common use case in OpenStack and Kubernetes). Let me explain the
> > scenario, issue and possible solutions:
> >
> > * Three logical switches  (Neutron networks) LS1, LS2, LS3
> > * LS3 has external connectivity (localnet port to a provider bridge).
> > * Two logical routers LR1 and LR2.
> > * LS1 and LS3 connected to LR1
> > * LS2 and LS3 connected to LR2.
> > * VM1 in LS1 with a FIP (dnat_and_snat NAT entry) in LS3 CIDR
> > * VM2 in LS2 with a FIP (dnat_and_snat NAT entry) in LS3 CIDR
> > * Ping from VM1 to VM2 FIP and viceversa works.
> >
> > Echo requests from VM1 reach to VM2 and VM2 responds to the FIP of VM1.
> > First time, ovn-controller will insert the ARP responder and add a new
> > entry to MAC_Binding table like:
> >
> > _uuid   : 447eaf43-119a-43b2-a821-0c79d8885d68
> > datapath: 07a76c72-6896-464a-8683-3df145d02434
> > ip  : "172.24.5.13"
> > logical_port: "lrp-82af833f-f78b-4f45-9fc8-719db0f9e619"
> > mac : "fa:16:3e:22:6c:0a"
> >
> > |binding|INFO|cr-lrp-198e5576-b654-4605-80c0-b9cf6d21ea2b: Claiming
> > fa:16:3e:22:6c:0a 172.24.5.4/24
> >
> > The problem happens when VM1, LS1, LR1 entry are deleted and recreated
> > again. If the FIP (172.24.5.13) is reused, the MAC_Binding entry won't
> > get updated and VM2 will be now unable to respond to pings coming from
> > VM1 as it'll attempt to do it to fa:16:3e:22:6c:0a.
> >
> > If I manually delete the MAC_Binding entry, a new one will then
> > correctly be recreated by ovn-controller with the right MAC address
> > (the one of the new cr-lrp).
> >
> > |00126|binding|INFO|cr-lrp-f09b2186-1cb2-4e50-99a5-587f680db8ad:
> > Claiming fa:16:3e:14:48:20 172.24.5.6/24
> >
> > _uuid   : dae11bdb-47d3-471e-8826-9aefb8572700
> > datapath: 07a76c72-6896-464a-8683-3df145d02434
> > ip  : "172.24.5.13"
> > logical_port: "lrp-82af833f-f78b-4f45-9fc8-719db0f9e619"
> > mac : "fa:16:3e:14:48:20"
> >
> >
> > Possible solutions:
> >
> > 1) Make ovn-controller (or ovn.-northd?) to update the MAC_Binding
> > entries whenever a new NAT row is created.
> >
> > 2) Send GARPs (I guess we're not doing this yet) whenever a LRP gets
> > bound to a chassis for all the nat_addresses that it has configured.
> >
> > For 2), I guess that it would make MAC_Binding entries getting updated
> > automatically?
> >
> > How does this sound?
> >
> > Thanks a lot,
> > Daniel Alvarez
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-10-29 Thread Daniel Alvarez Sanchez
Hi,

After digging further. The problem seems to be reduced to reusing an
old gateway IP address for a dnat_and_snat entry.
When a gateway port is bound to a chassis, its entry will show up in
the MAC_Binding table (at least when that Logical Switch is connected
to more than one Logical Router). After deleting the Logical Router
and all its ports, this entry will remain there. If a new Logical
Router is created and a Floating IP (dnat_and_snat) is assigned to a
VM with the old gw IP address, it will become unreachable.

A workaround now from networking-ovn (OpenStack integration) is to
delete MAC_Binding entries for that IP address upon a FIP creation. I
think that this however should be done from OVN, what do you folks
think?

Thanks,
Daniel
On Fri, Oct 26, 2018 at 11:39 AM Daniel Alvarez Sanchez
 wrote:
>
> Hi all,
>
> While analyzing a problem in OpenStack I think I have found out a
> severe bug in OVN when it comes to reuse floating IPs (which is a very
> common use case in OpenStack and Kubernetes). Let me explain the
> scenario, issue and possible solutions:
>
> * Three logical switches  (Neutron networks) LS1, LS2, LS3
> * LS3 has external connectivity (localnet port to a provider bridge).
> * Two logical routers LR1 and LR2.
> * LS1 and LS3 connected to LR1
> * LS2 and LS3 connected to LR2.
> * VM1 in LS1 with a FIP (dnat_and_snat NAT entry) in LS3 CIDR
> * VM2 in LS2 with a FIP (dnat_and_snat NAT entry) in LS3 CIDR
> * Ping from VM1 to VM2 FIP and viceversa works.
>
> Echo requests from VM1 reach to VM2 and VM2 responds to the FIP of VM1.
> First time, ovn-controller will insert the ARP responder and add a new
> entry to MAC_Binding table like:
>
> _uuid   : 447eaf43-119a-43b2-a821-0c79d8885d68
> datapath: 07a76c72-6896-464a-8683-3df145d02434
> ip  : "172.24.5.13"
> logical_port: "lrp-82af833f-f78b-4f45-9fc8-719db0f9e619"
> mac : "fa:16:3e:22:6c:0a"
>
> |binding|INFO|cr-lrp-198e5576-b654-4605-80c0-b9cf6d21ea2b: Claiming
> fa:16:3e:22:6c:0a 172.24.5.4/24
>
> The problem happens when VM1, LS1, LR1 entry are deleted and recreated
> again. If the FIP (172.24.5.13) is reused, the MAC_Binding entry won't
> get updated and VM2 will be now unable to respond to pings coming from
> VM1 as it'll attempt to do it to fa:16:3e:22:6c:0a.
>
> If I manually delete the MAC_Binding entry, a new one will then
> correctly be recreated by ovn-controller with the right MAC address
> (the one of the new cr-lrp).
>
> |00126|binding|INFO|cr-lrp-f09b2186-1cb2-4e50-99a5-587f680db8ad:
> Claiming fa:16:3e:14:48:20 172.24.5.6/24
>
> _uuid   : dae11bdb-47d3-471e-8826-9aefb8572700
> datapath: 07a76c72-6896-464a-8683-3df145d02434
> ip  : "172.24.5.13"
> logical_port: "lrp-82af833f-f78b-4f45-9fc8-719db0f9e619"
> mac : "fa:16:3e:14:48:20"
>
>
> Possible solutions:
>
> 1) Make ovn-controller (or ovn.-northd?) to update the MAC_Binding
> entries whenever a new NAT row is created.
>
> 2) Send GARPs (I guess we're not doing this yet) whenever a LRP gets
> bound to a chassis for all the nat_addresses that it has configured.
>
> For 2), I guess that it would make MAC_Binding entries getting updated
> automatically?
>
> How does this sound?
>
> Thanks a lot,
> Daniel Alvarez
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] OVN: MAC_Binding entries not getting updated leads to unreachable destinations

2018-10-26 Thread Daniel Alvarez Sanchez
Hi all,

While analyzing a problem in OpenStack I think I have found out a
severe bug in OVN when it comes to reuse floating IPs (which is a very
common use case in OpenStack and Kubernetes). Let me explain the
scenario, issue and possible solutions:

* Three logical switches  (Neutron networks) LS1, LS2, LS3
* LS3 has external connectivity (localnet port to a provider bridge).
* Two logical routers LR1 and LR2.
* LS1 and LS3 connected to LR1
* LS2 and LS3 connected to LR2.
* VM1 in LS1 with a FIP (dnat_and_snat NAT entry) in LS3 CIDR
* VM2 in LS2 with a FIP (dnat_and_snat NAT entry) in LS3 CIDR
* Ping from VM1 to VM2 FIP and viceversa works.

Echo requests from VM1 reach to VM2 and VM2 responds to the FIP of VM1.
First time, ovn-controller will insert the ARP responder and add a new
entry to MAC_Binding table like:

_uuid   : 447eaf43-119a-43b2-a821-0c79d8885d68
datapath: 07a76c72-6896-464a-8683-3df145d02434
ip  : "172.24.5.13"
logical_port: "lrp-82af833f-f78b-4f45-9fc8-719db0f9e619"
mac : "fa:16:3e:22:6c:0a"

|binding|INFO|cr-lrp-198e5576-b654-4605-80c0-b9cf6d21ea2b: Claiming
fa:16:3e:22:6c:0a 172.24.5.4/24

The problem happens when VM1, LS1, LR1 entry are deleted and recreated
again. If the FIP (172.24.5.13) is reused, the MAC_Binding entry won't
get updated and VM2 will be now unable to respond to pings coming from
VM1 as it'll attempt to do it to fa:16:3e:22:6c:0a.

If I manually delete the MAC_Binding entry, a new one will then
correctly be recreated by ovn-controller with the right MAC address
(the one of the new cr-lrp).

|00126|binding|INFO|cr-lrp-f09b2186-1cb2-4e50-99a5-587f680db8ad:
Claiming fa:16:3e:14:48:20 172.24.5.6/24

_uuid   : dae11bdb-47d3-471e-8826-9aefb8572700
datapath: 07a76c72-6896-464a-8683-3df145d02434
ip  : "172.24.5.13"
logical_port: "lrp-82af833f-f78b-4f45-9fc8-719db0f9e619"
mac : "fa:16:3e:14:48:20"


Possible solutions:

1) Make ovn-controller (or ovn.-northd?) to update the MAC_Binding
entries whenever a new NAT row is created.

2) Send GARPs (I guess we're not doing this yet) whenever a LRP gets
bound to a chassis for all the nat_addresses that it has configured.

For 2), I guess that it would make MAC_Binding entries getting updated
automatically?

How does this sound?

Thanks a lot,
Daniel Alvarez
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss