Re: [ovs-discuss] [OVN] ovn-controller Incremental Processing scale testing

2019-07-08 Thread Numan Siddique
On Tue, Jul 9, 2019 at 11:05 AM Han Zhou  wrote:

>
>
> On Fri, Jun 21, 2019 at 12:31 AM Han Zhou  wrote:
> >
> >
> >
> > On Thu, Jun 20, 2019 at 11:42 PM Numan Siddique 
> wrote:
> > >
> > >
> > >
> > > On Fri, Jun 21, 2019, 11:47 AM Han Zhou  wrote:
> > >>
> > >>
> > >>
> > >> On Tue, Jun 11, 2019 at 9:16 AM Daniel Alvarez Sanchez <
> dalva...@redhat.com> wrote:
> > >> >
> > >> > Thanks a lot Han for the answer!
> > >> >
> > >> > On Tue, Jun 11, 2019 at 5:57 PM Han Zhou  wrote:
> > >> > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Tue, Jun 11, 2019 at 5:12 AM Dumitru Ceara 
> wrote:
> > >> > > >
> > >> > > > On Tue, Jun 11, 2019 at 10:40 AM Daniel Alvarez Sanchez
> > >> > > >  wrote:
> > >> > > > >
> > >> > > > > Hi Han, all,
> > >> > > > >
> > >> > > > > Lucas, Numan and I have been doing some 'scale' testing of
> OpenStack
> > >> > > > > using OVN and wanted to present some results and issues that
> we've
> > >> > > > > found with the Incremental Processing feature in
> ovn-controller. Below
> > >> > > > > is the scenario that we executed:
> > >> > > > >
> > >> > > > > * 7 baremetal nodes setup: 3 controllers (running
> > >> > > > > ovn-northd/ovsdb-servers in A/P with pacemaker) + 4 compute
> nodes. OVS
> > >> > > > > 2.10.
> > >> > > > > * The test consists on:
> > >> > > > >   - Create openstack network (OVN LS), subnet and router
> > >> > > > >   - Attach subnet to the router and set gw to the external
> network
> > >> > > > >   - Create an OpenStack port and apply a Security Group (ACLs
> to allow
> > >> > > > > UDP, SSH and ICMP).
> > >> > > > >   - Bind the port to one of the 4 compute nodes (randomly) by
> > >> > > > > attaching it to a network namespace.
> > >> > > > >   - Wait for the port to be ACTIVE in Neutron ('up == True'
> in NB)
> > >> > > > >   - Wait until the test can ping the port
> > >> > > > > * Running browbeat/rally with 16 simultaneous process to
> execute the
> > >> > > > > test above 150 times.
> > >> > > > > * When all the 150 'fake VMs' are created, browbeat will
> delete all
> > >> > > > > the OpenStack/OVN resources.
> > >> > > > >
> > >> > > > > We first tried with OVS/OVN 2.10 and pulled some results
> which showed
> > >> > > > > 100% success but ovn-controller is quite loaded (as expected)
> in all
> > >> > > > > the nodes especially during the deletion phase:
> > >> > > > >
> > >> > > > > - Compute node: https://imgur.com/a/tzxfrIR
> > >> > > > > - Controller node (ovn-northd and ovsdb-servers):
> https://imgur.com/a/8ffKKYF
> > >> > > > >
> > >> > > > > After conducting the tests above, we replaced ovn-controller
> in all 7
> > >> > > > > nodes by the one with the current master branch (actually
> from last
> > >> > > > > week). We also replaced ovn-northd and ovsdb-servers but the
> > >> > > > > ovs-vswitchd has been left untouched (still on 2.10). The
> expected
> > >> > > > > results were to get less ovn-controller CPU usage and also
> better
> > >> > > > > times due to the Incremental Processing feature introduced
> recently.
> > >> > > > > However, the results don't look very good:
> > >> > > > >
> > >> > > > > - Compute node: https://imgur.com/a/wuq87F1
> > >> > > > > - Controller node (ovn-northd and ovsdb-servers):
> https://imgur.com/a/99kiyDp
> > >> > > > >
> > >> > > > > One thing that we can tell from the ovs-vswitchd CPU
> consumption is
> > >> > > > > that it's much less in the Incremental Processing (IP) case
> which
> > >> > > > > apparently doesn't make much sense. This led us to think that
> perhaps
> > >> > > > > ovn-controller was not installing the necessary flows in the
> switch
> > >> > > > > and we confirmed this hypothesis by looking into the dataplane
> > >> > > > > results. Out of the 150 VMs, 10% of them were unreachable via
> ping
> > >> > > > > when using ovn-controller from master.
> > >> > > > >
> > >> > > > > @Han, others, do you have any ideas as of what could be
> happening
> > >> > > > > here? We'll be able to use this setup for a few more days so
> let me
> > >> > > > > know if you want us to pull some other data/traces, ...
> > >> > > > >
> > >> > > > > Some other interesting things:
> > >> > > > > On each of the compute nodes, (with an almost evenly
> distributed
> > >> > > > > number of logical ports bound to them), the max amount of
> logical
> > >> > > > > flows in br-int is ~90K (by the end of the test, right before
> deleting
> > >> > > > > the resources).
> > >> > > > >
> > >> > > > > It looks like with the IP version, ovn-controller leaks some
> memory:
> > >> > > > > https://imgur.com/a/trQrhWd
> > >> > > > > While with OVS 2.10, it remains pretty flat during the test:
> > >> > > > > https://imgur.com/a/KCkIT4O
> > >> > > >
> > >> > > > Hi Daniel, Han,
> > >> > > >
> > >> > > > I just sent a small patch for the ovn-controller memory leak:
> > >> > > > https://patchwork.ozlabs.org/patch/1113758/
> > >> > > >
> > >> > > > At least on my setup this is what valgrind was pointing at.
> > >> > > >
> > >> > > > Cheers,
> 

Re: [ovs-discuss] [OVN] ovn-controller Incremental Processing scale testing

2019-07-08 Thread Han Zhou
On Fri, Jun 21, 2019 at 12:31 AM Han Zhou  wrote:
>
>
>
> On Thu, Jun 20, 2019 at 11:42 PM Numan Siddique 
wrote:
> >
> >
> >
> > On Fri, Jun 21, 2019, 11:47 AM Han Zhou  wrote:
> >>
> >>
> >>
> >> On Tue, Jun 11, 2019 at 9:16 AM Daniel Alvarez Sanchez <
dalva...@redhat.com> wrote:
> >> >
> >> > Thanks a lot Han for the answer!
> >> >
> >> > On Tue, Jun 11, 2019 at 5:57 PM Han Zhou  wrote:
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Tue, Jun 11, 2019 at 5:12 AM Dumitru Ceara 
wrote:
> >> > > >
> >> > > > On Tue, Jun 11, 2019 at 10:40 AM Daniel Alvarez Sanchez
> >> > > >  wrote:
> >> > > > >
> >> > > > > Hi Han, all,
> >> > > > >
> >> > > > > Lucas, Numan and I have been doing some 'scale' testing of
OpenStack
> >> > > > > using OVN and wanted to present some results and issues that
we've
> >> > > > > found with the Incremental Processing feature in
ovn-controller. Below
> >> > > > > is the scenario that we executed:
> >> > > > >
> >> > > > > * 7 baremetal nodes setup: 3 controllers (running
> >> > > > > ovn-northd/ovsdb-servers in A/P with pacemaker) + 4 compute
nodes. OVS
> >> > > > > 2.10.
> >> > > > > * The test consists on:
> >> > > > >   - Create openstack network (OVN LS), subnet and router
> >> > > > >   - Attach subnet to the router and set gw to the external
network
> >> > > > >   - Create an OpenStack port and apply a Security Group (ACLs
to allow
> >> > > > > UDP, SSH and ICMP).
> >> > > > >   - Bind the port to one of the 4 compute nodes (randomly) by
> >> > > > > attaching it to a network namespace.
> >> > > > >   - Wait for the port to be ACTIVE in Neutron ('up == True' in
NB)
> >> > > > >   - Wait until the test can ping the port
> >> > > > > * Running browbeat/rally with 16 simultaneous process to
execute the
> >> > > > > test above 150 times.
> >> > > > > * When all the 150 'fake VMs' are created, browbeat will
delete all
> >> > > > > the OpenStack/OVN resources.
> >> > > > >
> >> > > > > We first tried with OVS/OVN 2.10 and pulled some results which
showed
> >> > > > > 100% success but ovn-controller is quite loaded (as expected)
in all
> >> > > > > the nodes especially during the deletion phase:
> >> > > > >
> >> > > > > - Compute node: https://imgur.com/a/tzxfrIR
> >> > > > > - Controller node (ovn-northd and ovsdb-servers):
https://imgur.com/a/8ffKKYF
> >> > > > >
> >> > > > > After conducting the tests above, we replaced ovn-controller
in all 7
> >> > > > > nodes by the one with the current master branch (actually from
last
> >> > > > > week). We also replaced ovn-northd and ovsdb-servers but the
> >> > > > > ovs-vswitchd has been left untouched (still on 2.10). The
expected
> >> > > > > results were to get less ovn-controller CPU usage and also
better
> >> > > > > times due to the Incremental Processing feature introduced
recently.
> >> > > > > However, the results don't look very good:
> >> > > > >
> >> > > > > - Compute node: https://imgur.com/a/wuq87F1
> >> > > > > - Controller node (ovn-northd and ovsdb-servers):
https://imgur.com/a/99kiyDp
> >> > > > >
> >> > > > > One thing that we can tell from the ovs-vswitchd CPU
consumption is
> >> > > > > that it's much less in the Incremental Processing (IP) case
which
> >> > > > > apparently doesn't make much sense. This led us to think that
perhaps
> >> > > > > ovn-controller was not installing the necessary flows in the
switch
> >> > > > > and we confirmed this hypothesis by looking into the dataplane
> >> > > > > results. Out of the 150 VMs, 10% of them were unreachable via
ping
> >> > > > > when using ovn-controller from master.
> >> > > > >
> >> > > > > @Han, others, do you have any ideas as of what could be
happening
> >> > > > > here? We'll be able to use this setup for a few more days so
let me
> >> > > > > know if you want us to pull some other data/traces, ...
> >> > > > >
> >> > > > > Some other interesting things:
> >> > > > > On each of the compute nodes, (with an almost evenly
distributed
> >> > > > > number of logical ports bound to them), the max amount of
logical
> >> > > > > flows in br-int is ~90K (by the end of the test, right before
deleting
> >> > > > > the resources).
> >> > > > >
> >> > > > > It looks like with the IP version, ovn-controller leaks some
memory:
> >> > > > > https://imgur.com/a/trQrhWd
> >> > > > > While with OVS 2.10, it remains pretty flat during the test:
> >> > > > > https://imgur.com/a/KCkIT4O
> >> > > >
> >> > > > Hi Daniel, Han,
> >> > > >
> >> > > > I just sent a small patch for the ovn-controller memory leak:
> >> > > > https://patchwork.ozlabs.org/patch/1113758/
> >> > > >
> >> > > > At least on my setup this is what valgrind was pointing at.
> >> > > >
> >> > > > Cheers,
> >> > > > Dumitru
> >> > > >
> >> > > > >
> >> > > > > Looking forward to hearing back :)
> >> > > > > Daniel
> >> > > > >
> >> > > > > PS. Sorry for my previous email, I sent it by mistake without
the subject
> >> > > > > ___
> >> > > > > discuss mailing list
> >> > > > > 

Re: [ovs-discuss] [OVN] Aging mechanism for MAC_Binding table

2019-07-08 Thread Ben Pfaff
On Mon, Jul 08, 2019 at 06:19:23PM -0700, Han Zhou wrote:
> On Thu, Jun 27, 2019 at 6:44 AM Ben Pfaff  wrote:
> >
> > On Tue, Jun 25, 2019 at 01:05:21PM +0200, Daniel Alvarez Sanchez wrote:
> > > Lately we've been trying to solve certain issues related to stale
> > > entries in the MAC_Binding table (e.g. [0]). On the other hand, for
> > > the OpenStack + Octavia (Load Balancing service) use case, we see that
> > > a reused VIP can be as well affected by stale entries in this table
> > > due to the fact that it's never bound to a VIF so ovn-controller won't
> > > claim it and send the GARPs to update the neighbors.
> > >
> > > I'm not sure if other scenarios may suffer from this issue but seems
> > > reasonable to have an aging mechanism (as we discussed at some point
> > > in the past) that makes unused/old entries to expire. After talking to
> > > Numan on IRC, since a new pinctrl thread has been introduced recently
> > > [1], it'd be nice to implement this aging mechanism there.
> > > At the same time we'd be also reducing the amount of entries for long
> > > lived systems as it'd grow indefinitely.
> > >
> > > Any thoughts?
> > >
> > > Thanks!
> > > Daniel
> > >
> > > PS. With regards to the 'unused' vs 'old' entries I think it has to be
> > > 'old' rather than 'unused' as I don't see a way to reset the TTL of a
> > > MAC_Binding entry when we see packets coming. The implication is that
> > > we'll be seeing ARPs sent out more often when perhaps they're not
> > > needed. This also leads to the discussion of making the cache timeout
> > > configurable.
> >
> > I've always considered the MAC_Binding implementation incomplete because
> > of this issue and others.  ovn/TODO.rst says:
> >
> > * Dynamic IP to MAC binding enhancements.
> >
> >   OVN has basic support for establishing IP to MAC bindings
> dynamically, using
> >   ARP.
> >
> >   * Ratelimiting.
> >
> > From casual observation, Linux appears to generate at most one
> ARP per
> > second per destination.
> >
> > This might be supported by adding a new OVN logical action for
> > rate-limiting.
> >
> >   * Tracking queries
> >
> >  It's probably best to only record in the database responses to
> queries
> >  actually issued by an L3 logical router, so somehow they have to
> be
> >  tracked, probably by putting a tentative binding without a MAC
> address
> >  into the database.
> >
> >   * Renewal and expiration.
> >
> > Something needs to make sure that bindings remain valid and
> expire those
> > that become stale.
> >
> > One way to do this might be to add some support for time to the
> database
> > server itself.
> >
> >   * Table size limiting.
> >
> > The table of MAC bindings must not be allowed to grow
> unreasonably large.
> >
> >   * MTU handling (fragmentation on output)
> >
> > So, what do we do about it?  First, I think that adding support for time
> > to the database server is a terrible idea (even though I think I wrote
> > the above originally).  Let's not do that.  The following is some
> > "thinking out loud" on the subject.
> >
> > I think there's a challenge around which ovn-controller should take care
> > of a given MAC_Binding.  We don't want every ovn-controller expiring
> > every binding.  Ideally, we want exactly one ovn-controller expiring a
> > binding.  One way would be to add an owner column (but it would be
> > better if we don't need it).
> >
> > If we want to keep track of "unused" bindings, I can imagine a
> > statistical mechanism to do that.  Any user of a binding occasionally
> > and probabilistically changes a serial number column that we'd introduce
> > into the MAC_Binding table (this could be optimized to not bother if it
> > has changed recently).  The owner checks the serial number every so
> > often and if it hasn't changed then it deletes the row.
> >
> 
> Thanks Ben for the advice. Since the user of a binding is simply a OpenFlow
> rule matching, I guess we will need "controller" action to trigger the
> serial number column update in ovn-controller, combined with a meter action
> so that only small number of packets trigger the update. Is this what you
> are suggesting?

I had not thought that far ahead!  That approach would work, although
the trigger percentage would be difficult to figure out--it seems like
really we'd want "every Nth second", not "every Nth packet".  Another
approach that might work would be for ovn-controller to notice the
statistics on appropriate OpenFlow flows changing, or to use "learn"
actions as a way to make a controller action trigger only every so
often.
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Re:[HELP] Question about userspace geneve/vxlan port

2019-07-08 Thread txfh2007 via discuss
Hi Ben:
I have read the "userspace tunnel" document for several times, but I still 
have no clue how could the tunnel pkt get parsed in rx direction.  In my first 
mail, I have found the "tnl_port_receive" should only be called during upcall 
process, but for userspace upcall process ,after miniflow_extract the metadata 
field has been set to "0", so the tunnel header can't be parsed.

   My test env is as below, there are two OVS userspace bridges, if dpdk0 on 
br-provider receive a pkt with tunnel header, the pkt would be delivered to 
internal port br-provider, but can't be sent to OVN-XXX port on br-int. 
   I am wondering if my test topology is wrong or there are other mechanism to 
parse tunnel hdr.

Thank you !

Timo  

Bridge br-int
fail_mode: secure

Port "ovn-e1c6a3-0"
Interface "ovn-e1c6a3-0"
type: geneve
options: {csum="true", key=flow, remote_ip="10.142.18.12"}
Port "vhuf77e9f1f-d9"
Interface "vhuf77e9f1f-d9"
type: dpdkvhostuser

Port br-int
Interface br-int
type: internal

Bridge br-provider
Port br-provider
Interface br-provider
type: internal

Port "dpdk0"
Interface "dpdk0"
type: dpdk
options: {dpdk-devargs=":02:00.0", n_rxq="2"}

--
Ben Pfaff 
txfh2007 
ovs-discuss 
Re: Re:[ovs-discuss] Re:[HELP] Question about userspace geneve/vxlan port


Native tunneling and userspace tunneling are the same thing.

The mechanism should be symmetric: configuration for sending packets out
should also work for parsing them on the way back in.

On Mon, Jul 08, 2019 at 03:57:46PM +0800, txfh2007 wrote:
> Hi Ben:
> Thanks for your reply ! I didn't find the "native tunneling" document in 
> OpenvSwitch repository. Did you mean the document "userspace-tunneling.rst". 
> this document just tells us the br-phy can send tunnel pkt out, but when dpdk 
> type port receives pkts with tunnel hdr, how could I configure the "native 
> tunnel" mechanism to parse and handle these pkts? Or what you mean is 
> currently OVS cannot handle parsing tunnel pkts in userspace ?
> 
> Thank you 
> 
> Timo 
> 
> 
> --
> Ben Pfaff 
> txfh2007 
> ovs-discuss 
> Re: [ovs-discuss] Re:[HELP] Question about userspace geneve/vxlan port
> 
> 
> On Thu, Jul 04, 2019 at 05:27:28PM +0800, txfh2007 via discuss wrote:
> > I have found theoritically during the upcall process, task
> > tnl_port_receive could be called(via upcall_cb() -> upcall_receive()
> > -> xlate_lookup() ->xport_lookup). But in my env, after tracing code
> > by gdb, I have found the task "tnl_port_should_receive(flow)" always
> > returns "false" for flow->tunnel->ip_dst is "0", even if the pkt
> > received by dpdk port has a tunnel header.
> 
> Yes.
> 
> > I guess the reason is in userspace task "handle_packet_upcall", the
> > match.tun_md.valid has been set "false", so the expanded flow has no
> > tunnel info, and also in task "miniflow_extract" in flow.c, the
> > packet->md is null as in dfc_processing task the "md_is_valid" flag is
> > always "false". Am I right ?
> 
> Yes.
> 
> OVS takes what some might consider an idiosyncratic approach to tunnel
> processing.  The "obvious" approach is to simply parse tunnel headers
> and throw those into the flow.  If OVS did that, then you'd see what you
> expect, but this isn't what OVS does.
> 
> Instead, OVS treats tunnel and their headers as metadata.  This is
> because of OVS's history as part of the Linux kernel.  The Linux kernel
> has tunnel implementations as part of the TCP/IP stack.  When a tunnel
> packet arrives at a physical port in Linux, it passes into the TCP/IP
> stack, where it gets processed and received on a tunnel network device.
> This effectively strips the tunnel headers and transforms them into
> metadata.  If the tunnel network device is part of an OVS bridge, then
> it gets the packet at that point, and treats the metadata as something
> that can be matched.
> 
> With other datapaths, OVS expects some equivalent mechanism to exist.
> For the userspace datapath, OVS implements "native tunneling" to provide
> that mechanism.  It's described in the OVS documentation.
> 

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] [OVN] Aging mechanism for MAC_Binding table

2019-07-08 Thread Han Zhou
On Thu, Jun 27, 2019 at 6:44 AM Ben Pfaff  wrote:
>
> On Tue, Jun 25, 2019 at 01:05:21PM +0200, Daniel Alvarez Sanchez wrote:
> > Lately we've been trying to solve certain issues related to stale
> > entries in the MAC_Binding table (e.g. [0]). On the other hand, for
> > the OpenStack + Octavia (Load Balancing service) use case, we see that
> > a reused VIP can be as well affected by stale entries in this table
> > due to the fact that it's never bound to a VIF so ovn-controller won't
> > claim it and send the GARPs to update the neighbors.
> >
> > I'm not sure if other scenarios may suffer from this issue but seems
> > reasonable to have an aging mechanism (as we discussed at some point
> > in the past) that makes unused/old entries to expire. After talking to
> > Numan on IRC, since a new pinctrl thread has been introduced recently
> > [1], it'd be nice to implement this aging mechanism there.
> > At the same time we'd be also reducing the amount of entries for long
> > lived systems as it'd grow indefinitely.
> >
> > Any thoughts?
> >
> > Thanks!
> > Daniel
> >
> > PS. With regards to the 'unused' vs 'old' entries I think it has to be
> > 'old' rather than 'unused' as I don't see a way to reset the TTL of a
> > MAC_Binding entry when we see packets coming. The implication is that
> > we'll be seeing ARPs sent out more often when perhaps they're not
> > needed. This also leads to the discussion of making the cache timeout
> > configurable.
>
> I've always considered the MAC_Binding implementation incomplete because
> of this issue and others.  ovn/TODO.rst says:
>
> * Dynamic IP to MAC binding enhancements.
>
>   OVN has basic support for establishing IP to MAC bindings
dynamically, using
>   ARP.
>
>   * Ratelimiting.
>
> From casual observation, Linux appears to generate at most one
ARP per
> second per destination.
>
> This might be supported by adding a new OVN logical action for
> rate-limiting.
>
>   * Tracking queries
>
>  It's probably best to only record in the database responses to
queries
>  actually issued by an L3 logical router, so somehow they have to
be
>  tracked, probably by putting a tentative binding without a MAC
address
>  into the database.
>
>   * Renewal and expiration.
>
> Something needs to make sure that bindings remain valid and
expire those
> that become stale.
>
> One way to do this might be to add some support for time to the
database
> server itself.
>
>   * Table size limiting.
>
> The table of MAC bindings must not be allowed to grow
unreasonably large.
>
>   * MTU handling (fragmentation on output)
>
> So, what do we do about it?  First, I think that adding support for time
> to the database server is a terrible idea (even though I think I wrote
> the above originally).  Let's not do that.  The following is some
> "thinking out loud" on the subject.
>
> I think there's a challenge around which ovn-controller should take care
> of a given MAC_Binding.  We don't want every ovn-controller expiring
> every binding.  Ideally, we want exactly one ovn-controller expiring a
> binding.  One way would be to add an owner column (but it would be
> better if we don't need it).
>
> If we want to keep track of "unused" bindings, I can imagine a
> statistical mechanism to do that.  Any user of a binding occasionally
> and probabilistically changes a serial number column that we'd introduce
> into the MAC_Binding table (this could be optimized to not bother if it
> has changed recently).  The owner checks the serial number every so
> often and if it hasn't changed then it deletes the row.
>

Thanks Ben for the advice. Since the user of a binding is simply a OpenFlow
rule matching, I guess we will need "controller" action to trigger the
serial number column update in ovn-controller, combined with a meter action
so that only small number of packets trigger the update. Is this what you
are suggesting?


> The owner could also occasionally revalidate the binding.
>
> Any thoughts?
>
> Thanks,
>
> Ben.
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Daniel Alvarez Sanchez
On Mon, Jul 8, 2019 at 5:43 PM Ben Pfaff  wrote:
>
> Would you mind formally submitting this?  It seems like the best
> immediate solution.

Will do, thanks a lot Ben!
>
> On Mon, Jul 08, 2019 at 02:27:31PM +0200, Daniel Alvarez Sanchez wrote:
> > I tried a simple patch and it fixes the issue (see below). The
> > question now is, do we want to do this? I think it makes sense to drop
> > *all* the connections when the role changes but I'm curious to see
> > what other people think:
> >
> > diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c
> > index 4dda63a..ddbbc2e 100644
> > --- a/ovsdb/jsonrpc-server.c
> > +++ b/ovsdb/jsonrpc-server.c
> > @@ -365,7 +365,7 @@ ovsdb_jsonrpc_server_set_read_only(struct
> > ovsdb_jsonrpc_server *svr,
> >  {
> >  if (svr->read_only != read_only) {
> >  svr->read_only = read_only;
> > -ovsdb_jsonrpc_server_reconnect(svr, false,
> > +ovsdb_jsonrpc_server_reconnect(svr, true,
> > xstrdup(read_only
> > ? "making server read-only"
> > : "making server 
> > read/write"));
> >
> >
> > $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> > $ovn-nbctl ls-add sw0
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > state: active
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server
> > tcp:192.0.2.2:6641
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > state: backup
> > connecting: tcp:192.0.2.2:6641
> > $ ovn-nbctl ls-add sw1
> > ovn-nbctl: transaction error: {"details":"insert operation not allowed
> > when database server is in read only mode","error":"not allowed"}
> >
> > On Mon, Jul 8, 2019 at 1:25 PM Daniel Alvarez Sanchez
> >  wrote:
> > >
> > > I *think* that it may not a bug in ovsdb-server but a problem with
> > > ovn-controller as it doesn't seem to be a DB change aware client.
> > >
> > > When the role changes from master to backup or viceversa, connections
> > > are expected to be reestablished for all clients except those that are
> > > not aware of db changes [0] (note the 'false' argument). This flag is
> > > explained here [1] and looks like since ovn-controller is not
> > > monitoring the Database table in the _Server database, then the
> > > connection with it is not re-established. This is just a blind guess
> > > but  I can give it a shot :)
> > >
> > > [0] 
> > > https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L368
> > > [1] 
> > > https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L450-L456
> > >
> > > On Mon, Jul 8, 2019 at 12:45 PM Numan Siddique  
> > > wrote:
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Jul 8, 2019 at 3:52 PM Daniel Alvarez Sanchez 
> > > >  wrote:
> > > >>
> > > >> Hi folks,
> > > >>
> > > >> While working with an OpenStack environment running OVN and
> > > >> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
> > > >> has been probably around for a long time. The bug itself seems to be
> > > >> related with ovsdb-server not updating the read-only flag properly.
> > > >>
> > > >> With a 3 nodes cluster running ovsdb-server in active/passive mode,
> > > >> when we restart the master-node, pacemaker promotes another node as
> > > >> master and moves the associated IPAddr2 resource to it.
> > > >> At this point, ovn-controller instances across the cloud reconnect to
> > > >> the new node but there's a window where ovsdb-server is still running
> > > >> as backup.
> > > >>
> > > >> For those ovn-controller instances that reconnect within that window,
> > > >> every attempt to write in the OVSDB will fail with "operation not
> > > >> allowed when database server is in read only mode". This state will
> > > >> remain forever unless a reconnection is forced. Restarting
> > > >> ovn-controller or killing the connection (for example with tcpkill)
> > > >> will make things work again.
> > > >>
> > > >> A workaround in OVN OCF script could be to wait for the
> > > >> ovsdb_server_promote function to wait until we get 'running/active' on
> > > >> that instance.
> > > >>
> > > >> Another open question is what should clients (in this case,
> > > >> ovn-controller) do in such situation? Shall they log an error and
> > > >> attempt a reconnection (rate limited)?
> > > >
> > > >
> > > > Thanks for reporting this issue Daniel.
> > > >
> > > > I can easily  reproduce the issue with the below commands.
> > > >
> > > > $  > > > $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> > > > $ovn-nbctl ls-add sw0
> > > > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > > > state: active
> > > > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server 
> > > > tcp:192.0.2.2:6641
> > > > $ovs-appctl -t 

Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Ben Pfaff
Would you mind formally submitting this?  It seems like the best
immediate solution.

On Mon, Jul 08, 2019 at 02:27:31PM +0200, Daniel Alvarez Sanchez wrote:
> I tried a simple patch and it fixes the issue (see below). The
> question now is, do we want to do this? I think it makes sense to drop
> *all* the connections when the role changes but I'm curious to see
> what other people think:
> 
> diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c
> index 4dda63a..ddbbc2e 100644
> --- a/ovsdb/jsonrpc-server.c
> +++ b/ovsdb/jsonrpc-server.c
> @@ -365,7 +365,7 @@ ovsdb_jsonrpc_server_set_read_only(struct
> ovsdb_jsonrpc_server *svr,
>  {
>  if (svr->read_only != read_only) {
>  svr->read_only = read_only;
> -ovsdb_jsonrpc_server_reconnect(svr, false,
> +ovsdb_jsonrpc_server_reconnect(svr, true,
> xstrdup(read_only
> ? "making server read-only"
> : "making server 
> read/write"));
> 
> 
> $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> $ovn-nbctl ls-add sw0
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> state: active
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server
> tcp:192.0.2.2:6641
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> state: backup
> connecting: tcp:192.0.2.2:6641
> $ ovn-nbctl ls-add sw1
> ovn-nbctl: transaction error: {"details":"insert operation not allowed
> when database server is in read only mode","error":"not allowed"}
> 
> On Mon, Jul 8, 2019 at 1:25 PM Daniel Alvarez Sanchez
>  wrote:
> >
> > I *think* that it may not a bug in ovsdb-server but a problem with
> > ovn-controller as it doesn't seem to be a DB change aware client.
> >
> > When the role changes from master to backup or viceversa, connections
> > are expected to be reestablished for all clients except those that are
> > not aware of db changes [0] (note the 'false' argument). This flag is
> > explained here [1] and looks like since ovn-controller is not
> > monitoring the Database table in the _Server database, then the
> > connection with it is not re-established. This is just a blind guess
> > but  I can give it a shot :)
> >
> > [0] 
> > https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L368
> > [1] 
> > https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L450-L456
> >
> > On Mon, Jul 8, 2019 at 12:45 PM Numan Siddique  wrote:
> > >
> > >
> > >
> > >
> > > On Mon, Jul 8, 2019 at 3:52 PM Daniel Alvarez Sanchez 
> > >  wrote:
> > >>
> > >> Hi folks,
> > >>
> > >> While working with an OpenStack environment running OVN and
> > >> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
> > >> has been probably around for a long time. The bug itself seems to be
> > >> related with ovsdb-server not updating the read-only flag properly.
> > >>
> > >> With a 3 nodes cluster running ovsdb-server in active/passive mode,
> > >> when we restart the master-node, pacemaker promotes another node as
> > >> master and moves the associated IPAddr2 resource to it.
> > >> At this point, ovn-controller instances across the cloud reconnect to
> > >> the new node but there's a window where ovsdb-server is still running
> > >> as backup.
> > >>
> > >> For those ovn-controller instances that reconnect within that window,
> > >> every attempt to write in the OVSDB will fail with "operation not
> > >> allowed when database server is in read only mode". This state will
> > >> remain forever unless a reconnection is forced. Restarting
> > >> ovn-controller or killing the connection (for example with tcpkill)
> > >> will make things work again.
> > >>
> > >> A workaround in OVN OCF script could be to wait for the
> > >> ovsdb_server_promote function to wait until we get 'running/active' on
> > >> that instance.
> > >>
> > >> Another open question is what should clients (in this case,
> > >> ovn-controller) do in such situation? Shall they log an error and
> > >> attempt a reconnection (rate limited)?
> > >
> > >
> > > Thanks for reporting this issue Daniel.
> > >
> > > I can easily  reproduce the issue with the below commands.
> > >
> > > $  > > $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> > > $ovn-nbctl ls-add sw0
> > > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > > state: active
> > > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server 
> > > tcp:192.0.2.2:6641
> > > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
> > > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > > state: backup
> > > connecting: tcp:192.0.2.2:6641
> > > $ovn-nbctl ls-add sw1  --> This should have failed. Since OVN_NB_DAEMON 
> > > is set, ovn-nbctl talks to the
> > >

Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Ben Pfaff
ovn-controller is in fact change-aware, but the _Server database doesn't
report whether a particular database is read-only or read/write.  I
guess that was an oversight when I designed that schema.  That means
that there's no way for clients to monitor whether a particular database
changes between read-only and read/write.

I guess there are two ways to fix it:

1. Add a read/write column to the _Server schema and implement it in
   ovsdb-server and ovn-controller.

2. Make ovsdb-server kill connections when read/write status changes.

#2 is probably what we should do right away.  #1 can wait.

On Mon, Jul 08, 2019 at 01:25:09PM +0200, Daniel Alvarez Sanchez wrote:
> I *think* that it may not a bug in ovsdb-server but a problem with
> ovn-controller as it doesn't seem to be a DB change aware client.
> 
> When the role changes from master to backup or viceversa, connections
> are expected to be reestablished for all clients except those that are
> not aware of db changes [0] (note the 'false' argument). This flag is
> explained here [1] and looks like since ovn-controller is not
> monitoring the Database table in the _Server database, then the
> connection with it is not re-established. This is just a blind guess
> but  I can give it a shot :)
> 
> [0] 
> https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L368
> [1] 
> https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L450-L456
> 
> On Mon, Jul 8, 2019 at 12:45 PM Numan Siddique  wrote:
> >
> >
> >
> >
> > On Mon, Jul 8, 2019 at 3:52 PM Daniel Alvarez Sanchez  
> > wrote:
> >>
> >> Hi folks,
> >>
> >> While working with an OpenStack environment running OVN and
> >> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
> >> has been probably around for a long time. The bug itself seems to be
> >> related with ovsdb-server not updating the read-only flag properly.
> >>
> >> With a 3 nodes cluster running ovsdb-server in active/passive mode,
> >> when we restart the master-node, pacemaker promotes another node as
> >> master and moves the associated IPAddr2 resource to it.
> >> At this point, ovn-controller instances across the cloud reconnect to
> >> the new node but there's a window where ovsdb-server is still running
> >> as backup.
> >>
> >> For those ovn-controller instances that reconnect within that window,
> >> every attempt to write in the OVSDB will fail with "operation not
> >> allowed when database server is in read only mode". This state will
> >> remain forever unless a reconnection is forced. Restarting
> >> ovn-controller or killing the connection (for example with tcpkill)
> >> will make things work again.
> >>
> >> A workaround in OVN OCF script could be to wait for the
> >> ovsdb_server_promote function to wait until we get 'running/active' on
> >> that instance.
> >>
> >> Another open question is what should clients (in this case,
> >> ovn-controller) do in such situation? Shall they log an error and
> >> attempt a reconnection (rate limited)?
> >
> >
> > Thanks for reporting this issue Daniel.
> >
> > I can easily  reproduce the issue with the below commands.
> >
> > $  > $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> > $ovn-nbctl ls-add sw0
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > state: active
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server 
> > tcp:192.0.2.2:6641
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > state: backup
> > connecting: tcp:192.0.2.2:6641
> > $ovn-nbctl ls-add sw1  --> This should have failed. Since OVN_NB_DAEMON is 
> > set, ovn-nbctl talks to the
> >ovn-nbctl daemon and it is able 
> > to create a logical switch even though the db is in backup mode
> > $unset OVN_NB_DAEMON
> > $ovn-nbctl ls-add sw2
> > ovn-nbctl: transaction error: {"details":"insert operation not allowed when 
> > database server is in read only mode","error":"not allowed"}
> >
> >
> > I looked into the ovsdb-server code, when the user changes the state of the 
> > ovsdb-server, the read_only param of  active ovsdb_server_sessions
> > are not updated.
> >
> > Thanks
> > Numan
> >
> >>
> >> Thoughts?
> >>
> >> Thanks a lot,
> >> Daniel
> >> ___
> >> discuss mailing list
> >> disc...@openvswitch.org
> >> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Re: Re:[HELP] Question about userspace geneve/vxlan port

2019-07-08 Thread Ben Pfaff
Native tunneling and userspace tunneling are the same thing.

The mechanism should be symmetric: configuration for sending packets out
should also work for parsing them on the way back in.

On Mon, Jul 08, 2019 at 03:57:46PM +0800, txfh2007 wrote:
> Hi Ben:
> Thanks for your reply ! I didn't find the "native tunneling" document in 
> OpenvSwitch repository. Did you mean the document "userspace-tunneling.rst". 
> this document just tells us the br-phy can send tunnel pkt out, but when dpdk 
> type port receives pkts with tunnel hdr, how could I configure the "native 
> tunnel" mechanism to parse and handle these pkts? Or what you mean is 
> currently OVS cannot handle parsing tunnel pkts in userspace ?
> 
> Thank you 
> 
> Timo 
> 
> 
> --
> Ben Pfaff 
> txfh2007 
> ovs-discuss 
> Re: [ovs-discuss] Re:[HELP] Question about userspace geneve/vxlan port
> 
> 
> On Thu, Jul 04, 2019 at 05:27:28PM +0800, txfh2007 via discuss wrote:
> > I have found theoritically during the upcall process, task
> > tnl_port_receive could be called(via upcall_cb() -> upcall_receive()
> > -> xlate_lookup() ->xport_lookup). But in my env, after tracing code
> > by gdb, I have found the task "tnl_port_should_receive(flow)" always
> > returns "false" for flow->tunnel->ip_dst is "0", even if the pkt
> > received by dpdk port has a tunnel header.
> 
> Yes.
> 
> > I guess the reason is in userspace task "handle_packet_upcall", the
> > match.tun_md.valid has been set "false", so the expanded flow has no
> > tunnel info, and also in task "miniflow_extract" in flow.c, the
> > packet->md is null as in dfc_processing task the "md_is_valid" flag is
> > always "false". Am I right ?
> 
> Yes.
> 
> OVS takes what some might consider an idiosyncratic approach to tunnel
> processing.  The "obvious" approach is to simply parse tunnel headers
> and throw those into the flow.  If OVS did that, then you'd see what you
> expect, but this isn't what OVS does.
> 
> Instead, OVS treats tunnel and their headers as metadata.  This is
> because of OVS's history as part of the Linux kernel.  The Linux kernel
> has tunnel implementations as part of the TCP/IP stack.  When a tunnel
> packet arrives at a physical port in Linux, it passes into the TCP/IP
> stack, where it gets processed and received on a tunnel network device.
> This effectively strips the tunnel headers and transforms them into
> metadata.  If the tunnel network device is part of an OVS bridge, then
> it gets the packet at that point, and treats the metadata as something
> that can be matched.
> 
> With other datapaths, OVS expects some equivalent mechanism to exist.
> For the userspace datapath, OVS implements "native tunneling" to provide
> that mechanism.  It's described in the OVS documentation.
> 
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Daniel Alvarez Sanchez
I tried a simple patch and it fixes the issue (see below). The
question now is, do we want to do this? I think it makes sense to drop
*all* the connections when the role changes but I'm curious to see
what other people think:

diff --git a/ovsdb/jsonrpc-server.c b/ovsdb/jsonrpc-server.c
index 4dda63a..ddbbc2e 100644
--- a/ovsdb/jsonrpc-server.c
+++ b/ovsdb/jsonrpc-server.c
@@ -365,7 +365,7 @@ ovsdb_jsonrpc_server_set_read_only(struct
ovsdb_jsonrpc_server *svr,
 {
 if (svr->read_only != read_only) {
 svr->read_only = read_only;
-ovsdb_jsonrpc_server_reconnect(svr, false,
+ovsdb_jsonrpc_server_reconnect(svr, true,
xstrdup(read_only
? "making server read-only"
: "making server read/write"));


$export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
$ovn-nbctl ls-add sw0
$ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
state: active
$ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server
tcp:192.0.2.2:6641
$ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
$ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
state: backup
connecting: tcp:192.0.2.2:6641
$ ovn-nbctl ls-add sw1
ovn-nbctl: transaction error: {"details":"insert operation not allowed
when database server is in read only mode","error":"not allowed"}

On Mon, Jul 8, 2019 at 1:25 PM Daniel Alvarez Sanchez
 wrote:
>
> I *think* that it may not a bug in ovsdb-server but a problem with
> ovn-controller as it doesn't seem to be a DB change aware client.
>
> When the role changes from master to backup or viceversa, connections
> are expected to be reestablished for all clients except those that are
> not aware of db changes [0] (note the 'false' argument). This flag is
> explained here [1] and looks like since ovn-controller is not
> monitoring the Database table in the _Server database, then the
> connection with it is not re-established. This is just a blind guess
> but  I can give it a shot :)
>
> [0] 
> https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L368
> [1] 
> https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L450-L456
>
> On Mon, Jul 8, 2019 at 12:45 PM Numan Siddique  wrote:
> >
> >
> >
> >
> > On Mon, Jul 8, 2019 at 3:52 PM Daniel Alvarez Sanchez  
> > wrote:
> >>
> >> Hi folks,
> >>
> >> While working with an OpenStack environment running OVN and
> >> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
> >> has been probably around for a long time. The bug itself seems to be
> >> related with ovsdb-server not updating the read-only flag properly.
> >>
> >> With a 3 nodes cluster running ovsdb-server in active/passive mode,
> >> when we restart the master-node, pacemaker promotes another node as
> >> master and moves the associated IPAddr2 resource to it.
> >> At this point, ovn-controller instances across the cloud reconnect to
> >> the new node but there's a window where ovsdb-server is still running
> >> as backup.
> >>
> >> For those ovn-controller instances that reconnect within that window,
> >> every attempt to write in the OVSDB will fail with "operation not
> >> allowed when database server is in read only mode". This state will
> >> remain forever unless a reconnection is forced. Restarting
> >> ovn-controller or killing the connection (for example with tcpkill)
> >> will make things work again.
> >>
> >> A workaround in OVN OCF script could be to wait for the
> >> ovsdb_server_promote function to wait until we get 'running/active' on
> >> that instance.
> >>
> >> Another open question is what should clients (in this case,
> >> ovn-controller) do in such situation? Shall they log an error and
> >> attempt a reconnection (rate limited)?
> >
> >
> > Thanks for reporting this issue Daniel.
> >
> > I can easily  reproduce the issue with the below commands.
> >
> > $  > $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> > $ovn-nbctl ls-add sw0
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > state: active
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server 
> > tcp:192.0.2.2:6641
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
> > $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> > state: backup
> > connecting: tcp:192.0.2.2:6641
> > $ovn-nbctl ls-add sw1  --> This should have failed. Since OVN_NB_DAEMON is 
> > set, ovn-nbctl talks to the
> >ovn-nbctl daemon and it is able 
> > to create a logical switch even though the db is in backup mode
> > $unset OVN_NB_DAEMON
> > $ovn-nbctl ls-add sw2
> > ovn-nbctl: transaction error: {"details":"insert operation not allowed when 
> > database server is in read only mode","error":"not allowed"}
> >
> >
> > I looked into the ovsdb-server code, when 

Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Daniel Alvarez Sanchez
I *think* that it may not a bug in ovsdb-server but a problem with
ovn-controller as it doesn't seem to be a DB change aware client.

When the role changes from master to backup or viceversa, connections
are expected to be reestablished for all clients except those that are
not aware of db changes [0] (note the 'false' argument). This flag is
explained here [1] and looks like since ovn-controller is not
monitoring the Database table in the _Server database, then the
connection with it is not re-established. This is just a blind guess
but  I can give it a shot :)

[0] 
https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L368
[1] 
https://github.com/openvswitch/ovs/blob/403a6a0cb003f1d48b0a3cbf11a2806c45e9d076/ovsdb/jsonrpc-server.c#L450-L456

On Mon, Jul 8, 2019 at 12:45 PM Numan Siddique  wrote:
>
>
>
>
> On Mon, Jul 8, 2019 at 3:52 PM Daniel Alvarez Sanchez  
> wrote:
>>
>> Hi folks,
>>
>> While working with an OpenStack environment running OVN and
>> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
>> has been probably around for a long time. The bug itself seems to be
>> related with ovsdb-server not updating the read-only flag properly.
>>
>> With a 3 nodes cluster running ovsdb-server in active/passive mode,
>> when we restart the master-node, pacemaker promotes another node as
>> master and moves the associated IPAddr2 resource to it.
>> At this point, ovn-controller instances across the cloud reconnect to
>> the new node but there's a window where ovsdb-server is still running
>> as backup.
>>
>> For those ovn-controller instances that reconnect within that window,
>> every attempt to write in the OVSDB will fail with "operation not
>> allowed when database server is in read only mode". This state will
>> remain forever unless a reconnection is forced. Restarting
>> ovn-controller or killing the connection (for example with tcpkill)
>> will make things work again.
>>
>> A workaround in OVN OCF script could be to wait for the
>> ovsdb_server_promote function to wait until we get 'running/active' on
>> that instance.
>>
>> Another open question is what should clients (in this case,
>> ovn-controller) do in such situation? Shall they log an error and
>> attempt a reconnection (rate limited)?
>
>
> Thanks for reporting this issue Daniel.
>
> I can easily  reproduce the issue with the below commands.
>
> $  $export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach)
> $ovn-nbctl ls-add sw0
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> state: active
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/set-active-ovsdb-server 
> tcp:192.0.2.2:6641
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/connect-active-ovsdb-server
> $ovs-appctl -t $PWD/sandbox/nb1 ovsdb-server/sync-status
> state: backup
> connecting: tcp:192.0.2.2:6641
> $ovn-nbctl ls-add sw1  --> This should have failed. Since OVN_NB_DAEMON is 
> set, ovn-nbctl talks to the
>ovn-nbctl daemon and it is able to 
> create a logical switch even though the db is in backup mode
> $unset OVN_NB_DAEMON
> $ovn-nbctl ls-add sw2
> ovn-nbctl: transaction error: {"details":"insert operation not allowed when 
> database server is in read only mode","error":"not allowed"}
>
>
> I looked into the ovsdb-server code, when the user changes the state of the 
> ovsdb-server, the read_only param of  active ovsdb_server_sessions
> are not updated.
>
> Thanks
> Numan
>
>>
>> Thoughts?
>>
>> Thanks a lot,
>> Daniel
>> ___
>> discuss mailing list
>> disc...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Lucas Alvares Gomes
Hi,

Thanks for reporting, Daniel.

On Mon, Jul 8, 2019 at 11:22 AM Daniel Alvarez Sanchez
 wrote:
>
> Hi folks,
>
> While working with an OpenStack environment running OVN and
> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
> has been probably around for a long time. The bug itself seems to be
> related with ovsdb-server not updating the read-only flag properly.
>
> With a 3 nodes cluster running ovsdb-server in active/passive mode,
> when we restart the master-node, pacemaker promotes another node as
> master and moves the associated IPAddr2 resource to it.
> At this point, ovn-controller instances across the cloud reconnect to
> the new node but there's a window where ovsdb-server is still running
> as backup.
>
> For those ovn-controller instances that reconnect within that window,
> every attempt to write in the OVSDB will fail with "operation not
> allowed when database server is in read only mode". This state will
> remain forever unless a reconnection is forced. Restarting
> ovn-controller or killing the connection (for example with tcpkill)
> will make things work again.
>
> A workaround in OVN OCF script could be to wait for the
> ovsdb_server_promote function to wait until we get 'running/active' on
> that instance.
>
> Another open question is what should clients (in this case,
> ovn-controller) do in such situation? Shall they log an error and
> attempt a reconnection (rate limited)?
>

I would say so, ovn-controller _requires_ a read-write session for it
to function properly. Either it can retry to reconnect forever as you
suggested or assert and exit if it's a read-only connection or a
combination of the two (retry first and then exit).

Also, we need to improve the logs for such errors. While debugging the
problem it wasn't "easy" to find why ovn-controller wasn't updating
the database (we were looking into the nb_cfg column of the Chassis
table in the Southbound OVSDB). We've checked the state of the
connection (it was stable), the process was healthy, etc... Was only
when we enabled the DBG log level for ovn-controller that we've
started seeing messages such as:

2019-07-04T15:11:19.522Z|00148|jsonrpc|DBG|tcp:172.17.1.27:6642:
received notification, method="update2",
params=[["monid","OVN_Southbound"],{"Chassis":{"cb669c72-0f84-412c-a3b
f-482119649d85":{"modify":{"nb_cfg":3300]
2019-07-04T15:11:19.522Z|00149|jsonrpc|DBG|tcp:172.17.1.27:6642:
received reply, result=[{"details":"update operation not allowed when
database server is in read only mode","er ror":"not allowed"}],
id=8062

So, perhaps logging it as ERROR would be better because without the
DBG level all we could see in the logs was two INFO messages saying
that it reconnected to the Southbound OVSDB.

Cheers,
Lucas
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


Re: [ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Numan Siddique
On Mon, Jul 8, 2019 at 3:52 PM Daniel Alvarez Sanchez 
wrote:

> Hi folks,
>
> While working with an OpenStack environment running OVN and
> ovsdb-server in A/P configuration with Pacemaker we hit an issue that
> has been probably around for a long time. The bug itself seems to be
> related with ovsdb-server not updating the read-only flag properly.
>
> With a 3 nodes cluster running ovsdb-server in active/passive mode,
> when we restart the master-node, pacemaker promotes another node as
> master and moves the associated IPAddr2 resource to it.
> At this point, ovn-controller instances across the cloud reconnect to
> the new node but there's a window where ovsdb-server is still running
> as backup.
>
> For those ovn-controller instances that reconnect within that window,
> every attempt to write in the OVSDB will fail with "operation not
> allowed when database server is in read only mode". This state will
> remain forever unless a reconnection is forced. Restarting
> ovn-controller or killing the connection (for example with tcpkill)
> will make things work again.
>
> A workaround in OVN OCF script could be to wait for the
> ovsdb_server_promote function to wait until we get 'running/active' on
> that instance.
>
> Another open question is what should clients (in this case,
> ovn-controller) do in such situation? Shall they log an error and
> attempt a reconnection (rate limited)?
>

Thanks for reporting this issue Daniel.

I can easily  reproduce the issue with the below commands.

$  This should have failed. Since OVN_NB_DAEMON is
set, ovn-nbctl talks to the
   ovn-nbctl daemon and it is able
to create a logical switch even though the db is in backup mode
$unset OVN_NB_DAEMON
$ovn-nbctl ls-add sw2
ovn-nbctl: transaction error: {"details":"insert operation not allowed when
database server is in read only mode","error":"not allowed"}


I looked into the ovsdb-server code, when the user changes the state of the
ovsdb-server, the read_only param of  active ovsdb_server_sessions
are not updated.

Thanks
Numan


> Thoughts?
>
> Thanks a lot,
> Daniel
> ___
> discuss mailing list
> disc...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Issue with failover running ovsdb-server in A/P mode with Pacemaker

2019-07-08 Thread Daniel Alvarez Sanchez
Hi folks,

While working with an OpenStack environment running OVN and
ovsdb-server in A/P configuration with Pacemaker we hit an issue that
has been probably around for a long time. The bug itself seems to be
related with ovsdb-server not updating the read-only flag properly.

With a 3 nodes cluster running ovsdb-server in active/passive mode,
when we restart the master-node, pacemaker promotes another node as
master and moves the associated IPAddr2 resource to it.
At this point, ovn-controller instances across the cloud reconnect to
the new node but there's a window where ovsdb-server is still running
as backup.

For those ovn-controller instances that reconnect within that window,
every attempt to write in the OVSDB will fail with "operation not
allowed when database server is in read only mode". This state will
remain forever unless a reconnection is forced. Restarting
ovn-controller or killing the connection (for example with tcpkill)
will make things work again.

A workaround in OVN OCF script could be to wait for the
ovsdb_server_promote function to wait until we get 'running/active' on
that instance.

Another open question is what should clients (in this case,
ovn-controller) do in such situation? Shall they log an error and
attempt a reconnection (rate limited)?

Thoughts?

Thanks a lot,
Daniel
___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss


[ovs-discuss] Re: Re:[HELP] Question about userspace geneve/vxlan port

2019-07-08 Thread txfh2007 via discuss
Hi Ben:
Thanks for your reply ! I didn't find the "native tunneling" document in 
OpenvSwitch repository. Did you mean the document "userspace-tunneling.rst". 
this document just tells us the br-phy can send tunnel pkt out, but when dpdk 
type port receives pkts with tunnel hdr, how could I configure the "native 
tunnel" mechanism to parse and handle these pkts? Or what you mean is currently 
OVS cannot handle parsing tunnel pkts in userspace ?

Thank you 

Timo 


--
Ben Pfaff 
txfh2007 
ovs-discuss 
Re: [ovs-discuss] Re:[HELP] Question about userspace geneve/vxlan port


On Thu, Jul 04, 2019 at 05:27:28PM +0800, txfh2007 via discuss wrote:
> I have found theoritically during the upcall process, task
> tnl_port_receive could be called(via upcall_cb() -> upcall_receive()
> -> xlate_lookup() ->xport_lookup). But in my env, after tracing code
> by gdb, I have found the task "tnl_port_should_receive(flow)" always
> returns "false" for flow->tunnel->ip_dst is "0", even if the pkt
> received by dpdk port has a tunnel header.

Yes.

> I guess the reason is in userspace task "handle_packet_upcall", the
> match.tun_md.valid has been set "false", so the expanded flow has no
> tunnel info, and also in task "miniflow_extract" in flow.c, the
> packet->md is null as in dfc_processing task the "md_is_valid" flag is
> always "false". Am I right ?

Yes.

OVS takes what some might consider an idiosyncratic approach to tunnel
processing.  The "obvious" approach is to simply parse tunnel headers
and throw those into the flow.  If OVS did that, then you'd see what you
expect, but this isn't what OVS does.

Instead, OVS treats tunnel and their headers as metadata.  This is
because of OVS's history as part of the Linux kernel.  The Linux kernel
has tunnel implementations as part of the TCP/IP stack.  When a tunnel
packet arrives at a physical port in Linux, it passes into the TCP/IP
stack, where it gets processed and received on a tunnel network device.
This effectively strips the tunnel headers and transforms them into
metadata.  If the tunnel network device is part of an OVS bridge, then
it gets the packet at that point, and treats the metadata as something
that can be matched.

With other datapaths, OVS expects some equivalent mechanism to exist.
For the userspace datapath, OVS implements "native tunneling" to provide
that mechanism.  It's described in the OVS documentation.

___
discuss mailing list
disc...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss