On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou <az...@ovn.org> wrote: > > > On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant <russ...@ovn.org> wrote: > >> >> >> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou <az...@ovn.org> wrote: >> >>> >>> >>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant <russ...@ovn.org> wrote: >>> >>>> >>>> >>>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou <az...@ovn.org> wrote: >>>> >>>>> Hi, Rayn and Russell, >>>>> >>>> >>>> Can we move this discussion to the ovs dev mailing list? Feel free to >>>> just add it in a reply if you'd like. >>>> >>> Done. >>> >>>> >>>> >>>>> I am wondering how we can actually use the active/backup feature that >>>>> is now part of >>>>> OVSDB to increase OVN availability. >>>>> >>>> >>>> TO be clear, I haven't actually tried this yet. I'm only speaking >>>> about how I think it should work. >>>> >>>> >>>>> Specifically: >>>>> >>>>> 1. When the active OVSDB server failed, should the back up server take >>>>> over, and allow write transactions? One simpler possibility is to allow >>>>> read only access to the backup serve. >>>>> >>>> >>>> The backup server needs to take over. It's OK if that requires >>>> intervention by an HA manager like Pacemaker. If we can't make the passive >>>> server take over, I'd say the solution is incomplete. >>>> >>> >>> O.K. make sense. >>> >>> One possible issue with backup server taking over is "split head". In >>> case due to network error, backup server becomes disconnected from the >>> active >>> server, then we may have both server thinking they are active server >>> now. Does Pacemaker help with solving this issue. >>> >> >> It can, yes. I would expect Pacemaker to explicitly configure a node to >> be either the active or passive node. >> > Manual switching is more straight forward. I agree. > >> >>>> >>>>> 2. When a crashed active OVSDB server recovers, should it become the >>>>> new backup, or it should switch back. >>>>> >>>> >>>> Becoming the new backup is fine. Again, this can be orchestrated by an >>>> HA manager (Pacemaker). >>>> >>> I am not familiar with pacemaker. Can I assume it can provide a correct >>> --sync-from argument (pointing to backup server) when relaunch OVSDB >>> server? >>> >> >> Yes. I'd have to consult with some Pacemaker experts on exactly what the >> implementation would look like, but roughly: >> >> Pacemaker manages services using "OCF Resource Agents", which are just >> scripts with a defined set of inputs and outputs for service management. I >> would imagine a Pacemaker cluster being told it must have exactly 1 active >> and 1 passive OVSDB service. When the passive OVSDB service is started, it >> would include the "sync-from" argument based on where the active OVSDB >> service is currently running. >> >> We really need to prototype this and document it. I'm guessing too >> much. Pacemaker is frequently used to manage active/passive HA, though. >> >> Sounds reasonable, I will work on ovsdb internal changes to support > manual switching, using appctl commands. Then looking into prototyping with > HA systems. I have not used pacemaker in the past, so it may take some > time to ramp up. >
I should be able to help. We need to do this work anyway for integration into OpenStack deployment tools. Let me see if I can get some helpful examples to follow. -- Russell Bryant _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev