On Tue, Jul 26, 2016 at 3:48 PM, Andy Zhou <az...@ovn.org> wrote:

>
>
> On Tue, Jul 26, 2016 at 11:59 AM, Russell Bryant <russ...@ovn.org> wrote:
>
>>
>>
>> On Tue, Jul 26, 2016 at 2:41 PM, Andy Zhou <az...@ovn.org> wrote:
>>
>>>
>>>
>>> On Tue, Jul 26, 2016 at 5:37 AM, Russell Bryant <russ...@ovn.org> wrote:
>>>
>>>>
>>>>
>>>> On Mon, Jul 25, 2016 at 8:15 PM, Andy Zhou <az...@ovn.org> wrote:
>>>>
>>>>> Hi, Rayn and Russell,
>>>>>
>>>>
>>>> Can we move this discussion to the ovs dev mailing list?  Feel free to
>>>> just add it in a reply if you'd like.
>>>>
>>> Done.
>>>
>>>>
>>>>
>>>>> I am wondering how we can actually use the active/backup feature that
>>>>> is now part of
>>>>> OVSDB to increase OVN availability.
>>>>>
>>>>
>>>> TO be clear, I haven't actually tried this yet.  I'm only speaking
>>>> about how I think it should work.
>>>>
>>>>
>>>>> Specifically:
>>>>>
>>>>> 1. When the active OVSDB server failed, should the back up server take
>>>>> over, and allow write transactions? One simpler possibility is to allow
>>>>> read only access to the backup serve.
>>>>>
>>>>
>>>> The  backup server needs to take over.  It's OK if that requires
>>>> intervention by an HA manager like Pacemaker.  If we can't make the passive
>>>> server take over, I'd say the solution is incomplete.
>>>>
>>>
>>> O.K. make sense.
>>>
>>> One possible issue with backup server taking over is "split head".  In
>>> case due to network error, backup server becomes disconnected from the
>>> active
>>> server, then we may have both server thinking they are active server
>>> now.  Does Pacemaker help with solving this issue.
>>>
>>
>> It can, yes.  I would expect Pacemaker to explicitly configure a node to
>> be either the active or passive node.
>>
> Manual switching is more straight forward. I agree.
>
>>
>>>>
>>>>> 2. When a crashed active OVSDB server recovers, should it become the
>>>>> new backup, or it should switch back.
>>>>>
>>>>
>>>> Becoming the new backup is fine.  Again, this can be orchestrated by an
>>>> HA manager (Pacemaker).
>>>>
>>> I am not familiar with pacemaker. Can I assume it can provide a correct
>>> --sync-from argument (pointing to backup server) when relaunch OVSDB
>>> server?
>>>
>>
>> Yes.  I'd have to consult with some Pacemaker experts on exactly what the
>> implementation would look like, but roughly:
>>
>> Pacemaker manages services using "OCF Resource Agents", which are just
>> scripts with a defined set of inputs and outputs for service management.  I
>> would imagine a Pacemaker cluster being told it must have exactly 1 active
>> and 1 passive OVSDB service.  When the passive OVSDB service is started, it
>> would include the "sync-from" argument based on where the active OVSDB
>> service is currently running.
>>
>> We really need to prototype this and document it.  I'm guessing too
>> much.  Pacemaker is frequently used to manage active/passive HA, though.
>>
>> Sounds reasonable,  I will work on ovsdb internal changes to support
> manual switching, using appctl commands. Then looking into prototyping with
> HA systems.  I have not used pacemaker in the past, so it may take some
> time to ramp up.
>

I should be able to help.  We need to do this work anyway for integration
into OpenStack deployment tools.  Let me see if I can get some helpful
examples to follow.

-- 
Russell Bryant
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to