On Fri, Aug 19, 2016 at 11:48 AM, Numan Siddique <nusid...@redhat.com>
wrote:

>
>
> On Wed, Aug 17, 2016 at 11:24 PM, Andy Zhou <az...@ovn.org> wrote:
>
>>
>>
>> On Wed, Aug 17, 2016 at 8:30 AM, Numan Siddique <nusid...@redhat.com>
>> wrote:
>>
>>> ​Hi Andy,
>>> I have started working on integrating ovsdb-server HA support with
>>> pacemaker (via OCF heartbeat, i.e ocf:heartbeat).
>>>
>>
>> Thanks for working on it.
>>
>>>
>>> Few comments below.
>>>
>>>
>>>
>>>>
>>>> >>> Thanks for helping out.
>>>> >>>
>>>> >>> Given that, I now plan to work from bottom up, initially focusing on
>>>> >>> ovsdb server changes.
>>>> >>>
>>>> >>> 1. Add a state in ovsdb-server for it to know whether it is an
>>>> active
>>>> >>> server.  Backup server will not accept any connections.  Server
>>>> started with
>>>> >>> --sync-from argument will be put in the back state by default.
>>>> >>>
>>>> >>> 2. Add appctl commands to allow manually switch state.
>>>> >>>
>>>>
>>>
>>> In order to write the ocf script for ovsdb-server, we need a mechanism to
>>>  - know if the ovsdb-server instance is running as master or slave
>>>
>>  Current 2.6 branch code does not have this feature. You can always use
>> switch commands to
>> force the state.   On the other hand, adding an appctl command seems
>> appropriate and can be
>> useful for trouble shooting as well.  I will work it.
>>
>>  - to switch the state of the ovsdb-server to either master or slave.
>>>
>> These are currently supported via appctl commands:
>> ovsdb-server/connect-active-ovsdb-server
>> ovsdb-server/disconnect-active-ovsdb-server
>>
>
>
> ​Thanks for pointing me to the right direction. I added a new command to
> check if the ovsdb-server is active or backup and submitted the patch [1]
> as I needed that in the pacemaker ocf script for ovsdb server.
>
> [1] - https://patchwork.ozlabs.org/patch/660919/
>


> Based on my initial work, I have a couple of comments below
>
> 1. ​
>
> ​I want to start the ovsdb servers in 2 nodes in a backup state i.e ​with
> --sync-with defined for both of them so that pacemaker can promote a master
> (based on colocation constraints).
>
Could the process launch the active ovsdb-server issue an appctl command
"ovsdb-server/set-active-ovsdb-server"  immediately after.

>
>   When I start the ovsdb-server's this way, both of them try to get the
> schema from the other one and get hung. I am not sure if this can be
> addressed as ovsdb-server is single threaded. Or may be there is a way to
> address which I am not aware.
>
Although that's not an intended use case, I am not aware any reason it
should hang. I am working a change to replication core to remove blocking
transactions it uses.
May be it will help.  It not, I will debug it.

>
>  Hence, I added a new command line option "--backup-server=[true/false]".
> If back-server is set to true, only then ovsdb-server will try to sync with
> the active server. This way I can start one server as master (with
> --sync-from defined) and other as backup (both --sync-from and
> --backup-server set). It will be easier for pacemaker to demote a master to
> backup (for what ever reason) so that it can easily connect to the new
> master (without the need to send the unixctl command -
> ovsdb-server/set-active-ovsdb-server). I have a patch ready with the new
> command line option. Please let me know if this is fine.
>

I don't see why this option is necessary for the given use case. It seems
the launching process can achieve this by using appctl commands immediately
after.

On the other hand, if we ever want to make command line option changes,
this is a good time for it since 2.6 will be the first release to add them.
A (minor) issue I have with current command line options is that the
--sync-exclude is not available from the command line.

I considered to consolidating three appctl commands
set_sync_excluded_tables, set_active_ovsdb_server and
connect_active_ovsdb_server into a single command, say, sync-from
 <active-server> [<excluded-tables>].  But It seems you don't want
pacemaker to know about the address of the active/backup server. Is
this true? What's preventing pacemaker from knowing all the necessary
information?

>
> 2. When the pacemaker has to change the master to backup and backup to
> master (for whatever reason), it first demotes the master and then promotes
> the backup. When it demotes the master, at this point both the
> ovsdb-server's will be in backup modes. After it promotes the backup to
> master, the new backup server is not getting the updates from the new
> master, until I send the unixctl commands - 
> ovsdb-server/disconnect-active-ovsdb-server
> and
> ovsdb-server/connect-active-ovsdb-server
> ​ to the new backup.
> I think this issue needs to be addressed.
>
It looks like another example of issues can be caused by blocking
transactions currently used by replication. Thanks for reporting it. I will
add a test case in my non-blocking patches.  As a work around, could you
relaunch the new backup server after the backup server become active?

>
> ​Thanks
> Numan
>
>
>>  ​
>>> The initial (very dirty, not working properly) ocf script  can be found
>>> here - [1]
>>>
>>> I know you have already mentioned about adding this support above. This
>>> is just a confirmation that it would be consumed by the pacemaker ocf
>>> script once available.
>>>
>>> [1] - https://github.com/numansiddique/resource-agents/commit/7f
>>> 6c4d8977c7cf525ec22793d8adf5b308bc431e
>>>
>>>
>>> ​Thanks
>>> Numan
>>> ​
>>>
>>> >>> 3. Add a new table for backup server to register its address and
>>>> ports.
>>>> >>> OVSDB clients can learn about them at run time. Back up server
>>>> should issue
>>>> >>> an
>>>> >>> transaction to register its address before issuing the monitoring
>>>> >>> request.  This feature is not strictly necessary, and can be pushed
>>>> to HA
>>>> >>> manager,
>>>> >>> but having it built into ovsdb-server may make it simpler for
>>>> >>> integrationl.
>>>> >>>
>>>> >>> What do you think?
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >> Russell, Would HA manager also manage ovn-controller switch over?
>>>> >>
>>>> >
>>>> > Yes, indirectly.  The way this is typically handled is by using a
>>>> virtual
>>>> > IP that moves to whatever host is currently the master
>>>> >
>>>> Cool, then ovn-controller does not have to be HA aware.
>>>>
>>>> >
>>>> >
>>>>
>>>> >
>>>> > --
>>>> > Russell Bryant
>>>> >
>>>> _______________________________________________
>>>> dev mailing list
>>>> dev@openvswitch.org
>>>> http://openvswitch.org/mailman/listinfo/dev
>>>>
>>>
>>>
>>
>
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to