FYI, If I have configured a good ovndb-server cluster with one active two
slaves, then start pacemaker ovn-servers resource agents, they are all
becoming slaves...

On Tue, Nov 28, 2017 at 10:48 PM, Numan Siddique <[email protected]>
wrote:

>
>
> On Tue, Nov 28, 2017 at 2:29 PM, Hui Xiang <[email protected]> wrote:
>
>> Hi Numan,
>>
>>
>> Finally figure it out what's wrong when running ovndb-servers ocf in my
>> environment.
>>
>> 1. There is no default ovnnb and ovnsb running in my environment, I
>> thought it should be started by pacemaker as the usual way other typical
>> resource agent do it.
>> when I create the ovndb_servers resource, nothing happened, no operation
>> is executed except monitor, which is really hard to debug for a while.
>> In the ovsdb_server_monitor() function, first it will check the status,
>> here, it will be return NOT_RUNNING, then in the ovsdb_server_master_update()
>> function, "CRM_MASTER -D" is being executed, which appears stopped every
>> following action, I am not very clear what work it did.
>>
>> So, do the ovn_nb and ovn_sb needs to be running previouly before
>> pacemaker ovndb_servers resource create? Is there any such documentation
>> referred?
>>
>> 2. Without your patch every nodes executing ovsdb_server_monitor and
>> return OCF_SUCCESS
>> However, the first node of the three nodes cluster is executed
>> ovsdb_server_stop action, the reason showed below:
>> <27>Nov 28 15:35:11 node-1 pengine[1897010]:    error: clone_color:
>> ovndb_servers:0 is running on node-1.domain.tld which isn't allowed
>> Did I miss anything? I don't understand why it isn't allowed.
>>
>> 3. Regard your patch[1]
>> It first reports "/usr/lib/ocf/resource.d/ovn/ovndb-servers: line 26:
>> ocf_attribute_target: command not found ]" in my environment(pacemaker
>> 1.1.12)
>>
>
> Thanks. I will come back to you on your other points. The function
> "ocf_attribute_target" action must be added in 1.1.16-12.
>
> I think it makes sense to either remove "ocf_attribute_target" or find a
> way so that even older versions work.
>
> I will spin a v2.
> Thanks
> Numan
>
>
>
> The log showed same as item2, but I have seen very shortly different state
>> from "pcs status" as below shown:
>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>      Slaves: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>> There is no promote action being executed.
>>
>>
>> Thanks for looking and help.
>>
>> [1] - https://patchwork.ozlabs.org/patch/839022/
>>
>>
>>
>>
>>
>> On Fri, Nov 24, 2017 at 10:54 PM, Numan Siddique <[email protected]>
>> wrote:
>>
>>> Hi Hui Xiang,
>>>
>>> Can you please try with this patch [1]  and see if it works for you ?
>>> Please let me know how it goes. But I am not sure, if the patch would fix
>>> the issue.
>>>
>>> To brief, the OVN OCF script doesn't add monitor action for "Master"
>>> role. So pacemaker Resource agent would not check for the status of ovn db
>>> servers periodically. In case ovn db servers are killed, pacemaker wont
>>> know about it.
>>>
>>>
>>>
>>>
>>> You can also take a look at this [1] to know how it is used in openstack
>>> with tripleo installation.
>>>
>>> [1] - https://patchwork.ozlabs.org/patch/839022/
>>> [2] - https://github.com/openstack/puppet-tripleo/blob/master/ma
>>> nifests/profile/pacemaker/ovn_northd.pp
>>>
>>>
>>> Thanks
>>> Numan
>>>
>>> On Fri, Nov 24, 2017 at 3:00 PM, Hui Xiang <[email protected]> wrote:
>>>
>>>> Hi folks,
>>>>
>>>>   I am following what suggested on doc[1] to configure the
>>>> ovndb_servers HA, however, it's so unluck with upgrading pacemaker packages
>>>> from 1.12 to 1.16, do almost every kind of changes, there still not a
>>>> ovndb_servers master promoted, is there any special recipe for it to run?
>>>> so frustrated on it, sigh.
>>>>
>>>> It always showed:
>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>>>
>>>> Even if I tried below steps:
>>>> 1. pcs resource debug-stop ovndb_server on every nodes.      ovn-ctl
>>>> status_ovnxb: running/backup
>>>> 2. pcs resource debug-start ovndb_server on every nodes.      ovn-ctl
>>>> status_ovnxb: running/backup
>>>> 3. pcs resource debug-promote ovndb_server on one nodes.   ovn-ctl
>>>> status_ovnxb: running/active
>>>>
>>>> With above status, the pcs status still showed as:
>>>>  Master/Slave Set: ovndb_servers-master [ovndb_servers]
>>>>      Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ]
>>>>
>>>>
>>>> [1]. https://github.com/openvswitch/ovs/blob/master/Document
>>>> ation/topics/integration.rst
>>>>
>>>> Appreciated any hint.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> discuss mailing list
>>>> [email protected]
>>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
>>>>
>>>>
>>>
>>
>
_______________________________________________
discuss mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-discuss

Reply via email to