FYI, If I have configured a good ovndb-server cluster with one active two slaves, then start pacemaker ovn-servers resource agents, they are all becoming slaves...
On Tue, Nov 28, 2017 at 10:48 PM, Numan Siddique <[email protected]> wrote: > > > On Tue, Nov 28, 2017 at 2:29 PM, Hui Xiang <[email protected]> wrote: > >> Hi Numan, >> >> >> Finally figure it out what's wrong when running ovndb-servers ocf in my >> environment. >> >> 1. There is no default ovnnb and ovnsb running in my environment, I >> thought it should be started by pacemaker as the usual way other typical >> resource agent do it. >> when I create the ovndb_servers resource, nothing happened, no operation >> is executed except monitor, which is really hard to debug for a while. >> In the ovsdb_server_monitor() function, first it will check the status, >> here, it will be return NOT_RUNNING, then in the ovsdb_server_master_update() >> function, "CRM_MASTER -D" is being executed, which appears stopped every >> following action, I am not very clear what work it did. >> >> So, do the ovn_nb and ovn_sb needs to be running previouly before >> pacemaker ovndb_servers resource create? Is there any such documentation >> referred? >> >> 2. Without your patch every nodes executing ovsdb_server_monitor and >> return OCF_SUCCESS >> However, the first node of the three nodes cluster is executed >> ovsdb_server_stop action, the reason showed below: >> <27>Nov 28 15:35:11 node-1 pengine[1897010]: error: clone_color: >> ovndb_servers:0 is running on node-1.domain.tld which isn't allowed >> Did I miss anything? I don't understand why it isn't allowed. >> >> 3. Regard your patch[1] >> It first reports "/usr/lib/ocf/resource.d/ovn/ovndb-servers: line 26: >> ocf_attribute_target: command not found ]" in my environment(pacemaker >> 1.1.12) >> > > Thanks. I will come back to you on your other points. The function > "ocf_attribute_target" action must be added in 1.1.16-12. > > I think it makes sense to either remove "ocf_attribute_target" or find a > way so that even older versions work. > > I will spin a v2. > Thanks > Numan > > > > The log showed same as item2, but I have seen very shortly different state >> from "pcs status" as below shown: >> Master/Slave Set: ovndb_servers-master [ovndb_servers] >> Slaves: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ] >> There is no promote action being executed. >> >> >> Thanks for looking and help. >> >> [1] - https://patchwork.ozlabs.org/patch/839022/ >> >> >> >> >> >> On Fri, Nov 24, 2017 at 10:54 PM, Numan Siddique <[email protected]> >> wrote: >> >>> Hi Hui Xiang, >>> >>> Can you please try with this patch [1] and see if it works for you ? >>> Please let me know how it goes. But I am not sure, if the patch would fix >>> the issue. >>> >>> To brief, the OVN OCF script doesn't add monitor action for "Master" >>> role. So pacemaker Resource agent would not check for the status of ovn db >>> servers periodically. In case ovn db servers are killed, pacemaker wont >>> know about it. >>> >>> >>> >>> >>> You can also take a look at this [1] to know how it is used in openstack >>> with tripleo installation. >>> >>> [1] - https://patchwork.ozlabs.org/patch/839022/ >>> [2] - https://github.com/openstack/puppet-tripleo/blob/master/ma >>> nifests/profile/pacemaker/ovn_northd.pp >>> >>> >>> Thanks >>> Numan >>> >>> On Fri, Nov 24, 2017 at 3:00 PM, Hui Xiang <[email protected]> wrote: >>> >>>> Hi folks, >>>> >>>> I am following what suggested on doc[1] to configure the >>>> ovndb_servers HA, however, it's so unluck with upgrading pacemaker packages >>>> from 1.12 to 1.16, do almost every kind of changes, there still not a >>>> ovndb_servers master promoted, is there any special recipe for it to run? >>>> so frustrated on it, sigh. >>>> >>>> It always showed: >>>> Master/Slave Set: ovndb_servers-master [ovndb_servers] >>>> Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ] >>>> >>>> Even if I tried below steps: >>>> 1. pcs resource debug-stop ovndb_server on every nodes. ovn-ctl >>>> status_ovnxb: running/backup >>>> 2. pcs resource debug-start ovndb_server on every nodes. ovn-ctl >>>> status_ovnxb: running/backup >>>> 3. pcs resource debug-promote ovndb_server on one nodes. ovn-ctl >>>> status_ovnxb: running/active >>>> >>>> With above status, the pcs status still showed as: >>>> Master/Slave Set: ovndb_servers-master [ovndb_servers] >>>> Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ] >>>> >>>> >>>> [1]. https://github.com/openvswitch/ovs/blob/master/Document >>>> ation/topics/integration.rst >>>> >>>> Appreciated any hint. >>>> >>>> >>>> >>>> _______________________________________________ >>>> discuss mailing list >>>> [email protected] >>>> https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >>>> >>>> >>> >> >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
