On 23/11/17 23:52 +0800, Hui Xiang wrote: > I am working on HA with 3-nodes, which has below configurations: > > """ > pcs resource create ovndb_servers ocf:ovn:ovndb-servers \ > master_ip=168.254.101.2 \ > op monitor interval="10s" \ > op monitor interval="11s" role=Master > pcs resource master ovndb_servers-master ovndb_servers \ > meta notify="true" master-max="1" master-node-max="1" clone-max="3" > clone-node-max="1" > pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=168.254.101.2 \ > op monitor interval=10s > pcs constraint order promote ovndb_servers-master then VirtualIP > pcs constraint colocation add VirtualIP with master ovndb_servers-master \ > score=INFINITY > """
(Out of curiosity, this looks like a mix of output from pcs config export pcs-commands [or clufter cib2pcscmd -s] and manual editing. Is this a good guess?) > However, after setting it as above, the master is not being selected, all > are stopped, from pacemaker log, node-1 has been chosen as the master, I am > confuse where is wrong, can anybody give a help, it would be very > appreciated. > > > Master/Slave Set: ovndb_servers-master [ovndb_servers] > Stopped: [ node-1.domain.tld node-2.domain.tld node-3.domain.tld ] > VirtualIP (ocf::heartbeat:IPaddr2): Stopped > > > # pacemaker log > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ /cib/configuration/resources: <primitive class="ocf" > id="ovndb_servers" provider="ovn" type="ovndb-servers"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <instance_attributes > id="ovndb_servers-instance_attributes"> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <nvpair > id="ovndb_servers-instance_attributes-master_ip" name="master_ip" > value="168.254.101.2"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ </instance_attributes> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <operations> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <op > id="ovndb_servers-start-timeout-30s" interval="0s" name="start" > timeout="30s"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <op > id="ovndb_servers-stop-timeout-20s" interval="0s" name="stop" > timeout="20s"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <op > id="ovndb_servers-promote-timeout-50s" interval="0s" name="promote" > timeout="50s"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <op > id="ovndb_servers-demote-timeout-50s" interval="0s" name="demote" > timeout="50s"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <op > id="ovndb_servers-monitor-interval-10s" interval="10s" name="monitor"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ <op > id="ovndb_servers-monitor-interval-11s-role-Master" interval="11s" > name="monitor" role="Master"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ </operations> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ </primitive> > > Nov 23 23:06:03 [665249] node-1.domain.tld attrd: info: > attrd_peer_update: Setting master-ovndb_servers[node-1.domain.tld]: (null) > -> 5 from node-1.domain.tld If it's probable your ocf:ovn:ovndb-servers agent in master mode can run something like "attrd_updater -n master-ovndb_servers -U 5", then it was indeed launched OK, and if it does not continue to run as expected, there may be a problem with the agent itself. You can try running "pcs resource debug-promote ovndb_servers --full" to examine the executation details (assuming the agent responds to OCF_TRACE_RA=1 environment variable, which is what shell-based agents built on top ocf-shellfuncs sourcable shell library from resource-agents project, hence incl. also agents it ships, customarily do). > Nov 23 23:06:03 [665251] node-1.domain.tld crmd: notice: > process_lrm_event: Operation ovndb_servers_monitor_0: ok > (node=node-1.domain.tld, call=185, rc=0, cib-update=88, confirmed=true) > <29>Nov 23 23:06:03 node-1 crmd[665251]: notice: process_lrm_event: > Operation ovndb_servers_monitor_0: ok (node=node-1.domain.tld, call=185, > rc=0, cib-update=88, confirmed=true) > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: Diff: --- 0.630.2 2 > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: Diff: +++ 0.630.3 (null) > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: + /cib: @num_updates=3 > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_perform_op: ++ > /cib/status/node_state[@id='1']/transient_attributes[@id='1']/instance_attributes[@id='status-1']: > <nvpair id="status-1-master-ovndb_servers" name="master-ovndb_servers" > value="5"/> > Nov 23 23:06:03 [665246] node-1.domain.tld cib: info: > cib_process_request: Completed cib_modify operation for section status: OK > (rc=0, origin=node-3.domain.tld/attrd/80, version=0.630.3) Also depends if there's anything interesting after this point... -- Jan (Poki)
pgpXsxRJL8ji7.pgp
Description: PGP signature
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org