Re: [ClusterLabs] How to set up fencing/stonith
Dne 16.5.2018 v 05:52 Casey & Gina napsal(a): Hi, I'm trying to figure out how to get fencing/stonith going with pacemaker. As far as I understand it, they are both part of the same thing - setting up stonith means setting up fencing. If I'm mistaken on that, please let me know. Specifically, I'm wanting to use the external/vcenter plugin. I've got the required vCenter CLI software installed and tested with `gethosts`, `on`, `off`, etc. commands as per /usr/share/doc/cluster-glue/stonith/README.vcenter. I'm struggling to understand how to now get it set up with pacemaker. Both the aforementioned document as well as https://www.hastexo.com/resources/hints-and-kinks/fencing-vmware-virtualized-pacemaker-nodes/ have instructions for crm, not pcs, and I'm not sure how exactly to translate one to the other. What I've done before in this circumstance is to install crmsh, execute the crm-based command, then look at the resulting .xml and try to figure out a pcs command that creates an equivalent result. Anyways, those two instructions give very different commands, and I don't really understand either. Firstly, I'll start with the documentation file included on my system, as I'm assuming that should be the most authoritative. It provides the following two commands as examples: crm configure primitive vfencing stonith::external/vcenter params \ VI_SERVER="10.1.1.1" VI_CREDSTORE="/etc/vicredentials.xml" \ HOSTLIST="hostname1=vmname1;hostname2=vmname2" RESETPOWERON="0" \ op monitor interval="60s" crm configure clone Fencing vfencing Hi, the pcs alternative commands are: pcs stonith create vfencing external/vcenter \ VI_SERVER=10.1.1.1 VI_CREDSTORE=/etc/vicredentials.xml \ HOSTLIST="hostname1=vmname1;hostname2=vmname2" RESETPOWERON=0 \ op monitor interval=60s and pcs resource clone vfencing However, the part `op monitor interval=60s` can be omitted since pcs takes it from the agent automatically. The second command can be used but it does not make much sense as already mentioned by Andrei. Ivan Why is the second line there? What does it do? Is it necessary? Unfortunately the document doesn't give any explanation. Secondly, looking at the web link above, it says to add a primitive for each node in the cluster, as well as a location. This seems rather different than the above approach. Which is more correct? Lastly, searching the web for some documentation on how to do this with PCS, I came across https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/configuring_the_red_hat_high_availability_add-on_with_pacemaker/s1-fencedevicecreate-haar - which has yet another totally different way of doing things, by adding a "fencing device". Attempting to fiddle around with fence_vmware command doesn't seem to get me anywhere - how is this related to the external/vcenter module? So I'm really confused about what I should do, and why there seems to be radically different ways presented, none of which I can easily grasp. I assume these questions are the same regardless of which particular plugin is being used... Is there some good documentation that explains this in better detail and can definitively tell me the best way of going about this, preferably with pcs? Thank you, ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Feedback wanted: changing "master/slave" terminology
Hi, I think there's enough sentiment for "promoted"/"started" as the role names, since it most directly reflects how pacemaker uses them. Just a question. The property "role" of a resource operation can have values: "Stopped", "Started" and in the case of multi-state resources, "Slave" and "Master". What does it mean when the value is "Started"? Does it mean either "Slave" or "Master" or does it mean just "Slave"? Ivan For the resources themselves, how about "binary clones"? On Thu, 2018-01-18 at 10:48 -0600, Ken Gaillot wrote: On Thu, 2018-01-18 at 08:22 +0100, Ulrich Windl wrote: Ken Gaillotschrieb am 17.01.2018 um 17:04 in Nachricht <1516205099.5103.3.ca...@redhat.com>: On Wed, 2018-01-17 at 08:32 +0100, Ulrich Windl wrote: Ken Gaillot schrieb am 16.01.2018 um 23:33 in Nachricht <1516142036.5604.3.ca...@redhat.com>: As we look to release Pacemaker 2.0 and (separately) update the OCF standard, this is a good time to revisit the terminology and syntax we use for master/slave resources. I think the term "stateful resource" is a better substitute for "master/slave resource". That would mainly be a documentation change. If there will be exactly two states, it'll be bi-state resource, and when abandoning the name, you should also abandon names like promote and demote, because they stick to master/slave. So maybe start with describing what a stateful resource is, then talk about names. BTW: All resoiucres we have are "stateful", because they can be in started and stopped states at least ;-) Good points. A clone is a resource with a configurable number of instances using the same resource configuration. When a clone is stateful, each active s/the same/a common/ # if they were the same, there could be no differences Nope, it's identical ... a single + configuration in Pacemaker is used to generate all instances. The service's own configuration doesn't change, either. Each instance is either completely identical, and simply running on different nodes, or handles a subset of requests determined by information available at run-time. instance is in one of two roles at any given time, and Pacemaker two: just two or at least two? Exactly two. While it is easy to imagine more than two, or even more complex scenarios (e.g. database server instances can serve as master for certain tables and replicant for other tables), we don't see any demand for managing that via pacemaker, and it would require a complete re- implementation (and someone with the resources to do that). manages instances' roles via promote and demote actions. NOw try to define what promote and demote do ;-) A successful call to the resource agent's "start" action must leave the resource in a particular one of the roles (the default role, from the cluster's point of view). A successful "promote" action must move an instance from the default role to the non-default role, and a successful "demote" action must move an instance from the non-default role to the default role. So, it's very generic from the cluster's point of view. Too bad "roleful" isn't a word ;-) As you mentioned, "state" can more broadly refer to started, stopped, etc., but pacemaker does consider "started in slave role" and "started in master role" as extensions of this, so I don't think "stateful" is too far off the mark. Maybe also state the purpose of having different roles here, and define what a role as opposed to a state is. That's part of the problem -- the purpose is entirely up to the specific application. Some use it for a master copy of data vs a replicated copy of data, a read/write instance vs a read-only instance, a coordinating function vs an executing function, an active instance vs a hot-spare instance, etc. That's why I like "promoted"/"started" -- it most directly implies "whatever role you get after promote" vs "whatever role you get after start". It would even be easy to think of the pacemaker daemons themselves as clones. The crmd would be a stateful clone whose non-default role is the DC. The attrd would be a stateful clone whose non-default role is the writer. (It might be "fun" to represent the daemons as resources one day ...) Separately, clones (whether stateful or not) may be anonymous or unique (i.e. whether it makes sense to start more than one instance on the same node), which confuses things further. "anonymous clone" should be defined also, just as unique: Aren't all configured resources "unique" (i.e. being different from each other)? I'm curious about more than two roles, multiple "masters" and multiple "slaves". It's a common model to have one database master and a bunch of replicants, and with most databases having good active/active support these says, it's becoming more common to have multiple masters, with or without separate replicants. It's also common for one coordinator with multiple workers. Regards, Ulrich
Re: [ClusterLabs] [Pacemaker on raspberry pi]
Hi, *#pcs cluster node add pi05 --start --enable** *Disabling SBD service... pi05: sbd disabled Traceback (most recent call last): File "/usr/sbin/pcs", line 11, in load_entry_point('pcs==0.9.160', 'console_scripts', 'pcs')() File "/usr/lib/python3.6/site-packages/pcs/app.py", line 190, in main cmd_map[command](argv) File "/usr/lib/python3.6/site-packages/pcs/cluster.py", line 218, in cluster_cmd cluster_node(argv) File "/usr/lib/python3.6/site-packages/pcs/cluster.py", line 1674, in cluster_node node_add(lib_env, node0, node1, modifiers) File "/usr/lib/python3.6/site-packages/pcs/cluster.py", line 1857, in node_add allow_incomplete_distribution=modifiers["skip_offline_nodes"] File "/usr/lib/python3.6/site-packages/pcs/lib/commands/remote_node.py", line 58, in _share_authkey node_communication_format.pcmk_authkey_file(authkey_content), File "/usr/lib/python3.6/site-packages/pcs/lib/node_communication_format.py", line 47, in pcmk_authkey_file "pacemaker_remote authkey": pcmk_authkey_format(authkey_content) File "/usr/lib/python3.6/site-packages/pcs/lib/node_communication_format.py", line 29, in pcmk_authkey_format "data": base64.b64encode(authkey_content).decode("utf-8"), File "/usr/lib/python3.6/base64.py", line 58, in b64encode encoded = binascii.b2a_base64(s, newline=False) TypeError: a bytes-like object is required, not 'str' it seems there is a TypeError this problem has been fixed in pcs-0.9.163-2.fc27. This package is in testing and soon it will be in stable (days to stable 1). Ivan ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] Does restarting pcsd also restart resources
Hi Jonathan, the resources will stay up. Resources do not depend on pcsd daemon lifecycle. Ivan On 1/22/19 5:31 PM, Jonathan Hull wrote: A quick question, this is on RHEL7. If I was to restart the pcs daemon only (pcsd), such as is done automatically when changing the cert with "pcs pcsd certkey" will this also restart the cluster resources causing downtime or do the resources stay up? Thanks. ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] pcs constraint order set syntax
Hello Chris, Dne 19. 11. 18 v 23:32 Chris Miller napsal(a): Hello, I am attempting to add a resource to an existing ordering constraint set. The system in question came pre-configured with PCS (FreePBX HA), and need to add a resource group (queuemetrics) to the ordering constraint set. Before modification, the existing set is as follows (output from pcs config --full) set mysql httpd asterisk sequential=true (id:mysql-httpd-asterisk) setoptions kind=Optional (id:freepbx-start-order) I'm having issues with the "resource order set" command syntax, specifically with setting options and IDs. Per the man page and help info, the syntax appears as it should be this : pcs constraint order set mysql httpd asterisk queuemetrics sequential=true id=mysql-httpd-asterisk setoptions kind=Optional id=freepbx-start-order according pcs man: * sequential=true id=mysql-httpd-asterisk are options * kind=Optional id=freepbx-start-order are constraint_options Allowed options are: action, require-all, role, sequential. So id=mysql-httpd-asterisk is not valid. For constraint_options it is possible to use id. However, `pcs constraint order set` is only for creating constraint set. Unfortunately it is not possible to update a constraint. As a workaround you can delete the constraint and create the new one in one step. Something like this: $ pcs cluster cib temp-cib.xml $ pcs constraint delete freepbx-start-order -f temp-cib.xml $ pcs constraint order set mysql httpd asterisk queuemetrics sequential=true setoptions kind=Optional id=freepbx-start-order -f temp-cib.xml $ pcs cluster cib-push temp-cib.xml However when running this command I receive the following error : Call cib_replace failed (-203): Update does not conform to the configured schema I have also tried variations of this syntax, and the ID option specifically is ignored and a dynamically generated name is used instead. I'm not having any luck finding guidance with this specific issue online. Thanks in advance for your guidance. Chris Ivan ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] pcs 0.10.1 released
On 12/28/18 5:39 AM, digimer wrote: On 2018-11-26 12:26 p.m., Tomas Jelinek wrote: I am happy to announce the latest release of pcs, version 0.10.1. Source code is available at: https://github.com/ClusterLabs/pcs/archive/0.10.1.tar.gz or https://github.com/ClusterLabs/pcs/archive/0.10.1.zip This is the first final release of the pcs-0.10 branch. Pcs-0.10 is the new main pcs branch supporting Corosync 3.x and Pacemaker 2.x clusters while dropping support for older Corosync and Pacemaker versions. Pcs-0.9, being in maintenance mode, continues to support Corosync 1.x/2.x and Pacemaker 1.x clusters. Main changes compared to 0.9 branch: * Corosync 3.x and Kronosnet is supported while Corosync 2.x and older as well as CMAN are not * Node names are now fully supported * Pacemaker 2.x is supported while Pacemaker 1.x is not * Promotable clone resources replaced master resources; creating master resources is no longer possible but managing existing master resources is supported * Options starting with '-' and '--' are no longer accepted by commands for which those options have no effect * Obsoleting parameters of resource and fence agents are now supported and preferred over deprecated parameters * Several deprecated and / or undocumented pcs commands / options have been removed * Python 3.6+ and Ruby 2.2+ is now required Complete change log for this release against 0.9.163: ## [0.10.1] - 2018-11-23 ### Removed - Pcs-0.10 removes support for CMAN, Corosync 1.x, Corosync 2.x and Pacemaker 1.x based clusters. For managing those clusters use pcs-0.9.x. - Pcs-0.10 requires Python 3.6 and Ruby 2.2, support for older Python and Ruby versions has been removed. - `pcs resource failcount reset` command has been removed as `pcs resource cleanup` is doing exactly the same job. ([rhbz#1427273]) - Deprecated commands `pcs cluster remote-node add | remove` have been removed as they were replaced with `pcs cluster node add-guest | remove-guest` - Ability to create master resources has been removed as they are deprecated in Pacemaker 2.x ([rhbz#1542288]) - Instead of `pcs resource create ... master` use `pcs resource create ... promotable` or `pcs resource create ... clone promotable=true` - Instead of `pcs resource master` use `pcs resource promotable` or `pcs resource clone ... promotable=true` - Deprecated --clone option from `pcs resource create` command - Ability to manage node attributes with `pcs property set|unset|show` commands (using `--node` option). The same functionality is still available using `pcs node attribute` command. - Undocumented version of the `pcs constraint colocation add` command, its syntax was `pcs constraint colocation add [score] [options]` - Deprecated commands `pcs cluster standby | unstandby`, use `pcs node standby | unstandby` instead - Deprecated command `pcs cluster quorum unblock` which was replaced by `pcs quorum unblock` - Subcommand `pcs status groups` as it was not showing a cluster status but cluster configuration. The same functionality is still available using command `pcs resource group list` - Undocumented command `pcs acl target`, use `pcs acl user` instead ### Added - Validation for an unaccessible resource inside a bundle ([rhbz#1462248]) - Options to filter failures by an operation and its interval in `pcs resource cleanup` and `pcs resource failcount show` commands ([rhbz#1427273]) - Commands for listing and testing watchdog devices ([rhbz#1578891]) - Commands for creating promotable clone resources `pcs resource promotable` and `pcs resource create ... promotable` ([rhbz#1542288]) - `pcs resource update` and `pcs resource meta` commands change master resources to promotable clone resources because master resources are deprecated in Pacemaker 2.x ([rhbz#1542288]) - Support for the `promoted-max` bundle option replacing the `masters` option in Pacemaker 2.x ([rhbz#1542288]) - Support for OP_NO_RENEGOTIATION option when OpenSSL supports it (even with Python 3.6) ([rhbz#1566430]) - Support for container types `rkt` and `podman` into bundle commands ([rhbz#1619620]) - Support for promotable clone resources in pcsd and web UI ([rhbz#1542288]) - Obsoleting parameters of resource and fence agents are now supported and preferred over deprecated parameters ([rhbz#1436217]) - `pcs status` now shows failed and pending fencing actions and `pcs status --full` shows the whole fencing history. Pacemaker supporting fencing history is required. ([rhbz#1615891]) - `pcs stonith history` commands for displaying, synchronizing and cleaning up fencing history. Pacemaker supporting fencing history is required. ([rhbz#1620190]) - Validation of node existence in a cluster when creating location constraints ([rhbz#1553718]) - Command `pcs client local-auth` for authentication of pcs client against local pcsd. This is required when a non-root user wants to execute a command which requires root
Re: [ClusterLabs] Why do clusters have a name?
On 26. 03. 19 21:12, Brian Reichert wrote: This will sound like a dumb question: The manpage for pcs(8) implies that to set up a cluster, one needs to provide a name. Why do clusters have names? Is there a use case wherein there would be multiple clusters visible in an administrative UI, such that they'd need to be differentiated? For example in a web UI of pcs is a page with multiple clusters. Ivan ___ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/