Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1
On 11/04/2016 02:29 AM, Ulrich Windl wrote: Ken Gaillot schrieb am 03.11.2016 um 17:08 in > Nachricht > <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>: >> ClusterLabs is happy to announce the first release candidate for >> Pacemaker version 1.1.16. Source code is available at: >> >> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 >> >> The most significant enhancements in this release are: >> >> * rsc-pattern may now be used instead of rsc in location constraints, to >> allow a single location constraint to apply to all resources whose names >> match a regular expression. Sed-like %0 - %9 backreferences let >> submatches be used in node attribute names in rules. >> >> * The new ocf:pacemaker:attribute resource agent sets a node attribute >> according to whether the resource is running or stopped. This may be >> useful in combination with attribute-based rules to model dependencies >> that simple constraints can't handle. > > I don't quite understand this: Isn't the state of a resource in the CIB status > section anyway? If not, why not add it? So it would be readily available for > anyone (rules, constraints, etc.). This (hopefully) lets you model more complicated relationships. For example, someone recently asked whether they could make an ordering constraint apply only at "start-up" -- the first time resource A starts, it does some initialization that B needs, but once that's done, B can be independent of A. For that case, you could group A with an ocf:pacemaker:attribute resource. The important part is that the attribute is not set if A has never run on a node. So, you can make a rule that B can run only where the attribute is set, regardless of the value -- even if A is later stopped, the attribute will still be set. Another possible use would be for a cron that needs to know whether a particular resource is running, and an attribute query is quicker and easier than something like parsing crm_mon output or probing the service. It's all theoretical at this point, and I'm not entirely sure those examples would be useful :) but I wanted to make the agent available for people to experiment with. >> * Pacemaker's existing "node health" feature allows resources to move >> off nodes that become unhealthy. Now, when using >> node-health-strategy=progressive, a new cluster property >> node-health-base will be used as the initial health score of newly >> joined nodes (defaulting to 0, which is the previous behavior). This >> allows cloned and multistate resource instances to start on a node even >> if it has some "yellow" health attributes. > > So the node health is more or less a "node score"? I don't understand the last > sentence. Maybe give an example? Yes, node health is a score that's added when deciding where to place a resource. It does get complicated ... Node health monitoring is optional, and off by default. Node health attributes are set to red, yellow or green (outside pacemaker itself -- either by a resource agent, or some external process). As an example, let's say we have three node health attributes for CPU usage, CPU temperature, and SMART error count. With a progressive strategy, red and yellow are assigned some negative score, and green is 0. In our example, let's say yellow gets a -10 score. If any of our attributes are yellow, resources will avoid the node (unless they have higher positive scores from something like stickiness or a location constraint). Normally, this is what you want, but if your resources are cloned on all nodes, maybe you don't care if some attributes are yellow. In that case, you can set node-health-base=20, so even if two attributes are yellow, it won't prevent resources from running (20 + -10 + -10 = 0). There is nothing in node-health-base that is specific to clones; that's just the most likely use case. >> * Previously, the OCF_RESKEY_CRM_meta_notify_active_* variables were not >> properly passed to multistate resources with notification enabled. This >> has been fixed. To help resource agents detect when the fix is >> available, the CRM feature set has been incremented. (Whenever the >> feature set changes, mixed-version clusters are supported only during >> rolling upgrades -- nodes with an older version will not be allowed to >> rejoin once they shut down.) > > Where can I find a description of current "CRM feature sets"? > > Ulrich > >> >> * Watchdog-based fencing using sbd now works on remote nodes. >> >> * The build process now takes advantage of various compiler features >> (RELRO, PIE, as-needed linking, etc.) that enhance security and start-up >> performance. See the "Hardening flags" comments in the configure.ac file >> for more details. >> >> * Python 3 compatibility: The Pacemaker project now targets >> compatibility with both python 2 (versions 2.6 and later) and python 3 >> (versions 3.2 and later). All of the project's python code now meets >> this target, with the exception of CTS, which is still
Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1
On 11/04/2016 02:53 AM, Jan Pokorný wrote: > On 04/11/16 08:29 +0100, Ulrich Windl wrote: >> Ken Gaillot schrieb am 03.11.2016 um 17:08 in >> Nachricht <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>: >>> ClusterLabs is happy to announce the first release candidate for >>> Pacemaker version 1.1.16. Source code is available at: >>> >>> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 >>> >>> The most significant enhancements in this release are: >>> >>> [...] >>> >>> * Previously, the OCF_RESKEY_CRM_meta_notify_active_* variables were not >>> properly passed to multistate resources with notification enabled. This >>> has been fixed. To help resource agents detect when the fix is >>> available, the CRM feature set has been incremented. (Whenever the >>> feature set changes, mixed-version clusters are supported only during >>> rolling upgrades -- nodes with an older version will not be allowed to >>> rejoin once they shut down.) >> >> Where can I find a description of current "CRM feature sets"? > > It's originally internal-only versioning for cluster to know which > node has the oldest software and hence is predestined to be DC > (in rolling update scenario). > > Ken recently mapped these versions (together with LRMD protocol > versions relevant in the context of pacemaker remote communication) > to proper release versions: http://clusterlabs.org/wiki/ReleaseCalendar There is also a description of how the cluster uses the CRM feature set in the new upgrade documentation: http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Explained/index.html#_rolling_node_by_node ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] DRBD demote/promote not called - Why? How to fix?
Hi I have a basic 2 node active/passive cluster with Pacemaker (1.1.14 , pcs: 0.9.148) / CMAN (3.0.12.1) / Corosync (1.4.7) on RHEL 6.8. This cluster runs NFS on top of DRBD (8.4.4). Basically the system is working on both nodes and I can switch the resources from one node to the other. But switching resources to the other node does not work, if I try to move just one resource and have the others follow due to the location constraints. >From the logged messages I see that in this "failure case" there is NO attempt >to demote/promote the DRBD clone resource. Here is my setup: == Cluster Name: clst1 Corosync Nodes: ventsi-clst1-sync ventsi-clst2-sync Pacemaker Nodes: ventsi-clst1-sync ventsi-clst2-sync Resources: Resource: IPaddrNFS (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=xxx.xxx.xxx.xxx cidr_netmask=24 Operations: start interval=0s timeout=20s (IPaddrNFS-start-interval-0s) stop interval=0s timeout=20s (IPaddrNFS-stop-interval-0s) monitor interval=5s (IPaddrNFS-monitor-interval-5s) Resource: NFSServer (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/var/lib/nfsserversettings/ nfs_ip=xxx.xxx.xxx.xxx nfsd_args="-H xxx.xxx.xxx.xxx" Operations: start interval=0s timeout=40 (NFSServer-start-interval-0s) stop interval=0s timeout=20s (NFSServer-stop-interval-0s) monitor interval=10s timeout=20s (NFSServer-monitor-interval-10s) Master: DRBDClone Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true Resource: DRBD (class=ocf provider=linbit type=drbd) Attributes: drbd_resource=nfsdata Operations: start interval=0s timeout=240 (DRBD-start-interval-0s) promote interval=0s timeout=90 (DRBD-promote-interval-0s) demote interval=0s timeout=90 (DRBD-demote-interval-0s) stop interval=0s timeout=100 (DRBD-stop-interval-0s) monitor interval=1s timeout=5 (DRBD-monitor-interval-1s) Resource: DRBD_global_clst (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/drbd1 directory=/drbdmnts/global_clst fstype=ext4 Operations: start interval=0s timeout=60 (DRBD_global_clst-start-interval-0s) stop interval=0s timeout=60 (DRBD_global_clst-stop-interval-0s) monitor interval=20 timeout=40 (DRBD_global_clst-monitor-interval-20) Stonith Devices: Resource: ipmi-fence-clst1 (class=stonith type=fence_ipmilan) Attributes: lanplus=1 login=foo passwd=bar action=reboot ipaddr=yyy.yyy.yyy.yyy pcmk_host_check=static-list pcmk_host_list=ventsi-clst1-sync auth=password timeout=30 cipher=1 Operations: monitor interval=60s (ipmi-fence-clst1-monitor-interval-60s) Resource: ipmi-fence-clst2 (class=stonith type=fence_ipmilan) Attributes: lanplus=1 login=foo passwd=bar action=reboot ipaddr=zzz.zzz.zzz.zzz pcmk_host_check=static-list pcmk_host_list=ventsi-clst2-sync auth=password timeout=30 cipher=1 Operations: monitor interval=60s (ipmi-fence-clst2-monitor-interval-60s) Fencing Levels: Location Constraints: Resource: ipmi-fence-clst1 Disabled on: ventsi-clst1-sync (score:-INFINITY) (id:location-ipmi-fence-clst1-ventsi-clst1-sync--INFINITY) Resource: ipmi-fence-clst2 Disabled on: ventsi-clst2-sync (score:-INFINITY) (id:location-ipmi-fence-clst2-ventsi-clst2-sync--INFINITY) Ordering Constraints: start IPaddrNFS then start NFSServer (kind:Mandatory) (id:order-IPaddrNFS-NFSServer-mandatory) promote DRBDClone then start DRBD_global_clst (kind:Mandatory) (id:order-DRBDClone-DRBD_global_clst-mandatory) start DRBD_global_clst then start IPaddrNFS (kind:Mandatory) (id:order-DRBD_global_clst-IPaddrNFS-mandatory) Colocation Constraints: NFSServer with IPaddrNFS (score:INFINITY) (id:colocation-NFSServer-IPaddrNFS-INFINITY) DRBD_global_clst with DRBDClone (score:INFINITY) (id:colocation-DRBD_global_clst-DRBDClone-INFINITY) IPaddrNFS with DRBD_global_clst (score:INFINITY) (id:colocation-IPaddrNFS-DRBD_global_clst-INFINITY) Resources Defaults: resource-stickiness: INFINITY Operations Defaults: timeout: 10s Cluster Properties: cluster-infrastructure: cman dc-version: 1.1.14-8.el6-70404b0 have-watchdog: false last-lrm-refresh: 1478277432 no-quorum-policy: ignore stonith-enabled: true symmetric-cluster: true == Initial state is e.g. this (all resources at node1): Online: [ ventsi-clst1-sync ventsi-clst2-sync ] Full list of resources: ipmi-fence-clst1 (stonith:fence_ipmilan):Started ventsi-clst2-sync ipmi-fence-clst2 (stonith:fence_ipmilan):Started ventsi-clst1-sync IPaddrNFS (ocf::heartbeat:IPaddr2): Started ventsi-clst1-sync NFSServer (ocf::heartbeat:nfsserver): Started ventsi-clst1-sync Master/Slave Set: DRBDClone [DRBD] Masters: [ ventsi-clst1-sync ]
Re: [ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1
On 04/11/16 08:29 +0100, Ulrich Windl wrote: > Ken Gaillot schrieb am 03.11.2016 um 17:08 in > Nachricht <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>: >> ClusterLabs is happy to announce the first release candidate for >> Pacemaker version 1.1.16. Source code is available at: >> >> https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 >> >> The most significant enhancements in this release are: >> >> [...] >> >> * Previously, the OCF_RESKEY_CRM_meta_notify_active_* variables were not >> properly passed to multistate resources with notification enabled. This >> has been fixed. To help resource agents detect when the fix is >> available, the CRM feature set has been incremented. (Whenever the >> feature set changes, mixed-version clusters are supported only during >> rolling upgrades -- nodes with an older version will not be allowed to >> rejoin once they shut down.) > > Where can I find a description of current "CRM feature sets"? It's originally internal-only versioning for cluster to know which node has the oldest software and hence is predestined to be DC (in rolling update scenario). Ken recently mapped these versions (together with LRMD protocol versions relevant in the context of pacemaker remote communication) to proper release versions: http://clusterlabs.org/wiki/ReleaseCalendar -- Jan (Poki) pgpzDT7kPIwPr.pgp Description: PGP signature ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Antw: Pacemaker 1.1.16 - Release Candidate 1
>>> Ken Gaillot schrieb am 03.11.2016 um 17:08 in Nachricht <8af2ff98-05fd-a2c7-f670-58d0ff68e...@redhat.com>: > ClusterLabs is happy to announce the first release candidate for > Pacemaker version 1.1.16. Source code is available at: > > https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.16-rc1 > > The most significant enhancements in this release are: > > * rsc-pattern may now be used instead of rsc in location constraints, to > allow a single location constraint to apply to all resources whose names > match a regular expression. Sed-like %0 - %9 backreferences let > submatches be used in node attribute names in rules. > > * The new ocf:pacemaker:attribute resource agent sets a node attribute > according to whether the resource is running or stopped. This may be > useful in combination with attribute-based rules to model dependencies > that simple constraints can't handle. I don't quite understand this: Isn't the state of a resource in the CIB status section anyway? If not, why not add it? So it would be readily available for anyone (rules, constraints, etc.). > > * Pacemaker's existing "node health" feature allows resources to move > off nodes that become unhealthy. Now, when using > node-health-strategy=progressive, a new cluster property > node-health-base will be used as the initial health score of newly > joined nodes (defaulting to 0, which is the previous behavior). This > allows cloned and multistate resource instances to start on a node even > if it has some "yellow" health attributes. So the node health is more or less a "node score"? I don't understand the last sentence. Maybe give an example? > > * Previously, the OCF_RESKEY_CRM_meta_notify_active_* variables were not > properly passed to multistate resources with notification enabled. This > has been fixed. To help resource agents detect when the fix is > available, the CRM feature set has been incremented. (Whenever the > feature set changes, mixed-version clusters are supported only during > rolling upgrades -- nodes with an older version will not be allowed to > rejoin once they shut down.) Where can I find a description of current "CRM feature sets"? Ulrich > > * Watchdog-based fencing using sbd now works on remote nodes. > > * The build process now takes advantage of various compiler features > (RELRO, PIE, as-needed linking, etc.) that enhance security and start-up > performance. See the "Hardening flags" comments in the configure.ac file > for more details. > > * Python 3 compatibility: The Pacemaker project now targets > compatibility with both python 2 (versions 2.6 and later) and python 3 > (versions 3.2 and later). All of the project's python code now meets > this target, with the exception of CTS, which is still python 2 only. > > * The Pacemaker coding guidelines have been replaced by a more > comprehensive addition to the documentation set, "Pacemaker > Development". It is intended for developers working on the Pacemaker > code base itself, rather than external code such as resource agents. A > copy is viewable at > http://clusterlabs.org/doc/en-US/Pacemaker/1.1-pcs/html-single/Pacemaker_Dev > elopment/index.html > > As usual, the release includes many bugfixes, including a fix for a > serious security vulnerability (CVE-2016-7035). For a more detailed list > of changes, see the change log: > > https://github.com/ClusterLabs/pacemaker/blob/1.1/ChangeLog > > Everyone is encouraged to download, compile and test the new release. We > do many regression tests and simulations, but we can't cover all > possible use cases, so your feedback is important and appreciated. Due > to the security fix, I intend to keep this release cycle short, so quick > testing feedback is especially appreciated. > > Many thanks to all contributors of source code to this release, > including Andrew Beekhof, Bin Liu, Christian Schneider, Christoph Berg, > David Shane Holden, Ferenc Wágner, Yan Gao, Hideo Yamauchi, Jan Pokorný, > Ken Gaillot, Klaus Wenninger, Kostiantyn Ponomarenko, Kristoffer > Grönlund, Lars Ellenberg, Masatake Yamato, Michal Koutný, Nakahira > Kazutomo, Nate Clark, Nishanth Aravamudan, Oyvind Albrigtsen, Ruben > Kerkhof, Tim Bishop, Vladislav Bogdanov and Yusuke Iida. Apologies if I > have overlooked anyone. > -- > Ken Gaillot > > ___ > Users mailing list: Users@clusterlabs.org > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org