Re: [ClusterLabs] dead cluster after centos update
On Mon, 2017-10-23 at 15:02 -0500, Dimitri Maziuk wrote: > I've a 2-node ZFS cluster that was working fine until redhat improved > my > user experience. Now, it's dead. > > How do I get it back? > > There was two resources: ZFS and IP. ther was fence_scsi that is now > logging > > Oct 23 14:22:45 hereland fence_scsi: Failed: nodename or key is > > required > > Oct 23 14:22:45 hereland fence_scsi: Please use '-h' for usage > > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: > > fence_scsi[1929] stderr: [ Failed: nodename or key is required ] I'm not familiar with fence_scsi, but the above looks like the issue. Does your fence device configuration have either nodename or key? If your existing configuration was working previously and now isn't, open a support ticket with Red Hat, as it sounds like a regression. > > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: > > fence_scsi[1929] stderr: [ ] > > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: > > fence_scsi[1929] stderr: [ Please use '-h' for usage ] > > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: > > fence_scsi[1929] stderr: [ ] > > Oct 23 14:22:45 hereland stonith-ng[1753]: notice: Disabling port > > list queries for fence-tank (-201): (null) > > Oct 23 14:22:45 hereland stonith-ng[1753]: notice: Operation on of > > hereland-eth1 by for crmd.1784@flemish-eth1.3e4dd918: No > > such device > > Oct 23 14:22:45 hereland crmd[1757]: error: Unfencing of > > hereland-eth1 by failed: No such device (-19) Probably due to the configuration issue, unfencing fails. fence_scsi requires unfencing, which in its case means allowing the node access to the disk. Since that fails, nothing else can proceed. > > I disabled it for now, and > > pcs resource debug-start resource-zfs --full > > works fine: the pool is imported, filesystems are mounted and > exported > -- but the resources remain stopped no matter what. > > I don't see anything useful in the logs. How do I unfsck this mess? -- Ken Gaillot___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] Pacemaker 1.1.18 Release Candidate 3
The third release candidate for Pacemaker version 1.1.18 is now available at: https://github.com/ClusterLabs/pacemaker/releases/tag/Pacemaker-1.1.18- rc3 This pre-release fixes a few minor bugs. For details, see the ChangeLog. This is likely to be the last release candidate before the final release next week. Any testing you can do is very welcome. -- Ken Gaillot___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [ClusterLabs] (Not) Coming in 1.1.18: deprecating stonith-enabled
Well, this turned out to be trickier than initially imagined. It's actually related to a broader issue in the policy engine that has only been addressed piecemeal, which is that we need to make sure the learning of a given piece of information happens before there is a need to use it. To properly deprecate stonith-enabled, we'll have to move more of the information-gathering to the beginning of the policy engine's process. As part of that, I'll probably try to define more clearly the information that can be relied on at any given point. In any case, that's a bigger project than the 1.1.18 (or 2.0.0) time frame. On Mon, 2017-09-25 at 18:53 -0500, Ken Gaillot wrote: > Hi all, > > I thought I'd call attention to one of the most visible deprecations > coming in 1.1.18: stonith-enabled. In order to deprecate that option, > we have to provide an alternate way to do the things that it does. > > stonith-enabled determines whether a resource's "requires" meta- > attribute defaults to "quorum" or "fencing". This already has an > alternate method, the rsc_defaults section. > > For everything else, e.g. whether to fence misbehaving nodes, and > whether to start resources when fencing hasn't been configured, the > cluster will now check additional criteria. > > This my plan at the moment: > > Fencing will be considered possible in a configuration if: "no- > quorum- > policy" is "suicide", any resource has "requires" set to "unfencing" > or > "fencing" (the default), any operation has "on-fail" set to "fence" > (the default for stop operations), or any fence resource has been > configured. > > If fencing is not possible, the cluster will behave as if stonith- > enabled is false (even if it's not). -- Ken Gaillot___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[ClusterLabs] dead cluster after centos update
I've a 2-node ZFS cluster that was working fine until redhat improved my user experience. Now, it's dead. How do I get it back? There was two resources: ZFS and IP. ther was fence_scsi that is now logging > Oct 23 14:22:45 hereland fence_scsi: Failed: nodename or key is required > Oct 23 14:22:45 hereland fence_scsi: Please use '-h' for usage > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: fence_scsi[1929] stderr: > [ Failed: nodename or key is required ] > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: fence_scsi[1929] stderr: > [ ] > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: fence_scsi[1929] stderr: > [ Please use '-h' for usage ] > Oct 23 14:22:45 hereland stonith-ng[1753]: warning: fence_scsi[1929] stderr: > [ ] > Oct 23 14:22:45 hereland stonith-ng[1753]: notice: Disabling port list > queries for fence-tank (-201): (null) > Oct 23 14:22:45 hereland stonith-ng[1753]: notice: Operation on of > hereland-eth1 by for crmd.1784@flemish-eth1.3e4dd918: No such device > Oct 23 14:22:45 hereland crmd[1757]: error: Unfencing of hereland-eth1 by > failed: No such device (-19) I disabled it for now, and pcs resource debug-start resource-zfs --full works fine: the pool is imported, filesystems are mounted and exported -- but the resources remain stopped no matter what. I don't see anything useful in the logs. How do I unfsck this mess? -- Dimitri Maziuk Programmer/sysadmin BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu signature.asc Description: OpenPGP digital signature ___ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org