Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Klaus Wenninger
On 09/20/2016 10:25 PM, Ken Gaillot wrote: > Hi everybody, > > Currently, Pacemaker's on-fail property allows you to configure how the > cluster reacts to operation failures. The default "restart" means try to > restart on the same node, optionally moving to another node once > migration-threshold

[ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Auer, Jens
Hi, in my cluster setup I have a couple of resources from which I need to start some in specific order. Basically I have two cloned resources that should start after mounting a DRBD filesystem on all nodes plus one resource that start after the clone sets. It is important that this only

Re: [ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Auer, Jens
Hi, could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? It sounds similar. Cheers, Jens -- Jens Auer | CGI | Software-Engineer CGI (Germany) GmbH & Co. KG Rheinstraße 95 | 64295 Darmstadt | Germany T: +49 6151 36860 154 jens.a...@cgi.com Unsere Pflichtangaben gemäß §

Re: [ClusterLabs] best practice fencing with ipmi in 2node-setups / cloneresource/monitor/timeout

2016-09-21 Thread Ken Gaillot
On 09/21/2016 01:51 AM, Stefan Bauer wrote: > Hi Ken, > > let met sum it up: > > Pacemaker in recent versions is smart enough to run (trigger, execute) the > fence operation on the node, that is not the target. > > If i have an external stonith device that can fence multiple nodes, a single >

[ClusterLabs] Authoritative corosync's location (Was: corosync-quorum tool, output name key on Name column if set?)

2016-09-21 Thread Jan Pokorný
On 21/09/16 09:16 +0200, Jan Friesse wrote: > Thomas Lamprecht napsal(a): >> I have also another, organizational question. I saw on the GitHub page from >> corosync that pull request there are preferred, and also that the > > True At this point, it's worth noting that ClusterLabs/corosync is

Re: [ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Auer, Jens
Hi, > shared_fs has to wait for the DRBD promotion, but the other resources > have no such limitation, so they are free to start before shared_fs. Isn't there an implicit limitation by the ordering constraint? I have drbd_promote < shared_fs < snmpAgent-clone, and I would expect this to be a

Re: [ClusterLabs] kind=Optional order constraint not working at startup

2016-09-21 Thread Ken Gaillot
On 09/21/2016 09:00 AM, Auer, Jens wrote: > Hi, > > could this be issue 5039 (http://bugs.clusterlabs.org/show_bug.cgi?id=5039)? > It sounds similar. Correct -- "Optional" means honor the constraint only if both resources are starting *in the same transition*. shared_fs has to wait for the

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Ken Gaillot
On 09/21/2016 02:23 AM, Kristoffer Grönlund wrote: > First of all, is there a use case for when fence-after-3-failures is a > useful behavior? I seem to recall some case where someone expected that > to be the behavior and were surprised by how pacemaker works, but that > problem wouldn't be

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Ken Gaillot
On 09/20/2016 07:51 PM, Andrew Beekhof wrote: > > > On Wed, Sep 21, 2016 at 6:25 AM, Ken Gaillot > wrote: > > Hi everybody, > > Currently, Pacemaker's on-fail property allows you to configure how the > cluster reacts to operation

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Kristoffer Grönlund
Ken Gaillot writes: > Hi everybody, > > Currently, Pacemaker's on-fail property allows you to configure how the > cluster reacts to operation failures. The default "restart" means try to > restart on the same node, optionally moving to another node once > migration-threshold

Re: [ClusterLabs] corosync-quorum tool, output name key on Name column if set?

2016-09-21 Thread Jan Friesse
Thomas Lamprecht napsal(a): On 09/20/2016 12:36 PM, Christine Caulfield wrote: On 20/09/16 10:46, Thomas Lamprecht wrote: Hi, when I'm using corosync-quorumtool [-l] and have my ring0_addr set to a IP address, which does not resolve to a hostname, I get the nodes IP addresses for the 'Name'

Re: [ClusterLabs] Force Unmount - SLES 11 SP4

2016-09-21 Thread Kristoffer Grönlund
Jorge Fábregas writes: > Hi, > > I have an issue while shutting down one of our clusters. The unmounting > of an OCFS2 filesystem (ocf:heartbeat:Filesystem) is triggering a node > fence (accordingly). This is because the script for stopping the > application is not

Re: [ClusterLabs] RFC: allowing soft recovery attempts before ignore/block/etc.

2016-09-21 Thread Kristoffer Grönlund
Kristoffer Grönlund writes: > If implementing the first option, I would prefer to keep the behavior of > migration-threshold of counting all failures, not just > monitors. Otherwise there would be two closely related thresholds with > subtly divergent behavior, which seems

Re: [ClusterLabs] corosync-quorum tool, output name key on Name column if set?

2016-09-21 Thread Thomas Lamprecht
On 09/20/2016 12:36 PM, Christine Caulfield wrote: On 20/09/16 10:46, Thomas Lamprecht wrote: Hi, when I'm using corosync-quorumtool [-l] and have my ring0_addr set to a IP address, which does not resolve to a hostname, I get the nodes IP addresses for the 'Name' column. As I'm using the