Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Ferenc Wágner
Andrei Borzenkov writes: > On Wed, Mar 16, 2016 at 2:22 PM, Ferenc Wágner wrote: > >> Pacemaker explained says about this cluster option: >> >> Advanced Use Only: Should the cluster shoot unseen nodes? Not using >> the default is very unsafe! >> >> 1.

Re: [ClusterLabs] reproducible split brain

2016-03-19 Thread Ken Gaillot
On 03/16/2016 03:04 PM, Christopher Harvey wrote: > On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote: >> On 16/03/16 03:59 PM, Christopher Harvey wrote: >>> I am able to create a split brain situation in corosync 1.1.13 using >>> iptables in a 3 node cluster. >>> >>> I have 3 nodes, vmr-132-3,

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 17/03/16 07:30 PM, Christopher Harvey wrote: > On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote: >> On 03/17/2016 05:10 PM, Christopher Harvey wrote: >>> If I ignore pacemaker's existence, and just run corosync, corosync >>> disagrees about node membership in the situation presented in the

Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Andrei Borzenkov
On Wed, Mar 16, 2016 at 4:18 PM, Lars Ellenberg wrote: > On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote: >> >> And some more about fencing: >> >> >> >> 3. What's the difference in cluster behavior between >> >>- stonith-enabled=FALSE (9.3.2: how often

Re: [ClusterLabs] attrd: Fix sigsegv on exit if initialization failed

2016-03-19 Thread Ken Gaillot
On 10/12/2015 06:08 AM, Vladislav Bogdanov wrote: > Hi, > > This was caught with 0.17.1 libqb, which didn't play well with long pids. > > commit 180a943846b6d94c27b9b984b039ac0465df64da > Author: Vladislav Bogdanov > Date: Mon Oct 12 11:05:29 2015 + > > attrd:

Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind (Ken Gaillot)

2016-03-19 Thread Andrei Borzenkov
On Wed, Mar 16, 2016 at 9:35 PM, Mike Bernhardt wrote: > I guess I have to say "never mind!" I don't know what the problem was > yesterday, but it loads just fine today, even when the named config and the > virtual ip don't match! But for your edamacation, ifconfig does NOT

[ClusterLabs] Moving resources and implicit bans - please explain?

2016-03-19 Thread Matthew Mucker
I have set up my first three-node Pacemaker cluster and was doing some testing by using "crm resource move" commands. I found that once I moved a resource off a particular node, it would not come back up on that node. I spent a while troubleshooting and eventually gave up and rebuilt the node.

[ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Ferenc Wágner
Hi, Pacemaker explained says about this cluster option: Advanced Use Only: Should the cluster shoot unseen nodes? Not using the default is very unsafe! 1. What are those "unseen" nodes? And a possibly related question: 2. If I've got UNCLEAN (offline) nodes, is there a way to clean

[ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Ulrich Windl
>>> Christopher Harvey schrieb am 16.03.2016 um 21:04 in Nachricht <1458158684.122207.551267810.11f73...@webmail.messagingengine.com>: [...] >> > Would stonith solve this problem, or does this look like a bug? >> >> It should, that is its job. > > is there some log I can enable

Re: [ClusterLabs] Help required for N+1 redundancy setup

2016-03-19 Thread Nikhil Utane
Thanks Ken for the detailed response. I suppose I could even use some of the pcs/crm CLI commands then. Cheers. On Wed, Mar 16, 2016 at 8:27 PM, Ken Gaillot wrote: > On 03/16/2016 05:22 AM, Nikhil Utane wrote: > > I see following info gets updated in CIB. Can I use this or

Re: [ClusterLabs] Help required for N+1 redundancy setup

2016-03-19 Thread Ken Gaillot
On 03/16/2016 05:22 AM, Nikhil Utane wrote: > I see following info gets updated in CIB. Can I use this or there is better > way? > > crm-debug-origin="peer_update_callback" join="*down*" expected="member"> in_ccm/crmd/join reflect the current state of the node (as known by the partition that

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread vija ar
root file system is fine ... but fencing is not a necessity a cluster shld function without it .. i see the issue with corosync which has all been .. a inherent way of not working neatly or smoothly .. for e.g. take an issue where the live node is hung in db cluster .. now db perspective

Re: [ClusterLabs] Security with Corosync

2016-03-19 Thread Jan Friesse
Nikhil Utane napsal(a): Honza, In my CIB I see the infrastructure being set to cman. pcs status is reporting the same. [root@node3 corosync]# pcs status Cluster name: mycluster Last updated: Wed Mar 16 16:57:46 2016 Last change: Wed Mar 16 16:56:23 2016 Stack: *cman* But corosync also is

[ClusterLabs] Antw: Installed Galera, now HAProxy won't start

2016-03-19 Thread Ulrich Windl
>>> Matthew Mucker schrieb am 16.03.2016 um 23:10 in >>> Nachricht [...] > So thinking this through logically, it seems to me that the Openstack > docs were wrong in telling me to configure

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 19/03/16 10:10 AM, Dennis Jacobfeuerborn wrote: > On 18.03.2016 00:50, Digimer wrote: >> On 17/03/16 07:30 PM, Christopher Harvey wrote: >>> On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote: On 03/17/2016 05:10 PM, Christopher Harvey wrote: > If I ignore pacemaker's existence, and

[ClusterLabs] reproducible split brain

2016-03-19 Thread Christopher Harvey
I am able to create a split brain situation in corosync 1.1.13 using iptables in a 3 node cluster. I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5 All nodes are operational and form a 3 node cluster with all nodes are members of that ring. vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4

[ClusterLabs] [Announce] clufter-0.56.2 released

2016-03-19 Thread Jan Pokorný
I am happy to announce that clufter-0.56.2, a tool/library for transforming/analyzing cluster configuration formats, has been released and published (incl. signature using my 60BCBB4F5CD7F9EF key, expiration of which was prolonged just a few days back so you may want to consult key servers first):

Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Lars Ellenberg
On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote: > >> And some more about fencing: > >> > >> 3. What's the difference in cluster behavior between > >>- stonith-enabled=FALSE (9.3.2: how often will the stop operation be > >> retried?) > >>- having no configured STONITH

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Dennis Jacobfeuerborn
On 18.03.2016 00:50, Digimer wrote: > On 17/03/16 07:30 PM, Christopher Harvey wrote: >> On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote: >>> On 03/17/2016 05:10 PM, Christopher Harvey wrote: If I ignore pacemaker's existence, and just run corosync, corosync disagrees about node

Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind (Ken Gaillot)

2016-03-19 Thread Dennis Jacobfeuerborn
On 17.03.2016 08:45, Andrei Borzenkov wrote: > On Wed, Mar 16, 2016 at 9:35 PM, Mike Bernhardt wrote: >> I guess I have to say "never mind!" I don't know what the problem was >> yesterday, but it loads just fine today, even when the named config and the >> virtual ip don't

Re: [ClusterLabs] reproducible split brain

2016-03-19 Thread Digimer
On 16/03/16 03:59 PM, Christopher Harvey wrote: > I am able to create a split brain situation in corosync 1.1.13 using > iptables in a 3 node cluster. > > I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5 > > All nodes are operational and form a 3 node cluster with all nodes are > members of

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Jan Friesse
Christopher, If I ignore pacemaker's existence, and just run corosync, corosync disagrees about node membership in the situation presented in the first email. While it's true that stonith just happens to quickly correct the situation after it occurs it still smells like a bug in the case where

[ClusterLabs] Antw: Re: Pacemaker startup-fencing

2016-03-19 Thread Ulrich Windl
>>> Ferenc Wágner schrieb am 16.03.2016 um 13:47 in Nachricht <87k2l2zj0n@lant.ki.iif.hu>: [...] > Then I wonder why I hear the "must have working fencing if you value > your data" mantra so often (and always without explanation). After all, > it does not risk the data, only

Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind

2016-03-19 Thread Ken Gaillot
On 03/15/2016 06:47 PM, Mike Bernhardt wrote: > Not sure if this is a BIND question or a PCS/Corosync question, but > hopefully someone has done this before: > > > > I'm setting up a new CentOS 7 DNS server cluster to replace our very old > CentOS 4 cluster. The old one uses heartbeat which is

[ClusterLabs] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Tim Walberg
Having an issue on a newly built CentOS 7.2.1511 NFS cluster with DRBD (drbd84-utils-8.9.5-1 with kmod-drbd84-8.4.7-1_1). At this point, the resources consist of a cluster address, a DRBD device mirroring between the two cluster nodes, the file system, and the nfs-server resource. The resources

Re: [ClusterLabs] Installed Galera, now HAProxy won't start

2016-03-19 Thread Ian
> configure MariaDB server to bind to all available ports ( http://docs.openstack.org/ha-guide/controller-ha-galera-config.html, scroll to "Database Configuration," note that bind-address is 0.0.0.0.). If MariaDB binds to the virtual IP address, then HAProxy can't bind to that address and

[ClusterLabs] Reload operation for multi-state resource agent

2016-03-19 Thread Michael Lychkov
Hello everyone, Is there way to initiate reload operation call of master instance of multi-state resource agent? I have an ocf multi-state resource agent for a daemon service and I added reload op into this resource agent: * two parameters of resource agent: ... ...

Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Digimer
On 16/03/16 01:51 PM, Tim Walberg wrote: > Is there a way to make this work properly without STONITH? I forgot to mention > that both nodes are virtual machines (QEMU/KVM), which makes STONITH a minor > challenge. Also, since these symptoms occur even under "pcs cluster standby", > where STONITH

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 17/03/16 01:57 PM, vija ar wrote: > root file system is fine ... > > but fencing is not a necessity a cluster shld function without it .. i > see the issue with corosync which has all been .. a inherent way of not > working neatly or smoothly .. Absolutely wrong. If you have a service that

[ClusterLabs] [Announce] libqb 10.rc4 release

2016-03-19 Thread Christine Caulfield
This is a bugfix release and a potential 1.0 candidate. There are no actual code changes in this release, most of the patches are to the build system. Thanks to Jan Pokorný for, er, all of them. I've bumped the library soname to 0.18.0 which should really have happened last time. Changes from

Re: [ClusterLabs] Cluster failover failure with Unresolved dependency

2016-03-19 Thread Ken Gaillot
On 03/16/2016 11:20 AM, Lorand Kelemen wrote: > Dear Ken, > > I already modified the startup as suggested during testing, thanks! I > swapped the postfix ocf resource to the amavisd systemd resource, as latter > controls postfix startup also as it turns out and having both resouces in > the

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Christopher Harvey
If I ignore pacemaker's existence, and just run corosync, corosync disagrees about node membership in the situation presented in the first email. While it's true that stonith just happens to quickly correct the situation after it occurs it still smells like a bug in the case where corosync in used

Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Thomas Lamprecht
On 16.03.2016 18:51, Tim Walberg wrote: > Is there a way to make this work properly without STONITH? I forgot to mention > that both nodes are virtual machines (QEMU/KVM), which makes STONITH a minor > challenge. Also, since these symptoms occur even under "pcs cluster standby", > where STONITH

Re: [ClusterLabs] Security with Corosync

2016-03-19 Thread Nikhil Utane
[root@node3 corosync]# corosync -v Corosync Cluster Engine, version '1.4.7' Copyright (c) 2006-2009 Red Hat, Inc. So it is 1.x :( When I begun I was following multiple tutorials and ended up installing multiple packages. Let me try moving to corosync 2.0. I suppose it should be as easy as doing

Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Tim Walberg
Is there a way to make this work properly without STONITH? I forgot to mention that both nodes are virtual machines (QEMU/KVM), which makes STONITH a minor challenge. Also, since these symptoms occur even under "pcs cluster standby", where STONITH *shouldn't* be invoked, I'm not sure if that's the