Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Ferenc Wágner
Andrei Borzenkov  writes:

> On Wed, Mar 16, 2016 at 2:22 PM, Ferenc Wágner  wrote:
>
>> Pacemaker explained says about this cluster option:
>>
>> Advanced Use Only: Should the cluster shoot unseen nodes? Not using
>> the default is very unsafe!
>>
>> 1. What are those "unseen" nodes?
>
> Nodes that lost communication with other nodes (think of unplugging cables)

Translating to node status, does is mean UNCLEAN (offline) nodes which
suddenly return?  Can Pacemaker tell these apart from abruptly power
cycled nodes (when reboot happens before the comeback)?  I guess if a
node was successfully fenced at the time, it won't be considered
UNCLEAN, but is that the only way to avoid that?

>> And a possibly related question:
>>
>> 2. If I've got UNCLEAN (offline) nodes, is there a way to clean them up,
>>so that they don't get fenced when I switch them on?  I mean without
>>removing the node altogether, to keep its capacity settings for
>>example.
>
> You can declare node as down using "crm node clearstate". You should
> not really do it unless you ascertained that node is actually
> physically down.

Great.  Is there an equivalent in bare bones Pacemaker, that is, not
involving the CRM shell?  Like deleting some status or LRMD history
element of the node, for example?

>> And some more about fencing:
>>
>> 3. What's the difference in cluster behavior between
>>- stonith-enabled=FALSE (9.3.2: how often will the stop operation be 
>> retried?)
>>- having no configured STONITH devices (resources won't be started, 
>> right?)
>>- failing to STONITH with some error (on every node)
>>- timing out the STONITH operation
>>- manual fencing
>
> I do not think there is much difference. Without fencing pacemaker
> cannot make decision to relocate resources so cluster will be stuck.

Then I wonder why I hear the "must have working fencing if you value
your data" mantra so often (and always without explanation).  After all,
it does not risk the data, only the automatic cluster recovery, right?

>> 4. What's the modern way to do manual fencing?  (stonith_admin
>>--confirm + what?
>
> node name.

:) I did really poor wording that question.  I meant to ask what kind of
cluster (STONITH) configuration makes the cluster sit patiently until I
do the manual fencing, then carry on without timeouts or other errors.
Just as if some automatic fencing agent did the job, but letting me
investigate the node status beforehand.
-- 
Thanks,
Feri

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] reproducible split brain

2016-03-19 Thread Ken Gaillot
On 03/16/2016 03:04 PM, Christopher Harvey wrote:
> On Wed, Mar 16, 2016, at 04:00 PM, Digimer wrote:
>> On 16/03/16 03:59 PM, Christopher Harvey wrote:
>>> I am able to create a split brain situation in corosync 1.1.13 using
>>> iptables in a 3 node cluster.
>>>
>>> I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5
>>>
>>> All nodes are operational and form a 3 node cluster with all nodes are
>>> members of that ring.
>>> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> so far so good.
>>>
>>> running the following on vmr-132-4 drops all incoming (but not outgoing)
>>> packets from vmr-132-3:
>>> # iptables -I INPUT -s 192.168.132.3 -j DROP
>>> # iptables -L
>>> Chain INPUT (policy ACCEPT)
>>> target prot opt source   destination
>>> DROP   all  --  192.168.132.3anywhere
>>>
>>> Chain FORWARD (policy ACCEPT)
>>> target prot opt source   destination
>>>
>>> Chain OUTPUT (policy ACCEPT)
>>> target prot opt source   destination
>>>
>>> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
>>> vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ]
>>> vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ]
>>>
>>> vmr-132-3 thinks everything is normal and continues to provide service,
>>> vmr-132-4 and 5 form a new ring, achieve quorum and provide the same
>>> service. Splitting the link between 3 and 4 in both directions isolates
>>> vmr 3 from the rest of the cluster and everything fails over normally,
>>> so only a unidirectional failure causes problems.
>>>
>>> I don't have stonith enabled right now, and looking over the
>>> pacemaker.log file closely to see if 4 and 5 would normally have fenced
>>> 3, but I didn't see any fencing or stonith logs.
>>>
>>> Would stonith solve this problem, or does this look like a bug?
>>
>> It should, that is its job.
> 
> is there some log I can enable that would say
> "ERROR: hey, I would use stonith here, but you have it disabled! your
> warranty is void past this point! do not pass go, do not file a bug"?

Enable fencing, and create a fence device with a static host list that
doesn't match any of your nodes. Pacemaker will think fencing is
configured, but when it tries to actually fence a node, no devices will
be capable of it, and there will be errors to that effect (including "No
such device"). The cluster will block at that point. You can use
stonith_admin --confirm to manually indicate the node is down and
unblock the cluster (but be absolutely sure the node really is down!).

>> -- 
>> Digimer
>> Papers and Projects: https://alteeve.ca/w/
>> What if the cure for cancer is trapped in the mind of a person without
>> access to education?


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 17/03/16 07:30 PM, Christopher Harvey wrote:
> On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote:
>> On 03/17/2016 05:10 PM, Christopher Harvey wrote:
>>> If I ignore pacemaker's existence, and just run corosync, corosync
>>> disagrees about node membership in the situation presented in the first
>>> email. While it's true that stonith just happens to quickly correct the
>>> situation after it occurs it still smells like a bug in the case where
>>> corosync in used in isolation. Corosync is after all a membership and
>>> total ordering protocol, and the nodes in the cluster are unable to
>>> agree on membership.
>>>
>>> The Totem protocol specifies a ring_id in the token passed in a ring.
>>> Since all of the 3 nodes but one have formed a new ring with a new id
>>> how is it that the single node can survive in a ring with no other
>>> members passing a token with the old ring_id?
>>>
>>> Are there network failure situations that can fool the Totem membership
>>> protocol or is this an implementation problem? I don't see how it could
>>> not be one or the other, and it's bad either way.
>>
>> Neither, really. In a split brain situation, there simply is not enough
>> information for any protocol or implementation to reliably decide what
>> to do. That's what fencing is meant to solve -- it provides the
>> information that certain nodes are definitely not active.
>>
>> There's no way for either side of the split to know whether the opposite
>> side is down, or merely unable to communicate properly. If the latter,
>> it's possible that they are still accessing shared resources, which
>> without proper communication, can lead to serious problems (e.g. data
>> corruption of a shared volume).
> 
> The totem protocol is silent on the topic of fencing and resources, much
> the way TCP is.
> 
> Please explain to me what needs to be fenced in a cluster without
> resources where membership and total message ordering are the only
> concern. If fencing were a requirement for membership and ordering,
> wouldn't stonith be part of corosync and not pacemaker?

Corosync is a membership and communication layer (and in v2+, a quorum
provider). It doesn't care about or manage anything higher up. So it
doesn't care about fencing itself.

It simply cares about;

* Who is in the cluster?
* How do the members communicate?
* (v2+) Is there enough members for quorum?
* Notify resource managers of membership changes (join or loss).

The resource manager, pacemaker or rgmanager, care about resources, so
it is what cares about making smart decisions. As Ken pointed out,
without fencing, it can never tell the difference between no access and
dead peer.

This is (again) why fencing is critical.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Andrei Borzenkov
On Wed, Mar 16, 2016 at 4:18 PM, Lars Ellenberg
 wrote:
> On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote:
>> >> And some more about fencing:
>> >>
>> >> 3. What's the difference in cluster behavior between
>> >>- stonith-enabled=FALSE (9.3.2: how often will the stop operation be 
>> >> retried?)
>> >>- having no configured STONITH devices (resources won't be started, 
>> >> right?)
>> >>- failing to STONITH with some error (on every node)
>> >>- timing out the STONITH operation
>> >>- manual fencing
>> >
>> > I do not think there is much difference. Without fencing pacemaker
>> > cannot make decision to relocate resources so cluster will be stuck.
>>
>> Then I wonder why I hear the "must have working fencing if you value
>> your data" mantra so often (and always without explanation).  After all,
>> it does not risk the data, only the automatic cluster recovery, right?
>
> stonith-enabled=false
> means:
> if some node becomes unresponsive,
> it is immediately *assumed* it was "clean" dead.
> no fencing takes place,
> resource takeover happens without further protection.
>

Oh! Actually it is not quite clear from documentation; documentation
does not explain what happens in case of stonith-enabled=false at all.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] attrd: Fix sigsegv on exit if initialization failed

2016-03-19 Thread Ken Gaillot
On 10/12/2015 06:08 AM, Vladislav Bogdanov wrote:
> Hi,
> 
> This was caught with 0.17.1 libqb, which didn't play well with long pids.
> 
> commit 180a943846b6d94c27b9b984b039ac0465df64da
> Author: Vladislav Bogdanov 
> Date:   Mon Oct 12 11:05:29 2015 +
> 
> attrd: Fix sigsegv on exit if initialization failed
> 
> diff --git a/attrd/main.c b/attrd/main.c
> index 069e9fa..94e9212 100644
> --- a/attrd/main.c
> +++ b/attrd/main.c
> @@ -368,8 +368,12 @@ main(int argc, char **argv)
>  crm_notice("Cleaning up before exit");
> 
>  election_fini(writer);
> -crm_client_disconnect_all(ipcs);
> -qb_ipcs_destroy(ipcs);
> +
> +if (ipcs) {
> +crm_client_disconnect_all(ipcs);
> +qb_ipcs_destroy(ipcs);
> +}
> +
>  g_hash_table_destroy(attributes);
> 
>  if (the_cib) {

I set aside this message to merge it, then promptly lost it ... finally
ran across it again. It's merged into master now. Thanks for reporting
the problem and a patch.

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind (Ken Gaillot)

2016-03-19 Thread Andrei Borzenkov
On Wed, Mar 16, 2016 at 9:35 PM, Mike Bernhardt  wrote:
> I guess I have to say "never mind!" I don't know what the problem was
> yesterday, but it loads just fine today, even when the named config and the
> virtual ip don't match! But for your edamacation, ifconfig does NOT show the
> address although ip addr does:
>

That's normal. ifconfig knows nothing about addresses added using "ip
addr add". You can make it visible to ifconfig by adding label:

ip addr add dev eth1 ... label eth1:my-label

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Moving resources and implicit bans - please explain?

2016-03-19 Thread Matthew Mucker
I have set up my first three-node Pacemaker cluster and was doing some testing 
by using "crm resource move" commands. I found that once I moved a resource off 
a particular node, it would not come back up on that node. I spent a while 
troubleshooting and eventually gave up and rebuilt the node.

After rebuild, the same thing happened. I then found in the documentation for 
the crm_resource command under the move command "NOTE: This may prevent the 
resource from running on the previous location node until the implicit 
constraints expire or are removed with −−unban"

This is a regrettably vague note. What dictates the conditions for "may 
prevent?" How do I determine what implicit constraints are present on my 
resources and when they'll expire?

I did find that explicitly removing bans with crm_resource -U solved my 
problem. However, I'd like to understand this further. Any explanation would be 
appreciated. A Google search on "pacemaker move resource ban" didn't find me 
anything that was obviously authoritative.

I'd appreciate any expertise the community could share with me!

Thanks,

-Matthew
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Ferenc Wágner
Hi,

Pacemaker explained says about this cluster option:

Advanced Use Only: Should the cluster shoot unseen nodes? Not using
the default is very unsafe!

1. What are those "unseen" nodes?

And a possibly related question:

2. If I've got UNCLEAN (offline) nodes, is there a way to clean them up,
   so that they don't get fenced when I switch them on?  I mean without
   removing the node altogether, to keep its capacity settings for
   example.

And some more about fencing:

3. What's the difference in cluster behavior between
   - stonith-enabled=FALSE (9.3.2: how often will the stop operation be 
retried?)
   - having no configured STONITH devices (resources won't be started, right?)
   - failing to STONITH with some error (on every node)
   - timing out the STONITH operation
   - manual fencing

4. What's the modern way to do manual fencing?  (stonith_admin
   --confirm + what?  I ask because meatware.so comes from
   cluster-glue and uses the old API).
-- 
Thanks,
Feri

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Ulrich Windl
>>> Christopher Harvey  schrieb am 16.03.2016 um 21:04 in Nachricht
<1458158684.122207.551267810.11f73...@webmail.messagingengine.com>:
[...]
>> > Would stonith solve this problem, or does this look like a bug?
>> 
>> It should, that is its job.
> 
> is there some log I can enable that would say
> "ERROR: hey, I would use stonith here, but you have it disabled! your
> warranty is void past this point! do not pass go, do not file a bug"?

What should the kernel say during boot if the user has not defined a root file 
system?

Maybe the "stonith-enabled=false" setting should be called either 
"data-corruption-mode=true" or "hang-forever-on-error=true" ;-)

Regards,
Ulrich



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Help required for N+1 redundancy setup

2016-03-19 Thread Nikhil Utane
Thanks Ken for the detailed response.
I suppose I could even use some of the pcs/crm CLI commands then.
Cheers.

On Wed, Mar 16, 2016 at 8:27 PM, Ken Gaillot  wrote:

> On 03/16/2016 05:22 AM, Nikhil Utane wrote:
> > I see following info gets updated in CIB. Can I use this or there is
> better
> > way?
> >
> >  > crm-debug-origin="peer_update_callback" join="*down*" expected="member">
>
> in_ccm/crmd/join reflect the current state of the node (as known by the
> partition that you're looking at the CIB on), so if the node went down
> and came back up, it won't tell you anything about being down.
>
> - in_ccm indicates that the node is part of the underlying cluster layer
> (heartbeat/cman/corosync)
>
> - crmd indicates that the node is communicating at the pacemaker layer
>
> - join indicates what phase of the join process the node is at
>
> There's not a direct way to see what node went down after the fact.
> There are ways however:
>
> - if the node was running resources, those will be failed, and those
> failures (including node) will be shown in the cluster status
>
> - the logs show all node membership events; you can search for logs such
> as "state is now lost" and "left us"
>
> - "stonith -H $NODE_NAME" will show the fence history for a given node,
> so if the node went down due to fencing, it will show up there
>
> - you can configure an ocf:pacemaker:ClusterMon resource to run crm_mon
> periodically and run a script for node events, and you can write the
> script to do whatever you want (email you, etc.) (in the upcoming 1.1.15
> release, built-in notifications will make this more reliable and easier,
> but any script you use with ClusterMon will still be usable with the new
> method)
>
> > On Wed, Mar 16, 2016 at 12:40 PM, Nikhil Utane <
> nikhil.subscri...@gmail.com>
> > wrote:
> >
> >> Hi Ken,
> >>
> >> Sorry about the long delay. This activity was de-focussed but now it's
> >> back on track.
> >>
> >> One part of question that is still not answered is on the newly active
> >> node, how to find out which was the node that went down?
> >> Anything that gets updated in the status section that can be read and
> >> figured out?
> >>
> >> Thanks.
> >> Nikhil
> >>
> >> On Sat, Jan 9, 2016 at 3:31 AM, Ken Gaillot 
> wrote:
> >>
> >>> On 01/08/2016 11:13 AM, Nikhil Utane wrote:
> > I think stickiness will do what you want here. Set a stickiness
> higher
> > than the original node's preference, and the resource will want to
> stay
> > where it is.
> 
>  Not sure I understand this. Stickiness will ensure that resources
> don't
>  move back when original node comes back up, isn't it?
>  But in my case, I want the newly standby node to become the backup
> node
> >>> for
>  all other nodes. i.e. it should now be able to run all my resource
> >>> groups
>  albeit with a lower score. How do I achieve that?
> >>>
> >>> Oh right. I forgot to ask whether you had an opt-out
> >>> (symmetric-cluster=true, the default) or opt-in
> >>> (symmetric-cluster=false) cluster. If you're opt-out, every node can
> run
> >>> every resource unless you give it a negative preference.
> >>>
> >>> Partly it depends on whether there is a good reason to give each
> >>> instance a "home" node. Often, there's not. If you just want to balance
> >>> resources across nodes, the cluster will do that by default.
> >>>
> >>> If you prefer to put certain resources on certain nodes because the
> >>> resources require more physical resources (RAM/CPU/whatever), you can
> >>> set node attributes for that and use rules to set node preferences.
> >>>
> >>> Either way, you can decide whether you want stickiness with it.
> >>>
>  Also can you answer, how to get the values of node that goes active
> and
> >>> the
>  node that goes down inside the OCF agent?  Do I need to use
> >>> notification or
>  some simpler alternative is available?
>  Thanks.
> 
> 
>  On Fri, Jan 8, 2016 at 9:30 PM, Ken Gaillot 
> >>> wrote:
> 
> > On 01/08/2016 06:55 AM, Nikhil Utane wrote:
> >> Would like to validate my final config.
> >>
> >> As I mentioned earlier, I will be having (upto) 5 active servers
> and 1
> >> standby server.
> >> The standby server should take up the role of active that went down.
> >>> Each
> >> active has some unique configuration that needs to be preserved.
> >>
> >> 1) So I will create total 5 groups. Each group has a
> >>> "heartbeat::IPaddr2
> >> resource (for virtual IP) and my custom resource.
> >> 2) The virtual IP needs to be read inside my custom OCF agent, so I
> >>> will
> >> make use of attribute reference and point to the value of IPaddr2
> >>> inside
> > my
> >> custom resource to avoid duplication.
> >> 3) I will then configure location constraint to run the group
> resource
> > on 5
> >> active nodes with higher score and lesser 

Re: [ClusterLabs] Help required for N+1 redundancy setup

2016-03-19 Thread Ken Gaillot
On 03/16/2016 05:22 AM, Nikhil Utane wrote:
> I see following info gets updated in CIB. Can I use this or there is better
> way?
> 
>  crm-debug-origin="peer_update_callback" join="*down*" expected="member">

in_ccm/crmd/join reflect the current state of the node (as known by the
partition that you're looking at the CIB on), so if the node went down
and came back up, it won't tell you anything about being down.

- in_ccm indicates that the node is part of the underlying cluster layer
(heartbeat/cman/corosync)

- crmd indicates that the node is communicating at the pacemaker layer

- join indicates what phase of the join process the node is at

There's not a direct way to see what node went down after the fact.
There are ways however:

- if the node was running resources, those will be failed, and those
failures (including node) will be shown in the cluster status

- the logs show all node membership events; you can search for logs such
as "state is now lost" and "left us"

- "stonith -H $NODE_NAME" will show the fence history for a given node,
so if the node went down due to fencing, it will show up there

- you can configure an ocf:pacemaker:ClusterMon resource to run crm_mon
periodically and run a script for node events, and you can write the
script to do whatever you want (email you, etc.) (in the upcoming 1.1.15
release, built-in notifications will make this more reliable and easier,
but any script you use with ClusterMon will still be usable with the new
method)

> On Wed, Mar 16, 2016 at 12:40 PM, Nikhil Utane 
> wrote:
> 
>> Hi Ken,
>>
>> Sorry about the long delay. This activity was de-focussed but now it's
>> back on track.
>>
>> One part of question that is still not answered is on the newly active
>> node, how to find out which was the node that went down?
>> Anything that gets updated in the status section that can be read and
>> figured out?
>>
>> Thanks.
>> Nikhil
>>
>> On Sat, Jan 9, 2016 at 3:31 AM, Ken Gaillot  wrote:
>>
>>> On 01/08/2016 11:13 AM, Nikhil Utane wrote:
> I think stickiness will do what you want here. Set a stickiness higher
> than the original node's preference, and the resource will want to stay
> where it is.

 Not sure I understand this. Stickiness will ensure that resources don't
 move back when original node comes back up, isn't it?
 But in my case, I want the newly standby node to become the backup node
>>> for
 all other nodes. i.e. it should now be able to run all my resource
>>> groups
 albeit with a lower score. How do I achieve that?
>>>
>>> Oh right. I forgot to ask whether you had an opt-out
>>> (symmetric-cluster=true, the default) or opt-in
>>> (symmetric-cluster=false) cluster. If you're opt-out, every node can run
>>> every resource unless you give it a negative preference.
>>>
>>> Partly it depends on whether there is a good reason to give each
>>> instance a "home" node. Often, there's not. If you just want to balance
>>> resources across nodes, the cluster will do that by default.
>>>
>>> If you prefer to put certain resources on certain nodes because the
>>> resources require more physical resources (RAM/CPU/whatever), you can
>>> set node attributes for that and use rules to set node preferences.
>>>
>>> Either way, you can decide whether you want stickiness with it.
>>>
 Also can you answer, how to get the values of node that goes active and
>>> the
 node that goes down inside the OCF agent?  Do I need to use
>>> notification or
 some simpler alternative is available?
 Thanks.


 On Fri, Jan 8, 2016 at 9:30 PM, Ken Gaillot 
>>> wrote:

> On 01/08/2016 06:55 AM, Nikhil Utane wrote:
>> Would like to validate my final config.
>>
>> As I mentioned earlier, I will be having (upto) 5 active servers and 1
>> standby server.
>> The standby server should take up the role of active that went down.
>>> Each
>> active has some unique configuration that needs to be preserved.
>>
>> 1) So I will create total 5 groups. Each group has a
>>> "heartbeat::IPaddr2
>> resource (for virtual IP) and my custom resource.
>> 2) The virtual IP needs to be read inside my custom OCF agent, so I
>>> will
>> make use of attribute reference and point to the value of IPaddr2
>>> inside
> my
>> custom resource to avoid duplication.
>> 3) I will then configure location constraint to run the group resource
> on 5
>> active nodes with higher score and lesser score on standby.
>> For e.g.
>> Group  NodeScore
>> -
>> MyGroup1node1   500
>> MyGroup1node6   0
>>
>> MyGroup2node2   500
>> MyGroup2node6   0
>> ..
>> MyGroup5node5   500
>> MyGroup5node6   0
>>
>> 4) Now if say 

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread vija ar
root file system is fine ...

but fencing is not a necessity a cluster shld function without it .. i see
the issue with corosync which has all been .. a inherent way of not working
neatly or smoothly ..

for e.g. take an issue where the live node is hung in db cluster .. now db
perspective transactions r not happening and tht is fine as the node is
having some issue .. now there is no need to fence this hung node but just
to switch over to passive one .. but tht doesnt happens and fencing takes
place either by reboot or shut .. which further makes the DB dirty or far
more than tht in non-recoverable state which wouldnt have happen if a
normal switch to other node as in cluster would have happened ...

i see fencing is not a solution its only required to forcefully take
control which is not the case always

On Thu, Mar 17, 2016 at 12:49 PM, Ulrich Windl <
ulrich.wi...@rz.uni-regensburg.de> wrote:

> >>> Christopher Harvey  schrieb am 16.03.2016 um 21:04 in
> Nachricht
> <1458158684.122207.551267810.11f73...@webmail.messagingengine.com>:
> [...]
> >> > Would stonith solve this problem, or does this look like a bug?
> >>
> >> It should, that is its job.
> >
> > is there some log I can enable that would say
> > "ERROR: hey, I would use stonith here, but you have it disabled! your
> > warranty is void past this point! do not pass go, do not file a bug"?
>
> What should the kernel say during boot if the user has not defined a root
> file system?
>
> Maybe the "stonith-enabled=false" setting should be called either
> "data-corruption-mode=true" or "hang-forever-on-error=true" ;-)
>
> Regards,
> Ulrich
>
>
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Security with Corosync

2016-03-19 Thread Jan Friesse

Nikhil Utane napsal(a):

Honza,

In my CIB I see the infrastructure being set to cman. pcs status is
reporting the same.



[root@node3 corosync]# pcs status
Cluster name: mycluster
Last updated: Wed Mar 16 16:57:46 2016
Last change: Wed Mar 16 16:56:23 2016
Stack: *cman*

But corosync also is running fine.

[root@node2 nikhil]# pcs status nodes corosync
Corosync Nodes:
  Online: node2 node3
  Offline: node1

I did a cibadmin query and replace from cman to corosync but it doesn't
change (even though replace operation succeeds)
I read that CMAN internally uses corosync but in corosync 2 CMAN support is
removed.
Totally confused. Please help.


Best start is to find out what versions you are using? If you have 
corosync 1.x and really using cman (what is highly probable), 
corosync.conf is completely ignored and instead cluster.conf 
(/etc/cluster/cluster.conf) is used. cluster.conf uses cman keyfile and 
if this is not provided, encryption key is simply cluster name. This is 
probably reason why everything worked when you haven't had authkey on 
one of nodes.


Honza



-Thanks
Nikhil

On Mon, Mar 14, 2016 at 1:19 PM, Jan Friesse  wrote:


Nikhil Utane napsal(a):


Follow-up question.
I noticed that secauth was turned off in my corosync.conf file. I enabled
it on all 3 nodes and restarted the cluster. Everything was working fine.
However I just noticed that I had forgotten to copy the authkey to one of
the node. It is present on 2 nodes but not the third. And I did a failover
and the third node took over without any issue.
How is the 3rd node participating in the cluster if it doesn't have the
authkey?



It's just not possible. If you would enabled secauth correctly and you
didn't have /etc/corosync/authkey, message like "Could not open
/etc/corosync/authkey: No such file or directory" would show up. There are
few exceptions:
- you have changed totem.keyfile with file existing on all nodes
- you are using totem.key then everything works as expected (it has
priority over default authkey file but not over totem.keyfile)
- you are using COROSYNC_TOTEM_AUTHKEY_FILE env with file existing on all
nodes

Regards,
   Honza




On Fri, Mar 11, 2016 at 4:15 PM, Nikhil Utane <
nikhil.subscri...@gmail.com>
wrote:

Perfect. Thanks for the quick response Honza.


Cheers
Nikhil

On Fri, Mar 11, 2016 at 4:10 PM, Jan Friesse 
wrote:

Nikhil,


Nikhil Utane napsal(a):

Hi,


I changed some configuration and captured packets. I can see that the
data
is already garbled and not in the clear.
So does corosync already have this built-in?
Can somebody provide more details as to what all security features are
incorporated?



See man page corosync.conf(5) options crypto_hash, crypto_cipher (for
corosync 2.x) and potentially secauth (for coorsync 1.x and 2.x).

Basically corosync by default uses aes256 for encryption and sha1 for
hmac authentication.

Pacemaker uses corosync cpg API so as long as encryption is enabled in
the corosync.conf, messages interchanged between nodes are encrypted.

Regards,
Honza


-Thanks

Nikhil

On Fri, Mar 11, 2016 at 11:38 AM, Nikhil Utane <
nikhil.subscri...@gmail.com>
wrote:

Hi,



Does corosync provide mechanism to secure the communication path
between
nodes of a cluster?
I would like all the data that gets exchanged between all nodes to be
encrypted.

A quick google threw up this link:
https://github.com/corosync/corosync/blob/master/SECURITY

Can I make use of it with pacemaker?

-Thanks
Nikhil






___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org








___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org





___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org





[ClusterLabs] Antw: Installed Galera, now HAProxy won't start

2016-03-19 Thread Ulrich Windl
>>> Matthew Mucker  schrieb am 16.03.2016 um 23:10 in 
>>> Nachricht


[...]
> So thinking this through logically, it seems to me that the Openstack 
> docs were wrong in telling me to configure MariaDB server to bind to all 
> available ports 

In a cluster environment with virtual IP addresses it's always wrong to bind a 
service to all IP addresses. To make things worse, user may use the wrong 
address to connect, and then failover of a virtual IP will hit them.

[...]

Regards,
Ulrich



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 19/03/16 10:10 AM, Dennis Jacobfeuerborn wrote:
> On 18.03.2016 00:50, Digimer wrote:
>> On 17/03/16 07:30 PM, Christopher Harvey wrote:
>>> On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote:
 On 03/17/2016 05:10 PM, Christopher Harvey wrote:
> If I ignore pacemaker's existence, and just run corosync, corosync
> disagrees about node membership in the situation presented in the first
> email. While it's true that stonith just happens to quickly correct the
> situation after it occurs it still smells like a bug in the case where
> corosync in used in isolation. Corosync is after all a membership and
> total ordering protocol, and the nodes in the cluster are unable to
> agree on membership.
>
> The Totem protocol specifies a ring_id in the token passed in a ring.
> Since all of the 3 nodes but one have formed a new ring with a new id
> how is it that the single node can survive in a ring with no other
> members passing a token with the old ring_id?
>
> Are there network failure situations that can fool the Totem membership
> protocol or is this an implementation problem? I don't see how it could
> not be one or the other, and it's bad either way.

 Neither, really. In a split brain situation, there simply is not enough
 information for any protocol or implementation to reliably decide what
 to do. That's what fencing is meant to solve -- it provides the
 information that certain nodes are definitely not active.

 There's no way for either side of the split to know whether the opposite
 side is down, or merely unable to communicate properly. If the latter,
 it's possible that they are still accessing shared resources, which
 without proper communication, can lead to serious problems (e.g. data
 corruption of a shared volume).
>>>
>>> The totem protocol is silent on the topic of fencing and resources, much
>>> the way TCP is.
>>>
>>> Please explain to me what needs to be fenced in a cluster without
>>> resources where membership and total message ordering are the only
>>> concern. If fencing were a requirement for membership and ordering,
>>> wouldn't stonith be part of corosync and not pacemaker?
>>
>> Corosync is a membership and communication layer (and in v2+, a quorum
>> provider). It doesn't care about or manage anything higher up. So it
>> doesn't care about fencing itself.
>>
>> It simply cares about;
>>
>> * Who is in the cluster?
>> * How do the members communicate?
>> * (v2+) Is there enough members for quorum?
>> * Notify resource managers of membership changes (join or loss).
>>
>> The resource manager, pacemaker or rgmanager, care about resources, so
>> it is what cares about making smart decisions. As Ken pointed out,
>> without fencing, it can never tell the difference between no access and
>> dead peer.
>>
>> This is (again) why fencing is critical.
> 
> I think the key issue here is that people think about corosync they
> believe there can only be two state for membership (true or false) when
> in reality there are three possible states: true, false and unknown.
> 
> The problem then is that corosync apparently has no built-in way to deal
> with the "unknown" situation and requires guidance from an external
> entity for that (in this case pacemakers fencing).
> 
> This means that corosync alone simply cannot give you reliable
> membership guarantees. I strictly requires external help to be able to
> provide that.
> 
> Regards,
>   Dennis

I'm not sure that is accurate.

If corosync declares a node lost (failed to receive X tokens in Y time),
the node is declared lost and it reforms a new cluster, without the lost
member. So from corosync's perspective, the lost node is no longer a
member (it won't receive messages). It is possible that the lost node
might itself be alive, in which case it's corosync will do the same
thing (reform a new cluster, possibly with itself as the sole member).

If you're trying to have corosync *do* something, then that is missing
the point of corosync, I think. In all cases I've ever seen, you need a
separate resource manager to actually react to the membership changes.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] reproducible split brain

2016-03-19 Thread Christopher Harvey
I am able to create a split brain situation in corosync 1.1.13 using
iptables in a 3 node cluster.

I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5

All nodes are operational and form a 3 node cluster with all nodes are
members of that ring.
vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
so far so good.

running the following on vmr-132-4 drops all incoming (but not outgoing)
packets from vmr-132-3:
# iptables -I INPUT -s 192.168.132.3 -j DROP
# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination
DROP   all  --  192.168.132.3anywhere

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination

vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ]
vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ]

vmr-132-3 thinks everything is normal and continues to provide service,
vmr-132-4 and 5 form a new ring, achieve quorum and provide the same
service. Splitting the link between 3 and 4 in both directions isolates
vmr 3 from the rest of the cluster and everything fails over normally,
so only a unidirectional failure causes problems.

I don't have stonith enabled right now, and looking over the
pacemaker.log file closely to see if 4 and 5 would normally have fenced
3, but I didn't see any fencing or stonith logs.

Would stonith solve this problem, or does this look like a bug?

Thanks,
Chris

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] [Announce] clufter-0.56.2 released

2016-03-19 Thread Jan Pokorný
I am happy to announce that clufter-0.56.2, a tool/library for
transforming/analyzing cluster configuration formats, has been
released and published (incl. signature using my 60BCBB4F5CD7F9EF key,
expiration of which was prolonged just a few days back so you may
want to consult key servers first):


or alternative (original) location:



The test suite for this version is also provided:

or alternatively:


Changelog highlights (also available as a tag message):
- this is a bug fix release with one minor enhancement
- bug fixes:
  . with {cib,pcs}2pcscmd* commands, clufter no longer chokes on
validation failures (unless --nocheck provided) due to source CIB
not containing "status" section (which is normally the case with
implicit input located in /var/lib/pacemaker/cib/cib.xml);
now the bundled, compacted schemas mark this section optional
and also the recipe to distill such format from pacemaker native
schemas ensures the same assumption holds even if not pre-existed
[resolves: rhbz#1269964, comment 9]
[see also: https://github.com/ClusterLabs/pacemaker/pull/957]
  . internal representations of command + options/arguments was fixed
in several ways so as to provide correct outcomes in both general
(previously, some options could be duplicated while overwriting
other options/arguments, and standalone negative numbers were
considered options) and pcs (--wait=X cannot be decoupled the same
way option parsers can usually cope with, as pcs built-in parser
treats this specifically) cases
- enhancements:
  . [cp]cs2pcscmd* commands now supports "--wait" parameter to pcs
command for starting the cluster and prefers it to static "sleep"
when possible (pcs version recent enough)
[see also: rhbz#1229822]

* * *

The public repository (notably master and next branches) is currently at

(rather than ).

Official, signed releases can be found at
 or, alternatively, at

(also beware, automatic archives by GitHub preserve a "dev structure").

Natively packaged in Fedora (python-clufter, clufter-cli).

Issues & suggestions can be reported at either of (regardless if Fedora)
,

(rather than ).


Happy clustering/high-availing :)

-- 
Jan (Poki)


pgpco2EKbLp2N.pgp
Description: PGP signature
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Pacemaker startup-fencing

2016-03-19 Thread Lars Ellenberg
On Wed, Mar 16, 2016 at 01:47:52PM +0100, Ferenc Wágner wrote:
> >> And some more about fencing:
> >>
> >> 3. What's the difference in cluster behavior between
> >>- stonith-enabled=FALSE (9.3.2: how often will the stop operation be 
> >> retried?)
> >>- having no configured STONITH devices (resources won't be started, 
> >> right?)
> >>- failing to STONITH with some error (on every node)
> >>- timing out the STONITH operation
> >>- manual fencing
> >
> > I do not think there is much difference. Without fencing pacemaker
> > cannot make decision to relocate resources so cluster will be stuck.
> 
> Then I wonder why I hear the "must have working fencing if you value
> your data" mantra so often (and always without explanation).  After all,
> it does not risk the data, only the automatic cluster recovery, right?

stonith-enabled=false
means:
if some node becomes unresponsive,
it is immediately *assumed* it was "clean" dead.
no fencing takes place,
resource takeover happens without further protection.

That very much risks at least data divergence (replicas evoling
independently), if not data corruption (shared disks and the like).

-- 
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker
: R, Integration, Ops, Consulting, Support

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Dennis Jacobfeuerborn
On 18.03.2016 00:50, Digimer wrote:
> On 17/03/16 07:30 PM, Christopher Harvey wrote:
>> On Thu, Mar 17, 2016, at 06:24 PM, Ken Gaillot wrote:
>>> On 03/17/2016 05:10 PM, Christopher Harvey wrote:
 If I ignore pacemaker's existence, and just run corosync, corosync
 disagrees about node membership in the situation presented in the first
 email. While it's true that stonith just happens to quickly correct the
 situation after it occurs it still smells like a bug in the case where
 corosync in used in isolation. Corosync is after all a membership and
 total ordering protocol, and the nodes in the cluster are unable to
 agree on membership.

 The Totem protocol specifies a ring_id in the token passed in a ring.
 Since all of the 3 nodes but one have formed a new ring with a new id
 how is it that the single node can survive in a ring with no other
 members passing a token with the old ring_id?

 Are there network failure situations that can fool the Totem membership
 protocol or is this an implementation problem? I don't see how it could
 not be one or the other, and it's bad either way.
>>>
>>> Neither, really. In a split brain situation, there simply is not enough
>>> information for any protocol or implementation to reliably decide what
>>> to do. That's what fencing is meant to solve -- it provides the
>>> information that certain nodes are definitely not active.
>>>
>>> There's no way for either side of the split to know whether the opposite
>>> side is down, or merely unable to communicate properly. If the latter,
>>> it's possible that they are still accessing shared resources, which
>>> without proper communication, can lead to serious problems (e.g. data
>>> corruption of a shared volume).
>>
>> The totem protocol is silent on the topic of fencing and resources, much
>> the way TCP is.
>>
>> Please explain to me what needs to be fenced in a cluster without
>> resources where membership and total message ordering are the only
>> concern. If fencing were a requirement for membership and ordering,
>> wouldn't stonith be part of corosync and not pacemaker?
> 
> Corosync is a membership and communication layer (and in v2+, a quorum
> provider). It doesn't care about or manage anything higher up. So it
> doesn't care about fencing itself.
> 
> It simply cares about;
> 
> * Who is in the cluster?
> * How do the members communicate?
> * (v2+) Is there enough members for quorum?
> * Notify resource managers of membership changes (join or loss).
> 
> The resource manager, pacemaker or rgmanager, care about resources, so
> it is what cares about making smart decisions. As Ken pointed out,
> without fencing, it can never tell the difference between no access and
> dead peer.
> 
> This is (again) why fencing is critical.

I think the key issue here is that people think about corosync they
believe there can only be two state for membership (true or false) when
in reality there are three possible states: true, false and unknown.

The problem then is that corosync apparently has no built-in way to deal
with the "unknown" situation and requires guidance from an external
entity for that (in this case pacemakers fencing).

This means that corosync alone simply cannot give you reliable
membership guarantees. I strictly requires external help to be able to
provide that.

Regards,
  Dennis


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind (Ken Gaillot)

2016-03-19 Thread Dennis Jacobfeuerborn
On 17.03.2016 08:45, Andrei Borzenkov wrote:
> On Wed, Mar 16, 2016 at 9:35 PM, Mike Bernhardt  wrote:
>> I guess I have to say "never mind!" I don't know what the problem was
>> yesterday, but it loads just fine today, even when the named config and the
>> virtual ip don't match! But for your edamacation, ifconfig does NOT show the
>> address although ip addr does:
>>
> 
> That's normal. ifconfig knows nothing about addresses added using "ip
> addr add". You can make it visible to ifconfig by adding label:
> 
> ip addr add dev eth1 ... label eth1:my-label

Just stop using ifconfig/route and keep using the ip command.
ifconfig/route have been deprecated for a decade now and I know old
habits die hard but at some point you just have to move on otherwise you
might end up like those admins who stopped learning in the 80s and now
haunt discussion forums mostly contributing snark and negativity because
they've come to hate their jobs.

Regards,
  Dennis



___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] reproducible split brain

2016-03-19 Thread Digimer
On 16/03/16 03:59 PM, Christopher Harvey wrote:
> I am able to create a split brain situation in corosync 1.1.13 using
> iptables in a 3 node cluster.
> 
> I have 3 nodes, vmr-132-3, vmr-132-4, and vmr-132-5
> 
> All nodes are operational and form a 3 node cluster with all nodes are
> members of that ring.
> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> vmr-132-4 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> vmr-132-5 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> so far so good.
> 
> running the following on vmr-132-4 drops all incoming (but not outgoing)
> packets from vmr-132-3:
> # iptables -I INPUT -s 192.168.132.3 -j DROP
> # iptables -L
> Chain INPUT (policy ACCEPT)
> target prot opt source   destination
> DROP   all  --  192.168.132.3anywhere
> 
> Chain FORWARD (policy ACCEPT)
> target prot opt source   destination
> 
> Chain OUTPUT (policy ACCEPT)
> target prot opt source   destination
> 
> vmr-132-3 ---> Online: [ vmr-132-3 vmr-132-4 vmr-132-5 ]
> vmr-132-4 ---> Online: [ vmr-132-4 vmr-132-5 ]
> vmr-132-5 ---> Online: [ vmr-132-4 vmr-132-5 ]
> 
> vmr-132-3 thinks everything is normal and continues to provide service,
> vmr-132-4 and 5 form a new ring, achieve quorum and provide the same
> service. Splitting the link between 3 and 4 in both directions isolates
> vmr 3 from the rest of the cluster and everything fails over normally,
> so only a unidirectional failure causes problems.
> 
> I don't have stonith enabled right now, and looking over the
> pacemaker.log file closely to see if 4 and 5 would normally have fenced
> 3, but I didn't see any fencing or stonith logs.
> 
> Would stonith solve this problem, or does this look like a bug?

It should, that is its job.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Jan Friesse

Christopher,



If I ignore pacemaker's existence, and just run corosync, corosync
disagrees about node membership in the situation presented in the first
email. While it's true that stonith just happens to quickly correct the
situation after it occurs it still smells like a bug in the case where
corosync in used in isolation. Corosync is after all a membership and
total ordering protocol, and the nodes in the cluster are unable to
agree on membership.

The Totem protocol specifies a ring_id in the token passed in a ring.
Since all of the 3 nodes but one have formed a new ring with a new id
how is it that the single node can survive in a ring with no other
members passing a token with the old ring_id?

Are there network failure situations that can fool the Totem membership
protocol or is this an implementation problem? I don't see how it could


The main problem (as you noted in original mail) is really about 
blocking only one direction (input one). This is called byzantine 
failure and it's something what corosync is unable to handle. Totem was 
simply never designed to solve byzantine failures.


Regards,
  Honza



not be one or the other, and it's bad either way.

On Thu, Mar 17, 2016, at 02:08 PM, Digimer wrote:

On 17/03/16 01:57 PM, vija ar wrote:

root file system is fine ...

but fencing is not a necessity a cluster shld function without it .. i
see the issue with corosync which has all been .. a inherent way of not
working neatly or smoothly ..


Absolutely wrong.

If you have a service that can run on both/all nodes at the same time
without coordination, you don't need a cluster, just run your services
everywhere.

If that's not the case, then you need fencing so that the (surviving)
node(s) can be sure that they know where services are and are not
running.


for e.g. take an issue where the live node is hung in db cluster .. now
db perspective transactions r not happening and tht is fine as the node
is having some issue .. now there is no need to fence this hung node but
just to switch over to passive one .. but tht doesnt happens and fencing
takes place either by reboot or shut .. which further makes the DB dirty
or far more than tht in non-recoverable state which wouldnt have happen
if a normal switch to other node as in cluster would have happened ...

i see fencing is not a solution its only required to forcefully take
control which is not the case always

On Thu, Mar 17, 2016 at 12:49 PM, Ulrich Windl
> wrote:

 >>> Christopher Harvey  schrieb am 16.03.2016 um 21:04
 in Nachricht
 <1458158684.122207.551267810.11f73...@webmail.messagingengine.com
 >:
 [...]
 >> > Would stonith solve this problem, or does this look like a bug?
 >>
 >> It should, that is its job.
 >
 > is there some log I can enable that would say
 > "ERROR: hey, I would use stonith here, but you have it disabled! your
 > warranty is void past this point! do not pass go, do not file a bug"?

 What should the kernel say during boot if the user has not defined a
 root file system?

 Maybe the "stonith-enabled=false" setting should be called either
 "data-corruption-mode=true" or "hang-forever-on-error=true" ;-)

 Regards,
 Ulrich



 ___
 Users mailing list: Users@clusterlabs.org 
 http://clusterlabs.org/mailman/listinfo/users

 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




--
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org




___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: 

[ClusterLabs] Antw: Re: Pacemaker startup-fencing

2016-03-19 Thread Ulrich Windl
>>> Ferenc Wágner  schrieb am 16.03.2016 um 13:47 in Nachricht
<87k2l2zj0n@lant.ki.iif.hu>:
[...]
> Then I wonder why I hear the "must have working fencing if you value
> your data" mantra so often (and always without explanation).  After all,
> it does not risk the data, only the automatic cluster recovery, right?
[...]

Imagine this situation: You have a ext[234] filesystem on a shared disk that
is mounted on node n1 in a two node cluster.

Then network connection breaks. n1 thinks n2 crashed and continues to use the
shared disk and the filesystem (and maybe some application that modifies it)
(after having waited for the fencing to succeed). n2 thinks n1 crashed and
starts to use the shared disk (and filesystem, and maybe application) (after
having waited for the fencing to succeed).

Now if you have no fencing, n1 and n2 will both write on the ext[234]
filesystem, corrupting it.

We had that once, and it it no fun. Specifically the repair tools are not very
much tuned for that type of corruption!

You can also trigger that by misusing "clearstate" while a resource is still
in use...

Clear now?

Regards,
Ulrich


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] PCS, Corosync, Pacemaker, and Bind

2016-03-19 Thread Ken Gaillot
On 03/15/2016 06:47 PM, Mike Bernhardt wrote:
> Not sure if this is a BIND question or a PCS/Corosync question, but
> hopefully someone has done this before:
> 
>  
> 
> I'm setting up a new CentOS 7 DNS server cluster to replace our very old
> CentOS 4 cluster. The old one uses heartbeat which is no longer supported,
> so I'm now using pcs, corosync, and pacemaker.  The new one is running the
> latest 9.10.x production release of BIND. I want BIND to listen on, query
> from, etc on a particular IP address, which is virtualized with pacemaker. 
> 
>  
> 
> This worked fine on the old cluster. But whereas heartbeat would create a
> virtual subinterface (i.e. eth0:0) to support the virtual IP, corosync does
> not do that; at least it doesn't by default. So although the virtual IP
> exists and is pingable, it is not tied to a "physical" interface- ifconfig
> does not find it. And when BIND tries to start up, it fails because it can't
> find the virtual IP it's configured to run on, even though it is reachable.
> I only need IPv4, not IPv6.

The old subinterfaces are no longer neded in linux-land for "virtual"
IPs, which are now actually full-class citizens, just one of multiple
IPs assigned to the interface. ifconfig (or its fashionably new
alternative, "ip addr") should show both addresses on the same interface.

BIND shouldn't have any problems finding the IP. Can you show the error
messages that come up, and your pacemaker configuration?

>  
> 
> So, I'm hoping that there is a way to tell corosync (hopefully using pcsd)
> to create a virtual interface, not just a virtual address, so BIND can find
> it.
> 
>  
> 
> Thanks in advance!


___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Tim Walberg
Having an issue on a newly built CentOS 7.2.1511 NFS cluster with DRBD
(drbd84-utils-8.9.5-1 with kmod-drbd84-8.4.7-1_1). At this point, the
resources consist of a cluster address, a DRBD device mirroring between the
two cluster nodes, the file system, and the nfs-server resource. The
resources all behave properly until an extended failover or outage.

I have tested failover in several ways ("pcs cluster standby", "pcs cluster
stop", "init 0", "init 6", "echo b > /proc/sysrq-trigger", etc.) and the
symptoms are that, until the killed node is brought back into the cluster,
failover never seems to complete. The DRBD device appears on the remaining
node to be in a "Secondary/Unknown" state, and the resources end up looking
like:

# pcs status
Cluster name: nfscluster
Last updated: Wed Mar 16 12:05:33 2016  Last change: Wed Mar 16
12:04:46 2016 by root via cibadmin on nfsnode01
Stack: corosync
Current DC: nfsnode01 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with
quorum
2 nodes and 5 resources configured

Online: [ nfsnode01 ]
OFFLINE: [ nfsnode02 ]

Full list of resources:

 nfsVIP  (ocf::heartbeat:IPaddr2):   Started nfsnode01
 nfs-server (systemd:nfs-server):   Stopped
 Master/Slave Set: drbd_master [drbd_dev]
 Slaves: [ nfsnode01 ]
 Stopped: [ nfsnode02 ]
 drbd_fs   (ocf::heartbeat:Filesystem):Stopped

PCSD Status:
  nfsnode01: Online
  nfsnode02: Online

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

As soon as I bring the second node back online, the failover completes. But
this is obviously not a good state, as an extended outage for any reason on
one node essentially kills the cluster services. There's obviously
something I've missed in configuring the resources, but I haven't been able
to pinpoint it yet.

Perusing the logs, it appears that, upon the initial failure, DRBD does in
fact promote the drbd_master resource, but immediately after that, pengine
calls for it to be demoted for reasons I haven't been able to determine
yet, but seems to be tied to the fencing configuration. I can see that the
crm-fence-peer.sh script is called, but it almost seems like it's fencing
the wrong node... Indeed, I do see that it adds a -INFINITY location
constraint for the surviving node, which would explain the decision to
demote the DRBD master.

My DRBD resource looks like this:

# cat /etc/drbd.d/drbd0.res
resource drbd0 {

protocol C;
startup { wfc-timeout 0; degr-wfc-timeout 120; }

disk {
on-io-error detach;
fencing resource-only;
}

handlers {
fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
}

on nfsnode01 {
device /dev/drbd0;
disk /dev/vg_nfs/lv_drbd0;
meta-disk internal;
address 10.0.0.2:7788;
}

on nfsnode02 {
device /dev/drbd0;
disk /dev/vg_nfs/lv_drbd0;
meta-disk internal;
address 10.0.0.3:7788;
}
}

If I comment out the three lines having to do with fencing, the failover
works properly. But I'd prefer to have the fencing there in the odd chance
that we end up with a split brain instead of just a node outage...

And, here's "pcs config --full":

# pcs config --full
Cluster Name: nfscluster
Corosync Nodes:
 nfsnode01 nfsnode02
Pacemaker Nodes:
 nfsnode01 nfsnode02

Resources:
 Resource: nfsVIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=10.0.0.1 cidr_netmask=24
  Operations: start interval=0s timeout=20s (nfsVIP-start-interval-0s)
  stop interval=0s timeout=20s (nfsVIP-stop-interval-0s)
  monitor interval=15s (nfsVIP-monitor-interval-15s)
 Resource: nfs-server (class=systemd type=nfs-server)
  Operations: monitor interval=60s (nfs-server-monitor-interval-60s)
 Master: drbd_master
  Meta Attrs: master-max=1 master-node-max=1 clone-max=2 clone-node-max=1
notify=true
  Resource: drbd_dev (class=ocf provider=linbit type=drbd)
   Attributes: drbd_resource=drbd0
   Operations: start interval=0s timeout=240 (drbd_dev-start-interval-0s)
   promote interval=0s timeout=90 (drbd_dev-promote-interval-0s)
   demote interval=0s timeout=90 (drbd_dev-demote-interval-0s)
   stop interval=0s timeout=100 (drbd_dev-stop-interval-0s)
   monitor interval=29s role=Master
(drbd_dev-monitor-interval-29s)
   monitor interval=31s role=Slave
(drbd_dev-monitor-interval-31s)
 Resource: drbd_fs (class=ocf provider=heartbeat type=Filesystem)
  Attributes: device=/dev/drbd0 directory=/exports/drbd0 fstype=xfs
  Operations: start interval=0s timeout=60 (drbd_fs-start-interval-0s)
  stop interval=0s timeout=60 (drbd_fs-stop-interval-0s)
  monitor interval=20 timeout=40 (drbd_fs-monitor-interval-20)

Stonith Devices:
Fencing 

Re: [ClusterLabs] Installed Galera, now HAProxy won't start

2016-03-19 Thread Ian
> configure MariaDB server to bind to all available ports (
http://docs.openstack.org/ha-guide/controller-ha-galera-config.html, scroll
to "Database Configuration," note that bind-address is 0.0.0.0.). If
MariaDB binds to the virtual IP address, then HAProxy can't bind to that
address and therefore won't start. Right?

That is correct as far as my understanding goes.  By binding to port 3306
on all IPs (0.0.0.0), you are effectively preventing HAProxy from being
able to use port 3306 on its own IP and vice-versa.

Try setting specific bind addresses for your Galera nodes; I would be
surprised and interested if it didn't work.


On Wed, Mar 16, 2016 at 6:10 PM, Matthew Mucker  wrote:

> Sorry, folks, for being a pest here, but I'm finding the learning curve on
> this clustering stuff to be pretty steep.
>
>
> I'm following the docs to set up a three-node Openstack Controller
> cluster. I got Pacemaker running and I had two resources, the virtual IP
> and HAProxy, up and running and I could move these resources to any of the
> three nodes. Success!
>
>
> I then moved on to installing Galera.
>
>
> The MariaDB engine started fine on 2 of the 3 nodes but refused to start
> on the third. After some digging and poking (and swearing), I found that
> HAProxy was listening on the virtual IP on the mySQL port, which prevented
> MariaDB from listening on that port. Makes sense. So I moved HAProxy to
> another node and started MariaDB on my third node and now I have a
> three-node Galera cluster.
>
>
> But.
>
>
> Now HAPRoxy won't start on any node. I imagine it's because MariaDB is
> already listening on the same IP:Port combination that Galera wants. (After
> all, HAProxy is supposed to proxy that IP:Port, right?) Unfortunately, I
> don't see anything useful in the HAProxy.log file so I don't really know
> what's wrong.
>
>
> So thinking this through logically, it seems to me that the Openstack
> docs were wrong in telling me to configure MariaDB server to bind to all
> available ports (
> http://docs.openstack.org/ha-guide/controller-ha-galera-config.html,
> scroll to "Database Configuration," note that bind-address is 0.0.0.0.). If
> MariaDB binds to the virtual IP address, then HAProxy can't bind to that
> address and therefore won't start. Right?
>
>
> Am I thinking correctly here, or is something else wrong with my setup? In
> general, I've found that the OpenStack documents tend to be right, but in
> this case my understanding of the concepts involved makes me wonder.
>
>
> In any case, I'm having difficulty getting HAProxy and Galera running on
> the same nodes. My HAProxy config file is:
>
>
> global
>   chroot  /var/lib/haproxy
>   daemon
>   group  haproxy
>   maxconn  4000
>   pidfile  /var/run/haproxy.pid
>   user  haproxy
>
> defaults
>   log  global
>   maxconn  4000
>   option  redispatch
>   retries  3
>   timeout  http-request 10s
>   timeout  queue 1m
>   timeout  connect 10s
>   timeout  client 1m
>   timeout  server 1m
>   timeout  check 10s
>
> listen galera_cluster
>   bind 10.0.0.10:3306
>   balance  source
>   option  httpchk
>   server controller1 10.0.0.11:3306 check port 9200 inter 2000 rise 2
> fall 5
>   server controller2 10.0.0.12:3306 backup check port 9200 inter 2000
> rise 2 fall 5
>   server controller3 10.0.0.13:3306 backup check port 9200 inter 2000
> rise 2 fall 5
>
>
>
> Does the server name under "listen galera_cluster" need to match the
> hostname of the node? What else could be causing these two daemons to not
> play nicely together?
>
>
> Thanks!
>
>
> -Matthew
>
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] Reload operation for multi-state resource agent

2016-03-19 Thread Michael Lychkov
Hello everyone,

Is there way to initiate reload operation call of master instance of
multi-state resource agent?

I have an ocf multi-state resource agent for a daemon service and I
added reload op into this resource agent:

* two parameters of resource agent:


...


...


* reload op declaration in meta-data:



* reload op processing:

case "$1" in

monitor)svc_monitor
exit $?;;
...
reload) svc_reload
exit $?;;
...

When I change *init_svc_reload *parameter to a different value, reload
operation is executed only for slave instance of resource agent.
The only impact on master instance is early execution of monitor
operation, but I'd rather prefer reload execution for this instance.

[root@vm1 ~]# rpm -qi pacemaker
Name: pacemakerRelocations: (not relocatable)
Version : 1.1.12Vendor: Red Hat, Inc.
Release : 4.el6 Build Date: Thu 03 Jul
2014 04:05:56 PM MSK
...

[root@vm1 ~]# rpm -qi corosync
Name: corosync Relocations: (not relocatable)
Version : 1.4.7 Vendor: Red Hat, Inc.
Release : 2.el6 Build Date: Mon 02 Mar
2015 08:21:24 PM MSK
...

[root@pershing-vm5 ~]# rpm -qi resource-agents
Name: resource-agents  Relocations: (not relocatable)
Version : 3.9.5 Vendor: Red Hat, Inc.
Release : 12.el6_6.4Build Date: Thu 12 Feb
2015 01:13:26 AM MSK
...

[root@vm1 ~]# lsb_release -d
Description:Red Hat Enterprise Linux Server release 6.6 (Santiago)

---

Best regards, Mike Lychkov.
___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Digimer
On 16/03/16 01:51 PM, Tim Walberg wrote:
> Is there a way to make this work properly without STONITH? I forgot to mention
> that both nodes are virtual machines (QEMU/KVM), which makes STONITH a minor
> challenge. Also, since these symptoms occur even under "pcs cluster standby",
> where STONITH *shouldn't* be invoked, I'm not sure if that's the entire 
> answer.

Not sanely, no. All clusters need HA (if you don't need to coordinate
services, you don't need HA). In shared storage though, it becomes extra
critical.

As for fencing KVM; fence_xvm or fence_virsh (the former being ideal for
production, the later being easier to setup but more fragile, so only
good for testing).

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Digimer
On 17/03/16 01:57 PM, vija ar wrote:
> root file system is fine ...
> 
> but fencing is not a necessity a cluster shld function without it .. i
> see the issue with corosync which has all been .. a inherent way of not
> working neatly or smoothly ..

Absolutely wrong.

If you have a service that can run on both/all nodes at the same time
without coordination, you don't need a cluster, just run your services
everywhere.

If that's not the case, then you need fencing so that the (surviving)
node(s) can be sure that they know where services are and are not running.

> for e.g. take an issue where the live node is hung in db cluster .. now
> db perspective transactions r not happening and tht is fine as the node
> is having some issue .. now there is no need to fence this hung node but
> just to switch over to passive one .. but tht doesnt happens and fencing
> takes place either by reboot or shut .. which further makes the DB dirty
> or far more than tht in non-recoverable state which wouldnt have happen
> if a normal switch to other node as in cluster would have happened ...
> 
> i see fencing is not a solution its only required to forcefully take
> control which is not the case always
> 
> On Thu, Mar 17, 2016 at 12:49 PM, Ulrich Windl
>  > wrote:
> 
> >>> Christopher Harvey  schrieb am 16.03.2016 um 21:04
> in Nachricht
> <1458158684.122207.551267810.11f73...@webmail.messagingengine.com
> 
> >:
> [...]
> >> > Would stonith solve this problem, or does this look like a bug?
> >>
> >> It should, that is its job.
> >
> > is there some log I can enable that would say
> > "ERROR: hey, I would use stonith here, but you have it disabled! your
> > warranty is void past this point! do not pass go, do not file a bug"?
> 
> What should the kernel say during boot if the user has not defined a
> root file system?
> 
> Maybe the "stonith-enabled=false" setting should be called either
> "data-corruption-mode=true" or "hang-forever-on-error=true" ;-)
> 
> Regards,
> Ulrich
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org 
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 
> 
> 
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 


-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


[ClusterLabs] [Announce] libqb 10.rc4 release

2016-03-19 Thread Christine Caulfield
This is a bugfix release and a potential 1.0 candidate.

There are no actual code changes in this release, most of the patches
are to the build system. Thanks to Jan Pokorný for, er, all of them.
I've bumped the library soname to 0.18.0 which should really have
happened last time.

Changes from 1.0rc3

build: fix tests/_syslog_override.h not being distributed
build: enable syslog tests when configuring in spec
build: do not install syslog_override for the RPM packaging
build: update library soname to 0.18.0
build: do not try to second-guess "distdir" Automake variable
build: switch to XZ tarball format for {,s}rpm packaging
CI: make sure RPM can be built all the time
Generating the man pages definitely doesn't depend on existence of
(possibly generated) header files that we omit anyway.
build: drop extra qbconfig.h rule for auto_check_header self-test
build: extra clean-local rule instead of overriding clean-generic
build: docs: {dependent -> public}_headers + more robust obtaining
build: tests: grab "public_headers" akin to docs precedent
build: fix preposterous usage of $(AM_V_GEN)
build: tests: add intermediate check-headers target
CI: "make check" already included in "make distcheck"
build: fix out-of-tree build broken with 0b04ed5 (#184)
docs: enhance qb_log_ctl* description + fix doxygen warning
docs: apply "doxygen -u" on {html,man}.dox.in, fix deprecations
docs: {html,man}.dox.in: strip options for unused outputs
docs: {html,man}.dox.in: unify where reasonable
docs: make README.markdown always point to "CURRENT" docs
build: reorder LINT_FLAGS in a more logical way
build: make the code splint-friendly where not already
build: better support for splint checker
build: make splint check tolerant of existing defects



The current release tarball is here:
https://github.com/ClusterLabs/libqb/releases/download/v1.0rc4/libqb-1.0rc4.tar.gz

The github repository is here:
https://github.com/ClusterLabs/libqb

Please report bugs and issues in bugzilla:
https://bugzilla.redhat.com

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] Cluster failover failure with Unresolved dependency

2016-03-19 Thread Ken Gaillot
On 03/16/2016 11:20 AM, Lorand Kelemen wrote:
> Dear Ken,
> 
> I already modified the startup as suggested during testing, thanks! I
> swapped the postfix ocf resource to the amavisd systemd resource, as latter
> controls postfix startup also as it turns out and having both resouces in
> the mail-services group causes conflicts (postfix is detected as not
> running).
> 
> Still experiencing the same behaviour, killing amavisd returns an rc=7 for
> the monitoring operation on the "victim" node, this soungs logical, but the
> logs contain the same: amavisd and virtualip cannot run anywhere.
> 
> I made sure systemd is clean (amavisd = inactive, not running instead of
> failed) and also reset the failcount on all resources before killing
> amavisd.
> 
> How can I make sure to have a clean state for the resources beside above
> actions?

What you did is fine. I'm not sure why amavisd and virtualip can't run.
Can you show the output of "cibadmin -Q" when the cluster is in that state?

> Also note: when causing a filesystem resource to fail (e.g. with unmout),
> the failover happens successfully, all resources are started on the
> "survivor" node.
> 
> Best regards,
> Lorand
> 
> 
> On Wed, Mar 16, 2016 at 4:34 PM, Ken Gaillot  wrote:
> 
>> On 03/16/2016 05:49 AM, Lorand Kelemen wrote:
>>> Dear Ken,
>>>
>>> Thanks for the reply! I lowered migration-threshold to 1 and rearranged
>>> contraints like you suggested:
>>>
>>> Location Constraints:
>>> Ordering Constraints:
>>>   promote mail-clone then start fs-services (kind:Mandatory)
>>>   promote spool-clone then start fs-services (kind:Mandatory)
>>>   start fs-services then start network-services (kind:Mandatory)
>>
>> Certainly not a big deal, but I would change the above constraint to
>> start fs-services then start mail-services. The IP doesn't care whether
>> the filesystems are up yet or not, but postfix does.
>>
>>>   start network-services then start mail-services (kind:Mandatory)
>>> Colocation Constraints:
>>>   fs-services with spool-clone (score:INFINITY) (rsc-role:Started)
>>> (with-rsc-role:Master)
>>>   fs-services with mail-clone (score:INFINITY) (rsc-role:Started)
>>> (with-rsc-role:Master)
>>>   network-services with mail-services (score:INFINITY)
>>>   mail-services with fs-services (score:INFINITY)
>>>
>>> Now virtualip and postfix becomes stopped, I guess these are relevant
>> but I
>>> attach also full logs:
>>>
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.localpengine: info:
>>> native_color: Resource postfix cannot run anywhere
>>> Mar 16 11:38:06 [7419] HWJ-626.domain.localpengine: info:
>>> native_color: Resource virtualip-1 cannot run anywhere
>>>
>>> Interesting, will try to play around with ordering - colocation, the
>>> solution must be in these settings...
>>>
>>> Best regards,
>>> Lorand
>>>
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
>>> cib_perform_op:   Diff: --- 0.215.7 2
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
>>> cib_perform_op:   Diff: +++ 0.215.8 (null)
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
>>> cib_perform_op:   +  /cib:  @num_updates=8
>>> Mar 16 11:38:06 [7415] HWJ-626.domain.localcib: info:
>>> cib_perform_op:   ++
>>>
>> /cib/status/node_state[@id='1']/lrm[@id='1']/lrm_resources/lrm_resource[@id='postfix']:
>>>  >> operation_key="postfix_monitor_45000" operation="monitor"
>>> crm-debug-origin="do_update_resource" crm_feature_set="3.0.10"
>>> transition-key="86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> transition-magic="0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a"
>>> on_node="mail1" call-id="1333" rc-code="7"
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd: info:
>>> abort_transition_graph:   Transition aborted by postfix_monitor_45000
>>> 'create' on mail1: Inactive graph
>>> (magic=0:7;86:2962:0:ae755a85-c250-498f-9c94-ddd8a7e2788a, cib=0.215.8,
>>> source=process_graph_event:598, 1)
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd: info:
>>> update_failcount: Updating failcount for postfix on mail1 after
>> failed
>>> monitor: rc=7 (update=value++, time=1458124686)
>>
>> I don't think your constraints are causing problems now; the above
>> message indicates that the postfix resource failed. Postfix may not be
>> able to run anywhere because it's already failed on both nodes, and the
>> IP would be down because it has to be colocated with postfix, and
>> postfix can't run.
>>
>> The rc=7 above indicates that the postfix agent's monitor operation
>> returned 7, which is "not running". I'd check the logs for postfix errors.
>>
>>> Mar 16 11:38:06 [7420] HWJ-626.domain.local   crmd: info:
>>> process_graph_event:  Detected action (2962.86)
>>> postfix_monitor_45000.1333=not running: failed
>>> Mar 16 11:38:06 [7418] HWJ-626.domain.local  attrd: info:
>>> attrd_client_update:  Expanded 

Re: [ClusterLabs] Antw: Re: reproducible split brain

2016-03-19 Thread Christopher Harvey
If I ignore pacemaker's existence, and just run corosync, corosync
disagrees about node membership in the situation presented in the first
email. While it's true that stonith just happens to quickly correct the
situation after it occurs it still smells like a bug in the case where
corosync in used in isolation. Corosync is after all a membership and
total ordering protocol, and the nodes in the cluster are unable to
agree on membership.

The Totem protocol specifies a ring_id in the token passed in a ring.
Since all of the 3 nodes but one have formed a new ring with a new id
how is it that the single node can survive in a ring with no other
members passing a token with the old ring_id?

Are there network failure situations that can fool the Totem membership
protocol or is this an implementation problem? I don't see how it could
not be one or the other, and it's bad either way.

On Thu, Mar 17, 2016, at 02:08 PM, Digimer wrote:
> On 17/03/16 01:57 PM, vija ar wrote:
> > root file system is fine ...
> > 
> > but fencing is not a necessity a cluster shld function without it .. i
> > see the issue with corosync which has all been .. a inherent way of not
> > working neatly or smoothly ..
> 
> Absolutely wrong.
> 
> If you have a service that can run on both/all nodes at the same time
> without coordination, you don't need a cluster, just run your services
> everywhere.
> 
> If that's not the case, then you need fencing so that the (surviving)
> node(s) can be sure that they know where services are and are not
> running.
> 
> > for e.g. take an issue where the live node is hung in db cluster .. now
> > db perspective transactions r not happening and tht is fine as the node
> > is having some issue .. now there is no need to fence this hung node but
> > just to switch over to passive one .. but tht doesnt happens and fencing
> > takes place either by reboot or shut .. which further makes the DB dirty
> > or far more than tht in non-recoverable state which wouldnt have happen
> > if a normal switch to other node as in cluster would have happened ...
> > 
> > i see fencing is not a solution its only required to forcefully take
> > control which is not the case always
> > 
> > On Thu, Mar 17, 2016 at 12:49 PM, Ulrich Windl
> >  > > wrote:
> > 
> > >>> Christopher Harvey  schrieb am 16.03.2016 um 21:04
> > in Nachricht
> > <1458158684.122207.551267810.11f73...@webmail.messagingengine.com
> > 
> > >:
> > [...]
> > >> > Would stonith solve this problem, or does this look like a bug?
> > >>
> > >> It should, that is its job.
> > >
> > > is there some log I can enable that would say
> > > "ERROR: hey, I would use stonith here, but you have it disabled! your
> > > warranty is void past this point! do not pass go, do not file a bug"?
> > 
> > What should the kernel say during boot if the user has not defined a
> > root file system?
> > 
> > Maybe the "stonith-enabled=false" setting should be called either
> > "data-corruption-mode=true" or "hang-forever-on-error=true" ;-)
> > 
> > Regards,
> > Ulrich
> > 
> > 
> > 
> > ___
> > Users mailing list: Users@clusterlabs.org 
> > http://clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> > 
> > 
> > 
> > ___
> > Users mailing list: Users@clusterlabs.org
> > http://clusterlabs.org/mailman/listinfo/users
> > 
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> > 
> 
> 
> -- 
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
> 
> ___
> Users mailing list: Users@clusterlabs.org
> http://clusterlabs.org/mailman/listinfo/users
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Users mailing list: Users@clusterlabs.org
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Thomas Lamprecht


On 16.03.2016 18:51, Tim Walberg wrote:
> Is there a way to make this work properly without STONITH? I forgot to mention
> that both nodes are virtual machines (QEMU/KVM), which makes STONITH a minor
> challenge. Also, since these symptoms occur even under "pcs cluster standby",
> where STONITH *shouldn't* be invoked, I'm not sure if that's the entire 
> answer.
> 

There exists a lot fence agents which make use of the hypervisor which
hosts the VMs, e.g. fence_pve for Proxmox VE virtual machines, vmware,
virtual box, xen are also implemented, libvirt should be but I don't
know for sure.

See:
https://github.com/ClusterLabs/fence-agents
https://fedorahosted.org/cluster/wiki/fence-agents

This is a fairly easy way to setup fencing for me and I use it quite
often for tests.

I didn't set pacemaker up with such an agent but I see no problem which
could prevent that.

cheers,
Thomas

> 
> On 03/16/2016 13:34 -0400, Digimer wrote:
>>> On 16/03/16 01:17 PM, Tim Walberg wrote:
>>> > Having an issue on a newly built CentOS 7.2.1511 NFS cluster with DRBD
>>> > (drbd84-utils-8.9.5-1 with kmod-drbd84-8.4.7-1_1). At this point, the
>>> > resources consist of a cluster address, a DRBD device mirroring 
>>> between
>>> > the two cluster nodes, the file system, and the nfs-server resource. 
>>> The
>>> > resources all behave properly until an extended failover or outage.
>>> > 
>>> > I have tested failover in several ways ("pcs cluster standby", "pcs
>>> > cluster stop", "init 0", "init 6", "echo b > /proc/sysrq-trigger", 
>>> etc.)
>>> > and the symptoms are that, until the killed node is brought back into
>>> > the cluster, failover never seems to complete. The DRBD device appears
>>> > on the remaining node to be in a "Secondary/Unknown" state, and the
>>> > resources end up looking like:
>>> > 
>>> > # pcs status
>>> > Cluster name: nfscluster
>>> > Last updated: Wed Mar 16 12:05:33 2016  Last change: Wed Mar 
>>> 16
>>> > 12:04:46 2016 by root via cibadmin on nfsnode01
>>> > Stack: corosync
>>> > Current DC: nfsnode01 (version 1.1.13-10.el7_2.2-44eb2dd) - partition
>>> > with quorum
>>> > 2 nodes and 5 resources configured
>>> > 
>>> > Online: [ nfsnode01 ]
>>> > OFFLINE: [ nfsnode02 ]
>>> > 
>>> > Full list of resources:
>>> > 
>>> >  nfsVIP  (ocf::heartbeat:IPaddr2):   Started nfsnode01
>>> >  nfs-server (systemd:nfs-server):   Stopped
>>> >  Master/Slave Set: drbd_master [drbd_dev]
>>> >  Slaves: [ nfsnode01 ]
>>> >  Stopped: [ nfsnode02 ]
>>> >  drbd_fs   (ocf::heartbeat:Filesystem):Stopped
>>> > 
>>> > PCSD Status:
>>> >   nfsnode01: Online
>>> >   nfsnode02: Online
>>> > 
>>> > Daemon Status:
>>> >   corosync: active/enabled
>>> >   pacemaker: active/enabled
>>> >   pcsd: active/enabled
>>> > 
>>> > As soon as I bring the second node back online, the failover 
>>> completes.
>>> > But this is obviously not a good state, as an extended outage for any
>>> > reason on one node essentially kills the cluster services. There's
>>> > obviously something I've missed in configuring the resources, but I
>>> > haven't been able to pinpoint it yet.
>>> > 
>>> > Perusing the logs, it appears that, upon the initial failure, DRBD 
>>> does
>>> > in fact promote the drbd_master resource, but immediately after that,
>>> > pengine calls for it to be demoted for reasons I haven't been able to
>>> > determine yet, but seems to be tied to the fencing configuration. I 
>>> can
>>> > see that the crm-fence-peer.sh script is called, but it almost seems
>>> > like it's fencing the wrong node... Indeed, I do see that it adds a
>>> > -INFINITY location constraint for the surviving node, which would
>>> > explain the decision to demote the DRBD master.
>>> > 
>>> > My DRBD resource looks like this:
>>> > 
>>> > # cat /etc/drbd.d/drbd0.res
>>> > resource drbd0 {
>>> > 
>>> > protocol C;
>>> > startup { wfc-timeout 0; degr-wfc-timeout 120; }
>>> > 
>>> > disk {
>>> > on-io-error detach;
>>> > fencing resource-only;
>>> 
>>> This should be 'resource-and-stonith;', but alone won't do anything
>>> until pacemaker's stonith is working.
>>> 
>>> > }
>>> > 
>>> > handlers {
>>> > fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>>> > after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>>> > }
>>> > 
>>> > on nfsnode01 {
>>> > device /dev/drbd0;
>>> > disk /dev/vg_nfs/lv_drbd0;
>>> > meta-disk internal;
>>> > address 10.0.0.2:7788 ;
>>> > }
>>> > 
>>> > on 

Re: [ClusterLabs] Security with Corosync

2016-03-19 Thread Nikhil Utane
[root@node3 corosync]# corosync -v
Corosync Cluster Engine, version '1.4.7'
Copyright (c) 2006-2009 Red Hat, Inc.

So it is 1.x :(
When I begun I was following multiple tutorials and ended up installing
multiple packages. Let me try moving to corosync 2.0.
I suppose it should be as easy as doing yum install.

On Wed, Mar 16, 2016 at 10:29 PM, Jan Friesse  wrote:

> Nikhil Utane napsal(a):
>
>> Honza,
>>
>> In my CIB I see the infrastructure being set to cman. pcs status is
>> reporting the same.
>>
>> > name="cluster-infrastructure" value="*cman*"/>
>>
>> [root@node3 corosync]# pcs status
>> Cluster name: mycluster
>> Last updated: Wed Mar 16 16:57:46 2016
>> Last change: Wed Mar 16 16:56:23 2016
>> Stack: *cman*
>>
>> But corosync also is running fine.
>>
>> [root@node2 nikhil]# pcs status nodes corosync
>> Corosync Nodes:
>>   Online: node2 node3
>>   Offline: node1
>>
>> I did a cibadmin query and replace from cman to corosync but it doesn't
>> change (even though replace operation succeeds)
>> I read that CMAN internally uses corosync but in corosync 2 CMAN support
>> is
>> removed.
>> Totally confused. Please help.
>>
>
> Best start is to find out what versions you are using? If you have
> corosync 1.x and really using cman (what is highly probable), corosync.conf
> is completely ignored and instead cluster.conf (/etc/cluster/cluster.conf)
> is used. cluster.conf uses cman keyfile and if this is not provided,
> encryption key is simply cluster name. This is probably reason why
> everything worked when you haven't had authkey on one of nodes.
>
> Honza
>
>
>
>> -Thanks
>> Nikhil
>>
>> On Mon, Mar 14, 2016 at 1:19 PM, Jan Friesse  wrote:
>>
>> Nikhil Utane napsal(a):
>>>
>>> Follow-up question.
 I noticed that secauth was turned off in my corosync.conf file. I
 enabled
 it on all 3 nodes and restarted the cluster. Everything was working
 fine.
 However I just noticed that I had forgotten to copy the authkey to one
 of
 the node. It is present on 2 nodes but not the third. And I did a
 failover
 and the third node took over without any issue.
 How is the 3rd node participating in the cluster if it doesn't have the
 authkey?


>>> It's just not possible. If you would enabled secauth correctly and you
>>> didn't have /etc/corosync/authkey, message like "Could not open
>>> /etc/corosync/authkey: No such file or directory" would show up. There
>>> are
>>> few exceptions:
>>> - you have changed totem.keyfile with file existing on all nodes
>>> - you are using totem.key then everything works as expected (it has
>>> priority over default authkey file but not over totem.keyfile)
>>> - you are using COROSYNC_TOTEM_AUTHKEY_FILE env with file existing on all
>>> nodes
>>>
>>> Regards,
>>>Honza
>>>
>>>
>>>
>>> On Fri, Mar 11, 2016 at 4:15 PM, Nikhil Utane <
 nikhil.subscri...@gmail.com>
 wrote:

 Perfect. Thanks for the quick response Honza.

>
> Cheers
> Nikhil
>
> On Fri, Mar 11, 2016 at 4:10 PM, Jan Friesse 
> wrote:
>
> Nikhil,
>
>>
>> Nikhil Utane napsal(a):
>>
>> Hi,
>>
>>>
>>> I changed some configuration and captured packets. I can see that the
>>> data
>>> is already garbled and not in the clear.
>>> So does corosync already have this built-in?
>>> Can somebody provide more details as to what all security features
>>> are
>>> incorporated?
>>>
>>>
>>> See man page corosync.conf(5) options crypto_hash, crypto_cipher (for
>> corosync 2.x) and potentially secauth (for coorsync 1.x and 2.x).
>>
>> Basically corosync by default uses aes256 for encryption and sha1 for
>> hmac authentication.
>>
>> Pacemaker uses corosync cpg API so as long as encryption is enabled in
>> the corosync.conf, messages interchanged between nodes are encrypted.
>>
>> Regards,
>> Honza
>>
>>
>> -Thanks
>>
>>> Nikhil
>>>
>>> On Fri, Mar 11, 2016 at 11:38 AM, Nikhil Utane <
>>> nikhil.subscri...@gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>>
 Does corosync provide mechanism to secure the communication path
 between
 nodes of a cluster?
 I would like all the data that gets exchanged between all nodes to
 be
 encrypted.

 A quick google threw up this link:
 https://github.com/corosync/corosync/blob/master/SECURITY

 Can I make use of it with pacemaker?

 -Thanks
 Nikhil





>>> ___
>>> Users mailing list: Users@clusterlabs.org
>>> http://clusterlabs.org/mailman/listinfo/users
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started:
>>> 

Re: [ClusterLabs] [DRBD-user] DRBD fencing issue on failover causes resource failure

2016-03-19 Thread Tim Walberg
Is there a way to make this work properly without STONITH? I forgot to mention
that both nodes are virtual machines (QEMU/KVM), which makes STONITH a minor
challenge. Also, since these symptoms occur even under "pcs cluster standby",
where STONITH *shouldn't* be invoked, I'm not sure if that's the entire answer.


On 03/16/2016 13:34 -0400, Digimer wrote:
>>  On 16/03/16 01:17 PM, Tim Walberg wrote:
>>  > Having an issue on a newly built CentOS 7.2.1511 NFS cluster with DRBD
>>  > (drbd84-utils-8.9.5-1 with kmod-drbd84-8.4.7-1_1). At this point, the
>>  > resources consist of a cluster address, a DRBD device mirroring 
>> between
>>  > the two cluster nodes, the file system, and the nfs-server resource. 
>> The
>>  > resources all behave properly until an extended failover or outage.
>>  > 
>>  > I have tested failover in several ways ("pcs cluster standby", "pcs
>>  > cluster stop", "init 0", "init 6", "echo b > /proc/sysrq-trigger", 
>> etc.)
>>  > and the symptoms are that, until the killed node is brought back into
>>  > the cluster, failover never seems to complete. The DRBD device appears
>>  > on the remaining node to be in a "Secondary/Unknown" state, and the
>>  > resources end up looking like:
>>  > 
>>  > # pcs status
>>  > Cluster name: nfscluster
>>  > Last updated: Wed Mar 16 12:05:33 2016  Last change: Wed Mar 
>> 16
>>  > 12:04:46 2016 by root via cibadmin on nfsnode01
>>  > Stack: corosync
>>  > Current DC: nfsnode01 (version 1.1.13-10.el7_2.2-44eb2dd) - partition
>>  > with quorum
>>  > 2 nodes and 5 resources configured
>>  > 
>>  > Online: [ nfsnode01 ]
>>  > OFFLINE: [ nfsnode02 ]
>>  > 
>>  > Full list of resources:
>>  > 
>>  >  nfsVIP  (ocf::heartbeat:IPaddr2):   Started nfsnode01
>>  >  nfs-server (systemd:nfs-server):   Stopped
>>  >  Master/Slave Set: drbd_master [drbd_dev]
>>  >  Slaves: [ nfsnode01 ]
>>  >  Stopped: [ nfsnode02 ]
>>  >  drbd_fs   (ocf::heartbeat:Filesystem):Stopped
>>  > 
>>  > PCSD Status:
>>  >   nfsnode01: Online
>>  >   nfsnode02: Online
>>  > 
>>  > Daemon Status:
>>  >   corosync: active/enabled
>>  >   pacemaker: active/enabled
>>  >   pcsd: active/enabled
>>  > 
>>  > As soon as I bring the second node back online, the failover 
>> completes.
>>  > But this is obviously not a good state, as an extended outage for any
>>  > reason on one node essentially kills the cluster services. There's
>>  > obviously something I've missed in configuring the resources, but I
>>  > haven't been able to pinpoint it yet.
>>  > 
>>  > Perusing the logs, it appears that, upon the initial failure, DRBD 
>> does
>>  > in fact promote the drbd_master resource, but immediately after that,
>>  > pengine calls for it to be demoted for reasons I haven't been able to
>>  > determine yet, but seems to be tied to the fencing configuration. I 
>> can
>>  > see that the crm-fence-peer.sh script is called, but it almost seems
>>  > like it's fencing the wrong node... Indeed, I do see that it adds a
>>  > -INFINITY location constraint for the surviving node, which would
>>  > explain the decision to demote the DRBD master.
>>  > 
>>  > My DRBD resource looks like this:
>>  > 
>>  > # cat /etc/drbd.d/drbd0.res
>>  > resource drbd0 {
>>  > 
>>  > protocol C;
>>  > startup { wfc-timeout 0; degr-wfc-timeout 120; }
>>  > 
>>  > disk {
>>  > on-io-error detach;
>>  > fencing resource-only;
>>  
>>  This should be 'resource-and-stonith;', but alone won't do anything
>>  until pacemaker's stonith is working.
>>  
>>  > }
>>  > 
>>  > handlers {
>>  > fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
>>  > after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh";
>>  > }
>>  > 
>>  > on nfsnode01 {
>>  > device /dev/drbd0;
>>  > disk /dev/vg_nfs/lv_drbd0;
>>  > meta-disk internal;
>>  > address 10.0.0.2:7788 ;
>>  > }
>>  > 
>>  > on nfsnode02 {
>>  > device /dev/drbd0;
>>  > disk /dev/vg_nfs/lv_drbd0;
>>  > meta-disk internal;
>>  > address 10.0.0.3:7788 ;
>>  > }
>>  > }
>>  > 
>>  > If I comment out the three lines having to do with fencing, the 
>> failover
>>  > works properly. But I'd prefer to have the fencing there in the odd
>>  > chance that we end up with a split brain instead of just a node 
>> outage...
>>  > 
>>  > And, here's "pcs config --full":
>>  > 
>>