from:"Brian J. Murrell"

Re: [Pacemaker] unknown third node added to a 2 node cluster?

2014-10-22 Thread Brian J. Murrell (brian)

On Mon, 2014-10-13 at 12:51 +1100, Andrew Beekhof wrote:
> 
> Even the same address can be a problem. That brief window where things were 
> getting renewed can screw up corosync.

But as I proved, there was no renewal at all during the period of this
entire pacemaker run, so the use of DHCP here is a red-herring and does
not explain the observed behaviour.

> Never ever use dhcp for a cluster node. Ever. Really, never.

Fair enough.  But since this was not the cause of this problem, it's
still unexplained.  Is it a bug in pacemaker that it doesn't handle this
mysterious third node appearance/disappearance and it fouls up the
cluster?

> Yes. That is what nodeid's are calculated from.
> Different nodeid == different address

So your theory is that corosync on one of the nodes momentarily decided
to change which interface it was binding to and ...

> localhost is the most common one

... binded to localhost?  If so, I guess I should take this to the
corosync list.

b.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] unknown third node added to a 2 node cluster?

2014-10-10 Thread Brian J. Murrell (brian)

On Wed, 2014-10-08 at 12:39 +1100, Andrew Beekhof wrote:
> On 8 Oct 2014, at 2:09 am, Brian J. Murrell (brian) 
>  wrote:
> 
> > Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
> > and "node6" I saw an "unknown" third node being added to the cluster,
> > but only on node5:
> 
> Is either node using dhcp?

Yes, they both are.  The server is the ISC DHCP server (on EL6) and the
address pool is much more plentiful than the node count.  That is all
just to say that the DHCP server serving these nodes abides by the DHCP
RFC's recommendation to allow clients to continue to use addresses they
have already been assigned when making a renewal request.  And indeed,
give them the same address they had previously after a lease expiry, as
long as the pool is not constrained and address needed to satisfy a
request from a different machine.

> I would guess node6 got a new IP address

These nodes are using the ISC DHCP client.  That DHCP client logs in the
same log (/var/log/messages) as was posted in my prior message when it
renews a lease with messages such as:

Oct 10 05:56:19 node6 dhclient[1026]: DHCPREQUEST on eth0 to 10.14.80.6 port 67 
(xid=0x4f11c576)
Oct 10 05:56:19 node6 dhclient[1026]: DHCPACK from 10.14.80.6 (xid=0x4f11c576)
Oct 10 05:56:20 node6 dhclient[1026]: bound to 10.14.82.141 -- renewal in 8546 
seconds.

In the logs that I pasted the messages from in my previous message, such
messages don't even exist because the nodes are not left up long enough
to even get to a lease expiry.  These are tests nodes and so are
rebooted frequently.

TL;DR: I am quite certain the node did not get a new/different address.

> (or that corosync decided to bind to a different one)

Bind to a different what?  Address?  As in binding to an address that
was not even configured on the machine?

b.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] unknown third node added to a 2 node cluster?

2014-10-07 Thread Brian J. Murrell (brian)

Given a 2 node pacemaker-1.1.10-14.el6_5.3 cluster with nodes "node5"
and "node6" I saw an "unknown" third node being added to the cluster,
but only on node5:

Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 12: memb=2, new=0, lost=0
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: memb: 
node6 3713011210
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: memb: 
node5 3729788426
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 12: memb=3, new=1, lost=0
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: update_member: Creating 
entry for node 2085752330 born on 12
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: update_member: Node 
2085752330/unknown is now: member
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: NEW:  
.pending. 2085752330
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node6 3713011210
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node5 3729788426
Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
.pending. 2085752330

Above is where this third node seems to appear.

Sep 18 22:52:16 node5 corosync[17321]:   [pcmk  ] info: 
send_member_notification: Sending membership update 12 to 2 children
Sep 18 22:52:16 node5 corosync[17321]:   [TOTEM ] A processor joined or left 
the membership and a new membership was formed.
Sep 18 22:52:16 node5 cib[17371]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node (null)[2085752330] - state is now member (was 
(null))
Sep 18 22:52:16 node5 crmd[17376]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node (null)[2085752330] - state is now member (was 
(null))
Sep 18 22:52:16 node5 crmd[17376]:   notice: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL 
origin=abort_transition_graph ]
Sep 18 22:52:16 node5 crmd[17376]:error: join_make_offer: No recipient for 
welcome message
Sep 18 22:52:16 node5 crmd[17376]:  warning: do_state_transition: Only 2 of 3 
cluster nodes are eligible to run resources - continue 0
Sep 18 22:52:16 node5 attrd[17374]:   notice: attrd_local_callback: Sending 
full refresh (origin=crmd)
Sep 18 22:52:16 node5 attrd[17374]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: probe_complete (true)
Sep 18 22:52:16 node5 stonith-ng[17372]:   notice: unpack_config: On loss of 
CCM Quorum: Ignore
Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: Diff: --- 0.31.2
Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: Diff: +++ 0.32.1 
4a679012144955c802557a39707247a2
Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: --   
Sep 18 22:52:16 node5 cib[17371]:   notice: cib:diff: ++   
Sep 18 22:52:16 node5 pengine[17375]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Sep 18 22:52:16 node5 pengine[17375]:   notice: LogActions: Start   
res1#011(node5)
Sep 18 22:52:16 node5 crmd[17376]:   notice: te_rsc_command: Initiating action 
7: start res1_start_0 on node5 (local)
Sep 18 22:52:16 node5 pengine[17375]:   notice: process_pe_message: Calculated 
Transition 22: /var/lib/pacemaker/pengine/pe-input-165.bz2
Sep 18 22:52:16 node5 stonith-ng[17372]:   notice: stonith_device_register: 
Device 'st-fencing' already existed in device list (1 active devices)

On node6 at the same time the following was in the log:

Sep 18 22:52:15 node6 corosync[11178]:   [TOTEM ] Incrementing problem counter 
for seqid 5 iface 10.128.0.221 to [1 of 10]
Sep 18 22:52:16 node6 corosync[11178]:   [TOTEM ] Incrementing problem counter 
for seqid 8 iface 10.128.0.221 to [2 of 10]
Sep 18 22:52:17 node6 corosync[11178]:   [TOTEM ] Decrementing problem counter 
for iface 10.128.0.221 to [1 of 10]
Sep 18 22:52:19 node6 corosync[11178]:   [TOTEM ] ring 1 active with no faults

Any idea what's going on here?

Cheers,
b.




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] another "node rebooting too quickly" bug?

2014-04-24 Thread Brian J. Murrell

Hi,

As was previously discussed there is a bug in the handling of a STONITH
if a node reboots too quickly.  I had a different kind of failure that I
suspect is the same kind of problem, just different symptom.

The situation is a two node cluster with two resources plus a fencing
resource.  Each node is running one of the two resources and one is
running the fencing resource.  One of the nodes is killed.  The
remaining node should notice this, STONITH it and take over the
resource, but what happens is all of the above except the resource is
restarted on the node that was killed.

Here's the log from the surviving node -- the one that issues the
STONITH and *should* take over the resource:

Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 20: memb=1, new=0, lost=1
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: memb: 
node2 2085752330
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: lost: 
node1 2068975114
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 20: memb=1, new=0, lost=0
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node2 2085752330
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node node1 was not seen in the previous transition
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: update_member: Node 
2068975114/node1 is now: lost
Apr 22 19:47:26 node2 corosync[3204]:   [pcmk  ] info: 
send_member_notification: Sending membership update 20 to 2 children
Apr 22 19:47:26 node2 corosync[3204]:   [TOTEM ] A processor joined or left the 
membership and a new membership was formed.

node1 is gone

Apr 22 19:47:26 node2 corosync[3204]:   [CPG   ] chosen downlist: sender r(0) 
ip(10.14.82.124) r(1) ip(10.128.2.124) ; members(old:2 left:1)
Apr 22 19:47:26 node2 corosync[3204]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Apr 22 19:47:26 node2 crmd[3255]:   notice: plugin_handle_membership: 
Membership 20: quorum lost
Apr 22 19:47:26 node2 crmd[3255]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node node1[2068975114] - state is now lost (was 
member)
Apr 22 19:47:26 node2 crmd[3255]:  warning: match_down_event: No match for 
shutdown action on node1
Apr 22 19:47:26 node2 crmd[3255]:   notice: peer_update_callback: 
Stonith/shutdown of node1 not matched

Is this "Stonith/shutdown of node1 not matched" because the fencing
resource was running on node1 but now it's gone?

Apr 22 19:47:26 node2 cib[3250]:   notice: plugin_handle_membership: Membership 
20: quorum lost
Apr 22 19:47:26 node2 cib[3250]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node node1[2068975114] - state is now lost (was 
member)
Apr 22 19:47:26 node2 crmd[3255]:   notice: do_state_transition: State 
transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL 
origin=check_join_state ]
Apr 22 19:47:26 node2 attrd[3253]:   notice: attrd_local_callback: Sending full 
refresh (origin=crmd)
Apr 22 19:47:26 node2 attrd[3253]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: probe_complete (true)
Apr 22 19:47:26 node2 pengine[3254]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Apr 22 19:47:26 node2 pengine[3254]:  warning: pe_fence_node: Node node1 will 
be fenced because the node is no longer part of the cluster
Apr 22 19:47:26 node2 pengine[3254]:  warning: determine_online_status: Node 
node1 is unclean
Apr 22 19:47:26 node2 pengine[3254]:  warning: custom_action: Action 
st-fencing_stop_0 on node1 is unrunnable (offline)
Apr 22 19:47:26 node2 pengine[3254]:  warning: custom_action: Action 
resource1_stop_0 on node1 is unrunnable (offline)

Resources from node1 unrunnable because it's AWOL!

Apr 22 19:47:26 node2 pengine[3254]:  warning: stage6: Scheduling Node node1 
for STONITH

So kill it!

Apr 22 19:47:26 node2 pengine[3254]:   notice: LogActions: Move
st-fencing#011(Started node1 -> node2)
Apr 22 19:47:26 node2 pengine[3254]:   notice: LogActions: Move
resource1#011(Started node1 -> node2)

So here are the resources that were on node1 being scheduled to move to
node2.

Apr 22 19:47:26 node2 crmd[3255]:   notice: te_fence_node: Executing reboot 
fencing operation (13) on node1 (timeout=6)
Apr 22 19:47:26 node2 stonith-ng[3251]:   notice: handle_request: Client 
crmd.3255.0a3d203a wants to fence (reboot) 'node1' with device '(any)'
Apr 22 19:47:26 node2 stonith-ng[3251]:   notice: initiate_remote_stonith_op: 
Initiating remote operation reboot for node1: 
7660cf0e-10da-4e96-b3bd-4b415c72201c (0)

The actual killing of node1.

Apr 22 19:47:26 node2 pengine[3254]:  warning: process_pe_message: Calculated 
Transition 23: /var/lib/pacemaker/pengine/pe-warn-0.bz2
Apr 22 19:48:22 node2 corosync[3204]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 24: memb=1, new

Re: [Pacemaker] Node stuck in pending state

2014-04-10 Thread Brian J. Murrell

On Thu, 2014-04-10 at 10:04 +1000, Andrew Beekhof wrote: 
> 
> Brian: the detective work above is highly appreciated

NP.  I feel like I am getting better at reading these logs and can
provide some more detailed dissection of them.  And am happy to do so to
help get to the bottom of things.  :-)

> Essentially the node is returning "too fast" (specifically, before the 
> fencing notification arrives) causing pacemaker to forget the node is up and 
> healthy.
> 
> The fix for this is https://github.com/beekhof/pacemaker/commit/e777b17 and 
> is present in 1.1.11

Is there any chance of getting that applied to RHEL 6.5, "quickly"?

Much thanks for your looking into this one Andrew.

b.



signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Node stuck in pending state

2014-04-09 Thread Brian J. Murrell

On Tue, 2014-04-08 at 17:29 -0400, Digimer wrote: 
> Looks like your fencing (stonith) failed.

Where?  If I'm reading the logs correctly, it looks like stonith worked.
Here's the stonith:

Apr  8 09:53:21 lotus-4vm6 stonith-ng[2492]:   notice: log_operation: Operation 
'reboot' [3306] (call 2 from crmd.2496) for host 'lotus-4vm5' with device 
'st-fencing' returned: 0 (OK)

and then corosync reporting that the node left the cluster:

Apr  8 09:53:26 lotus-4vm6 corosync[2442]:   [pcmk  ] info: pcmk_peer_update: 
lost: lotus-4vm5 3176140298

Yes?  Or am I misunderstanding that message?

Doesn't this below also further indicate that the vm5 node did actually
get stonithed?

Apr  8 09:53:26 lotus-4vm6 corosync[2442]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node lotus-4vm5 was not seen in the previous 
transition
Apr  8 09:53:26 lotus-4vm6 corosync[2442]:   [pcmk  ] info: update_member: Node 
3176140298/lotus-4vm5 is now: lost

crmd and cib also seem to be noticing the node has gone away too, don't
they here:

Apr  8 09:53:26 lotus-4vm6 cib[2491]:   notice: plugin_handle_membership: 
Membership 20: quorum lost
Apr  8 09:53:26 lotus-4vm6 cib[2491]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node lotus-4vm5[3176140298] - state is now lost (was 
member)
Apr  8 09:53:26 lotus-4vm6 crmd[2496]:   notice: plugin_handle_membership: 
Membership 20: quorum lost
Apr  8 09:53:26 lotus-4vm6 crmd[2496]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node lotus-4vm5[3176140298] - state is now lost (was 
member)

And then the node comes back:

Apr  8 09:54:04 lotus-4vm6 corosync[2442]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 24: memb=1, new=0, lost=0
Apr  8 09:54:04 lotus-4vm6 corosync[2442]:   [pcmk  ] info: pcmk_peer_update: 
memb: lotus-4vm6 3192917514
Apr  8 09:54:04 lotus-4vm6 corosync[2442]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 24: memb=2, new=1, lost=0
Apr  8 09:54:04 lotus-4vm6 corosync[2442]:   [pcmk  ] info: update_member: Node 
3176140298/lotus-4vm5 is now: member
Apr  8 09:54:04 lotus-4vm6 corosync[2442]:   [pcmk  ] info: pcmk_peer_update: 
NEW:  lotus-4vm5 3176140298
Apr  8 09:54:04 lotus-4vm6 corosync[2442]:   [pcmk  ] info: pcmk_peer_update: 
MEMB: lotus-4vm5 3176140298

And now crmd realizes the node is back:

Apr  8 09:54:04 lotus-4vm6 crmd[2496]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node lotus-4vm5[3176140298] - state is now member 
(was lost)

As well as cib:

Apr  8 09:54:04 lotus-4vm6 cib[2491]:   notice: crm_update_peer_state: 
plugin_handle_membership: Node lotus-4vm5[3176140298] - state is now member 
(was lost)

And stonith-ng and crmd reports successful reboot:

Apr  8 09:54:04 lotus-4vm6 stonith-ng[2492]:   notice: remote_op_done: 
Operation reboot of lotus-4vm5 by lotus-4vm6 for 
crmd.2496-ZBdUr1hrI04s+xCAc1R/N1ez/nohh...@public.gmane.org:
 OK
Apr  8 09:54:04 lotus-4vm6 crmd[2496]:   notice: tengine_stonith_callback: 
Stonith operation 2/13:0:0:f325afae-64b0-4812-a897-70556ab1e806: OK (0)
Apr  8 09:54:04 lotus-4vm6 crmd[2496]:   notice: tengine_stonith_notify: Peer 
lotus-4vm5 was terminated (reboot) by lotus-4vm6 for lotus-4vm6: OK 
(ref=ae82b411-b07a-4235-be55-5a30a00b323b) by client crmd.2496

But all of a sudden, crmd reports the node is "lost" again?

Apr  8 09:54:04 lotus-4vm6 crmd[2496]:   notice: crm_update_peer_state: 
send_stonith_update: Node lotus-4vm5[3176140298] - state is now lost (was 
member)

But why?

Not surprising that we get these messages (below) if crmd thinks it was
suddenly "lost" (when it was in fact not according to the log for vm5:
)

Apr  8 09:54:11 lotus-4vm6 crmd[2496]:  warning: crmd_cs_dispatch: Recieving 
messages from a node we think is dead: lotus-4vm5[-1118826998]
Apr  8 09:54:31 lotus-4vm6 crmd[2496]:   notice: do_election_count_vote: 
Election 2 (current: 2, owner: lotus-4vm5): Processed vote from lotus-4vm5 
(Peer is not part of our cluster)

So I think the question is, why did crmd suddenly believe the node to be
"lost" when there was no evidence that it was lost?

b.



signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] error: send_cpg_message: Sending message via cpg FAILED: (rc=6) Try again

2014-02-06 Thread Brian J. Murrell (brian)

On Thu, 2014-02-06 at 10:42 -0500, Brian J. Murrell (brian) wrote:
> On Wed, 2014-01-08 at 13:30 +1100, Andrew Beekhof wrote:
> > What version of pacemaker?
> 
> Most recently I have been seeing this in 1.1.10 as shipped by RHEL6.5.

Doh!  Somebody did a test run that had not been updated to use the
newest pacemaker.  So disregard this until I do see one using 1.1.10,
which I have not so far, so hopefully I never will.  :-)

Sorry for the noise.

b.




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] error: send_cpg_message: Sending message via cpg FAILED: (rc=6) Try again

2014-02-06 Thread Brian J. Murrell (brian)

On Wed, 2014-01-08 at 13:30 +1100, Andrew Beekhof wrote:
> What version of pacemaker?

Most recently I have been seeing this in 1.1.10 as shipped by RHEL6.5.

> On 10 Dec 2013, at 4:40 am, Brian J. Murrell 
>  wrote:
> 

I didn't seem to get a response to any of the below questions.  I was
hoping the answers to them would help me gather as much data as possible
to try to diagnose this issue:

> > Would that same information be available in /var/log/messages if I have
> > configured corosync such as:
> > 
> > logging {
> >fileline: off
> >to_stderr: no
> >to_logfile: no
> >to_syslog: yes
> >logfile: /var/log/cluster/corosync.log
> >debug: off
> >timestamp: on
> >logger_subsys {
> >subsys: AMF
> >debug: off
> >}
> > }
> > 
> > If so, then the log snippet I posted in the prior message includes all
> > that corosync had to report.  Should I increase the amount of logging?
> > Any suggestions on an appropriate amount/flags, etc.?

> > corosync-1.4.1-15.el6_4.1.x86_64 as shipped by RH in EL6.
> > 
> > Is this new enough?  I know 2.x is also available but I don't think RH
> > is shipping that yet.  Hopefully their 1.4.1 is still supported.

Cheers,
b.




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] crm_resource -L not trustable right after restart

2014-01-21 Thread Brian J. Murrell (brian)

On Thu, 2014-01-16 at 14:49 +1100, Andrew Beekhof wrote:
> 
> What crm_mon are you looking at?
> I see stuff like:
> 
>  virt-fencing (stonith:fence_xvm):Started rhos4-node3 
>  Resource Group: mysql-group
>  mysql-vip(ocf::heartbeat:IPaddr2):   Started rhos4-node3 
>  mysql-fs (ocf::heartbeat:Filesystem):Started rhos4-node3 
>  mysql-db (ocf::heartbeat:mysql): Started rhos4-node3 

Yes, you are right.  I couldn't see the forest for the trees.

I initially was optimistic about crm_mon being more truthful than
crm_resource but it turns out it is not.

Take for example these commands to set a constraint and start a resource
(which has already been defined at this point):

[21/Jan/2014:13:46:40] cibadmin -o constraints -C -X ''
[21/Jan/2014:13:46:41] cibadmin -o constraints -C -X ''
[21/Jan/2014:13:46:42] crm_resource -r 'res1' -p target-role -m -v 'Started'

and then these repeated calls to crm_mon -1 on node5:

[21/Jan/2014:13:46:42] crm_mon -1
Last updated: Tue Jan 21 13:46:42 2014
Last change: Tue Jan 21 13:46:42 2014 via crm_resource on node5
Stack: openais
Current DC: node5 - partition with quorum
Version: 1.1.10-14.el6_5.1-368c726
2 Nodes configured
2 Resources configured


Online: [ node5 node6 ]

 st-fencing (stonith:fence_product):Started node5 
 res1   (ocf::product:Target):  Started node6 

[21/Jan/2014:13:46:42] crm_mon -1
Last updated: Tue Jan 21 13:46:42 2014
Last change: Tue Jan 21 13:46:42 2014 via crm_resource on node5
Stack: openais
Current DC: node5 - partition with quorum
Version: 1.1.10-14.el6_5.1-368c726
2 Nodes configured
2 Resources configured


Online: [ node5 node6 ]

 st-fencing (stonith:fence_product):Started node5 
 res1   (ocf::product:Target):  Started node6 

[21/Jan/2014:13:46:49] crm_mon -1 -r
Last updated: Tue Jan 21 13:46:49 2014
Last change: Tue Jan 21 13:46:42 2014 via crm_resource on node5
Stack: openais
Current DC: node5 - partition with quorum
Version: 1.1.10-14.el6_5.1-368c726
2 Nodes configured
2 Resources configured


Online: [ node5 node6 ]

Full list of resources:

 st-fencing (stonith:fence_product):Started node5 
 res1   (ocf::product:Target):  Started node5 

The first two are not correct, showing the resource started on node6
when it was actually started on node5.  Finally, 7 seconds later, it is
reporting correctly.  The logs on node{5,6} bear this out.  The resource
was actually only ever started on node5 and never on node6.

Here's the log for node5:

Jan 21 13:42:00 node5 pacemaker: Starting Pacemaker Cluster Manager
Jan 21 13:42:00 node5 pacemakerd[8684]:   notice: main: Starting Pacemaker 
1.1.10-14.el6_5.1 (Build: 368c726):  generated-manpages agent-manpages 
ascii-docs publican-docs ncurses libqb-logging libqb-ipc nagios  
corosync-plugin cman
Jan 21 13:42:00 node5 pacemakerd[8684]:   notice: get_node_name: Defaulting to 
uname -n for the local classic openais (with plugin) node name
Jan 21 13:42:00 node5 stonith-ng[8691]:   notice: crm_cluster_connect: 
Connecting to cluster infrastructure: classic openais (with plugin)
Jan 21 13:42:00 node5 cib[8690]:   notice: main: Using new config location: 
/var/lib/pacemaker/cib
Jan 21 13:42:00 node5 cib[8690]:  warning: retrieveCib: Cluster configuration 
not found: /var/lib/pacemaker/cib/cib.xml
Jan 21 13:42:00 node5 cib[8690]:  warning: readCibXmlFile: Primary 
configuration corrupt or unusable, trying backups
Jan 21 13:42:00 node5 cib[8690]:  warning: readCibXmlFile: Continuing with an 
empty configuration.
Jan 21 13:42:00 node5 attrd[8693]:   notice: crm_cluster_connect: Connecting to 
cluster infrastructure: classic openais (with plugin)
Jan 21 13:42:00 node5 crmd[8695]:   notice: main: CRM Git Version: 368c726
Jan 21 13:42:00 node5 attrd[8693]:   notice: get_node_name: Defaulting to uname 
-n for the local classic openais (with plugin) node name
Jan 21 13:42:00 node5 corosync[8646]:   [pcmk  ] info: pcmk_ipc: Recorded 
connection 0x1cbc3c0 for attrd/0
Jan 21 13:42:00 node5 stonith-ng[8691]:   notice: get_node_name: Defaulting to 
uname -n for the local classic openais (with plugin) node name
Jan 21 13:42:00 node5 corosync[8646]:   [pcmk  ] info: pcmk_ipc: Recorded 
connection 0x1cb8040 for stonith-ng/0
Jan 21 13:42:00 node5 attrd[8693]:   notice: get_node_name: Defaulting to uname 
-n for the local classic openais (with plugin) node name
Jan 21 13:42:00 node5 stonith-ng[8691]:   notice: get_node_name: Defaulting to 
uname -n for the local classic openais (with plugin) node name
Jan 21 13:42:00 node5 attrd[8693]:   notice: main: Starting mainloop...
Jan 21 13:42:00 node5 cib[8690]:   notice: crm_cluster_connect: Connecting to 
cluster infrastructure: classic openais (with plugin)
Jan 21 13:42:00 node5 cib[8690]:   notice: get_node_name: Defaulting to uname 
-n for the local classic openais (with plugin) node name
Jan 21 13:42:00 node5 corosync[8646]:   [pcmk  ] info: pcmk_ipc: Recorded 
connection 0x1cc0740 for cib/0
Jan 21 13:42:00 node5

Re: [Pacemaker] crm_resource -L not trustable right after restart

2014-01-15 Thread Brian J. Murrell (brian)

On Thu, 2014-01-16 at 08:35 +1100, Andrew Beekhof wrote:
> 
> I know, I was giving you another example of when the cib is not completely 
> up-to-date with reality.

Yeah, I understood that.  I was just countering with why that example is
actually more acceptable.

> It may very well be partially started.

Sure.

> Its almost certainly not stopped which is what is being reported.

Right.  But until it is completely started (and ready to do whatever
it's supposed to do), it might as well be considered stopped.  If you
have to make a binary state out of stopped, starting, started, I think
most people will agree that the states are stopped and starting and
stopped is anything < starting since most things are not useful until
they are fully started.

> You're not using the output to decide whether to perform some logic?

Nope.  Just reporting the state.  But that's difficult when you have two
participants making positive assertions about state when one is not
really in a position to do so.

> Because crm_mon is the more usual command to run right after startup

The problem with crm_mon is that it doesn't tell you where a resource is
running.

>  (which would give you enough context to know things are still syncing).

That's interesting.  Would polling crm_mon be more efficient than
polling the remote CIB with cibadmin -Q?

> DC election happens at the crmd.

So would it be fair to say then that I should not trust the local CIB
until DC election has finished or could there be latency between that
completing and the CIB being refreshed?

If DC election completion is accurate, what's the best way to determine
that has completed?

b.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] crm_resource -L not trustable right after restart

2014-01-15 Thread Brian J. Murrell (brian)

On Wed, 2014-01-15 at 17:11 +1100, Andrew Beekhof wrote:
> 
> Consider any long running action, such as starting a database.
> We do not update the CIB until after actions have completed, so there can and 
> will be times when the status section is out of date to one degree or another.

But that is the opposite of what I am reporting and is acceptable.  It's
acceptable for a resource that is in the process of starting being
reported as stopped, because it's not yet started.

What I am seeing is resources being reported as stopped when they are in
fact started/running and have been for a long time.

> At node startup is another point at which the status could potentially be 
> behind.

Right.  Which is the case I am talking about.

> It sounds to me like you're trying to second guess the cluster, which is a 
> dangerous path.

No, not trying to second guess at all.  I'm just trying to ask the
cluster what the state is and not getting the truth.  I am willing to
believe whatever state the cluster says it's in as long as what I am
getting is the truth.

> What if its the first node to start up?

I'd think a timeout comes in to play here.

> There'd be no fresh copy to arrive in that case.

I can't say that I know how the CIB works internally/entirely, but I'd
imagine that when a cluster node starts up it tries to see if there is a
more fresh CIB out there in the cluster.  Maybe this is part of the
process of choosing/discovering a DC.  But ultimately if the node is the
first one up, it will eventually figure that out so that it can nominate
itself as the DC.  Or it finds out that there is a DC already (and gets
a fresh CIB from it?).  It's during that window that I propose that
crm_resource should not be asserting anything and should just admit that
it does not (yet) know.

> If it had enough information to know it was out of date, it wouldn't be out 
> of date.

But surely it understands if it is in the process of joining a cluster
or not, and therefore does know enough to know that it doesn't know if
it's out of date or not.  But that it could be.

> As above, there are situations when you'd never get an answer.

I should have added to my proposal "or has determined that there is
nothing to refresh it's CIB from" and that it's local copy is
authoritative for the whole cluster.

b.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] crm_resource -L not trustable right after restart

2014-01-14 Thread Brian J. Murrell (brian)

On Tue, 2014-01-14 at 16:01 +1100, Andrew Beekhof wrote:
> 
> > On Tue, 2014-01-14 at 08:09 +1100, Andrew Beekhof wrote:
> >> 
> >> The local cib hasn't caught up yet by the looks of it.

I should have asked in my previous message: is this entirely an artifact
of having just restarted or are there any other times where the local
CIB can in fact be out of date (and thus crm_resource is inaccurate), if
even for a brief period of time?  I just want to completely understand
the nature of this situation.

> It doesn't know that it doesn't know.

But it (pacemaker at least) does know that it's just started up, and
should also know whether it's gotten a fresh copy of the CIB since
starting up, right?  I think I'd consider it required behaviour that
pacemaker not consider itself authoritative enough to provide answers
like "location" until it has gotten a fresh copy of the CIB.

> Does it show anything as running?  Any nodes as online?

> I'd not expect that it stays in that situation for more than a second or 
> two...

You are probably right about that.  But unfortunately that second or two
provides a large enough window to provide mis-information.

> We could add an option to force crm_resource to use the master instance 
> instead of the local one I guess.

Or, depending on the answers to above (like can this local-is-not-true
situation every manifest itself at times other than "just started")
perhaps just don't allow crm_resource (or any other tool) to provide
information from the local CIB until it's been refreshed at least once
since a startup.

I would much rather crm_resource experience some latency in being able
to provide answers than provide wrong ones.  Perhaps there needs to be a
switch to indicate if it should block waiting for the local CIB to be
up-to-date or should return immediately with an "unknown" type response
if the local CIB has not yet been updated since a start.

Cheers,
b.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] crm_resource -L not trustable right after restart

2014-01-13 Thread Brian J. Murrell (brian)

On Tue, 2014-01-14 at 08:09 +1100, Andrew Beekhof wrote:
> 
> The local cib hasn't caught up yet by the looks of it.

Should crm_resource actually be [mis-]reporting as if it were
knowledgeable when it's not though?  IOW is this expected behaviour or
should it be considered a bug?  Should I open a ticket?

> You could compare 'cibadmin -Ql' with 'cibadmin -Q'

Is there no other way to force crm_resource to be truthful/accurate or
silent if it cannot be truthful/accurate?  Having to run this kind of
pre-check before every crm_resource --locate seems like it's going to
drive overhead up quite a bit.

Maybe I am using the wrong tool for the job.  Is there a better tool
than crm_resource to ascertain, with full truthfullness (or silence if
truthfullness is not possible), where resources are running?

Cheers,
b.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] crm_resource -L not trustable right after restart

2014-01-13 Thread Brian J. Murrell (brian)

Hi,

I found a situation using pacemaker 1.1.10 on RHEL6.5 where the output
of "crm_resource -L" is not trust-able, shortly after a node is booted.

Here is the output from crm_resource -L on one of the nodes in a two
node cluster (the one that was not rebooted):

 st-fencing (stonith:fence_foo):Started 
 res1   (ocf::foo:Target):  Started 
 res2   (ocf::foo:Target):  Started 

Here is the output from the same command on the other node in the two
node cluster right after it was rebooted:

 st-fencing (stonith:fence_foo):Stopped 
 res1   (ocf::foo:Target):  Stopped 
 res2   (ocf::foo:Target):  Stopped 

These were collected at the same time (within the same second) on the
two nodes.

Clearly the rebooted node is not telling the truth.  Perhaps the truth
for it is "I don't know", which would be fair enough but that's not what
pacemaker is asserting there.

So, how do I know (i.e. programmatically -- what command can I issue to
know) if and when crm_resource can be trusted to be truthful?

b.




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] does adding a second ring actually work with cman?

2013-12-17 Thread Brian J. Murrell

On Tue, 2013-12-17 at 16:33 +0100, Florian Crouzat wrote: 
> 
> Is it possible that lotus-5vm8 (from DNS) and lotus-5vm8-ring1 (from 
> /etc/hosts) resolves to the same IP (10.128.0.206) which could maybe 
> confuse cman and make it decide that there is only one ring ?

No, they do resolve to two different IPs -- the respective IPs for the
interfaces.

The problem was just my misunderstanding that CMAN is not interactively
updating the live configuration but does indeed require a "service cman
re{load|start}" to activate that new ring configuration.

Cheers,
b.

signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] does adding a second ring actually work with cman?

2013-12-16 Thread Brian J. Murrell

So, I was reading:

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s2-rrp-ccs-CA.html

about adding a second ring to one's CMAN configuration.  I tried to add
a second ring to my configuration without success.

Given the example:

# ccs -h clusternet-node1-eth1 --addalt clusternet-node1-eth1 
clusternet-node1-eth2

I'm assuming that clusternet-node1-eth2 is a name that resolves to the
IP address of the interface to add as the second ring?  Otherwise I see
no way that the ccs command is supposed to know which (of possibly many)
interfaces to use for the second ring.

That didn't seem to work for me in any case:

# ifconfig eth1
eth1  Link encap:Ethernet  HWaddr 52:54:00:05:A7:08  
  inet addr:10.128.0.206  Bcast:10.128.7.255  Mask:255.255.248.0
  inet6 addr: fe80::5054:ff:fe05:a708/64 Scope:Link
  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
  RX packets:4662465 errors:0 dropped:0 overruns:0 frame:0
  TX packets:1555036 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:1000 
  RX bytes:520360401 (496.2 MiB)  TX bytes:171863479 (163.9 MiB)

# grep 10.128.0.206 /etc/hosts
10.128.0.206lotus-5vm8-ring1
# hostname
lotus-5vm8
# ccs -f /etc/cluster/cluster.conf --addalt $(hostname) $(hostname)-ring1
# ccs -f /etc/cluster/cluster.conf --getconf
  
  

  

  
  

  
  

  
  


  



  

# corosync-objctl  | grep ring
totem.interface.ringnumber=0

So is there something I am misunderstanding or is this not actually
working?  The node I was trying this on is EL6.4.

Cheers,
b.



signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] cman, ccs: Validation Failure, unable to modify configuration file

2013-12-16 Thread Brian J. Murrell

So, trying to create a cluster on a given node with ccs:

# ccs -p xxx -h $(hostname) --createcluster foo2
Validation Failure, unable to modify configuration file (use -i to ignore this 
error).

But there shouldn't be any configuration here yet.  I've not done
anything with this node:

# ccs -p xxx -h $(hostname) --getconf
  




  





So what's wrong with this (empty) configuration?

Even using -i as suggested doesn't work:

# ccs -p xxx -h $(hostname) --createcluster -i foo2
Validation Failure, unable to modify configuration file (use -i to ignore this 
error).

Any ideas?

b.



signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] is ccs as racy as it feels?

2013-12-10 Thread Brian J. Murrell

On Tue, 2013-12-10 at 10:27 +, Christine Caulfield wrote: 
> 
> Sadly you're not wrong.

That's what I was afraid of.

> But it's actually no worse than updating 
> corosync.conf manually,

I think it is...

> in fact it's pretty much the same thing,

Not really.  Updating corosync.conf on any given node means only having
to write that file on that node.  There is no cluster-wide
synchronization needed and therefore no last-write-wins race so all
nodes can do that in parallel.  Plus adding a new node means only having
to update the corosync.conf on that new node (and starting up corosync
of course) and corosync then does the job of telling it's peers about
the new node rather than having to have the administrator go out and
touch every node to inform them of the new member.

It's this removal of node auto-discovery and changing it to an operator
task that is really complicating the workflow.  Granted, it's not so
much complicating it for a human operator who is naturally only
single-threaded and mostly incapable of inducing the last-write-wins
races.

But when you are writing tools that now have to take what used to be a
very capable multithreaded task, free of races and shove it down a
single-threaded pipe/queue just to eliminate races, this is a huge step
backwards in evolution.

> so 
> nothing is actually getting worse.

It is though.  See above.

> All the CIB information is still 
> properly replicated.

Yeah.  I understand/understood that.  Pacemaker's actual operations go
mostly unchanged.  It's the cluster membership process that's gotten
needlessly complicated and regressed in functionality.

> The main difficulty is in safely replicating information that's needed 
> to boot the system.

Do you literally mean staring the system up?  I guess the use-case you
are describing here is booting nodes from a clustered filesystem?  But
what if you don't need that complication?  This process is being made
more complicated to satisfy only a subset of the use-cases.

> In general use we've not found it to be a huge problem (though, I'm 
> still not keen on it either TBH) because most management is done by one 
> person from one node.

Indeed.  As I said above, WRT to single-threaded operators.  But when
you are writing a management system on top of all of this, which
naturally wants to be multi-threaded (because scalable systems avoid
bottlenecking through single choke points) and was able to be
multithreaded when it was just corosync.conf, having to choke everything
back down into a single thread just sucks.

> There is not really any concept of nodes trying to 
> "add themselves" to a cluster, it needs to be done by a person - which 
> maybe what you're unhappy with.

Yes, not so much "add themselves" but allowed to be added, in parallel
without fear of racing.

This ccs tool wouldn't be so bad if it operated more like the CIB where
modifications were replicated automatically and properly locked so that
modifications could be made anywhere on the cluster and all members got
those modifications automatically rather than pushing off the work of
locking, replication and serialization off onto the caller.

b.

signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] is ccs as racy as it feels?

2013-12-09 Thread Brian J. Murrell

So, I'm trying to wrap my head around this need to migrate to pacemaker
+CMAN. I've been looking at
http://clusterlabs.org/quickstart-redhat.html and
https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/

It seems "ccs" is the tool to configure the CMAN part of things.

The first URL talks about using ccs to create a local configuration and
then "copy" that around to the rest of the cluster. Yuck.

The first URL doesn't really cover how one builds up clusters (i.e. over
time) but assumes that you know what your cluster is going to look like
before you build that configuration and says nothing about what to do
when you decide to add new nodes at some later point. I would guess
more "ccm -f /etc/cluster/cluster.conf" and some more copying around
again. Does anything need to be prodded to get this new configuration
that was just copied? I do hope just "prodding" and not a restart of
all services including pacemaker managed resources.

The second URL talks about ricci for propagating the configuration
around. But it seems to assume that all configuration is done from a
single node and then "sync'd" to the rest of the cluster with ricci in a
"last write wins" sort of work-flow.

So unlike pacemaker itself where any node can modify the configuration
of the CIB (raciness in tools like crm aside), having multiple nodes
using ccs feels quite dangerous in a "last-write-wins" kind of way. Am
I correct?

This makes it quite difficult to dispatch the task of configuring the
cluster out to the nodes that will be participating in the cluster --
having them configure their own participation. This distribution of
configuration tasks all works fine for pacemaker-proper (if you avoid
tools like crm) but feels like it's going to blow up when having
multiple nodes trying to add themselves and their own configuration to
the CMAN configuration -- all in parallel.

Am I correct about all of this? I hope I am not, because if I am this
all feels like a (very) huge step backward from the days where corosync
+pacemaker configuration could be carried out in parallel, on multiple
nodes without having to designate particular (i.e. one per cluster)
nodes as the single configuration point and feeding these designated
nodes the configuration items through a single-threaded work queue all
just to avoid the races that didn't exist using just corosync+pacemaker.

Cheers,
b.

signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] error: send_cpg_message: Sending message via cpg FAILED: (rc=6) Try again

2013-12-09 Thread Brian J. Murrell

On Mon, 2013-12-09 at 09:28 +0100, Jan Friesse wrote:
> 
> Error 6 error means "try again". This is happening ether if corosync is
> overloaded or creating new membership. Please take a look to
> /var/log/cluster/corosync.log if you see something strange there (+ make
> sure you have newest corosync).

Would that same information be available in /var/log/messages if I have
configured corosync such as:

logging {
fileline: off
to_stderr: no
to_logfile: no
to_syslog: yes
logfile: /var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}

If so, then the log snippet I posted in the prior message includes all
that corosync had to report.  Should I increase the amount of logging?
Any suggestions on an appropriate amount/flags, etc.?

> (+ make
> sure you have newest corosync).

corosync-1.4.1-15.el6_4.1.x86_64 as shipped by RH in EL6.

Is this new enough?  I know 2.x is also available but I don't think RH
is shipping that yet.  Hopefully their 1.4.1 is still supported.

Cheers,
b.

signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] error: send_cpg_message: Sending message via cpg FAILED: (rc=6) Try again

2013-12-06 Thread Brian J. Murrell (brian)

I seem to have another instance where pacemaker fails to exit at the end
of a shutdown.  Here's the log from the start of the "service pacemaker
stop":

Dec  3 13:00:39 wtm-60vm8 crmd[14076]:   notice: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Dec  3 13:00:39 wtm-60vm8 crmd[14076]: info: do_te_invoke: Processing graph 
19 (ref=pe_calc-dc-1386093636-83) derived from /var/lib/pengine/pe-input-40.bz2
Dec  3 13:00:39 wtm-60vm8 crmd[14076]:   notice: run_graph:  Transition 19 
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, 
Source=/var/lib/pengine/pe-input-40.bz2): Complete
Dec  3 13:00:39 wtm-60vm8 crmd[14076]:   notice: do_state_transition: State 
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS 
cause=C_FSA_INTERNAL origin=notify_crmd ]
Dec  3 13:00:39 wtm-60vm8 pengine[14075]:   notice: process_pe_message: 
Transition 19: PEngine Input stored in: /var/lib/pengine/pe-input-40.bz2
Dec  3 13:02:39 wtm-60vm8 crmd[14076]: info: handle_shutdown_request: 
Creating shutdown request for wtm-60vm7 (state=S_IDLE)
Dec  3 13:02:39 wtm-60vm8 crmd[14076]: info: abort_transition_graph: 
te_update_diff:176 - Triggered transition abort (complete=1, tag=nvpair, 
id=status-wtm-60vm7-shutdown, name=shutdown, value=1386093759, magic=NA, 
cib=0.48.3) : Transient attribute: update
Dec  3 13:02:39 wtm-60vm8 crmd[14076]:   notice: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL 
origin=abort_transition_graph ]
Dec  3 13:02:39 wtm-60vm8 pengine[14075]:   notice: unpack_config: On loss of 
CCM Quorum: Ignore
Dec  3 13:02:39 wtm-60vm8 pengine[14075]:   notice: stage6: Scheduling Node 
wtm-60vm7 for shutdown
Dec  3 13:02:39 wtm-60vm8 pengine[14075]:   notice: LogActions: Move
rsrc1#011(Started wtm-60vm7 -> wtm-60vm8)
Dec  3 13:02:39 wtm-60vm8 crmd[14076]:   notice: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
Dec  3 13:02:39 wtm-60vm8 crmd[14076]: info: do_te_invoke: Processing graph 
20 (ref=pe_calc-dc-1386093759-84) derived from /var/lib/pengine/pe-input-41.bz2
Dec  3 13:02:39 wtm-60vm8 crmd[14076]: info: te_rsc_command: Initiating 
action 11: stop rsrc1_stop_0 on wtm-60vm7
Dec  3 13:02:39 wtm-60vm8 pengine[14075]:   notice: process_pe_message: 
Transition 20: PEngine Input stored in: /var/lib/pengine/pe-input-41.bz2
Dec  3 13:02:44 wtm-60vm8 crmd[14076]: info: te_rsc_command: Initiating 
action 12: start rsrc1_start_0 on wtm-60vm8 (local)
Dec  3 13:02:44 wtm-60vm8 lrmd: [14073]: info: rsc:rsrc1:14: start
Dec  3 13:02:44 wtm-60vm8 crmd[14076]: info: te_crm_command: Executing 
crm-event (16): do_shutdown on wtm-60vm7
Dec  3 13:02:44 wtm-60vm8 crmd[14076]:   notice: crmd_peer_update: Status 
update: Client wtm-60vm7/crmd now has status [offline] (DC=true)
Dec  3 13:02:44 wtm-60vm8 cib[14071]: info: cib_process_shutdown_req: 
Shutdown REQ from wtm-60vm7
Dec  3 13:02:44 wtm-60vm8 cib[14071]: info: cib_process_request: Operation 
complete: op cib_shutdown_req for section 'all' 
(origin=wtm-60vm7/wtm-60vm7/(null), version=0.48.5): ok (rc=0)
Dec  3 13:02:51 wtm-60vm8 corosync[14032]:   [TOTEM ] A processor joined or 
left the membership and a new membership was formed.
Dec  3 13:02:51 wtm-60vm8 cib[14071]:   notice: ais_dispatch_message: 
Membership 20: quorum lost
Dec  3 13:02:51 wtm-60vm8 cib[14071]: info: crm_update_peer: Node 
wtm-60vm7: id=3866040492 state=lost (new) addr=r(0) ip(172.24.111.230) r(1) 
ip(10.0.0.102)  votes=1 born=16 seen=16 proc=0002
Dec  3 13:02:51 wtm-60vm8 crmd[14076]:   notice: ais_dispatch_message: 
Membership 20: quorum lost
Dec  3 13:02:51 wtm-60vm8 crmd[14076]: info: ais_status_callback: status: 
wtm-60vm7 is now lost (was member)
Dec  3 13:02:51 wtm-60vm8 crmd[14076]: info: crm_update_peer: Node 
wtm-60vm7: id=3866040492 state=lost (new) addr=r(0) ip(172.24.111.230) r(1) 
ip(10.0.0.102)  votes=1 born=16 seen=16 proc=0002
Dec  3 13:02:51 wtm-60vm8 cib[14071]: info: cib_process_request: Operation 
complete: op cib_modify for section nodes (origin=local/crmd/180, 
version=0.48.6): ok (rc=0)
Dec  3 13:02:51 wtm-60vm8 cib[14071]: info: cib_process_request: Operation 
complete: op cib_modify for section cib (origin=local/crmd/182, 
version=0.48.8): ok (rc=0)
Dec  3 13:02:51 wtm-60vm8 crmd[14076]: info: crmd_ais_dispatch: Setting 
expected votes to 2
Dec  3 13:02:51 wtm-60vm8 cib[14071]: info: cib_process_request: Operation 
complete: op cib_modify for section crm_config (origin=local/crmd/184, 
version=0.48.9): ok (rc=0)
Dec  3 13:02:52 wtm-60vm8 pacemakerd[14067]: info: crm_signal_dispatch: 
Invoking handler for signal 15: Terminated
Dec  3 13:02:52 wtm-60vm8 pacemakerd[14067]:   notice: pcmk_shutdown_worker:

[Pacemaker] prevent starting resources on failed node

2013-12-06 Thread Brian J. Murrell (brian)

[ Hopefully this doesn't cause a duplicate post but my first attempt
returned an error. ]

Using pacemaker 1.1.10 (but I think this issue is more general than that
release), I want to enforce a policy that once a node fails, no
resources can be started/run on it until the user permits it.

I have been successful in achieving this using resource stickiness.
Mostly.  It seems that once the resource has been successfully started
on another node, it stays put, even once the failed node comes back up.
So this is all good.

Where it does seem to be falling down though is that if the failed node
comes back up before the resource can be successfully started on another
node, pacemaker seems to include the just-failed-and-restarted node in
the candidate list of nodes it tries to start the resource on.  So in
this manner, it seems that resource stickiness only applies once the
resource has been started (which is not surprising; it seems a
reasonable behaviour).

The question then is, anyone have any ideas on how to implement such a
policy?  That is, once a node fails, no resources are allowed to start
on it, even if it means not starting the resource (i.e. all other nodes
are unable to start it for whatever reason)?  Simply not starting the
node would be one way to achieve it, yes, but we cannot rely on the node
not being started.

It seems perhaps the installation of a constraint when a node is
stonithed might do the trick, but the question is how to couple/trigger
the installation of a constraint with a stonith action?

Or is there a better/different way to achieve this?

Cheers,
b.




___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] catch-22: can't fence node A because node A has the fencing resource

2013-12-03 Thread Brian J. Murrell

On Tue, 2013-12-03 at 18:26 -0500, David Vossel wrote: 
> 
> We did away with all of the policy engine logic involved with trying to move 
> fencing devices off of the target node before executing the fencing action. 
> Behind the scenes all fencing devices are now essentially clones.  If the 
> target node to be fenced has a fencing device running on it, that device can 
> execute anywhere in the cluster to avoid the "suicide" situation.

OK.

> When you are looking at crm_mon output and see a fencing device is running on 
> a specific node, all that really means is that we are going to attempt to 
> execute fencing actions for that device from that node first. If that node is 
> unavailable,

Would it be better to not even try to use a node and ask it to commit
suicide but always try to use another node?

> we'll try that same device anywhere in the cluster we can get it to work

OK.

> (unless you've specifically built some location constraint that prevents the 
> fencing device from ever running on a specific node)

While I do have constraints on the more service-oriented resources to
give them preferred nodes, I don't have any constraints on the fencing
resources.

So given all of the above, and given the log I supplied showing that the
fencing was just not being attempted anywhere other than the node to be
fenced (which was down during that log) any clues as to where to look
for why?

> Hope that helps.

It explains the differences, but unfortunately I'm still not sure why it
wouldn't get run somewhere else, eventually, rather than continually
being attempted on the node to be killed (which as I mentioned, was shut
down at the time the log was made).

Cheers,
b.

signature.asc
Description: This is a digitally signed message part
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] catch-22: can't fence node A because node A has the fencing resource

2013-12-02 Thread Brian J. Murrell

So, I'm migrating my working pacemaker configuration from 1.1.7 to
1.1.10 and am finding what appears to be a new behavior in 1.1.10.

If a given node is running a fencing resource and that node goes AWOL,
it needs to be fenced (of course).  But any other node trying to take
over the fencing resource to fence it appears to first want the current
owner of the fencing resource to fence the node.  Of course that can't
happen since the node that needs to do the fencing is AWOL.

So while I can buy into the general policy that a node needs to be
fenced in order to take over it's resources, fencing resources have to
be excepted from this or there can be this catch-22.

I believe that is how things were working in 1.1.7 but now that I'm on
1.1.10[-1.el6_4.4] this no longer seems to be the case.

Or perhaps there is some additional configuration that 1.1.10 needs to
effect this behavior.  Here is my configuration:

Cluster Name: 
Corosync Nodes:
 
Pacemaker Nodes:
 host1 host2 

Resources: 
 Resource: rsc1 (class=ocf provider=foo type=Target)
  Attributes: target=111bad0a-a86a-40e3-b056-c5c93168aa0d 
  Meta Attrs: target-role=Started 
  Operations: monitor interval=5 timeout=60 (rsc1-monitor-5)
  start interval=0 timeout=300 (rsc1-start-0)
  stop interval=0 timeout=300 (rsc1-stop-0)
 Resource: rsc2 (class=ocf provider=chroma type=Target)
  Attributes: target=a8efa349-4c73-4efc-90d3-d6be7d73c515 
  Meta Attrs: target-role=Started 
  Operations: monitor interval=5 timeout=60 (rsc2-monitor-5)
  start interval=0 timeout=300 (rsc2-start-0)
  stop interval=0 timeout=300 (rsc2-stop-0)

Stonith Devices: 
 Resource: st-fencing (class=stonith type=fence_foo)
Fencing Levels: 

Location Constraints:
  Resource: rsc1
Enabled on: host1 (score:20) (id:rsc1-primary)
Enabled on: host2 (score:10) (id:rsc1-secondary)
  Resource: rsc2
Enabled on: host2 (score:20) (id:rsc2-primary)
Enabled on: host1 (score:10) (id:rsc2-secondary)
Ordering Constraints:
Colocation Constraints:

Cluster Properties:
 cluster-infrastructure: classic openais (with plugin)
 dc-version: 1.1.10-1.el6_4.4-368c726
 expected-quorum-votes: 2
 no-quorum-policy: ignore
 stonith-enabled: true
 symmetric-cluster: true

One thing that PCS is not showing that might be relevant here is that I
have a a resource stickiness value set to 1000 to prevent resources from
failing back to nodes after a failover.

With the above configuration if host1 is shut down, host2 just spins in
a loop doing:

Dec  2 20:00:02 host2 pengine[8923]:  warning: pe_fence_node: Node host1 will 
be fenced because the node is no longer part of the cluster
Dec  2 20:00:02 host2 pengine[8923]:  warning: determine_online_status: Node 
host1 is unclean
Dec  2 20:00:02 host2 pengine[8923]:  warning: custom_action: Action 
st-fencing_stop_0 on host1 is unrunnable (offline)
Dec  2 20:00:02 host2 pengine[8923]:  warning: custom_action: Action 
rsc1_stop_0 on host1 is unrunnable (offline)
Dec  2 20:00:02 host2 pengine[8923]:  warning: stage6: Scheduling Node host1 
for STONITH
Dec  2 20:00:02 host2 pengine[8923]:   notice: LogActions: Move
st-fencing#011(Started host1 -> host2)
Dec  2 20:00:02 host2 pengine[8923]:   notice: LogActions: Move
rsc1#011(Started host1 -> host2)
Dec  2 20:00:02 host2 crmd[8924]:   notice: te_fence_node: Executing reboot 
fencing operation (13) on host1 (timeout=6)
Dec  2 20:00:02 host2 stonith-ng[8920]:   notice: handle_request: Client 
crmd.8924.39504cd3 wants to fence (reboot) 'host1' with device '(any)'
Dec  2 20:00:02 host2 stonith-ng[8920]:   notice: initiate_remote_stonith_op: 
Initiating remote operation reboot for host1: 
ad69ead5-0bbb-45d8-bb07-30bcd405ace2 (0)
Dec  2 20:00:02 host2 pengine[8923]:  warning: process_pe_message: Calculated 
Transition 22: /var/lib/pacemaker/pengine/pe-warn-2.bz2  
Dec  2 20:01:14 host2 stonith-ng[8920]:error: remote_op_done: Operation 
reboot of host1 by host2 for crmd.8924@host2.ad69ead5: Timer expired
Dec  2 20:01:14 host2 crmd[8924]:   notice: tengine_stonith_callback: Stonith 
operation 4/13:22:0:0171e376-182e-485f-a484-9e638e1bd355: Timer expired (-62)
Dec  2 20:01:14 host2 crmd[8924]:   notice: tengine_stonith_callback: Stonith 
operation 4 for host1 failed (Timer expired): aborting transition.
Dec  2 20:01:14 host2 crmd[8924]:   notice: tengine_stonith_notify: Peer host1 
was not terminated (reboot) by host2 for host2: Timer expired 
(ref=ad69ead5-0bbb-45d8-bb07-30bcd405ace2) by client crmd.8924
Dec  2 20:01:14 host2 crmd[8924]:   notice: run_graph: Transition 22 
(Complete=1, Pending=0, Fired=0, Skipped=7, Incomplete=0, 
Source=/var/lib/pacemaker/pengine/pe-warn-2.bz2): Stopped
Dec  2 20:01:14 host2 pengine[8923]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Dec  2 20:01:14 host2 pengine[8923]:  warning: pe_fence_node: Node host1 will 
be fenced because the node is no longer part of the cluster  
Dec  2 20:01:14 host2 pengine[8923]:  warning: d

Re: [Pacemaker] Best way to notify stonith action

2013-07-08 Thread Brian J. Murrell


On 13-07-08 03:48 AM, Andreas Mock wrote:

Hi all,

I'm just wondering what the best way is to
let an admin know that the cluster (rest of
a cluster) has stonithed some other nodes?


You could modify or even just wrap the stonith agent.  They are usually 
just python or shell script anyway (well, except for fence_xvm, 
unfortunately).


Cheers,
b.



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] error: do_exit: Could not recover from internal error

2013-05-23 Thread Brian J. Murrell

On 13-05-22 07:05 PM, Andrew Beekhof wrote:
>  
> Also, 1.1.8-7 was not tested with the plugin _at_all_ (and neither will 
> future RHEL builds).

Was 1.1.7-* in EL 6.3 tested with the plugin?  Is staying with most
recent EL 6.3 pacemaker-1.1.7 release really the more stable option for
people not able to re-tool their clusters to use CMAN at this point in time?

When it's RHEL 7 upgrade time, a re-tooling of the HA framework can be
put on the table, but such a thing cannot be be considered during a
stable release series, unfortunately.

Thanks,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] error: do_exit: Could not recover from internal error

2013-05-22 Thread Brian J. Murrell

Using pacemaker 1.1.8-7 on EL6, I got the following series of events
trying to shut down pacemaker and then corosync.  The corosync shutdown
(service corosync stop) ended up spinning/hanging indefinitely (~7hrs
now).  The events, including a:

May 21 23:47:18 node1 crmd[17598]:error: do_exit: Could not recover from 
internal error

For completeness, I've included the logging from the whole session,
from corosync startup until it all went haywire.  The badness seems to
start at about 23:43:18.

May 21 23:42:51 node1 corosync[17541]:   [MAIN  ] Corosync Cluster Engine 
('1.4.1'): started and ready to provide service.
May 21 23:42:51 node1 corosync[17541]:   [MAIN  ] Corosync built-in features: 
nss dbus rdma snmp
May 21 23:42:51 node1 corosync[17541]:   [MAIN  ] Successfully read main 
configuration file '/etc/corosync/corosync.conf'.
May 21 23:42:51 node1 corosync[17541]:   [TOTEM ] Initializing transport 
(UDP/IP Multicast).
May 21 23:42:51 node1 corosync[17541]:   [TOTEM ] Initializing transmit/receive 
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 21 23:42:51 node1 corosync[17541]:   [TOTEM ] Initializing transport 
(UDP/IP Multicast).
May 21 23:42:51 node1 corosync[17541]:   [TOTEM ] Initializing transmit/receive 
security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
May 21 23:42:51 node1 corosync[17541]:   [TOTEM ] The network interface 
[192.168.122.253] is now up.
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: process_ais_conf: 
Reading configure
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] ERROR: process_ais_conf: You 
have configured a cluster using the Pacemaker plugin for Corosync. The plugin 
is not supported in this environment and will be removed very soon.

May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] ERROR: process_ais_conf:  
Please see Chapter 8 of 'Clusters from Scratch' 
(http://www.clusterlabs.org/doc) for details on using Pacemaker with CMAN
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: config_find_init: Local 
handle: 5650605097994944513 for logging
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: config_find_next: 
Processing additional logging options...
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: Found 
'off' for option: debug
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: Found 
'no' for option: to_logfile
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: Found 
'yes' for option: to_syslog
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'daemon' for option: syslog_facility
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: config_find_init: Local 
handle: 273040974342370 for quorum
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: config_find_next: No 
additional configuration supplied for: quorum
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: No 
default for option: provider
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: config_find_init: Local 
handle: 5880381755227111427 for service
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: config_find_next: 
Processing additional service options...
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: Found 
'1' for option: ver
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: process_ais_conf: 
Enabling MCP mode: Use the Pacemaker init script to complete Pacemaker startup
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'pcmk' for option: clustername
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'no' for option: use_logd
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: get_config_opt: 
Defaulting to 'no' for option: use_mgmtd
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: pcmk_startup: CRM: 
Initialized
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] Logging: Initialized 
pcmk_startup
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: pcmk_startup: Service: 9
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: pcmk_startup: Local 
hostname: node1
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: pcmk_update_nodeid: 
Local node id: 4252674240
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: update_member: Creating 
entry for node 4252674240 born on 0
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: update_member: 
0x12c4640 Node 4252674240 now known as node1 (was: (null))
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: update_member: Node 
node1 now has 1 quorum votes (was 0)
May 21 23:42:51 node1 corosync[17541]:   [pcmk  ] info: update_member: Node 
4252674240/node1 is now: member
May 21 23:42:51 node1 corosync[17541]:   [SERV  ] Service engine loaded: 
Pacemaker Cluster Manager 1.1.8
May 21 23:42:51 node1 corosync[17541]:   [SERV  ] Service engine loaded: 
corosync extended virtual synchrony service
May 21 23:42:51 node1 corosync[17541]:   [SERV  ] Service engine loade

[Pacemaker] stonith-ng: error: remote_op_done: Operation reboot of node2 by node1 for stonith_admin: Timer expired

2013-05-16 Thread Brian J. Murrell

Using Pacemaker 1.1.8 on EL6.4 with the pacemaker plugin, I'm finding
strange behavior with "stonith-admin -B node2".  It seems to shut the
node down but not start it back up and ends up reporting a timer
expired:

# stonith_admin -B node2
Command failed: Timer expired

The pacemaker log for the operation is:

May 16 13:50:41 node1 stonith_admin[23174]:   notice: crm_log_args: Invoked: 
stonith_admin -B node2 
May 16 13:50:41 node1 stonith-ng[1673]:   notice: handle_request: Client 
stonith_admin.23174.4a093de2 wants to fence (reboot) 'node2' with device '(any)'
May 16 13:50:41 node1 stonith-ng[1673]:   notice: initiate_remote_stonith_op: 
Initiating remote operation reboot for node2: 
aa230634-6a38-42b7-8ed4-0a0eb64af39a (0)
May 16 13:50:41 node1 cibadmin[23176]:   notice: crm_log_args: Invoked: 
cibadmin --query 
May 16 13:50:49 node1 corosync[1376]:   [TOTEM ] A processor failed, forming 
new configuration.
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 76: memb=1, new=0, lost=1
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] info: pcmk_peer_update: memb: 
node1 4252674240
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] info: pcmk_peer_update: lost: 
node2 2608507072
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 76: memb=1, new=0, lost=0
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node1 4252674240
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node node2 was not seen in the previous transition
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] info: update_member: Node 
2608507072/node2 is now: lost
May 16 13:50:55 node1 corosync[1376]:   [pcmk  ] info: 
send_member_notification: Sending membership update 76 to 2 children
May 16 13:50:55 node1 corosync[1376]:   [TOTEM ] A processor joined or left the 
membership and a new membership was formed.
May 16 13:50:55 node1 corosync[1376]:   [CPG   ] chosen downlist: sender r(0) 
ip(192.168.122.253) r(1) ip(10.0.0.253) ; members(old:2 left:1)
May 16 13:50:55 node1 corosync[1376]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
May 16 13:50:55 node1 cib[1672]:   notice: ais_dispatch_message: Membership 76: 
quorum lost
May 16 13:50:55 node1 cib[1672]:   notice: crm_update_peer_state: 
crm_update_ais_node: Node node2[2608507072] - state is now lost
May 16 13:50:55 node1 crmd[1677]:   notice: ais_dispatch_message: Membership 
76: quorum lost
May 16 13:50:55 node1 crmd[1677]:   notice: crm_update_peer_state: 
crm_update_ais_node: Node node2[2608507072] - state is now lost
May 16 13:50:55 node1 crmd[1677]:  warning: match_down_event: No match for 
shutdown action on node2
May 16 13:50:55 node1 crmd[1677]:   notice: peer_update_callback: 
Stonith/shutdown of node2 not matched
May 16 13:50:55 node1 crmd[1677]:   notice: do_state_transition: State 
transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL 
origin=check_join_state ]
May 16 13:50:57 node1 attrd[1675]:   notice: attrd_local_callback: Sending full 
refresh (origin=crmd)
May 16 13:50:57 node1 attrd[1675]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: last-failure-resource1 (1368710825)
May 16 13:50:57 node1 attrd[1675]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: probe_complete (true)
May 16 13:50:58 node1 pengine[1676]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: Defaulting to 
'now'
May 16 13:50:58 node1 pengine[1676]: crit: get_timet_now: De

Re: [Pacemaker] resource starts but then fails right away

2013-05-10 Thread Brian J. Murrell

On 13-05-09 09:53 PM, Andrew Beekhof wrote:
> 
> May  7 02:36:16 node1 crmd[16836]: info: delete_resource: Removing 
> resource testfs-resource1 for 18002_crm_resource (internal) on node1
> May  7 02:36:16 node1 lrmd: [16833]: info: flush_op: process for operation 
> monitor[8] on ocf::Target::testfs-resource1 for client 16836 still running, 
> flush delayed
> May  7 02:36:16 node1 crmd[16836]: info: lrm_remove_deleted_op: Removing 
> op testfs-resource1_monitor_0:8 for deleted resource testfs-resource1
> 
> So apparently a badly timed cleanup was run.

:-(  I didn't know there could such timing problems.  I might have to
change my process a bit then perhaps.

> Did you do that or was it the crm shell?

That was "me" doing a "crm resource cleanup" (soon to become
"crm_resource -r ... --cleanup").  The process is typically:

- create resource
- start resource
- wait for resource to start

where "start resource" is:
- "clean it to start with a known clean resource"
  (crm resource cleanup)
- "start resource"
  (crm_resource -r ... -p target-role -m -v Started)

and "wait for resource" is a loop of "crm resource status ..." (soon to
be "crm_resource -r ... --locate")

So the create, clean, start operations happen in quite quick succession
(i.e. scripted).  Is that pathological?  Is a clean between create and
start known to be problematic?

FWIW, the reason for clean before the start, even after just creating
the resource is that "clean" and "start" are lumped together into a
function that is called after create, but can also be called at other
times during the life-cycle, so it could be needed to clean a resource
before trying to start it.  I was hoping the cleaning of a just created
resource was going to be effectively a NOOP.

I guess for completeness, I should add here that creating the resource
is a "cibadmin -o resource -C ..." operation.

> If the machine is heavily loaded, or just very busy with file I/O, that can 
> still take quite a long time.

Yeah, not very loaded at all, especially at this point.  This is all
happening before anything really gets started on the machine... this is
the process of getting the resources up and running and the machine is
dedicated to running the tasks associated with these resources.

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] resource starts but then fails right away

2013-05-09 Thread Brian J. Murrell

Using Pacemaker 1.1.7 on EL6.3, I'm getting an intermittent recurrence
of a situation where I add a resource and start it and it seems to
start but then right away fail.  i.e.

# clean up resource before trying to start, just to make sure we start with a 
clean slate
# crm resource cleanup testfs-resource1
Cleaning up testfs-resource1 on node1

Waiting for 2 replies from the CRMd.. OK

# now try to start it
# crm_resource -r testfs-resource1 -p target-role -m -v Started

# monitor teh start up for success
# crm resource status testfs-resource1:

resource testfs-resource1 is NOT running

# crm resource status testfs-resource1

resource testfs-resource1 is NOT running

# crm resource status testfs-resource1

resource testfs-resource1 is NOT running

...

# crm resource status testfs-resource1

resource testfs-resource1 is NOT running

# crm resource status testfs-resource1

resource testfs-resource1 is NOT running

# crm resource status testfs-resource1
resource testfs-resource1 is running on: node1

# it started.  check once more:

# crm status

Last updated: Tue May  7 02:37:34 2013
Last change: Tue May  7 02:36:17 2013 via crm_resource on node1
Stack: openais
Current DC: node1 - partition WITHOUT quorum
Version: 1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14
1 Nodes configured, 2 expected votes
3 Resources configured.


Online: [ node1 ]

 st-fencing (stonith:fence_foo):Started node1
 resource2  (ocf::foo:Target):  Started node1
 testfs-resource1   (ocf::foo:Target):  Started node1 FAILED

Failed actions:
testfs-resource1_monitor_0 (node=node1, call=-1, rc=1, status=Timed Out): 
unknown error

# but lo and behold, it failed, with a monitor operation failing.

# stop it
# crm_resource -r testfs-resource1 -p target-role -m -v Stopped: 0

The syslog for this whole operation, starting with adding the resource
is as follows:

May  7 02:36:12 node1 cib[16831]: info: cib:diff: - 
May  7 02:36:12 node1 crmd[16836]: info: abort_transition_graph: 
te_update_diff:126 - Triggered transition abort (complete=1, tag=diff, 
id=(null), magic=NA, cib=0.16.1) : Non-status change
May  7 02:36:12 node1 crmd[16836]:   notice: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL 
origin=abort_transition_graph ]
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 

May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 

May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 

May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 

May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:12 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:12 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:12 node1 cib[16831]: info: cib_process_request: Operation 
complete: op cib_create for section resources (origin=local/cibadmin/2, 
version=0.16.1): ok (rc=0)
May  7 02:36:12 node1 pengine[16835]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
May  7 02:36:12 node1 crmd[16836]:   notice: do_state_transition: State 
transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS 
cause=C_IPC_MESSAGE origin=handle_response ]
May  7 02:36:12 node1 crmd[16836]: info: do_te_invoke: Processing graph 13 
(ref=pe_calc-dc-1367894172-72) derived from /var/lib/pengine/pe-input-124.bz2
May  7 02:36:12 node1 crmd[16836]: info: te_rsc_command: Initiating action 
5: monitor testfs-resource1_monitor_0 on node1 (local)
May  7 02:36:12 node1 lrmd: [16833]: info: rsc:testfs-resource1:8: probe
May  7 02:36:12 node1 pengine[16835]:   notice: process_pe_message: Transition 
13: PEngine Input stored in: /var/lib/pengine/pe-input-124.bz2
May  7 02:36:14 node1 crmd[16836]: info: abort_transition_graph: 
te_update_diff:126 - Triggered transition abort (complete=0, tag=diff, 
id=(null), magic=NA, cib=0.17.1) : Non-status change
May  7 02:36:14 node1 cib[16831]: info: cib:diff: - 
May  7 02:36:14 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:14 node1 cib[16831]: info: cib:diff: +   
May  7 02:36:14 node1 cib[16831]: info: cib:diff: + 
May  7 02:36:14 node1 cib[16831]: info: ci

[Pacemaker] warning: unpack_rsc_op: Processing failed op monitor for my_resource on node1: unknown error (1)

2013-04-30 Thread Brian J. Murrell

Using 1.1.8 on EL6.4, I am seeing this sort of thing:

pengine[1590]:  warning: unpack_rsc_op: Processing failed op monitor for 
my_resource on node1: unknown error (1)

The full log from the point of adding the resource until the errors:

Apr 30 11:46:30 node1 cibadmin[3380]:   notice: crm_log_args: Invoked: cibadmin 
-o resources -C -x /tmp/tmpHrgNZv 
Apr 30 11:46:30 node1 crmd[1591]:   notice: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL 
origin=abort_transition_graph ]
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: Diff: --- 0.24.5
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: Diff: +++ 0.25.1 
8a4aac3dcddc2689e4b336e1bf2078ff
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: -- 
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++ 

Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++ 

Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++ 
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++ 
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++ 

Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++ 

Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:30 node1 pengine[1590]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Apr 30 11:46:30 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:46:30 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:46:30 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:46:30 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:46:30 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:46:30 node1 pengine[1590]:   notice: process_pe_message: Calculated 
Transition 5: /var/lib/pacemaker/pengine/pe-input-10.bz2
Apr 30 11:46:30 node1 cibadmin[3386]:   notice: crm_log_args: Invoked: cibadmin 
-o constraints -C -X  
Apr 30 11:46:30 node1 cib[1586]:   notice: log_cib_diff: cib:diff: Local-only 
Change: 0.26.1
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: -- 
Apr 30 11:46:30 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: Diff: --- 0.26.3
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: Diff: +++ 0.27.1 
8305c8fe19d06a6204bd04f437eb923a
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: -- 
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: ++ 
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: Diff: --- 0.27.2
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: Diff: +++ 0.28.1 
0dbddb3084f7cd76bffe21916538be94
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: --   
Apr 30 11:46:33 node1 cib[1586]:   notice: cib:diff: ++   
Apr 30 11:46:33 node1 crmd[1591]:  warning: do_update_resource: Resource 
my_resource no longer exists in the lrmd
Apr 30 11:46:33 node1 crmd[1591]:   notice: process_lrm_event: LRM operation 
my_resource_monitor_0 (call=31, rc=7, cib-update=0, confirmed=true) not running
Apr 30 11:46:33 node1 crmd[1591]:  warning: decode_transition_key: Bad UUID 
(crm_resource.c) in sscanf result (4) for 3397:0:0:crm_resource.c
Apr 30 11:46:33 node1 crmd[1591]:error: send_msg_via_ipc: Unknown 
Sub-system (3397_crm_resource)... discarding message.
Apr 30 11:47:50 node1 crmd[1591]:  warning: action_timer_callback: Timer popped 
(timeout=2, abort_level=100, complete=false)
Apr 30 11:47:50 node1 crmd[1591]:error: print_synapse: [Action5]: 
In-flight rsc op my_resource_monitor_0   on node1 (priority: 0, waiting: none)
Apr 30 11:47:50 node1 crmd[1591]:  warning: cib_action_update: rsc_op 5: 
my_resource_monitor_0 on node1 timed out
Apr 30 11:47:50 node1 crmd[1591]:   notice: run_graph: Transition 5 
(Complete=4, Pending=0, Fired=0, Skipped=1, Incomplete=0, 
Source=/var/lib/pacemaker/pengine/pe-input-10.bz2): Stopped
Apr 30 11:47:50 node1 pengine[1590]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Apr 30 11:47:50 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:47:50 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:47:50 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:47:50 node1 pengine[1590]:  warning: unpack_rsc_op: Processing failed 
op monitor for my_resource on node1: unknown error (1)
Apr 30 11:47:50 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:47:50 node1 pengine[1590]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 11:47:50 node1 peng

Re: [Pacemaker] will a stonith resource be moved from an AWOL node?

2013-04-30 Thread Brian J. Murrell

On 13-04-30 11:13 AM, Lars Marowsky-Bree wrote:
> 
> Pacemaker 1.1.8's stonith/fencing subsystem directly ties into the CIB,
> and will complete the fencing request even if the fencing/stonith
> resource is not instantiated on the node yet.

But clearly that's not happening here.

> (There's a bug in 1.1.8 as
> released that causes an annoying delay here, but that's fixed since.)

Do you know which bug specifically so that I can see if the fix has been
applied here?

>> Node node1: UNCLEAN (pending)
>> Online: [ node2 ]
> 
>> node1 is very clearly completely off.  The cluster has been in this state, 
>> with node1 being off for several 10s of minutes now and still the stonith 
>> resource is running on it.
> 
> It shouldn't take so long. 

Indeed.  And FWIW, it's still in that state.

> I think your easiest path is to update.

Update to what?  I'm already using pacemaker-1.1.8-7 on EL6 and a yum
update is not providing anything newer.

Cheers,
b.





signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] will a stonith resource be moved from an AWOL node?

2013-04-30 Thread Brian J. Murrell

I'm using pacemaker 1.1.8 and I don't see stonith resources moving away
from AWOL hosts like I thought I did with 1.1.7.  So I guess the first
thing to do is clear up what is supposed to happen.

If I have a single stonith resource for a cluster and it's running on
node A and then node A goes AWOL, what happens to that stonith resource?

From what I think I know of pacemaker, pacemaker wants to be able to
stonith that AWOL node before moving any resources away from it since
starting a resource on a new node while the state of the AWOL node is
unknown is unsafe, right?

But of course, if the resource that pacemaker wants to move is the
stonith resource there's a bit of a catch-22.  It can't move the
stonith resource until it can stonith the node, which it cannot stonith
the node because the node running the resource is AWOL.

So, is pacemaker supposed to resolve this on it's own or am I supposed
to create a cluster configuration that ensures that enough stonith
resources exist to mitigate this situation?

The case I have in hand is this:

# pcs config
Corosync Nodes:
 
Pacemaker Nodes:
 node1 node2 

Resources: 
 Resource: stonith (type=fence_xvm class=stonith)

Location Constraints:
Ordering Constraints:
Colocation Constraints:

Cluster Properties:
 dc-version: 1.1.8-7.wc1.el6-394e906
 expected-quorum-votes: 2
 no-quorum-policy: ignore
 symmetric-cluster: true
 cluster-infrastructure: classic openais (with plugin)
 stonith-enabled: true
 last-lrm-refresh: 1367331233

# pcs status
Last updated: Tue Apr 30 14:48:06 2013
Last change: Tue Apr 30 14:13:53 2013 via crmd on node2
Stack: classic openais (with plugin)
Current DC: node2 - partition WITHOUT quorum
Version: 1.1.8-7.wc1.el6-394e906
2 Nodes configured, 2 expected votes
1 Resources configured.


Node node1: UNCLEAN (pending)
Online: [ node2 ]

Full list of resources:

 stonith(stonith:fence_xvm):Started node1

node1 is very clearly completely off.  The cluster has been in this state, with 
node1 being off for several 10s of minutes now and still the stonith resource 
is running on it.

The log, since corosync noticed node1 going AWOL:

Apr 30 14:14:56 node2 corosync[1364]:   [TOTEM ] A processor failed, forming 
new configuration.
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 52: memb=1, new=0, lost=1
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] info: pcmk_peer_update: memb: 
node2 2608507072
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] info: pcmk_peer_update: lost: 
node1 4252674240
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 52: memb=1, new=0, lost=0
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node2 2608507072
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node node1 was not seen in the previous transition
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] info: update_member: Node 
4252674240/node1 is now: lost
Apr 30 14:14:57 node2 corosync[1364]:   [pcmk  ] info: 
send_member_notification: Sending membership update 52 to 2 children
Apr 30 14:14:57 node2 corosync[1364]:   [TOTEM ] A processor joined or left the 
membership and a new membership was formed.
Apr 30 14:14:57 node2 corosync[1364]:   [CPG   ] chosen downlist: sender r(0) 
ip(192.168.122.155) ; members(old:2 left:1)
Apr 30 14:14:57 node2 corosync[1364]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Apr 30 14:14:57 node2 crmd[1666]:   notice: ais_dispatch_message: Membership 
52: quorum lost
Apr 30 14:14:57 node2 crmd[1666]:   notice: crm_update_peer_state: 
crm_update_ais_node: Node node1[4252674240] - state is now lost
Apr 30 14:14:57 node2 crmd[1666]:  warning: match_down_event: No match for 
shutdown action on node1
Apr 30 14:14:57 node2 crmd[1666]:   notice: peer_update_callback: 
Stonith/shutdown of node1 not matched
Apr 30 14:14:57 node2 cib[1661]:   notice: ais_dispatch_message: Membership 52: 
quorum lost
Apr 30 14:14:57 node2 cib[1661]:   notice: crm_update_peer_state: 
crm_update_ais_node: Node node1[4252674240] - state is now lost
Apr 30 14:14:57 node2 crmd[1666]:   notice: do_state_transition: State 
transition S_IDLE -> S_INTEGRATION [ input=I_NODE_JOIN cause=C_FSA_INTERNAL 
origin=check_join_state ]
Apr 30 14:14:57 node2 attrd[1664]:   notice: attrd_local_callback: Sending full 
refresh (origin=crmd)
Apr 30 14:14:57 node2 attrd[1664]:   notice: attrd_trigger_update: Sending 
flush op to all hosts for: probe_complete (true)
Apr 30 14:14:58 node2 pengine[1665]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Apr 30 14:14:58 node2 pengine[1665]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 14:14:58 node2 pengine[1665]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 14:14:58 node2 pengine[1665]: crit: get_timet_now: Defaulting to 
'now'
Apr 30 14:14:58 node2 pengine[1665]: crit: get_timet_now: Defaulting to

Re: [Pacemaker] why so long to stonith?

2013-04-24 Thread Brian J. Murrell

On 13-04-24 01:16 AM, Andrew Beekhof wrote:
> 
> Almost certainly you are hitting:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=951340

Yup.  The patch posted there fixed it.

> I am doing my best to convince people that make decisions that this is worthy 
> of an update before 6.5.

I've added my voice to the bug, if that's any help.

b.




signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] why so long to stonith?

2013-04-23 Thread Brian J. Murrell

Using pacemaker 1.1.8 on RHEL 6.4, I did a test where I just killed
(-KILL) corosync on a peer node.  Pacemaker seemed to take a long time
to transition to stonithing it though after noticing it was AWOL:

Apr 23 19:05:20 node2 corosync[1324]:   [TOTEM ] A processor failed, forming 
new configuration.
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 188: memb=1, new=0, lost=1
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: memb: 
node2 2608507072
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: lost: 
node1 4252674240
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 188: memb=1, new=0, lost=0
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node2 2608507072
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] info: 
ais_mark_unseen_peer_dead: Node node1 was not seen in the previous transition
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] info: update_member: Node 
4252674240/node1 is now: lost
Apr 23 19:05:21 node2 corosync[1324]:   [pcmk  ] info: 
send_member_notification: Sending membership update 188 to 2 children
Apr 23 19:05:21 node2 corosync[1324]:   [TOTEM ] A processor joined or left the 
membership and a new membership was formed.
Apr 23 19:05:21 node2 corosync[1324]:   [CPG   ] chosen downlist: sender r(0) 
ip(192.168.122.155) ; members(old:2 left:1)
Apr 23 19:05:21 node2 corosync[1324]:   [MAIN  ] Completed service 
synchronization, ready to provide service.
Apr 23 19:05:21 node2 crmd[1634]:   notice: ais_dispatch_message: Membership 
188: quorum lost
Apr 23 19:05:21 node2 crmd[1634]:   notice: crm_update_peer_state: 
crm_update_ais_node: Node node1[4252674240] - state is now lost
Apr 23 19:05:21 node2 cib[1629]:   notice: ais_dispatch_message: Membership 
188: quorum lost
Apr 23 19:05:21 node2 cib[1629]:   notice: crm_update_peer_state: 
crm_update_ais_node: Node node1[4252674240] - state is now lost
Apr 23 19:08:31 node2 crmd[1634]:   notice: do_state_transition: State 
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED 
origin=crm_timer_popped ]
Apr 23 19:08:31 node2 pengine[1633]:   notice: unpack_config: On loss of CCM 
Quorum: Ignore
Apr 23 19:08:31 node2 pengine[1633]:  warning: pe_fence_node: Node node1 will 
be fenced because the node is no longer part of the cluster
Apr 23 19:08:31 node2 pengine[1633]:  warning: determine_online_status: Node 
node1 is unclean
Apr 23 19:08:31 node2 pengine[1633]:  warning: custom_action: Action 
MGS_e4a31b_stop_0 on node1 is unrunnable (offline)
Apr 23 19:08:31 node2 pengine[1633]:  warning: stage6: Scheduling Node node1 
for STONITH
Apr 23 19:08:31 node2 pengine[1633]:   notice: LogActions: Move
MGS_e4a31b#011(Started node1 -> node2)
Apr 23 19:08:31 node2 crmd[1634]:   notice: te_fence_node: Executing reboot 
fencing operation (15) on node1 (timeout=6)
Apr 23 19:08:31 node2 stonith-ng[1630]:   notice: handle_request: Client 
crmd.1634.642b9c6e wants to fence (reboot) 'node1' with device '(any)'
Apr 23 19:08:31 node2 stonith-ng[1630]:   notice: initiate_remote_stonith_op: 
Initiating remote operation reboot for node1: 
fb431eb4-789c-41bc-903e-4041d50e93b4 (0)
Apr 23 19:08:31 node2 pengine[1633]:  warning: process_pe_message: Calculated 
Transition 115: /var/lib/pacemaker/pengine/pe-warn-7.bz2
Apr 23 19:08:41 node2 stonith-ng[1630]:   notice: log_operation: Operation 
'reboot' [27682] (call 0 from crmd.1634) for host 'node1' with device 
'st-node1' returned: 0 (OK)
Apr 23 19:08:41 node2 stonith-ng[1630]:   notice: remote_op_done: Operation 
reboot of node1 by node2 for crmd.1634@node2.fb431eb4: OK
Apr 23 19:08:41 node2 crmd[1634]:   notice: tengine_stonith_callback: Stonith 
operation 3/15:115:0:c118573c-84e3-48bd-8dc9-40de24438385: OK (0)
Apr 23 19:08:41 node2 crmd[1634]:   notice: tengine_stonith_notify: Peer node1 
was terminated (st_notify_fence) by node2 for node2: OK 
(ref=fb431eb4-789c-41bc-903e-4041d50e93b4) by client crmd.1634
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] notice: pcmk_peer_update: 
Transitional membership event on ring 192: memb=1, new=0, lost=0
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: memb: 
node2 2608507072
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] notice: pcmk_peer_update: 
Stable membership event on ring 192: memb=2, new=1, lost=0
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] info: update_member: Node 
4252674240/node1 is now: member
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: NEW:  
node1 4252674240
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node2 2608507072
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] info: pcmk_peer_update: MEMB: 
node1 4252674240
Apr 23 19:09:03 node2 corosync[1324]:   [pcmk  ] info: 
send_member_notification: Sending membership update 192 to 2 children
Apr 23

[Pacemaker] crm_attribute not returning node attribute

2013-04-19 Thread Brian J. Murrell

Given:

host1# crm node attribute host1 show foo
scope=nodes  name=foo value=bar

Why doesn't this return anything:

host1# crm_attribute --node host1 --name foo --query
host1# echo $?
0

cibadmin -Q confirms the presence of the attribute:

  

  

  

This is on pacemaker 1.1.8 on EL6.4 and crmsh.

Thoughts?

b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] racing crm commands... last write wins?

2013-04-12 Thread Brian J. Murrell

On 13-04-10 07:02 PM, Andrew Beekhof wrote:
> 
> On 11/04/2013, at 6:33 AM, Brian J. Murrell 
>  wrote:
>>
>> Does crm_resource suffer from this problem
> 
> no

Excellent.

I was unable to find any comprehensive documentation on just how to
implement a pacemaker configuration solely with crm_resource and the
manpage for it doesn't seem to indicate any way to create resources, for
example.

Is it typical that when you don't want to use "crm" (or "pcs") and want
to rely on the crm_* group of commands, that you do so in conjunction
with cibadmin for things like creating resources, etc.?  It seems so,
but I just want to make sure there is not something I have not uncovered
yet.

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] racing crm commands... last write wins?

2013-04-12 Thread Brian J. Murrell

On 13-04-11 06:00 PM, Andrew Beekhof wrote:
> 
> Actually, I think the semantics of -C are first-write-wins and any subsequent 
> attempts fail with -EEXSIST

Indeed, you are correct.  I think my point though was that it didn't
matter in my case which writer wins since they should all be trying to
write the same thing.

But it's good to make the semantics clear for anyone who comes across
this thread only to find that mine were in fact inaccurate.

Cheers
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] racing crm commands... last write wins?

2013-04-11 Thread Brian J. Murrell

On 13-04-11 07:37 AM, Brian J. Murrell wrote:
> 
> In exploring all options, how about pcs?  Does pcs' "resource create
> ..." for example have the same read+modify+replace problem as crm
> configure or does pcs resource create also only send proper fragments to
> update just the part of the CIB it's operating on?

Having just cracked pcs open, it doesn't seem to.  It seems to create an
XML string which it then applies to the CIB with:

cibadmin -o resources -C -X $xml_resource_string

IIUC, that is compatible with my use-case of multiple nodes in the
cluster creating resources concurrently and not having last-write-wins
problems, correct?

Usually the multiple nodes are creating separate resources and in any
case where multiple nodes are creating the same resource (or adjusting
some global parameter), it's configuration is the same from all writers
so last-write-wins there doesn't matter, yes?

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] racing crm commands... last write wins?

2013-04-11 Thread Brian J. Murrell

On 13-04-10 04:33 PM, Brian J. Murrell wrote:
> 
> Does crm_resource suffer from this problem or does it properly only send
> exactly the update to the CIB for the operation it's trying to achieve?

In exploring all options, how about pcs?  Does pcs' "resource create
..." for example have the same read+modify+replace problem as crm
configure or does pcs resource create also only send proper fragments to
update just the part of the CIB it's operating on?

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] racing crm commands... last write wins?

2013-04-10 Thread Brian J. Murrell

On 13-02-21 07:48 PM, Andrew Beekhof wrote:
> On Fri, Feb 22, 2013 at 5:18 AM, Brian J. Murrell 
>  wrote:
>> I wonder what happens in the case of two racing "crm" commands that want
>> to update the CIB (with non-overlapping/conflicting data).  Is there any
>> locking to ensure that one crm cannot overwrite the other's change?
>> (i.e. second one to get there has to read in the new CIB before being
>> able to apply his change and send it back)  Or if there is a situation
>> where one write stomps another's,
> 
> If my information is up-to-date, yes.
> 
> crmsh uses a read+modify+replace cycle, if B reads after A has read
> but before the replace has happened, data will be lost.

Does crm_resource suffer from this problem or does it properly only send
exactly the update to the CIB for the operation it's trying to achieve?

Cheers,
b.




signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] stonith and avoiding split brain in two nodes cluster

2013-03-28 Thread Brian J. Murrell

On 13-03-25 03:50 PM, Jacek Konieczny wrote:
> 
> The first node to notice that the other is unreachable will fence (kill)
> the other, making sure it is the only one operating on the shared data.

Right.  But with typical two-node clusters ignoring no-quorum, because
quorum is being ignored, as soon as there is a communications breakdown,
both nodes will notice the other is unreachable and both nodes will try
to fence the other, entering into a death-match.

It is entirely possible that both nodes end up killing each other and
now you have no nodes running any resources!

> Even though it is only half of the node, the cluster is considered
> quorate as the other node is known not to be running any cluster
> resources.
> 
> When the fenced node reboots its cluster stack starts, but with no
> quorum until it comminicates with the surviving node again. So no
> cluster services are started there until both nodes communicate
> properly and the proper quorum is recovered.

But this requires a two-node cluster to be able to determine quorum and
not be configured to ignore no-quorum which I think is the entire point
of the OP's question.

b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] racing crm commands... last write wins?

2013-02-25 Thread Brian J. Murrell

On 13-02-25 10:30 AM, Dejan Muhamedagic wrote:
> 
> Before doing replace, crmsh queries the CIB and checks if the
> epoch was modified in the meantime.

But doesn't take out a lock of any sort to prevent an update in the
meanwhile, right?

> Those operations are not
> atomic, though.

Indeed.

> Perhaps there's a way to improve this.

Well, the CIB is shared resource.  Shared resources need to be locked
against these sort of racy updates.  Is there no locking of any kind at
any level of CIB modifying operations?

i.e. does even cibadmin suffer from these last-write wins races with no
option or opportunity to lock the CIB?

> crm node/crm resource invoke crm_attribute or other crm_ tools.

Yes, that would be my expectation.  But somewhere, something has to
implement locking of the shared resource, yes?

> You should file a bugzilla then.

Indeed.  If there is no locking available anywhere, a ticket most
definitely needs filing.

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] a situation where pacemaker refuses to stop

2013-02-25 Thread Brian J. Murrell

On 13-02-24 07:56 PM, Andrew Beekhof wrote:
> 
> Basically yes.
> Stonith is the first stage of recovery and supposed to be at least
> vaguely reliable.
> Have you figured out why fencing is so broken?

It wasn't really "broken" but was in the process of being configured
when this situation arose.  The set up hadn't gotten to configuring the
stonith resource yet.

> Part of the problem is that 2-node clusters have no concept of quorum,
> so they can get a bit trigger-happy in the name of data-integrity.
> If Pacemaker were to shut down in this case, it would be leaving
> things (as far as it can tell) in an inconsistent state which is
> likely result in bad things later on - there's not much point in
> "highly available corrupted data".

Fair enough I suppose.  It's a corner case that one wants/needs to try
to avoid then.  :-/

Cheers,
b.





signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] a situation where pacemaker refuses to stop

2013-02-23 Thread Brian J. Murrell

I seem to have found a situation where pacemaker (pacemaker-1.1.7-6.el6.x86_64)
refuses to stop (i.e. service pacemaker stop) on EL6.

The status of the 2 node cluster was that the node being asked to stop
(node2) was continually trying to stonith another node (node1) in the
cluster which was not running corosync/pacemaker (yet).  The reason
node2 was looping around the stonith operation for node1 was that there
was no stonith resource set up for node1 (yet).

The log on node2 simply repeats this over and over again:

stonith-ng[20695]:error: remote_op_done: Operation reboot of node1 by 
 for node2[d4e76f3a-42ed-4576-975e-b805ac30c04a]: Operation timed out
crmd[20699]: info: tengine_stonith_callback: StonithOp 
crmd[20699]:   notice: tengine_stonith_callback: Stonith operation 110 for 
node1 failed (Operation timed out): aborting transition.
crmd[20699]: info: abort_transition_graph: tengine_stonith_callback:454 - 
Triggered transition abort (complete=0) : Stonith failed
crmd[20699]:   notice: tengine_stonith_notify: Peer node1 was not terminated 
(reboot) by  for node2: Operation timed out 
(ref=18e93407-4efa-4b97-99e1-b331591598ef)
crmd[20699]:   notice: run_graph:  Transition 108 (Complete=2, Pending=0, 
Fired=0, Skipped=4, Incomplete=0, Source=/var/lib/pengine/pe-warn-3.bz2): 
Stopped
crmd[20699]:   notice: do_state_transition: State transition 
S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_FSA_INTERNAL 
origin=notify_crmd ]
pengine[20698]:   notice: unpack_config: On loss of CCM Quorum: Ignore
pengine[20698]:  warning: stage6: Scheduling Node node1 for STONITH
pengine[20698]:   notice: stage6: Scheduling Node node2 for shutdown
pengine[20698]:   notice: LogActions: Stopst-fencing#011(node2)
pengine[20698]:  warning: process_pe_message: Transition 109: WARNINGs found 
during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-3.bz2
crmd[20699]:   notice: do_state_transition: State transition S_POLICY_ENGINE -> 
S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
origin=handle_response ]
pengine[20698]:   notice: process_pe_message: Configuration WARNINGs found 
during PE processing.  Please run "crm_verify -L" to identify issues.
crmd[20699]: info: do_te_invoke: Processing graph 109 
(ref=pe_calc-dc-1361624958-120) derived from /var/lib/pengine/pe-warn-3.bz2
crmd[20699]:   notice: te_fence_node: Executing reboot fencing operation (7) on 
node1 (timeout=6)
stonith-ng[20695]: info: initiate_remote_stonith_op: Initiating remote 
operation reboot for node1: 96b06897-5ba7-46c3-b9d2-797113df2812
stonith-ng[20695]: info: can_fence_host_with_device: Refreshing port list 
for st-fencing
stonith-ng[20695]: info: can_fence_host_with_device: st-fencing can not 
fence node1: dynamic-list
stonith-ng[20695]: info: stonith_command: Processed st_query from node2: 
rc=0

and while that's repeating the "service pacemaker stop" is producing:

node2# service pacemaker stop
Signaling Pacemaker Cluster Manager to terminate:  [  OK  ]
Waiting for cluster services to 
unload:.

I suppose this will continue forever until I either manually force
pacemaker down or fix up the cluster config to allow the stonith
operation to succeed.  In an environment where pacemaker is being
controlled by another process, this is clearly an undesirable sit-
uation.

Is this behavior (the shutdown hanging while pacemaker spins trying
to stonith) expected?

Cheers,
b.




signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] racing crm commands... last write wins?

2013-02-21 Thread Brian J. Murrell

I wonder what happens in the case of two racing "crm" commands that want
to update the CIB (with non-overlapping/conflicting data).  Is there any
locking to ensure that one crm cannot overwrite the other's change?
(i.e. second one to get there has to read in the new CIB before being
able to apply his change and send it back)  Or if there is a situation
where one write stomps another's, is there at least some kind of
notification?

Ultimately, it would be bad for two nodes for example to issue:

# crm node attribute $(uname -n) set name value

at the same time and have one of those updates lost.

But that's what I think I am seeing.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] return properties and rsc_defaults back to default values

2013-02-14 Thread Brian J. Murrell

Is there a way to return an individual property (or all properties)
and/or a rsc_default (or all) back to default values, using crm, or
otherwise?

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] location constraint "anywhere" on asymmetric cluster

2013-01-30 Thread Brian J. Murrell

I'm experimenting with asymmetric clusters and resource location
constraints.

My cluster has some resources which have to be restricted to certain
nodes and other resources which can run on any node.  Given that, an
"opt-in" cluster seems the most manageable.  That is, it seems easier to
create constraints based on where things should run (i.e. a whitelist)
rather than having to maintain lists of where things shouldn't run
(blacklists).

I wonder though for the case of a resource that can run anywhere, if
there is any more elegant way to specify that in an asymmetric cluster
other than creating an expression using something like the "#uname"
attribute with an operation of "ne" and specifying a host that doesn't
exist for the expression's "value".

Of course other attributes could be used in a negative fashion such as
this but I wonder if there is a more "positive" syntax to express
"anywhere".  Ideally, I'd like the expression to be absolutely clear
that it's intention is to allow the resource to run "anywhere" rather
than the reader having to deduce that by knowing that the node specified
doesn't exist.

Alternatively, if there is a way to "whitelist" (i.e. meaning not having
to create and maintain a list of where not to run a resource) where
resources can run on a symmetric cluster that might be an acceptable
alternative.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] best/proper way to shut down a node for service

2013-01-23 Thread Brian J. Murrell

On 13-01-23 03:32 AM, Dan Frincu wrote:
> Hi,

Hi,

> I usually put the node in standby, which means it can no longer run
> any resources on it. Both Pacemaker and Corosync continue to run, node
> provides quorum.

But a node in standby will still be STONITHed if it goes AWOL.  I put a
node in standby and then yanked it's power and it's peer started STONITH
operations on it.  That's the part I want to avoid.

b.




signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] best/proper way to shut down a node for service

2013-01-22 Thread Brian J. Murrell

OK.  So you have a corosync cluster of nodes with pacemaker managing
resources on them, including (of course) STONITH.

What's the best/proper way to shut down a node, say, for maintenance
such that pacemaker doesn't go trying to "fix" that situation and
STONITHing it to try to bring it back up, etc.?

Currently my practice for STONITH is to have it reboot.  Maybe it's a
better practice to have STONITH configured to just power a node down and
not try to power it back up for this exact reason?

Any other suggestions welcome.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-07-04 Thread Brian J. Murrell

On 12-07-04 04:27 AM, Andreas Kurz wrote:
> 
> beside increasing the batch limit to a higher value ... did you also
> tune corosync totem timings?

Not yet.

But a closer look at the logs reveals a bunch of these:

Jun 28 14:56:56 node-2 corosync[30497]:   [pcmk  ] ERROR: send_cluster_msg_raw: 
Child 25046 spawned to record non-fatal assertion failure line 1594: rc == 0
Jun 28 14:56:56 node-2 corosync[30497]:   [pcmk  ] ERROR: send_cluster_msg_raw: 
Message not sent (-1): 

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-07-04 Thread Brian J. Murrell

On 12-07-04 02:12 AM, Andrew Beekhof wrote:
> On Wed, Jul 4, 2012 at 10:06 AM, Brian J. Murrell 
>  wrote:
>>
>> Just because I reduced the number of nodes doesn't mean that I reduced
>> the parallelism any.
> 
> Yes. You did.  You reduced the number of "check what state the
> resource is on every node" probes.

Let me apologize as I was not clear.  I meant I did not reduce the
amount of parallelism in *my* CIB modify operations.  I was simply
clarifying that my operations on a single node are not serialized and
thus reducing the total number of nodes and increasing the number of
operations per node was not reducing the contention of those operations
by putting more operations into a serial queue per node.

> Now I'm getting annoyed.
> I keep explaining this is not true yet you keep repeating the above assertion.

Yes, I understand what you are saying.  The ordering of the messages on
the list is unfortunate and some seem to have been crossing each other.
 The message you were replying to, which is annoying you was composed
before your subsequent messages and was in response to somebody else on
the list.

My apologies for the confusion.

b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-07-03 Thread Brian J. Murrell

On 12-07-03 04:26 PM, David Vossel wrote:
> 
> This is not a definite.  Perhaps you are experiencing this given the 
> pacemaker version you are running

Yes, that is absolutely possible and it certainly has been under
consideration throughout this process.  I did also recognize however,
that I am running the latest stable (1.1.6) release and while I might be
able to experiment with with a development branch in the lab, I could
not use it in production.  So while it would be an interesting
experiment, my primary goal had to be getting 1.1.6 to run stably.

> and the torture test you are running with all those parallel commands,

It is worth keeping in mind that all of those parallel commands are just
as parallel with the 4 node cluster as they are with the 8 (4 nodes
actively modifying the CIB + 4 completely idle nodes) and 16 node
clusters -- both of which failed.

Just because I reduced the number of nodes doesn't mean that I reduced
the parallelism any.  The commands being run on each node are not
serialized and are all launched in parallel on the 4 node cluster as
much as they were with the 16 node cluster.

So strictly speaking, it doesn't seem that parallelism in the CIB
modifications are as much of a factor as simply the number of nodes in
the cluster, even when some (i.e. in the 8 node test I did) of the nodes
are entirely passive and not modifying the CIB at all.

> but I wouldn't go as far as to say pacemaker cannot scale to more than a 
> handful of nodes.

I'd totally welcome being shown the error of my ways.

> I'm sure you know this, I just wanted to be explicit about this so there is 
> no confusion caused by people who may use your example as a concrete metric.

But of course.  In my experiments, it was clear that the cib process
could peak a single core on my 12 core Xeons with just 4 nodes in the
cluster at times.

Therefore it is also clear that some time down the road, assuming CPU is
the limiting factor here, it's quite easy to see how a faster CPU core,
or multithreading the cib would allow for better scaling, but my point
was simply at the current time, and again, assuming (since I don't know
for sure what the limiting factor really is) CPU is the limiting factor
here, somewhere between 4-8 nodes is the limit with more or less default
tunings.

> From the deployments I've seen on the mailing list and bug reports, the most 
> common clusters appear to be around the 2-6 node mark.

Which seems consistent.

> The messaging involved with keeping the all the local resource operations in 
> the CIB synced across that many nodes is pretty insane.

Indeed, and I most certainly had considered that.  What really threw a
curve in that train of thought for me though was that even idle,
non-CIB-modifying nodes (i.e. turning a working 4 node cluster into a
non-working 8 node cluster by adding 4 nodes that do nothing with the
CIB) can tip a working configuration over into non-working.

I could most certainly see how the contention of 8 nodes all trying to
jam stuff into the CIB might be taxing with all of the locking that
needs to go on, etc, but for those 4 added idle nodes to add enough
complexity to make an working 4 node cluster not work is puzzling.
Puzzling enough (granted, to somebody who knows zilch about the
messaging that goes on with CIB operations) to make is smell more like a
bug than simple contention.

> If you are set on using pacemaker,

Well, I am not necessarily married to it.  It did just seem like the
tool with the critical mass behind it.  As sketchy as it might seem to
ask, (and I only am since you seem to be hinting that there might be a
better tool for the job) is there a tool more suited to the job?

> the best approach for scaling for your situation would probably be to try and 
> figure out how to break nodes into smaller clusters that are easier to manage.

Indeed, that is what I ended up doing.  Now my 16 node cluster is 4 4
node clusters.  The problem with that though, is that when a node in a
cluster fails, it has only 3 other nodes to spread it's resources around
onto, and if 2 should fail, 2 nodes are trying to service twice their
normal load.  The benefit of larger clusters is clear. in giving
pacemaker more nodes to evenly distribute resources to, impacting the
load of other the other nodes minimally when one or more nodes of the
cluster do fail.

> I have not heard of a single deployment as large as you are thinking of.

Heh.  Not atypical of me to push the envelope I'm afraid.  :-/

Cheers, and many thanks for your input.  It is valuable to this discussion.

b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-07-03 Thread Brian J. Murrell

On 12-07-03 06:17 PM, Andrew Beekhof wrote:
> 
> Even adding passive nodes multiplies the number of probe operations
> that need to be performed and loaded into the cib.

So it seems.  I just would have not thought they be such a load since
from a simplistic perspective, since they are not trying to update the
CIB, it seems they just need an update of it when the rest of the nodes
doing the updating are done.  But I do admit that could be a simplistic
view.

> Did you try any of the settings I suggested?

The only setting I saw you suggest was "batch-limit" and at first glance
it did not seem clear to me which way to adjust this (up or down) and I
was running out of time for experimentation and just needed to get to
something that works.

So now that pressure is off and I have a bit of time to experiment, what
value would you suggest for that parameter given the 32 resource and
constraints I want to add on a 16 node cluster?

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-07-03 Thread Brian J. Murrell

On 12-06-27 11:30 PM, Andrew Beekhof wrote:
> 
> The updates from you aren't the problem.  Its the number of resource
> operations (that need to be stored in the CIB) that result from your
> changes that might be causing the problem.

Just to follow this up for anyone currently following or anyone finding
this thread in the future...

It turns out that the problem is simply the size of the HA cluster that
I want to create.  The details are in the bug I filed at
http://bugs.clusterlabs.org/show_bug.cgi?id=5076 but the short story is
that I can add the number of resources and constrains I want to add
(i.e. 32-34 of each, as previously described in this thread),
concurrently even, so long as there is not more than 4 nodes per
corosync/pacemaker cluster.

Even adding 4 passive nodes (I only tried 8 total of 8 nodes, but not
values between 4 and 8 so the tipping point might be somewhere in
between 4 and 8) -- nodes that do no CIB operations of their own made
pacemaker crumble.

So the summary seems to be that pacemaker cannot scale to more than a
handful of nodes, even when the nodes are big: 12 core Xeon nodes with
gobs of memory.

I can only guess that everybody is using pacemaker in "pair" (or not
much bigger) type configurations currently.  Is that accurate?

Perhaps there is some tuning that can be done to scale somewhat, but
realistically, I am looking for pacemaker clusters in the tens, if not
into the hundreds of nodes.  However, I really wonder if any amount of
tuning could be done to achieve clusters that large given the small
number of nodes supported with the default tuning values.

Thoughts?

b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-06-27 Thread Brian J. Murrell

On 12-06-26 09:54 PM, Andrew Beekhof wrote:
> 
> The DC, possibly you didn't have one at that moment in time.

It was the DC in fact.  I restarted corosync on that node and the
timeouts went away.  But note I "re"started, not started.  It was
running at the time, just not properly, apparently.

> Were there (m)any membership events occurring at the time?

I'm not sure.

I do seem to be able to reproduce this situation though with some
software I have that's driving pacemaker configuration building.

I essentially have 34 resources across 17 nodes that I need to populate
pacemaker with, complete with location constraints.  This populating is
done with a pair of cibadmin commands, one for the resource and one for
the constraint.  These pairs of commands are being run for each resource
on the nodes on which they will run.

So, that's 17 pairs of cibadmin commands being run, one pair on each
node, concurrently -- so yes, lots of thrashing of the CIB.  Is the CIB
and/or cibadmin not up to this kind of thrashing?

Typically while this is happening some number of cibadmin commands will
start failing with:

Call cib_create failed (-41): Remote node did not respond

and then calls to (say) "cibadmin -Q" on every node except the DC will
start failing with:

Call cib_query failed (-41): Remote node did not respond

After restarting corosync on the DC, (most if not all of) the non-DC
nodes are now able to return from "cibadmin -Q" but they have differing
CIB contents.  That state doesn't seem to last long and all nodes except
the (typically new/different) DC node again suffer "Remote node did not
respond".  A restart of that new DC again yields some/most of the nodes
able to complete queries again, bug again, with differing CIB content.

I am using corosync-1.4.1-4.el6_2.3 and pacemaker-1.1.6-3.el6 on these
nodes.

Any ideas?  Am I really pushing the CIB too hard with all of the
concurrent modifications?

b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] Call cib_query failed (-41): Remote node did not respond

2012-06-26 Thread Brian J. Murrell

So, I have an 18 node cluster here (so a small haystack, indeed, but
still a haystack in which to try to find a needle) where a certain
set of (yet unknown, figuring that out is part of this process)
operations are pooching pacemaker.  The symptom is that on one or
more nodes I get the following kinds of errors:

# cibadmin -Q
Call cib_query failed (-41): Remote node did not respond

along with similar things in the log:

Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 7 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 8 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 9 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:38 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 10 
failed: (rc=-41) Remote node did not respond
Jun 26 19:51:39 iu-18 crmd: [19119]: WARN: cib_rsc_callback: Resource update 11 
failed: (rc=-41) Remote node did not respond

Clearly some node in the cluster has a problem, but nothing in any
of these messages is helping me figure out which one it is.

Any hints on how I figure out which node this "iu-18" node is having
problems communicating with?

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] manually failing back resources when set sticky

2012-03-30 Thread Brian J. Murrell

On 12-03-30 02:35 PM, Florian Haas wrote:
> 
> crm configure rsc_defaults resource-stickiness=0
> 
> ... and then when resources have moved back, set it to 1000 again.
> It's really that simple. :)

That sounds racy.  I am changing a parameter which has the potential to
affect the stickiness of all resources for a (hopefully brief) period of
time.  If there is some other fail{ure,over} transaction in play while I
do this I might adversely affect my policy of no-automatic-failback
mightn't I?

Since this suggestion is also non-atomic, meaning I set a contraint,
wait for the result of the change in allocation due to that setting and
then "undo" it when the allocation change has completed, wouldn't I just
be better to use "crm resource migrate FOO" and then monitor for the
reallocation and then remove the "cli-standby-FOO" constraint when it
has?  Wouldn't this effect your suggestion in the same non-atomic manner
but be sure to only affect the one resource I am trying to fail back?

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] manually failing back resources when set sticky

2012-03-30 Thread Brian J. Murrell

In my cluster configuration, each resource can be run on one of two node
and I designate a "primary" and a "secondary" using location constraints
such as:

location FOO-primary FOO 20: bar1
location FOO-secondary FOO 10: bar2

And I also set a default stickiness to prevent auto-fail-back (i.e. to
prevent flapping):

rsc_defaults $id="rsc-options" resource-stickiness="1000"

This all works as I expect.  Resources run where I expect them to while
everything is operating normally and when a node fails the resource
migrates to the secondary and stays there even when the primary node
comes back.

The question is, what is the proper administrative command(s) to move
the resource back to it's "primary" after I have manually determined
that that node is OK after coming back from a failure?

I figure I could just create a new resource constraint, wait for the
migration and then remove it, but I just wonder if there is a more
atomic "move back to your preferred node" command I can issue.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] resources show as running on all nodes right after adding them

2012-03-28 Thread Brian J. Murrell

On 12-03-28 10:39 AM, Florian Haas wrote:
> 
> Probably because your resource agent reports OCF_SUCCESS on a probe
> operation

To be clear, is this the "status" $OP in the agent?

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

[Pacemaker] resources show as running on all nodes right after adding them

2012-03-28 Thread Brian J. Murrell

We seem to have occasion where we find crm_resource reporting that a
resource is running on more (usually all!) nodes when we query right
after adding it:

# crm_resource -resource chalkfs-OST_3 --locate
resource chalkfs-OST_3 is running on: chalk02 
resource chalkfs-OST_3 is running on: chalk03 
resource chalkfs-OST_3 is running on: chalk04 
resource chalkfs-OST_3 is running on: chalk01 

Further checking reveals:

# crm status

Last updated: Mon Dec 19 11:30:31 2011
Stack: openais
Current DC: chalk01 - partition with quorum
Version: 1.0.11-1554a83db0d3c3e546cfd3aaff6af1184f79ee87
4 Nodes configured, 4 expected votes
3 Resources configured.


Online: [ chalk01 chalk02 chalk03 chalk04 ]

MGS_1   (ocf::hydra:Target):Started chalk01
chalkfs-OST_3   (ocf::hydra:Target) Started [   chalk02 chalk03 chalk04 
chalk01 ]
resource chalkfs-OST_3 is running on: chalk02 
resource chalkfs-OST_3 is running on: chalk03 
resource chalkfs-OST_3 is running on: chalk04 
resource chalkfs-OST_3 is running on: chalk01 

Clearly this resource is not running on all nodes, so why is it
being reported as such?

b.





signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] running a resource on any node in an asymmetric cluster

2011-10-26 Thread Brian J. Murrell

On 11-10-26 10:19 AM, Brian J. Murrell wrote:
> 
> # cat /tmp/foo.xml
> 
>   
   ^^^
I figured it out.  This "integer" has to be quoted.  I'm thinking too
much like a programmer.  :-/

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] running a resource on any node in an asymmetric cluster

2011-10-26 Thread Brian J. Murrell

I want to be able to run a resource on any node in an asymmetric
cluster so I tried creating a rule to run it on any node not named
"foo" since there are no nodes named foo in my cluster:

# cat /tmp/foo.xml

  

  


for the resource bar:

primitive bar stonith:fence_virsh \
params ipaddr="192.168.122.1" login="root" 
identity_file="/root/.ssh/id_rsa-virsh" port="node2" action="reboot" 
secure="true" pcmk_host_list="node2" pcmk_host_check="static-list" 
pcmk_host_map=""

and apply that with:

# cibadmin -o constraints -C -x /tmp/foo.xml

I get:

Call cib_create failed (-47): Update does not conform to the configured 
schema/DTD

Can anyone point out why?  I thought I followed Example 8.9 from

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-rules-location.html

I even tried substituting the node foo for "node2" (since it doesn't
make much sense to run a stonith resource for node2 on node2 but that
doesn't change the result of cibadmin.

Cheers and thanks,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] cloning primatives with differing params

2011-10-25 Thread Brian J. Murrell

I want to create a stonith primitive and clone it for each node in my
cluster.  I'm using the fence-agents virsh agent as my stonith
primitive.  Currently for a single node it looks like:

primitive st-pm-node1 stonith:fence_virsh \
params ipaddr="192.168.122.1" login="xxx" passwd="xxx" port="node1" 
action="reboot" pcmk_host_list="node1" pcmk_host_check="static-list" 
pcmk_host_map="" secure="true"

But of course that only works for one node and I want to create a
clonable primitive that will apply to all nodes as they are added
to the cluster.  What is stumping me though is the required "port"
parameter which is the node to stonith.  I've not seen an example
of how a clone resource can be created that can substitute values
in for each clone.  Is that even possible?

On a pretty un-related question... given an asymmetric cluster, is there
a way to specify that a resource can run on any node without having
to add a location constraint for each node as they are added?

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] stonith configured but not happening

2011-10-18 Thread Brian J. Murrell

On 11-10-18 09:40 AM, Andreas Kurz wrote:
> Hello,

Hi,

> I'd expect this to be the problem ... if you insist on using an
> unsymmetric cluster you must add a location score for each resource you
> want to be up on a node ... so add a location constraint for the fencing
> clone for each node ... or use a symmetric cluster.

A.  Of course.  Setting location constraints fixed that.

Thanx much!

b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] stonith configured but not happening

2011-10-18 Thread Brian J. Murrell

I have a pacemaker 1.0.10 installation on rhel5 but I can't seem to
manage to get a working stonith configuration.  I have tested my stonith
device manually using the stonith command and it works fine.  What
doesn't seem to be happening is pacemaker/stonithd actually asking for a
stonith.  In my log I get:

Oct 18 08:54:23 mds1 stonithd: [4645]: ERROR: Failed to STONITH the node
oss1: optype=RESET, op_result=TIMEOUT
Oct 18 08:54:23 mds1 crmd: [4650]: info: tengine_stonith_callback:
call=-975, optype=1, node_name=oss1, result=2, node_list=,
action=17:1023:0:4e12e206-e0be-4915-bfb8-b4e052057f01
Oct 18 08:54:23 mds1 crmd: [4650]: ERROR: tengine_stonith_callback:
Stonith of oss1 failed (2)... aborting transition.
Oct 18 08:54:23 mds1 crmd: [4650]: info: abort_transition_graph:
tengine_stonith_callback:402 - Triggered transition abort (complete=0) :
Stonith failed
Oct 18 08:54:23 mds1 crmd: [4650]: info: update_abort_priority: Abort
priority upgraded from 0 to 100
Oct 18 08:54:23 mds1 crmd: [4650]: info: update_abort_priority: Abort
action done superceeded by restart
Oct 18 08:54:23 mds1 crmd: [4650]: info: run_graph:

Oct 18 08:54:23 mds1 crmd: [4650]: notice: run_graph: Transition 1023
(Complete=2, Pending=0, Fired=0, Skipped=7, Incomplete=0,
Source=/var/lib/pengine/pe-warn-5799.bz2): Stopped
Oct 18 08:54:23 mds1 crmd: [4650]: info: te_graph_trigger: Transition
1023 is now complete
Oct 18 08:54:23 mds1 crmd: [4650]: info: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_FSA_INTERNAL origin=notify_crmd ]
Oct 18 08:54:23 mds1 crmd: [4650]: info: do_state_transition: All 1
cluster nodes are eligible to run resources.
Oct 18 08:54:23 mds1 crmd: [4650]: info: do_pe_invoke: Query 1307:
Requesting the current CIB: S_POLICY_ENGINE
Oct 18 08:54:23 mds1 crmd: [4650]: info: do_pe_invoke_callback: Invoking
the PE: query=1307, ref=pe_calc-dc-1318942463-1164, seq=16860, quorate=0
Oct 18 08:54:23 mds1 pengine: [4649]: notice: unpack_config: On loss of
CCM Quorum: Ignore
Oct 18 08:54:23 mds1 pengine: [4649]: info: unpack_config: Node scores:
'red' = -INFINITY, 'yellow' = 0, 'green' = 0
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: pe_fence_node: Node oss1
will be fenced because it is un-expectedly down
Oct 18 08:54:23 mds1 pengine: [4649]: info:
determine_online_status_fencing: #011ha_state=active, ccm_state=false,
crm_state=online, join_state=pending, expected=member
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: determine_online_status:
Node oss1 is unclean
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: pe_fence_node: Node mds2
will be fenced because it is un-expectedly down
Oct 18 08:54:23 mds1 pengine: [4649]: info:
determine_online_status_fencing: #011ha_state=active, ccm_state=false,
crm_state=online, join_state=pending, expected=member
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: determine_online_status:
Node mds2 is unclean
Oct 18 08:54:23 mds1 pengine: [4649]: info:
determine_online_status_fencing: Node oss2 is down
Oct 18 08:54:23 mds1 pengine: [4649]: info: determine_online_status:
Node mds1 is online
Oct 18 08:54:23 mds1 pengine: [4649]: notice: native_print:
MGS_2#011(ocf::hydra:Target):#011Started mds1
Oct 18 08:54:23 mds1 pengine: [4649]: notice: native_print:
testfs-MDT_3#011(ocf::hydra:Target):#011Started mds2
Oct 18 08:54:23 mds1 pengine: [4649]: notice: native_print:
testfs-OST_4#011(ocf::hydra:Target):#011Started oss1
Oct 18 08:54:23 mds1 pengine: [4649]: notice: clone_print:  Clone Set:
fencing
Oct 18 08:54:23 mds1 pengine: [4649]: notice: short_print:  Stopped:
[ st-pm:0 st-pm:1 st-pm:2 st-pm:3 ]
Oct 18 08:54:23 mds1 pengine: [4649]: info: get_failcount:
testfs-MDT_3 has failed 10 times on mds1
Oct 18 08:54:23 mds1 pengine: [4649]: notice: common_apply_stickiness:
testfs-MDT_3 can fail 90 more times on mds1 before being forced off
Oct 18 08:54:23 mds1 pengine: [4649]: info: native_color: Resource
testfs-OST_4 cannot run anywhere
Oct 18 08:54:23 mds1 pengine: [4649]: info: native_color: Resource
st-pm:0 cannot run anywhere
Oct 18 08:54:23 mds1 pengine: [4649]: info: native_color: Resource
st-pm:1 cannot run anywhere
Oct 18 08:54:23 mds1 pengine: [4649]: info: native_color: Resource
st-pm:2 cannot run anywhere
Oct 18 08:54:23 mds1 pengine: [4649]: info: native_color: Resource
st-pm:3 cannot run anywhere
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: custom_action: Action
testfs-MDT_3_stop_0 on mds2 is unrunnable (offline)
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: custom_action: Marking node
mds2 unclean
Oct 18 08:54:23 mds1 pengine: [4649]: notice: RecurringOp:  Start
recurring monitor (120s) for testfs-MDT_3 on mds1
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: custom_action: Action
testfs-OST_4_stop_0 on oss1 is unrunnable (offline)
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: custom_action: Marking node
oss1 unclean
Oct 18 08:54:23 mds1 pengine: [4649]: WARN: stage6:

[Pacemaker] concurrent uses of cibadmin: Signon to CIB failed: connection failed

2011-09-29 Thread Brian J. Murrell

So, in another thread there was a discussion of using cibadmin to
mitigate possible concurrency issue of crm shell.  I have written a test
program to test that theory and unfortunately cibadmin falls down in the
face of heavy concurrency also with errors such as:

Signon to CIB failed: connection failed
Init failed, could not perform requested operations
Signon to CIB failed: connection failed
Init failed, could not perform requested operations
Signon to CIB failed: connection failed
Init failed, could not perform requested operations

Effectively my test runs:

for x in $(seq 1 50); do
cibadmin -o resources -C -x resource-$x.xml &
done

My complete test program is attached for review/experimentation if you wish.

Am I doing something wrong or is this a bug?  I'm using pacemaker
1.0.10-1.4.el5 for what it's worth.

Cheers,
b.
#!/bin/bash

set -e

trap 'echo "got an error"' ERR

add_resource() {
local num="$1"
local add_constraints=${2:-false}
local use_crm=${3:-false}

cat < /tmp/resource-$num.xml

  

  
  



  
  

  

EOF

if ! $use_crm; then
cibadmin -o resources -C -x /tmp/resource-$num.xml
if $add_constraints; then
cibadmin -o constraints -C -X ""
fi
else
crm configure primitive resource-$num ocf:foo:Target meta \
target-role="stopped" operations \$id="resource-$num-operations" \
op monitor interval="120" timeout="60" op start interval="0" \
timeout="300" op stop interval="0" timeout="300" params \
target="resource-$num"
fi

}

remove_resource() {
local num="$1"

cibadmin -D -X ""
cibadmin -D -X ""

}

CLEAN=${CLEAN:-false}
USE_CRM=${USE_CRM:-false}
ADD_CONSTRAINTS=${ADD_CONSTRAINTS:-false}
CONCURRENT=${CONCURRENT:-false}

# not interested in conncurrent cleaning at this time
if $CLEAN; then
CONCURRENT=false
fi

for x in $(seq 1 50); do
if $CLEAN; then
remove_resource $x $AMP
else
if $CONCURRENT; then
add_resource $x $USE_CRM $ADD_CONSTRAINTS &
else
add_resource $x $USE_CRM $ADD_CONSTRAINTS 
fi
fi
done
if $CONCURRENT; then
echo "waiting: $(date)"
wait
echo "done: $(date)"
fi


signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering

2011-09-28 Thread Brian J. Murrell

On 11-09-28 10:20 AM, Dejan Muhamedagic wrote:
> Hi,

Hi,

> I'm really not sure. Need to investigate this area more.

Well, I am experimenting with cibadmin.  It's certainly not as nice and
shiny as crm shell though.  :-)

> cibadmin talks to the cib (the process) and cib should allow
> only one writer at the time.

Good.  That's needed of course.  But what does it do with other
attempting writers?  Do they block until the CIB is available to write
or do they turn their attempted writers away in error?

> The shell keeps the changes in its memory until the user says
> commit (or if it's a single-shot configure command). Just before
> doing the commit, it checks (using cibadmin) if the CIB changed
> in the meantime (i.e. since it was last time loaded or refreshed
> in crm) and if so it refuses to commit changes.

A.

> That is,
> _unless_ it is forced to do so. So, if you use the -F option,
> one crm instance is likely to override changes of another crm
> instance or, for that matter, of anybody else.

But is crm writing (i.e. replacing) entire CIBs or just updating
fragments of it, like the resources and constraints, etc. it's being
asked to operate on by the user?

If the the latter, then two crm instances that are forced to write
non-overlapping fragments should result in both being successful, if the
cib is locking out concurrent cibadmin writers the way it should be, yes?

> In short, having more than one crm instance trying to modify the
> configuration simultaneously probably won't give good results.

As long as they are making non-colliding changes, shouldn't they both be
successful?

> And the matter is simple: If the cluster CIB changed since the
> crm itself accepted configuration modifications, there's no way
> to say which changes should take precedence and there's no
> obvious way to merge the changes coming from two different
> sources.

Indeed, assuming they conflict.  But if they don't, there shouldn't be
any problem with two crms working on independent resources and
constraints, yes?

> What's your use case?

We're using tools to drive HA configuration where those tools go out to
the various nodes in the cluster and perform configuration tasks,
possibly and probably in parallel, one of which is to issue the crm
commands to configure the resources and constraints that that node will
primarily be responsible for.

b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Concurrent runs of 'crm configure primitive' interfering

2011-09-28 Thread Brian J. Murrell

On 11-09-16 11:14 AM, Dejan Muhamedagic wrote:
> On Thu, Sep 08, 2011 at 03:41:42PM +0100, John Spray wrote:
> 
>>  * Is there another way of adding resources which would be safe when
>> run concurrently?
> 
> cibadmin.

But doesn't crm use cibadmin itself and if so, shouldn't whatever
benefits of using cibadmin directly filter up to crm shell?  Put another
way, if crm shell is just using cibadmin, isn't it likely that cibadmin
will exhibit the same concurrency issue?

> This sounds to me like a deficiency in the crm shell.

Of course I'm still willing to believe that.  I just wonder what crm
shell could be doing that is failing with concurrent commands that
cibadmin will not fail with also.

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Call cib_modify failed (-22): The object/attribute does not exist

2011-09-26 Thread Brian J. Murrell

On 11-09-25 09:21 PM, Andrew Beekhof wrote:
> 
> As the error says, the resource R_10.10.10.101 doesn't exist yet.
> Put it in a  tag or use -C instead of -U

Thanks much.  I already replied to Tim, but the summary is that the
manpage is incorrect in two places.  One is specifying the attributes
tag in the fragment it wants to add in the EXAMPLES section and the
second is that it's suggesting to use -U to "add" resources.

Cheers and thanks,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] Call cib_modify failed (-22): The object/attribute does not exist

2011-09-26 Thread Brian J. Murrell

On 11-09-26 03:44 AM, Tim Serong wrote:
> 
> Because:
> 
> 1) You need to run "cibadmin -o resources -C -x test.xml" to create the
>resource (-C creates, -U updates an existing resource).

That's what I thought/wondered but the EXAMPLES section in the manpage
is quite clear that it's asking one to use "-U" to "add an IPaddr2
resource to the resources  section", and then -C wouldn't work either
because...

> 2) Even if you use -C, it will probably still fail due to a schema
>violation,

Indeed, which was pushing me back to -U as I was interpreting the schema
violation error as telling me that -C wanted an entire CIB XML file, not
just a resource fragment.

> because the  element is bogus (apparently the
>cibadmin man page needs tweaking).

Very much so.

> Try:
> 
>   provider="heartbeat">
> 
>   
>   
> 
>   

Indeed, that works wonderfully.

> Better yet, use the crm shell instead of cibadmin, and you can forget
> about the XML :)

I so wish.  I have already gone down that path but ran into
http://permalink.gmane.org/gmane.linux.highavailability.pacemaker/10911
so now I am testing the recommended use of cibadmin instead of crm shell.

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] Call cib_modify failed (-22): The object/attribute does not exist

2011-09-24 Thread Brian J. Murrell

Using pacemaker-1.0.10-1.4.el5 I am trying to add the "R_10.10.10.101"
IPaddr2 example resource:


 
  
   
   
  
 


from the cibadmin manpage under EXAMPLES and getting:

# cibadmin -o resources -U -x test.xml
Call cib_modify failed (-22): The object/attribute does not exist


Any ideas why?

Thanx,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Re: [Pacemaker] resource stickiness and preventing stonith on failback

2011-09-19 Thread Brian J. Murrell

On 11-09-19 11:02 PM, Andrew Beekhof wrote:
> On Wed, Aug 24, 2011 at 6:56 AM, Brian J. Murrell 
>  wrote:
>>
>> 2. preventing the active node from being STONITHed when the resource
>>   is moved back to it's failed-and-restored node after a failover.
>>   IOW: BAR1 is available on foo1, which fails and the resource is moved
>>   to foo2.  foo1 returns and the resource is failed back to foo1, but
>>   in doing that foo2 is STONITHed.
>>
>> As for #2, the issue with STONITHing foo2 when failing back to foo1 is
>> that foo1 and foo2 are an active/active pair of servers.  STONITHing
>> foo2 just to restore foo1's services puts foo2's services out of service,
>>
>> I do want a node that is believed to be dead to be STONITHed before it's
>> resource(s) are failed over though.
> 
> Thats a great way to ensure your data gets trashed.

What's that?

> If the "node that is believed to be dead" isn't /actually/ dead,
> you'll have two nodes running the same resources and writing to the
> same files.

Where did I say I wanted a node that was believed to be dead not to be
STONITHed before another node takes over the resource?  I actually said
(I left it in the quoted portion above if you want to go back and read
it) "I do want a node that is believed to be dead to be STONITHed before
it's resource(s) are failed over though."

The node I don't want STONITHed is the failover node that is alive and
well and can be told to release the resource cleanly and can confirm its
release.  This is the node in the active/active pair (i.e. a pair that
each serve half of the resources) that is currently running all of the
resources due to it's partner having failed.  Of course I don't want
this node's resources to have to be interrupted just because the failed
node has come back.

And it all does seem to work that way, FWIW.  I'm not sure why my
earlier experiments didn't bear that out.

Cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] is a single node cluster possible?

2011-08-31 Thread Brian J. Murrell

I have a need to create single node clusters with pacemaker.

Crazy you might say.  It does seem crazy at first but there are two
drivers for this:

The first is testing.  I want to write a single code path for
controlling the starting and stopping of resources in larger, real,
multi-node clusters controlled with pacemaker.

The second, more practical reason is that we want to be able to deploy
machines that are scalable from 1 upwards, without having to completely
reconfigure the way services are started and stopped when the deployment
grows from their single node to multiple nodes for the same set of
services.  Again, having a single code path for all of this is desirable.

I noted this thread back in 2008:
http://oss.clusterlabs.org/pipermail/pacemaker/2008-August/000281.html

At that time it was theoretically possible with a "null" comm plugin
which would be used instead of talking to HB or openais.

Has anything changed on this front?

Has anyone used pacemaker in this configuration who might want to share
their recipe(s) with me?

Thanks and cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] property default-resource-stickiness vs. rsc_defaults resource-stickiness

2011-08-25 Thread Brian J. Murrell

I've seen both of setting a default-resource-stickiness property (i.e.
http://www.howtoforge.com/installation-and-setup-guide-for-drbd-openais-pacemaker-xen-on-opensuse-11.1)
and a rsc_defaults option with resource-stickiness
(http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch05s03s02.html)
as solutions to preventing auto-failback.

I wonder what the difference between the two is and which one is
considered better/more correct.

Cheers,
b.



signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

[Pacemaker] resource stickiness and preventing stonith on failback

2011-08-23 Thread Brian J. Murrell

Hi All,

I am trying to configure pacemaker (1.0.10) to make a single filesystem
highly available by two nodes (please don't be distracted by the dangers
of multiply mounted filesystems and clustering filesystems, etc., as I
am absolutely clear about that -- consider that I am using a filesystem
resource as just an example if you wish). Here is my filesystem
resource description:

node foo1
node foo2 \
attributes standby="off"
primitive OST1 ocf:heartbeat:Filesystem \
meta target-role="Started" \
operations $id="BAR1-operations" \
op monitor interval="120" timeout="60" \
op start interval="0" timeout="300" \
op stop interval="0" timeout="300" \
params device="/dev/disk/by-uuid/8c500092-5de6-43d7-b59a-ef91fa9667b9"
directory="/mnt/bar1" fstype="ext3"
primitive st-pm stonith:external/powerman \
params serverhost="192.168.122.1:10101" poweroff="0"
clone fencing st-pm
property $id="cib-bootstrap-options" \
dc-version="1.0.10-da7075976b5ff0bee71074385f8fd02f296ec8a3" \
cluster-infrastructure="openais" \
expected-quorum-votes="1" \
no-quorum-policy="ignore" \
last-lrm-refresh="1306783242" \
default-resource-stickiness="1000"
rsc_defaults $id="rsc-options" \
resource-stickiness="100"

The two problems I have run into are:

1. preventing the resource from failing back to the node it was
previously on after it has failed over and the previous node has
been restored. Basically what's documented at

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch05s03s02.html

2. preventing the active node from being STONITHed when the resource
is moved back to it's failed-and-restored node after a failover.
IOW: BAR1 is available on foo1, which fails and the resource is moved
to foo2. foo1 returns and the resource is failed back to foo1, but
in doing that foo2 is STONITHed.

For #1, as you can see, I tried setting the default resource stickiness
to 100. That didn't seem to work. When I stopped corosync on the
active node, the service failed over but it promptly failed back when I
started corosync again, contrary to the example on the referenced URL.

Subsequently I (think I) tried adding the specific resource stickiness
of 1000. That didn't seem to help either.

As for #2, the issue with STONITHing foo2 when failing back to foo1 is
that foo1 and foo2 are an active/active pair of servers. STONITHing
foo2 just to restore foo1's services puts foo2's services out of service,

I do want a node that is believed to be dead to be STONITHed before it's
resource(s) are failed over though.

Any hints on what I am doing wrong?

Thanx and cheers,
b.

signature.asc
Description: OpenPGP digital signature
___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

76 matches

Mail list logo