Hi Andrew,
did you already find the time to have a look at the pe-input-59.bz2
file?
Thank you.
Regards,
Thomas
Am 2013-08-28 06:25, schrieb Thomas Schulte:
Hi Andrew,
thank you! The latest file is attached (pe-input-59.bz2).
Regards,
Thomas
Am 28.08.2013 um 01:40 schrieb Andrew Beekhof <[email protected]>:
On 27/08/2013, at 8:54 PM, Thomas Schulte <[email protected]> wrote:
Hi list,
I'm experiencing a strange problem an I can't figure out what's wrong.
I'm running openSUSE 12.3 on a 2-node-cluster with
pacemaker-1.1.9-55.2.x86_64.
Every 15 minutes the following messages are logged. If stonith is
enabled, my hosts get fenced afterwards.
The messages appeared on both nodes until I temporarily activated
stonith.
After some fencing, now only the second node "s00202" keeps logging
these (both nodes online again):
Aug 27 10:21:13 s00202 crmd[3390]: notice: do_state_transition: State
transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC
cause=C_TIMER_POPPED origin=crm_timer_popped ]
Aug 27 10:21:13 s00202 pengine[3389]: notice: unpack_config: On loss
of CCM Quorum: Ignore
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_vmail and ms_drbd_vmail are both
allocated but to different nodes: s00202 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_www and ms_drbd_www are both
allocated but to different nodes: s00201 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_www and ms_drbd_www are both
allocated but to different nodes: s00201 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_mysql and ms_drbd_mysql are both
allocated but to different nodes: s00201 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_redis and ms_drbd_redis are both
allocated but to different nodes: s00201 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_bind and ms_drbd_bind are both
allocated but to different nodes: s00202 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_squid and ms_drbd_squid are both
allocated but to different nodes: s00202 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_www and ms_drbd_www are both
allocated but to different nodes: s00201 vs. n/a
Aug 27 10:21:13 s00202 pengine[3389]: last message repeated 3 times
Aug 27 10:21:13 s00202 crmd[3390]: notice: do_te_invoke: Processing
graph 61 (ref=pe_calc-dc-1377591673-1345) derived from
/var/lib/pacemaker/pengine/pe-input-53.bz2
Aug 27 10:21:13 s00202 crmd[3390]: notice: run_graph: Transition 61
(Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0,
Source=/var/lib/pacemaker/pengine/pe-input-53.bz2): Complete
Aug 27 10:21:13 s00202 crmd[3390]: notice: do_state_transition: State
transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS
cause=C_FSA_INTERNAL origin=notify_crmd ]
Aug 27 10:21:13 s00202 pengine[3389]: notice: process_pe_message:
Calculated Transition 61: /var/lib/pacemaker/pengine/pe-input-53.bz2
Could you attach this file please?
I'll be able to see if the current version behaves any better.
I believe that the errors about filter_colocation_contraint all have
the same cause, so I'm going to reduce the problem to the following:
Aug 27 10:21:13 s00202 pengine[3389]: error:
filter_colocation_constraint: pri_fs_vmail and ms_drbd_vmail are both
allocated but to different nodes: s00202 vs. n/a
This is my configuration (relevant extract):
# crm configure show
node s00201
node s00202
primitive pri_drbd_vmail ocf:linbit:drbd \
operations $id="pri_drbd_vmail-operations" \
op monitor interval="20" role="Slave" timeout="20" \
op monitor interval="10" role="Master" timeout="20" \
params drbd_resource="vmail"
primitive pri_fs_vmail ocf:heartbeat:Filesystem \
params device="/dev/drbd5" fstype="ext4" directory="/var/vmail" \
op monitor interval="30"
primitive pri_ip_vmail ocf:heartbeat:IPaddr2 \
operations $id="pri_ip_vmail-operations" \
op monitor interval="10s" timeout="20s" \
params ip="10.0.1.105" nic="br0"
primitive pri_svc_postfix ocf:heartbeat:postfix \
operations $id="pri_svc_postfix-operations" \
op monitor interval="60s" timeout="20s" \
params config_dir="/etc/postfix_vmail"
group grp_vmail pri_fs_vmail pri_ip_vmail pri_svc_postfix \
meta target-role="Started"
ms ms_drbd_vmail pri_drbd_vmail \
meta notify="true" target-role="Started" master-max="1"
is-managed="true"
colocation col_grp_vmail_ON_drbd_vmail inf: grp_vmail:Started
ms_drbd_vmail:Master
order ord_ms_drbd_vmail_BEFORE_grp_vmail inf: ms_drbd_vmail:promote
grp_vmail:start
property $id="cib-bootstrap-options" \
no-quorum-policy="ignore" \
placement-strategy="balanced" \
dc-version="1.1.9-2db99f1" \
cluster-infrastructure="classic openais (with plugin)" \
expected-quorum-votes="2" \
last-lrm-refresh="1377585341" \
stonith-enabled="false"
rsc_defaults $id="rsc-options" \
resource-stickiness="100" \
migration-threshold="3"
op_defaults $id="op-options" \
timeout="600" \
record-pending="false"
# crm_resource -L
Master/Slave Set: ms_drbd_vmail [pri_drbd_vmail]
Masters: [ s00202 ]
Slaves: [ s00201 ]
Resource Group: grp_vmail
pri_fs_vmail (ocf::heartbeat:Filesystem): Started
pri_ip_vmail (ocf::heartbeat:IPaddr2): Started
pri_svc_postfix (ocf::heartbeat:postfix): Started
# ptest -Ls | egrep "pri_fs_vmail|ms_drbd_vmail|grp_vmail"
group_color: grp_vmail allocation score on s00201: 0
group_color: grp_vmail allocation score on s00202: 0
group_color: pri_fs_vmail allocation score on s00201: 0
group_color: pri_fs_vmail allocation score on s00202: 100
clone_color: ms_drbd_vmail allocation score on s00201: 0
clone_color: ms_drbd_vmail allocation score on s00202: 600
native_color: pri_fs_vmail allocation score on s00201: -INFINITY
native_color: pri_fs_vmail allocation score on s00202: 10700
Besides these error messages, both hosts are working fine (stonith
disabled).
Starting/stopping or migrating resources is possible without problems,
too.
I have no idea what's wrong and Google didn't help me in this case.
Maybe someone here is able to help me out?
Thank you.
Regards,
Thomas
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems