On 27/08/2013, at 8:54 PM, Thomas Schulte <[email protected]> wrote:
> Hi list, > > I'm experiencing a strange problem an I can't figure out what's wrong. I'm > running openSUSE 12.3 on a 2-node-cluster with pacemaker-1.1.9-55.2.x86_64. > > Every 15 minutes the following messages are logged. If stonith is enabled, my > hosts get fenced afterwards. > The messages appeared on both nodes until I temporarily activated stonith. > After some fencing, now only the second node "s00202" keeps logging these > (both nodes online again): > > Aug 27 10:21:13 s00202 crmd[3390]: notice: do_state_transition: State > transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED > origin=crm_timer_popped ] > Aug 27 10:21:13 s00202 pengine[3389]: notice: unpack_config: On loss of CCM > Quorum: Ignore > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_vmail and ms_drbd_vmail are both allocated but to different nodes: > s00202 vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_www and ms_drbd_www are both allocated but to different nodes: s00201 > vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_www and ms_drbd_www are both allocated but to different nodes: s00201 > vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_mysql and ms_drbd_mysql are both allocated but to different nodes: > s00201 vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_redis and ms_drbd_redis are both allocated but to different nodes: > s00201 vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_bind and ms_drbd_bind are both allocated but to different nodes: > s00202 vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_squid and ms_drbd_squid are both allocated but to different nodes: > s00202 vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_www and ms_drbd_www are both allocated but to different nodes: s00201 > vs. n/a > Aug 27 10:21:13 s00202 pengine[3389]: last message repeated 3 times > Aug 27 10:21:13 s00202 crmd[3390]: notice: do_te_invoke: Processing graph > 61 (ref=pe_calc-dc-1377591673-1345) derived from > /var/lib/pacemaker/pengine/pe-input-53.bz2 > Aug 27 10:21:13 s00202 crmd[3390]: notice: run_graph: Transition 61 > (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, > Source=/var/lib/pacemaker/pengine/pe-input-53.bz2): Complete > Aug 27 10:21:13 s00202 crmd[3390]: notice: do_state_transition: State > transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS > cause=C_FSA_INTERNAL origin=notify_crmd ] > Aug 27 10:21:13 s00202 pengine[3389]: notice: process_pe_message: > Calculated Transition 61: /var/lib/pacemaker/pengine/pe-input-53.bz2 Could you attach this file please? I'll be able to see if the current version behaves any better. > > > I believe that the errors about filter_colocation_contraint all have the same > cause, so I'm going to reduce the problem to the following: > > Aug 27 10:21:13 s00202 pengine[3389]: error: filter_colocation_constraint: > pri_fs_vmail and ms_drbd_vmail are both allocated but to different nodes: > s00202 vs. n/a > > > This is my configuration (relevant extract): > > # crm configure show > > node s00201 > node s00202 > primitive pri_drbd_vmail ocf:linbit:drbd \ > operations $id="pri_drbd_vmail-operations" \ > op monitor interval="20" role="Slave" timeout="20" \ > op monitor interval="10" role="Master" timeout="20" \ > params drbd_resource="vmail" > primitive pri_fs_vmail ocf:heartbeat:Filesystem \ > params device="/dev/drbd5" fstype="ext4" directory="/var/vmail" \ > op monitor interval="30" > primitive pri_ip_vmail ocf:heartbeat:IPaddr2 \ > operations $id="pri_ip_vmail-operations" \ > op monitor interval="10s" timeout="20s" \ > params ip="10.0.1.105" nic="br0" > primitive pri_svc_postfix ocf:heartbeat:postfix \ > operations $id="pri_svc_postfix-operations" \ > op monitor interval="60s" timeout="20s" \ > params config_dir="/etc/postfix_vmail" > group grp_vmail pri_fs_vmail pri_ip_vmail pri_svc_postfix \ > meta target-role="Started" > ms ms_drbd_vmail pri_drbd_vmail \ > meta notify="true" target-role="Started" master-max="1" is-managed="true" > colocation col_grp_vmail_ON_drbd_vmail inf: grp_vmail:Started > ms_drbd_vmail:Master > order ord_ms_drbd_vmail_BEFORE_grp_vmail inf: ms_drbd_vmail:promote > grp_vmail:start > property $id="cib-bootstrap-options" \ > no-quorum-policy="ignore" \ > placement-strategy="balanced" \ > dc-version="1.1.9-2db99f1" \ > cluster-infrastructure="classic openais (with plugin)" \ > expected-quorum-votes="2" \ > last-lrm-refresh="1377585341" \ > stonith-enabled="false" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" \ > migration-threshold="3" > op_defaults $id="op-options" \ > timeout="600" \ > record-pending="false" > > > # crm_resource -L > > Master/Slave Set: ms_drbd_vmail [pri_drbd_vmail] > Masters: [ s00202 ] > Slaves: [ s00201 ] > Resource Group: grp_vmail > pri_fs_vmail (ocf::heartbeat:Filesystem): Started > pri_ip_vmail (ocf::heartbeat:IPaddr2): Started > pri_svc_postfix (ocf::heartbeat:postfix): Started > > > # ptest -Ls | egrep "pri_fs_vmail|ms_drbd_vmail|grp_vmail" > > group_color: grp_vmail allocation score on s00201: 0 > group_color: grp_vmail allocation score on s00202: 0 > group_color: pri_fs_vmail allocation score on s00201: 0 > group_color: pri_fs_vmail allocation score on s00202: 100 > clone_color: ms_drbd_vmail allocation score on s00201: 0 > clone_color: ms_drbd_vmail allocation score on s00202: 600 > native_color: pri_fs_vmail allocation score on s00201: -INFINITY > native_color: pri_fs_vmail allocation score on s00202: 10700 > > > Besides these error messages, both hosts are working fine (stonith disabled). > Starting/stopping or migrating resources is possible without problems, too. > > I have no idea what's wrong and Google didn't help me in this case. > Maybe someone here is able to help me out? > > > Thank you. > > Regards, > Thomas > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
