hi, after adding a second ring to corosync.conf the problem seems to be gone.
after killing corosync the node is fenced by the other node. after reboot the cluster is fully operational. is this essential to have at least 2 rings? maybe there is a network timing problem (but can't see error messages) the interface on ring 0 (192.168.20.171) is a bridge. the interface on ring 1 (10.10.10.171) is normal ethernet interface. regards ulrich [root@pcmk1 ~]# corosync-cfgtool -s Printing ring status. Local node ID -1424709440 RING ID 0 id = 192.168.20.171 status = ring 0 active with no faults RING ID 1 id = 10.10.10.171 status = ring 1 active with no faults On Tue, 2012-07-17 at 15:24 +0200, Ulrich Leodolter wrote: > hi, > > i have setup a very basic 2-node cluster on RHEL 6.3 > first thing i tried was to setup stonith/fencing_ipmilan > resource. > > fencing seems to work, if i kill corosync on one node > it is restarted (ipmi reboot) by the other node. > > but after restart the cluster doesn't come back to normal > operation, i looks like the pacemakerd hangs and the > node status is offline. > > i found only one way to fix the problem: > > killall -9 pacemakerd > service pacemakerd start > > after that both nodes are online. below you can see my > cluster configuration and the corosync.log messages which > repeat forever when pacemakerd hangs. > > i am new to pacemaker and followed the "Clusters from Scratch" > guide for the first setup. information about fence_ipmilan > is from google :-) > > can u give me tips ?? what is wrong with this basic cluster > config. i don't want to add more resources (kvm virtual > machines) until fencing is configured correctly. > > thx > ulrich > > > > [root@pcmk1 ~]# crm configure show > node pcmk1 \ > attributes standby="off" > node pcmk2 \ > attributes standby="off" > primitive p_stonith_pcmk1 stonith:fence_ipmilan \ > params auth="password" ipaddr="192.168.120.171" passwd="xxx" > lanplus="true" login="pcmk" timeout="20s" power_wait="5s" verbose="true" > pcmk_host_check="static-list" pcmk_host_list="pcmk1" \ > meta target-role="started" > primitive p_stonith_pcmk2 stonith:fence_ipmilan \ > params auth="password" ipaddr="192.168.120.172" passwd="xxx" > lanplus="true" login="pcmk" timeout="20s" power_wait="5s" verbose="true" > pcmk_host_check="static-list" pcmk_host_list="pcmk2" \ > meta target-role="started" > location loc_p_stonith_pcmk1_pcmk1 p_stonith_pcmk1 -inf: pcmk1 > location loc_p_stonith_pcmk2_pcmk2 p_stonith_pcmk2 -inf: pcmk2 > property $id="cib-bootstrap-options" \ > expected-quorum-votes="2" \ > dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ > no-quorum-policy="ignore" \ > cluster-infrastructure="openais" > rsc_defaults $id="rsc-options" \ > resource-stickiness="200" > > > /var/log/cluster/corosync.log: > > Jul 13 11:29:41 [1859] pcmk2 crmd: info: do_dc_release: DC > role released > Jul 13 11:29:41 [1859] pcmk2 crmd: info: do_te_control: > Transitioner is now inactive > Jul 13 11:29:41 [1854] pcmk2 cib: info: set_crm_log_level: New > log level: 3 0 > Jul 13 11:30:01 [1859] pcmk2 crmd: info: crm_timer_popped: > Election Trigger (I_DC_TIMEOUT) just popped (20000ms) > Jul 13 11:30:01 [1859] pcmk2 crmd: warning: do_log: FSA: Input > I_DC_TIMEOUT from crm_timer_popped() received in state S_PENDING > Jul 13 11:30:01 [1859] pcmk2 crmd: notice: do_state_transition: > State transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT > cause=C_TIMER_POPPED origin=crm_timer_poppe > d ] > Jul 13 11:30:01 [1859] pcmk2 crmd: info: do_election_count_vote: > Election 8 (owner: pcmk1) lost: vote from pcmk1 (Uptime) > Jul 13 11:30:01 [1859] pcmk2 crmd: notice: do_state_transition: > State transition S_ELECTION -> S_PENDING [ input=I_PENDING > cause=C_FSA_INTERNAL origin=do_election_count_ > vote ] > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org