On Fri, Jul 16, 2010 at 5:44 PM, Guillaume Chanaud <guillaume.chan...@connecting-nature.com> wrote: > Hello,
[snip] > # Optionally assign a fixed node id (integer) > nodeid: 30283707487 In addition to changing the node's ip address, did you change this too? > interface { > ringnumber: 0 > bindnetaddr: 192.168.0.60 > mcastport: 5406 > mcastaddr: 225.0.0.1 > } > } > logging { > fileline: on > to_stderr: yes > to_logfile: yes > to_syslog: yes > logfile: /var/log/corosync.log > debug: off > timestamp: on > logger_subsys { > subsys: AMF > debug: off > } > } > amf { > mode: disabled > } > > the "bindnetaddress" change for every node, and the "nodeid" too (i tried to > put a fixed value to see if it was not a matter of auto generated node id > value, but it didn't change anything). > After some time, on the offline node, the only two processes running are : > /usr/lib64/heartbeat/lrmd > /usr/lib64/heartbeat/pengine Looks like corosync crashed. Possibly related to the duplicate nodeid? Was there a core file in /var/run/corosync? what does "ulimit -c" say? _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker