yes this is documented in various places but i'll make sure it makes it into the man pages
regards -steve On Mon, 2010-03-15 at 10:53 +0100, Colin wrote: > Hi All, > > in a test that we started last week we have two Pacemaker+Corosync > clusters, each with three hosts, where all six hosts are on the same > network(s). The two clusters are identically configured, with one > execption: the mcastport is 688 for one, and 689 for the other. > > This morning I found the clusters in a strange state, none of the > hosts could see any of the others, i.e. Pacemaker output was "as if" > Corosync wasn't running on the other nodes, although the network was > fine, as I could easily verify with a ping etc. > > I then noticed in the lsof output that Corosync seems to also use the > port below the configured mcastport, which leads me to my questions: > > Is this normal? It doesn't seem to be documented in > http://corosync.org/doku.php?id=faq:configure_openais and > corosync.conf(5). > Is this overlap created by the additional port a likely cause for the > cluster conking out? > > Thanks, Colin > > > PS: I'm in the process of trying to revive the cluster; > /etc/init.d/corosync stop didn't work, but a few "kill -9" and "rm -f > /var/lib/heartbeat/crm/*" commands later I'm up-and-running again on > 2x2 of the 2x3 nodes with the same config as previously, looking fine > so far... > > > r...@h001:~# dpkg -l | grep corosync > ii corosync > 1.2.0-0ubuntu1 Standards-based > cluster framework (daemon an > ii libcorosync4 > 1.2.0-0ubuntu1 Standards-based > cluster framework (libraries > r...@h001:~# cat /etc/corosync/corosync.conf > totem { > version: 2 > consensus: 1500 > vsftype: none > clear_node_high_bit: yes > secauth: off > threads: 0 > rrp_mode: passive > interface { > ringnumber: 0 > bindnetaddr: 192.168.50.32 > broadcast: yes > mcastport: 688 <=== 689 for the other cluster > } > interface { > ringnumber: 1 > bindnetaddr: 192.168.52.32 > broadcast: yes > mcastport: 688 <=== 689 for the other cluster > } > } > amf { > mode: disabled > } > service { > ver: 0 > name: pacemaker > } > aisexec { > user: root > group: root > } > logging { > fileline: off > to_stderr: yes > to_logfile: no > to_syslog: yes > syslog_facility: daemon > debug: on > timestamp: on > logger_subsys { > subsys: AMF > debug: off > tags: enter|leave|trace1|trace2|trace3|trace4|trace6 > } > } > r...@h001:~# lsof -n | grep corosync | grep UDP > corosync 17688 root 5u IPv4 89563 0t0 > UDP 255.255.255.255:688 > corosync 17688 root 6u IPv4 89564 0t0 > UDP 192.168.50.40:687 > corosync 17688 root 7u IPv4 89565 0t0 > UDP 192.168.50.40:688 > corosync 17688 root 8u IPv4 89612 0t0 > UDP 255.255.255.255:688 > corosync 17688 root 9u IPv4 89613 0t0 > UDP 192.168.52.40:687 > corosync 17688 root 10u IPv4 89614 0t0 > UDP 192.168.52.40:688 > r...@h001:~# > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
