Is "ttl 1" a good idea for a public network? >>> Denis Gribkov <[email protected]> schrieb am 21.02.2017 um 18:26 in Nachricht <[email protected]>: > Hi Everyone. > > I have 16-nodes asynchronous cluster configured with Corosync redundant > ring feature. > > Each node has 2 similarly connected/configured NIC's. One NIC connected > to the public network, > > another one to our private VLAN. When I checked Corosync rings > operability I found: > > # corosync-cfgtool -s > Printing ring status. > Local node ID 1 > RING ID 0 > id = 192.168.1.54 > status = Marking ringid 0 interface 192.168.1.54 FAULTY > RING ID 1 > id = 111.11.11.1 > status = ring 1 active with no faults > > After some time of digging into I identified that if I enable back the > failed ring with command: > > # corosync-cfgtool -r > > RING ID 0 will be marked as "active" for few minutes, but after it > marked permanently as faulty. > > Log has no any useful info, just single message: > > corosync[21740]: [TOTEM ] Marking ringid 0 interface 192.168.1.54 FAULTY > > And no any message like: > > [TOTEM ] Automatically recovered ring 1 > > > My corosync.conf looks like: > > compatibility: whitetank > > totem { > version: 2 > secauth: on > threads: 4 > rrp_mode: passive > > interface { > > member { > memberaddr: PRIVATE_IP_1 > } > > ... > > member { > memberaddr: PRIVATE_IP_16 > } > > ringnumber: 0 > bindnetaddr: PRIVATE_NET_ADDR > mcastaddr: 226.0.0.1 > mcastport: 5505 > ttl: 1 > } > > interface { > > member { > memberaddr: PUBLIC_IP_1 > } > ... > > member { > memberaddr: PUBLIC_IP_16 > } > > ringnumber: 1 > bindnetaddr: PUBLIC_NET_ADDR > mcastaddr: 224.0.0.1 > mcastport: 5405 > ttl: 1 > } > > transport: udpu > > logging { > to_stderr: no > to_logfile: yes > logfile: /var/log/cluster/corosync.log > logfile_priority: info > to_syslog: yes > syslog_priority: warning > debug: on > timestamp: on > } > > I had tried to change rrp_mode, mcastaddr/mcastport for ringnumber: 0, > but result was the similar. > > I checked multicast/unicast operability using omping utility and didn't > found any issues. > > Also no errors on our private VLAN was found for network equipment. > > Why Corosync decided to disable permanently second ring? How I can debug > the issue? > > Other properties: > > Corosync Cluster Engine, version '1.4.7' > > Pacemaker properties: > cluster-infrastructure: cman > cluster-recheck-interval: 5min > dc-version: 1.1.14-8.el6-70404b0 > expected-quorum-votes: 3 > have-watchdog: false > last-lrm-refresh: 1484068350 > maintenance-mode: false > no-quorum-policy: ignore > pe-error-series-max: 1000 > pe-input-series-max: 1000 > pe-warn-series-max: 1000 > stonith-action: reboot > stonith-enabled: false > symmetric-cluster: false > > Thank you. > > -- > Regards Denis Gribkov
_______________________________________________ Users mailing list: [email protected] http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
