According to the https://access.redhat.com/solutions/638843 , the interface, that is defined in the corosync.conf, must be present in the system (see at the bottom of the article, section "ROOT CAUSE"). To confirm that I made a couple of tests.
Here is a part of the corosync.conf file (in a free-write form) (also attached the origin config file): =============================== rrp_mode: passive ring0_addr is defined in corosync.conf ring1_addr is defined in corosync.conf =============================== ------------------------------- Two-node cluster ------------------------------- Test #1: -------------------------------------------------- IP for ring0 is not defines in the system: -------------------------------------------------- Start Corosync simultaneously on both nodes. Corosync fails to start. >From the logs: Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] parse error in config: No interfaces defined Jan 08 09:43:56 [2992] A6-402-2 corosync error [MAIN ] Corosync Cluster Engine exiting with status 8 at main.c:1343. Result: Corosync and Pacemaker are not running. Test #2: -------------------------------------------------- IP for ring1 is not defines in the system: -------------------------------------------------- Start Corosync simultaneously on both nodes. Corosync starts. Start Pacemaker simultaneously on both nodes. Pacemaker fails to start. >From the logs, the last writes from the "corosync": Jan 8 16:31:29 daemon.err<27> corosync[3728]: [TOTEM ] Marking ringid 0 interface 169.254.1.3 FAULTY Jan 8 16:31:30 daemon.notice<29> corosync[3728]: [TOTEM ] Automatically recovered ring 0 Result: Corosync and Pacemaker are not running. Test #3: "rrp_mode: active" leads to the same result, except Corosync and Pacemaker init scripts return status "running". But still "vim /var/log/cluster/corosync.log" shows a lot of errors like: Jan 08 16:30:47 [4067] A6-402-1 cib: error: pcmk_cpg_dispatch: Connection to the CPG API failed: Library error (2) Result: Corosync and Pacemaker show their statuses as "running", but "crm_mon" cannot connect to the cluster database. And half of the Pacemaker's services are not running (including Cluster Information Base (CIB)). ------------------------------- For a single node mode ------------------------------- IP for ring0 is not defines in the system: Corosync fails to start. IP for ring1 is not defines in the system: Corosync and Pacemaker are started. It is possible that configuration will be applied successfully (50%), and it is possible that the cluster is not running any resources, and it is possible that the node cannot be put in a standby mode (shows: communication error), and it is possible that the cluster is running all resources, but applied configuration is not guaranteed to be fully loaded (some rules can be missed). ------------------------------- Conclusions: ------------------------------- It is possible that in some rare cases (see comments to the bug) the cluster will work, but in that case its working state is unstable and the cluster can stop working every moment. So, is it correct? Does my assumptions make any sense? I didn't any other explanation in the network ... . Thank you, Kostya On Fri, Jan 9, 2015 at 11:10 AM, Kostiantyn Ponomarenko < konstantin.ponomare...@gmail.com> wrote: > Hi guys, > > Corosync fails to start if there is no such network interface configured > in the system. > Even with "rrp_mode: passive" the problem is the same when at least one > network interface is not configured in the system. > > Is this the expected behavior? > I thought that when you use redundant rings, it is enough to have at least > one NIC configured in the system. Am I wrong? > > Thank you, > Kostya >
corosync.conf
Description: Binary data
_______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org