07.09.2012 18:28, John White wrote: > An odd update to this. We run in a stateless environment (nodes are > pxe booted and have NFS roots, etc). Trying the same install on a VM > works just fine. I wonder if anyone has experience with pacemaker and > stateless nodes.
I run it with iso image loaded from PXE server to RAM. State data and cluster-wide configuration is on CIFS. Volatile RW data is on tmpfs. Probably you have some trouble with communication paths used for interconnection. Try to mount /var/run to tmpfs. Or where is that socket on linux? memset (&address, 0, sizeof (struct sockaddr_un)); address.sun_family = AF_UNIX; #if defined(COROSYNC_LINUX) sprintf (address.sun_path + 1, "%s", socket_name); #else sprintf (address.sun_path, "%s/%s", SOCKETDIR, socket_name); #endif >> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: init_ais_connection_classic: Connection to our AIS plugin (10) failed: Library error (2) It is ENOENT (2) /* No such file or directory */ Could you provide content of /proc/mounts? Vladislav > ---------------- > John White > HPC Systems Engineer > (510) 486-7307 > One Cyclotron Rd, MS: 50C-3209C > Lawrence Berkeley National Lab > Berkeley, CA 94720 > > On Sep 6, 2012, at 2:49 PM, John White <jwh...@lbl.gov> wrote: > >> Hello Folks, >> I'm having a very hard time getting a basic pacemaker setup going. >> I've gotten corosync up and running just fine from what i can tell, but once >> I start with pacemaker commands, I get CIB errors everywhere: >> >> -bash-4.1# crm configure >> Signon to CIB failed: connection failed >> Init failed, could not perform requested operations >> ERROR: cannot parse xml: no element found: line 1, column 0 >> crm(live)configure# >> >> Digging deeper, I see both attrd and cib failing to connect to the AIS >> plugin: >> >> Sep 06 14:42:52 n0014.lustre attrd: [13225]: notice: crm_cluster_connect: >> Connecting to cluster infrastructure: classic openais (with plugin) >> Sep 06 14:42:52 n0014.lustre attrd: [13225]: ERROR: main: HA Signon failed >> Sep 06 14:42:52 n0014.lustre attrd: [13225]: ERROR: main: Aborting startup >> -snip- >> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: get_cluster_type: Cluster >> type is: 'openais' >> Sep 06 14:42:52 n0014.lustre cib: [13223]: notice: crm_cluster_connect: >> Connecting to cluster infrastructure: classic openais (with plugin) >> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: >> init_ais_connection_classic: Creating connection to our Corosync plugin >> Sep 06 14:42:52 n0014.lustre cib: [13223]: info: >> init_ais_connection_classic: Connection to our AIS plugin (10) failed: >> Library error (2) >> Sep 06 14:42:52 n0014.lustre cib: [13223]: CRIT: cib_init: Cannot sign in to >> the cluster… terminating >> >> >> I'm really at a loss here after 3 days, any ideas or hints as to where I >> might find a solution? More logging available upon request. >> >> >> >> ---------------- >> John White >> HPC Systems Engineer >> (510) 486-7307 >> One Cyclotron Rd, MS: 50C-3209C >> Lawrence Berkeley National Lab >> Berkeley, CA 94720 >> > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org