On Mon, Nov 21, 2011 at 10:54 AM, Nick Khamis <[email protected]> wrote: > Hello Andrew, > > Thank you so much for your response. I did manage to get an active/active > cluster working using cman+pacemaker. Everything works fine except for the > occasional error from fenced, and a kernel crash from ocfs2_controld.cman. > >>> Not true. SLES/openSUSE has supported cman-free clusters and cluster >>> filesystems >>> for many years. > > I believe I have most of the pieces needed to build a pcmk only active/active, > (i.e., pcmk + corosync/openais, standard dlm_controld, and > ocfs2_controld.pcmk).
Nope: With pacemaker + cman: standard dlm_controld + (gfs|ocfs2)_controld Without cman: dlm_controld.pcmk + (gfs|ocfs2)_controld.pcmk > > When attempting to start the cluster: > > aisexec > /etc/init.d/pacemaker start > > root 1189 0.3 1.3 62980 3400 ? Ssl 17:56 0:06 corosync > root 1205 0.0 0.6 13824 1668 pts/0 S 17:57 0:00 pacemakerd > root 1209 0.0 0.9 11112 2484 ? Ss 17:57 0:00 \_ > /usr/lib/heartbeat/stonithd > 999 1210 0.0 1.5 12180 4020 ? Ss 17:57 0:00 \_ > /usr/lib/heartbeat/cib > root 1211 0.0 0.7 5444 1812 ? Ss 17:57 0:00 \_ > /usr/lib/heartbeat/lrmd > 999 1212 0.0 1.0 11444 2620 ? Ss 17:57 0:00 \_ > /usr/lib/heartbeat/attrd > 999 1213 0.0 0.8 7428 2120 ? Ss 17:57 0:00 \_ > /usr/lib/heartbeat/pengine > 999 1214 0.0 1.1 15612 2892 ? Ss 17:57 0:00 \_ > /usr/lib/heartbeat/crmd > > > # mount -t configfs none /sys/kernel/config > # dlm_controld -D > logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/dlm_controld.log > dlm_controld 3.1.7 started > cman_admin_init error 2 > /sys/kernel/config/dlm/cluster/comms: opendir failed: 2 > /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2 > > # ocfs2_controld.pcmk -D > > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_clm' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_evt' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_ckpt' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_amf_v2' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_msg' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_lck' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: > Processing additional service options... > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found > 'openais_tmr' for option: name > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No > additional configuration supplied for: service > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No > additional configuration supplied for: quorum > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: No > default for option: provider > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_cluster_type: > Detected an active 'corosync' cluster > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: > init_ais_connection_once: Connection to 'corosync': established > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node > astdrbd2 now has id: 6 > ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node 6 > is now known as astdrbd2 > ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort: > send_ais_text: Triggered assert at corosync.c:352 : dest != > crm_msg_ais > Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text: > Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort: > send_ais_text: Triggered assert at corosync.c:352 : dest != > crm_msg_ais > Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text: > Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) > 1321832187 setup_stack@170: Cluster connection established. Local node id: 6 > 1321832187 setup_stack@174: Added Pacemaker as client 1 with fd -1 > > It is with the help of the LHA community that enabled me to progress > as much as I have, and I do > really apreicate it. I wish I could disclose more information on why > we don't just use binaries included > in a distro for a pcmk only active/active however, all I am entitled > to say is that it is a requirement to > achieve this using source built from scratch. > > Kind Regards, > > Nick. > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
