On Mon, Nov 21, 2011 at 10:54 AM, Nick Khamis <[email protected]> wrote:
> Hello Andrew,
>
> Thank you so much for your response. I did manage to get an active/active
> cluster working using cman+pacemaker. Everything works fine except for the
> occasional error from fenced, and a kernel crash from ocfs2_controld.cman.
>
>>> Not true. SLES/openSUSE has supported cman-free clusters and cluster 
>>> filesystems
>>> for many years.
>
> I believe I have most of the pieces needed to build a pcmk only active/active,
> (i.e., pcmk + corosync/openais, standard dlm_controld, and 
> ocfs2_controld.pcmk).

Nope:

With pacemaker + cman: standard dlm_controld + (gfs|ocfs2)_controld
Without cman:  dlm_controld.pcmk + (gfs|ocfs2)_controld.pcmk

>
> When attempting to start the cluster:
>
> aisexec
> /etc/init.d/pacemaker start
>
> root      1189  0.3  1.3  62980  3400 ?        Ssl  17:56   0:06 corosync
> root      1205  0.0  0.6  13824  1668 pts/0    S    17:57   0:00 pacemakerd
> root      1209  0.0  0.9  11112  2484 ?        Ss   17:57   0:00  \_
> /usr/lib/heartbeat/stonithd
> 999       1210  0.0  1.5  12180  4020 ?        Ss   17:57   0:00  \_
> /usr/lib/heartbeat/cib
> root      1211  0.0  0.7   5444  1812 ?        Ss   17:57   0:00  \_
> /usr/lib/heartbeat/lrmd
> 999       1212  0.0  1.0  11444  2620 ?        Ss   17:57   0:00  \_
> /usr/lib/heartbeat/attrd
> 999       1213  0.0  0.8   7428  2120 ?        Ss   17:57   0:00  \_
> /usr/lib/heartbeat/pengine
> 999       1214  0.0  1.1  15612  2892 ?        Ss   17:57   0:00  \_
> /usr/lib/heartbeat/crmd
>
>
> # mount -t configfs none /sys/kernel/config
> # dlm_controld -D
> logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/dlm_controld.log
> dlm_controld 3.1.7 started
> cman_admin_init error 2
> /sys/kernel/config/dlm/cluster/comms: opendir failed: 2
> /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2
>
> # ocfs2_controld.pcmk -D
>
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_clm' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_evt' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_ckpt' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_amf_v2' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_msg' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_lck' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next:
> Processing additional service options...
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: Found
> 'openais_tmr' for option: name
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No
> additional configuration supplied for: service
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: config_find_next: No
> additional configuration supplied for: quorum
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_config_opt: No
> default for option: provider
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: get_cluster_type:
> Detected an active 'corosync' cluster
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info:
> init_ais_connection_once: Connection to 'corosync': established
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node
> astdrbd2 now has id: 6
> ocfs2_controld[1786]: 2011/11/20_18:36:27 info: crm_new_peer: Node 6
> is now known as astdrbd2
> ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort:
> send_ais_text: Triggered assert at corosync.c:352 : dest !=
> crm_msg_ais
> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text:
> Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: crm_abort:
> send_ais_text: Triggered assert at corosync.c:352 : dest !=
> crm_msg_ais
> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
> ocfs2_controld[1786]: 2011/11/20_18:36:27 ERROR: send_ais_text:
> Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0)
> 1321832187 setup_stack@170: Cluster connection established.  Local node id: 6
> 1321832187 setup_stack@174: Added Pacemaker as client 1 with fd -1
>
> It is with  the help of the LHA community that enabled me to progress
> as much as I have, and I do
> really apreicate it. I wish I could disclose more information on why
> we don't just use binaries included
> in a distro for a pcmk only active/active however, all I am entitled
> to say is that it is a requirement to
> achieve this using source built from scratch.
>
> Kind Regards,
>
> Nick.
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to