On Sat, Nov 12, 2011 at 12:06 AM, Nick Khamis <[email protected]> wrote: > Hello Andrew, > > I do appologize for this, and really appreciate how far I have got into > this project thanks to everyone's help. Just as a quick summary: > > the patch that you suggested did in fact fix the following (ais.c:346): > > ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: crm_abort: > send_ais_text: Triggered assert at ais.c:346 : dest != crm_msg_ais > Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: send_ais_text: > Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: crm_abort: > send_ais_text: Triggered assert at ais.c:346 : dest != crm_msg_ais > Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[14698]: 2011/11/02_11:32:19 ERROR: send_ais_text: > Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) > 1320247939 setup_stack@170: Cluster connection established. Local node id: 1 > 1320247939 setup_stack@174: Added Pacemaker as client 1 with fd -1 > > The run-time error I am getting now is in (corosync.c:352): > > ocfs2_controld[6883]: 2011/11/03_16:34:20 info: crm_new_peer: Node 1 > is now known as astdrbd1 > ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort: > send_ais_text: Triggered assert at corosync.c:352 : dest != > crm_msg_ais > Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text: > Sending message 0 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: crm_abort: > send_ais_text: Triggered assert at corosync.c:352 : dest != > crm_msg_ais > Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) > ocfs2_controld[6883]: 2011/11/03_16:34:20 ERROR: send_ais_text: > Sending message 1 via cpg: FAILED (rc=22): Message error: Success (0) > 1320352460 setup_stack@170: Cluster connection established. Local node id: 1 > 1320352460 setup_stack@174: Added Pacemaker as client 1 with fd -1 > > > * The controld RA is using the standard dlm_controld, and this is now working. > * The o2cb RA is using ocfs2_controld.pcmk, and this is where I am running > into > the runtime error with corosync.c
As I mentioned in the last email, you're not supposed to use ocfs2_controld.pcmk with cman. You must use the standard ocfs2_controld > >> >> IMO (and as Florian alluded to in another message), you'd probably save >> yourself a lot of trouble taking prebuilt packages from a distro where >> the pieces you need are known to work together. > >> Indeed. > > There is no resenting that! But I am so close. Actually, I do have things > working without the o2cb primitive, i.e., pcmk is starting the dual primary > drbd, cloned dlm, and mounting the cloned ocfs2 filesystem: > > root@astdrbd1:~# /etc/init.d/cman start > Starting cluster: > Checking if cluster has been disabled at boot... [ OK ] > Checking Network Manager... [ OK ] > Global setup... [ OK ] > Loading kernel modules... [ OK ] > Mounting configfs... [ OK ] > Starting cman... [ OK ] > Waiting for quorum... [ OK ] > Starting fenced... [ OK ] > Starting dlm_controld... [ OK ] > Unfencing self... [ OK ] > Joining fence domain... [ OK ] > > root@astdrbd1:~# /etc/init.d/pacemaker start > Starting Pacemaker Cluster Manager: touch: missing file operand > Try `touch --help' for more information. > [ OK ] > > > ============ > Last updated: Fri Nov 11 07:36:11 2011 > Last change: Fri Nov 11 07:33:06 2011 via crmd on astdrbd1 > Stack: cman > Current DC: astdrbd1 - partition with quorum > Version: 1.1.6-2d8fad5 > 2 Nodes configured, 2 expected votes > 7 Resources configured. > ============ > > Online: [ astdrbd1 astdrbd2 ] > > astIP (ocf::heartbeat:IPaddr2): Started astdrbd1 > Master/Slave Set: msASTDRBD [astDRBD] > Masters: [ astdrbd2 astdrbd1 ] > Clone Set: astDLMClone [astDLM] > Started: [ astdrbd2 astdrbd1 ] > Clone Set: astFilesystemClone [astFilesystem] > Started: [ astdrbd2 astdrbd1 ] > > > Of course, o2cb is not pcmk cluster aware right now and needs to be > started manually. > > Vladislav, if you are getting this I can test if the kernel bug that slows > down > ocfs2 reported by you earlier. Is there any test you would like me to perform? > > > Kind Regards, > > Nick. > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
