This indicates you dont have a proper configuration. Please send your corosync-objctl output and config file as requested previously.
Thanks -steve On 03/08/2011 02:40 PM, ray klassen wrote: > one other thing. in this configuration, corosync has to be shot in the head > itself to stop. /etc/init.d/corosync stop results in something like > "Waiting for corosync services to stop" and lines and lines of dots. Kill -9 > is > the only way, it seems. > > > > > ----- Original Message ---- > From: ray klassen <[email protected]> > To: [email protected] > Sent: Tue, 8 March, 2011 13:12:27 > Subject: Re: [Openais] firewire > > MCP is not really mentioned anywhere except ClusterGuy's blog (maybe you're > him) > > but from that I'm assuming that you mean starting the pacemaker separately. > as > /etc/init.d/pacemaker. So I removed the /etc/corosync/services.d/pcmk file. I > also (from ClusterGuy's page on 'MCP' > http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for > > ) added 'cman' (yum install cman -- for mailing list readers yet to come) > from > the alternative 2. > > > And it does work. I now can view a 'partition with quorum' with crm_mon. over > firewire, with udpu. > > > Just don't really know how it works. how does pacemaker communicate with the > stack? etc.? unix sockets? shared memory? how does corosync communicate with > the > > stack? > > > > > > > > ----- Original Message ---- > From: Steven Dake <[email protected]> > To: ray klassen <[email protected]> > Cc: [email protected] > Sent: Tue, 8 March, 2011 10:02:28 > Subject: Re: [Openais] firewire > > First off, I'd recommend using the "MCP" process that is part of > Pacemaker rather then the plugin. > > Second, if you could run corosync-objctl and put the output on the list, > along with your /etc/corosync/corosyn.conf, that would be helpful. > > Regards > -steve > > On 03/08/2011 09:19 AM, ray klassen wrote: >> what I'm finding on further investigation is that all the pacemaker >> child processes are dying on startup >> >> >> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process lrmd exited (pid=6356, rc=100) >> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child >> process lrmd no longer wishes to be respawned >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving >> born-on unset: 308 >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update: >> id=168430090, born=0, seq=308 >> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now >> has process list: 00000000000000000000000000111302 (1118978) >> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process cib exited (pid=6355, rc=100) >> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child >> process cib no longer wishes to be respawned >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving >> born-on unset: 308 >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update: >> id=168430090, born=0, seq=308 >> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now >> has process list: 00000000000000000000000000111202 (1118722) >> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process crmd exited (pid=6359, rc=100) >> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child >> process crmd no longer wishes to be respawned >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving >> born-on unset: 308 >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update: >> id=168430090, born=0, seq=308 >> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now >> has process list: 00000000000000000000000000111002 (1118210) >> Mar 08 08:15:28 corosync [TOTEM ] mcasted message added to pending queue >> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process attrd exited (pid=6357, rc=100) >> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child >> process attrd no longer wishes to be respawned >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving >> born-on unset: 308 >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update: >> id=168430090, born=0, seq=308 >> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now >> has process list: 00000000000000000000000000110002 (1114114) >> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process pengine exited (pid=6358, rc=100) >> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child >> process pengine no longer wishes to be respawned >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving >> born-on unset: 308 >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update: >> id=168430090, born=0, seq=308 >> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now >> has process list: 00000000000000000000000000100002 (1048578) >> Mar 08 08:15:28 corosync [TOTEM ] mcasted message added to pending queue >> Mar 08 08:15:28 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child >> process stonith-ng exited (pid=6354, rc=100) >> Mar 08 08:15:28 corosync [pcmk ] notice: pcmk_wait_dispatch: Child >> process stonith-ng no longer wishes to be respawned >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Leaving >> born-on unset: 308 >> Mar 08 08:15:28 corosync [pcmk ] debug: send_cluster_id: Local update: >> id=168430090, born=0, seq=308 >> Mar 08 08:15:28 corosync [pcmk ] info: update_member: Node wwww.com now >> has process list: 00000000000000000000000000000002 (2) >> Mar 08 08:15:28 corosync [TOTEM ] mcasted message added to pending queue >> Mar >> >> >> >> ------------------------------------------------------------------------ >> *From:* Dan Frincu <[email protected]> >> *To:* [email protected] >> *Sent:* Tue, 8 March, 2011 2:45:00 >> *Subject:* Re: [Openais] firewire >> >> >> >> On Tue, Mar 8, 2011 at 2:07 AM, ray klassen >> <[email protected] <mailto:[email protected]>> >> wrote: >> >> well I have the 1.3.0 version of corosync seemingly happy with udpu and >> firewire. The logs report connection back and forth between the two >> boxes. But >> now crm_mon never connects. Does pacemaker not support udpu yet? >> >> >> Pacemaker is the Cluster Resource Manager, so it doesn't really care >> about the underlying method that the Messaging and Membership layer uses >> to connect between nodes. >> >> I've had this issue (crm_mon not connecting) when I performed an upgrade >> from openais-0.80 to corosync-1.3.0 with udpu, I solved it by eventually >> rebooting the servers. In your case I doubt it's an upgrade between >> versions of software, since you've reinstalled. >> >> My 2 cents. >> >> >> >> pacemaker-1.1.4-5.fc14.i686 >> (I switched to fedora from debian to get the latest version of corosync) >> >> >> >> >> ----- Original Message ---- >> From: Steven Dake <[email protected] <mailto:[email protected]>> >> To: ray klassen <[email protected] >> <mailto:[email protected]>> >> Cc: [email protected] >> <mailto:[email protected]> >> Sent: Thu, 3 March, 2011 16:56:21 >> Subject: Re: [Openais] firewire >> >> On 03/03/2011 05:45 PM, ray klassen wrote: >> > Has anyone had any success running corosync with the firewire-net >> module? I >> >want >> > >> > to set up a two node router cluster with a dedicated link between >> the routers. >> >> > Only problem is, I've run out of ethernet ports so I've got ip >> configured on >> >the >> > >> > firewire ports. pinging's no problem between the addresses.. funny >> thing is, on >> > >> > one of them (and they're really identical) corosync starts up no >> problem at all >> > >> > and stays up. on the other one corosync fails with "ERROR: >> ais_dispatch: >> > Receiving message body failed: (2) Library error: Resource temporarily >> > unavailable (11)." >> > >> > >> > Reading up on the firewire-net mailing outstanding issues turned >> up that >> > multicast wasn't fully implemented so my corosync.conf files both say >> >broadcast: >> > >> > yes. instead of mcast-addr >> > >> > Firewire-net was emitting fwnet_write_complete: failed: 10 errors >> so I pulled >> >> > down the latest vanilla kernel 2.6.37.2 and am running that. with >> far fewer of >> >> > that error.. >> > >> > otherwise versions are >> > Debian Squeeze >> > Corosync Version: 1.2.1-4 >> > Pacemaker 1.0.9.1+hg15626-1 >> > >> > Is this a hopeless case? I've a got a debug log from corosync that >> doesn't seem >> > >> > that helpful. If you want I can post that as well >> > >> > Thanks >> > >> >> I'm hesitant to suggest using firewire as a transport as your the first >> person that has ever tried it. If multicast is broken on your hardware, >> you might try the "udpu" transport which uses UDP only (udp is the basis >> for all network communication). >> >> Regards >> -steve >> >> > >> > >> > _______________________________________________ >> > Openais mailing list >> > [email protected] >> <mailto:[email protected]> >> > https://lists.linux-foundation.org/mailman/listinfo/openais >> >> >> >> _______________________________________________ >> Openais mailing list >> [email protected] >> <mailto:[email protected]> >> https://lists.linux-foundation.org/mailman/listinfo/openais >> >> >> >> >> -- >> Dan Frincu >> CCNA, RHCE >> >> >> >> >> _______________________________________________ >> Openais mailing list >> [email protected] >> https://lists.linux-foundation.org/mailman/listinfo/openais > > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais > > > > > _______________________________________________ > Openais mailing list > [email protected] > https://lists.linux-foundation.org/mailman/listinfo/openais _______________________________________________ Openais mailing list [email protected] https://lists.linux-foundation.org/mailman/listinfo/openais
