On 02/29/12 18:46, Marcus Bointon wrote: > On 29 Feb 2012, at 17:43, Florian Haas wrote: > >> My hunch is that you never properly shut down corosync on that one. >> Did you check your ps output so see if it was really down? Corosync >> 1.2.x had some nasty shutdown issues when running with Pacemaker. > > I shut down or killed anything vaguely related to > corosync/crm/heartbeat/crm/cib and restarted corosync and pacemaker. > > Now on www4 I can see a pacemaker process with crmd, pengine, lrmd and > stonithd child processes, and on www5 I see those plus attrd and cib (which > curiously are the same processes that were reporting segfaults when I was > running the old version). www4 is correspondingly still failing to connect to > cib. > > Starting corosync by itself appears to work correctly on both - the logs show > they see each other, no errors. > > If on www4 I start attrd and cib manually (as root), they do run, and crm > then manages to connect but reports no nodes. crm on www5 sees www4, but it's > marked as 'pending'. pcmk on www4 logs that it can see www5. > > Marcus
And you're sure you've got a healthy Corosync membership? "corosync-cfgtool -s" shows all rings healthy? "corosync-objctl | grep member" shows 2 members? Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems