Hi mark, There is a hostcache file which yuo can remove located under /var/.
Stop Heartbeat, make a backup of your hostcache file then remove the hostcache file. STart heartbeat and have a look again. Having renamed your machine names will cause problems with heartbeat. Tim On Fri, Oct 2, 2009 at 9:34 AM, Mark Hunting <[email protected]> wrote: > Mark Hunting wrote: > > Dejan Muhamedagic wrote: > > > >> Hi, > >> > >> On Thu, Oct 01, 2009 at 04:45:45PM +0200, Mark Hunting wrote: > >> > >> > >>> Sorry for not mentioning I use Heartbeat 2.1.3 from the Debian Lenny > >>> repository, in crm configuration. > >>> > >>> Mark Hunting wrote: > >>> > >>> > >>>> Hi, > >>>> > >>>> I have set up a 3-node cluster. Works perfectly, but when I shut one > >>>> node down the other two lose quorum, and shut down their resources (!) > >>>> because no-quorum-policy is set to 'stop' like it should. > >>>> I have no idea why the quorum is lost, this really should not happen > as > >>>> the remaining two nodes are still the majority. crm_mon shows them > >>>> online and they can talk to each other. Only the quorum is lost, > >>>> have_quorum is "false" until the third node comes up again. > >>>> Can anybody tell me how this is possible, or give me some command that > >>>> can help me investigate this? > >>>> > >>>> > >> ccm_tool (or similar, can't recall the name exactly) can show you > >> what a node thinks its partition looks like. Otherwise, look at > >> the ccm lines in the logs, though they may be really hard to > >> figure out. > >> > >> Thanks, > >> > >> Dejan > >> > > Thanks a lot! It just came to my mind that I changed the three node > > names today in ha.cf, and this problem started to occur afterwards. I > > think the cluster still remembers the three old names next to the new > > ones. I guess it now 'thinks' it has six nodes instead of three, and > > that may be an explanation for this behaviour I'm seeing (although then > > with 3 of the 6 nodes online it also shouldn't get a quorum imo, but it > > does). crm_admin shows only 3 nodes however, that's a bit strange. I > > can't access the cluster right now, but I'll try to figure out more > > tomorrow. There should be a way to force the removal of the old node > > names (ideas anyone?) > I know a bit more now. The cluster thinks it has 4 nodes instead of 3. I > see this in my logs: > ccm: [5131]: debug: total_node_count=4, total_quorum_votes=400 > But there are really only 3 nodes. Crmadmin, ccm_tool and the xml output > from cibadmin all only show my existing 3 nodes. So I have no idea > where this total_node_count of 4 comes from. How can I let Heartbeat > stop thinking it has 4 nodes? > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Timothy Carr Technical Specialist University of Cape Town Cell: +27834572568 Fax: +27865472190 Gtalk: [email protected] Skype: timothy.carr.foxtrail _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
