Hi, Zhen Huang <[EMAIL PROTECTED]> wrote: > Hi, > > The DC node should try to connect to the quorumd sever periodically. > If not, it should be a bug.
Thanks for clarifying, I'll retest later today when I'm back at home, when I can reproduce, I'll open a bugzilla entry. kind regards Sebastian > > > Alan Robertson <[EMAIL PROTECTED]> > 11/14/2007 03:13 AM > > To > Sebastian Reitenbach <[EMAIL PROTECTED]> > cc > [email protected], Zhen Huang/China/[EMAIL PROTECTED] > Subject > Re: [Linux-HA] question regarding quorumd > > > > > > > Sebastian Reitenbach wrote: > > Hi, > > > > Andrew Beekhof <[EMAIL PROTECTED]> wrote: > >> On Nov 13, 2007, at 11:13 AM, Sebastian Reitenbach wrote: > >> > >>> Hi, > >>> > >>> Andrew Beekhof <[EMAIL PROTECTED]> wrote: > >>>> On Nov 9, 2007, at 4:34 PM, Sebastian Reitenbach wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> I did some tests with a two node cluster and a third one running a > >>>>> quorumd. > >>>>> > >>>>> I started the quorumd, and then the two cluster nodes. > >>>>> The one that became DC, started to communicate with the remote > >>>>> quorumd. > >>>> The CRM (and thus the "DC") doesn't know anything about quorumd > >>>> I believe this is purely the domain of the CCM and I've no idea how > >>>> that works :-) > >>>> > >>>> We just consume membership data from it... > >>>> > >>>> So anyway, my point is that the fact that a node is the DC is > >>>> irrelevant when it comes to quorumd. > >>> but somehow the cluster knows, as only the DC is communicating with > >>> the > >>> external quorumd. > >> I think that its just a co-incidence that it happens to be the DC... > >> at least I hope it is. > > I thought I read somewhere, that the DC is the one in charge of > > communicating with the remote quorumd, but I may be wrong here. > > > >>> I just do not understand, why the cluster does not retry > >>> to re-contact the quorumd after it lost connection to it. This was > >>> what I > >>> assumed, after a disconnect to the remote quorumd, the cluster nodes > >>> should > >>> try to contact it, and when the contact is there again, use it again. > >> I agree - but I've never seen that code. You'll have to contact alan > >> or file a bug for him. > > Alan, in case you think this is a bug, I'll go create a bug report for > it. > > Please let me know. > > > >>>>> I killed the DC, saw the other becoming DC, and start communicating > >>>>> to the remote quorumd, all fine, cluster still with quorum. > >>>>> Then I killed the quorumd itself, the DC recognized, and started to > >>>>> stop > >>>>> all resource, because of the quorum_policy, as it lost quorum. > >>>>> > >>>>> Then I restarted the quorumd again, but the DC, still without > >>>>> quorum, > >>>>> did not tried to communicate to the quorumd again. > >>>>> I'd expect the still living DC to try to contact the quorumd, in > >>>>> case it > >>>>> comes back. > >>>>> > >>>>> If there is a good reason, why the DC is not trying to reconnect to > >>>>> the > >>>>> remote quorumd I'd really like to get enlightened from someone who > >>>>> knows. > > It should be trying to reconnect. It _does_ communicate w/quorumd from > a single machine/cluster. I think that it's coincidence that it's the > DC. Huang Zhen wrote the code. I've CCed him. I'm at the LISA > conference this week - if HZ doesn't get back to you by next Monday, > I'll look into it. > _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
