Thank you for your input :) The nodes are syncd using NTP. Although I am unsure about the respective run levels.
I will look into this, thank you. On Sun, Dec 11, 2011 at 7:16 AM, Dukhan, Meir <mduk...@nds.com> wrote: > > Are your nodes time synced and how? > > We ran into problems of nodes being fenced because NTP problem. > > The solution (AFAIR, from the Redhat knowledge base) was to start ntpd > _before_ cman. > I'm not sure but there could be an update of openais or ntpd re this issue. > > For those of you who have RedHat account, see the RedHat KB article: > > Does cman need to have the time of nodes in sync? > https://access.redhat.com/kb/docs/DOC-42471 > > Hope this help, > > Regards, > -- Meir R. Dukhan > > |-----Original Message----- > |From: linux-cluster-boun...@redhat.com [mailto:linux-cluster- > |boun...@redhat.com] On Behalf Of Digimer > |Sent: Sunday, December 11, 2011 0:23 AM > |To: Matthew Painter > |Cc: linux clustering > |Subject: Re: [Linux-cluster] Nodes leaving and re-joining intermittently > | > |On 12/10/2011 05:00 PM, Matthew Painter wrote: > |> The switch was our first thought, but that has been swapped, and while > |> we are not having nodes fenced anymore (we were daily), this anomoly > |> remains. > |> > |> I will ask for those logs and conf on Monday. > |> > |> I think it might be worth reinstalling corosync on this box anyway? > |> Can't be healthy if it is exiting unclearly. I have has reports of the > |> rgmanager dying on this box. (pid file but not running) Could that be > |> related? > |> > |> Thanks :) > | > |It's impossible to say without knowing your configuration. Please share > the > |cluster.conf (only obfuscate passwords, please) along with the log files. > |The more detail, the better. Versions, distros, network config, etc. > | > |Uninstalling corosync is not likely help. RGManager is something fairly > |high up in the stack, so it's not likely the cause either. > | > |Did you configure the timeouts to be very high, by chance? I'm finding it > |difficult to fathom how the node can withdraw without being fenced, short > |of cleanly stopping the cluster stack. I suspect there is something > |important not being said, which the configuration information, versions > and > |logs will hopefully expose. > | > |-- > |Digimer > |E-Mail: digi...@alteeve.com > |Freenode handle: digimer > |Papers and Projects: http://alteeve.com > |Node Assassin: http://nodeassassin.org > |"omg my singularity battery is dead again. > |stupid hawking radiation." - epitron > | > |-- > |Linux-cluster mailing list > |Linux-cluster@redhat.com > |https://www.redhat.com/mailman/listinfo/linux-cluster > > This message is confidential and intended only for the addressee. If you > have received this message in error, please immediately notify the > postmas...@nds.com and delete it from your system as well as any copies. > The content of e-mails as well as traffic data may be monitored by NDS for > employment and security purposes. > To protect the environment please do not print this e-mail unless > necessary. > > An NDS Group Limited company. www.nds.com >
-- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster