Il giorno Mar 16 Ott 2012 23:44:15 CEST, Lars Marowsky-Bree ha scritto: [...] > Depending on what kind of problem this node has, it could be that it > erratically affects timing of network messages, or even sends garbage, > which has the potential to mess up the totem protocol pretty much. > What corosync version do you have? > And yes, this is impossible to diagnose without the full cluster logs > etc. A good candidate for bugzilla. > Regards, > Lars
Hi Lars, thank you for your answer. I know that without the full logs doing a coherent analysis is impossible, but as you can imagine there are a lot of logs about this problem and yes, I will fill a bugzilla as soon as possible. Some other informations about the systems: OS version: CentOS release 6.2 (Final) Kernel version: 2.6.32-220.23.1.el6.x86_64 Corosync version: corosync-1.4.1-4.el6_2.3.x86_64 Going deep into the failed node I saw also these message: ERST: Can not request iomem region <0xffff88103419be60-0xffff102068337cc0> for ERST. >From the Red Hat's Knowledge Base it seems that the root cause is a kernel problem with the ERST (Error record Serialization Table) access. The resolution suggested is to upgrade kernel versione 2.6.32-279.el6. I just need to know if this error is a consequence of the original one (NMI) or it is the cause. What I know is that it appeared after the NMI error so, maybe, it is a consequence. As I said, I will fill a bugzilla soon. Thanks again, -- RaSca Mia Mamma Usa Linux: Niente รจ impossibile da capire, se lo spieghi bene! [email protected] http://www.miamammausalinux.org _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
