Il giorno Mar 16 Ott 2012 23:44:15 CEST, Lars Marowsky-Bree ha scritto:
[...]
> Depending on what kind of problem this node has, it could be that it
> erratically affects timing of network messages, or even sends garbage,
> which has the potential to mess up the totem protocol pretty much.
> What corosync version do you have?
> And yes, this is impossible to diagnose without the full cluster logs
> etc. A good candidate for bugzilla.
> Regards,
>     Lars

Hi Lars,
thank you for your answer. I know that without the full logs doing a
coherent analysis is impossible, but as you can imagine there are a lot
of logs about this problem and yes, I will fill a bugzilla as soon as
possible.

Some other informations about the systems:

OS version: CentOS release 6.2 (Final)
Kernel version: 2.6.32-220.23.1.el6.x86_64
Corosync version: corosync-1.4.1-4.el6_2.3.x86_64

Going deep into the failed node I saw also these message:

ERST: Can not request iomem region
<0xffff88103419be60-0xffff102068337cc0> for ERST.

>From the Red Hat's Knowledge Base it seems that the root cause is a
kernel problem with the ERST (Error record Serialization Table) access.

The resolution suggested is to upgrade kernel versione 2.6.32-279.el6. I
just need to know if this error is a consequence of the original one
(NMI) or it is the cause. What I know is that it appeared after the NMI
error so, maybe, it is a consequence.

As I said, I will fill a bugzilla soon. Thanks again,

-- 
RaSca
Mia Mamma Usa Linux: Niente รจ impossibile da capire, se lo spieghi bene!
[email protected]
http://www.miamammausalinux.org
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to