On 2011-11-09 08:09, Ulrich Windl wrote:
>>>> Dejan Muhamedagic <[email protected]> schrieb am 08.11.2011 um 16:34 in
> Nachricht <20111108153419.GB3575@squib>:
>> On Mon, Nov 07, 2011 at 12:02:48PM -0800, Robinson, Eric wrote:
>>>> As Florian mentioned, there's the debug option, but I doubt 
>>>> think it is going to help. What may help is to take a look at 
>>>> the network traffic, but you'd need really good sight ;-)
>>>>
>>>> Thanks,
>>>>
>>>
>>> You're right, it didn't help. What helped was going back to the Linux
>>> bonding documentation, learning about /proc/net/bonding, and finding out
>>> that the bonded links were actually in rr mode instead of active-backup
>>> mode as I had thought, which in turn lead to the discovery that I had a
>>> typo (BONDING_OPS instead of BONDING_OPTS) which was causing 50% dropped
>>> packets. Fixed that and the rings are very stable now. Still, it would
>>> have helped if the debug option gave more information. :-)
>>
>> Well, I'm not going to argue that corosync's (or of all our
>> projects really) logging is perfect, but in this case what can
>> it say apart from "token lost"?
> 
> Well, if the network is fine, and the implementation is correct, a token 
> cannot be "lost". It might arrive too late (when there's a misconfiguration).

Well for a token, "arriving too late", as in "missing its timeout", is
equivalent to "getting lost". And I'm not following why you think that
that clearly points to a misconfiguration. The configuration could be
just fine, and you could be suffering from a real hardware issue.

> In other cases there should be a report of some network problem. As it stands 
> for now, a "lost token" can have a variety of reasons.

Hm. -ENOCRYSTALBALL. It's usually a poor idea for log messages to make
an assumption or suggestion as to the cause of the problem. DRBD at one
point had an error condition that included the log message "broken NICs"
-- which was fine until we found out that the condition could be
triggered by something other than broken NICs, too.

> Just my thoughts.

Mine too. :)

Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to