Re: [Linux-HA] Antw: Re: Massive amount of log messages after node failure

Ulrich Windl Wed, 18 May 2011 00:03:52 -0700

>>> Lars Marowsky-Bree <[email protected]> schrieb am 17.05.2011 um 22:39 in 
>>> Nachricht
<[email protected]>:
> On 2011-05-17T17:16:51, Ulrich Windl <[email protected]> 
> wrote:
> 
> > I think that pacemaker is logging too much all the time, so you hardly can 
> find out if there really is a problem. For example external/sbd is logging a 
> message every time the shared disk is OK, that is every 30s or so.
> 
> It should not - the external/sbd status code path doesn't have any log
> messages? What do you see?


Apr 28 17:10:11 host2 stonith: [7890]: info: external/sbd device OK.
Apr 28 17:10:42 host2 stonith: [7951]: info: external/sbd device OK.
Apr 28 17:11:13 host2 stonith: [8007]: info: external/sbd device OK.
Apr 28 17:11:44 host2 stonith: [8063]: info: external/sbd device OK.

> 
> In general, turning down logging is something that we do, but with care
> - disk space is cheap, missing the information to diagnose a problem
> after the first failure and needing to recreate it is not. I'd rather
> err on the conservative side. If you're looking for important bits,
> filtering for warn/crit/err/emerg should do.
> 
> Syslog has the advantage of seeing all messages in context, an
> incredibly valuable aspect.

In some older software I wrote I did collect debug messages in a separate file, 
and when no errors occurred the file was deleted. In case of an error the file 
was mailed together with the error message. That kind of approach makes much 
more sense than creating megabytes of messages that nobody cares about.

> 
> > And of course, I wouldn't complain if I hadn't done it better long time 
> ago:
> 
> Ah, so you're offering patches! ;-) Excellent, we look forward to
> reviewing them - please post them on the respective development mailing
> lists.

Once I have set up our development system I will definitely have a look at all 
the stuff, but I'll be quite busy with more important tasks in the next weeks 
or months.

> 
> > Seeing pacemaker logs, I feel the programmers just left their personal 
> debugging messages in there which nobody really understands. An example:
> 
> Of course. Some of the messages are intended to be read by developers
> when we try to diagnose customer/user problems. They get very anxious
> when we can't. ;-)
> 
> Like I said, we're always tuning them down - you'll find that they are a
> lot quieter nowadays than they were 2 years ago, and in theory, a
> cluster that doesn't do anything won't log much. What you quoted was,
> however, from an active transition - the cluster was actively doing
> something anyway, and we'd rather be able to figure it out in
> retrospect.

I have a cluster that just has an SBD device configured (it's abou to be 
completed soon). It's producing a lot of messages all the time. I'm still 
unsure whether there is a problem or not, but once I know better, I'll ask 
again.

Regards,
Ulrich


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] Antw: Re: Massive amount of log messages after node failure

Reply via email to