Re: [Dspam-devel] Improve logging

Tom Hendrikx Wed, 24 Aug 2011 15:56:18 -0700

On 24/08/11 23:40, Stevan Bajic wrote:
> On 24.08.2011 21:27, Tom Hendrikx wrote:
>> Hello,
> Hello,
> 
>> I'm currently looking into possibilities to improve regular logging
>> output for DSPAM. Currently, DSPAM does not log anything when running
>> normally. There is no easy way to see regular activity within a DSPAM
>> process. There are ways around this such as enabling TrackSources which
>> logs 1 line for each processed message (under most circumstances), but
>> this is far from ideal.
>>
>> The idea is to make DSPAM output some informal logging about its own
>> activity. To do this, my general idea is like this:
>>
>> - Add LOG(LOG_INFO, "informal message") to applicable places in the code
>>
>> - Add the actual strings to language.h in the same way current logging
>> (90% errors) is done.
>>
>> - Convert some of the existing debug messages to LOG(LOG_DEBUG...
>> statements and enable them even when --debug is disabled
> this will result in bigger binaries and more memory needed to run DSPAM. 
> Currently DSPAM does not compile some code if --debug is disabled.


This is meant for debugging output that is not inside ifdef blocks, also
see the part below about --debug etc.

Some (not all) of the current LOGDEBUG output is also valid for regular
end user debugging, so we could just convert them to LOG(LOG_DEBUG, ...
in stead of duplicating them.

> 
>> - Add a directive to dspam.conf that enables the user to change
>> verbosity: LogLevel {debug,info,notice,warn,critical,alert,emerg} and
>> log all messages above and including this level to the regular logging
>> channel (syslog or ./configure --with-logfile=<path>)
>>
>>
>> The general logging output should pass the following requirements:
>>
>> - Each (interesting) event should ad one (and only one) line to the log
>>
>> - A log entry should be short but informative, and contain as much
>> relevant information as possible.
>>
>> - LogLevel set to "info" should result in output that enables an
>> administrator (i.e. no developer) to see high level message handling on
>> a per-message basis.
>>
>> - LogLevel set above "info" should only output issues, as is currently
>> done (albeit because no messages of level "info" or "lower" are
>> available in the code).
>>
>> - LogLevel set to "debug" should add output that helps administrators to
>> find configuration issues.
>>
>> - Output that is only helpful for developers or significantly impacts
>> performance (because extra calculations/queries/etc are needed) should
>> be logged only when
>> --enable-debug/--enable-bnr-debug/--enable-verbose-debug options are
>> enabled at compile time. This is currently already being done.
>>
>> Other things to consider:
>>
>> We could write all the debug output to the regular logging channel,
>> thereby eliminating the current (IMHO annoying) practice of logging to
>> different files.
> What different files? Do you mean the SQL logging and the message 
> content logging?

Id like to eliminate the dspam.debug, bnr.log and sql.errors (which are
mostly not errors but queries) files as written to the logdir.

NB The files used for the webui (system.log and all other files in dspam
homedir) are not conventional logging and serve no actual logging
purpose: their intention is to provide data to the webui. They are not
subject to this conversation.

> 
>> Current error messages can be enhanced when possible. Depends on the
>> case, but adding debug/warning messages deeper inside the code should
>> reveal more information when someone is interested.
>>
>> If needed we could add some unique identifier for each message passing
>> though DSPAM, so log lines for each messages can be grouped together
>> easily (MTA queue id alike).
> Currently DSPAM does not have such code/id because it does not have a 
> queue management.

We could generate a semi-unique id for each message, not for queue
management, but for log file readability.

I have a low-traffic site so this would be not needed for me personally,
but i'd expect this to come in handy for larger sites that process
multiple messages at the same time. Sounds like good practice to me.

> 
>> Example
>> -------
>> We are running dspam as daemon and a message is processed. Loglevel is
>> set to "info". This means that a line is logged for each of the events:
>>
>> - external entity connects to dspam via LMTP or dspamc
>> - dspam classifies message, and decides what the result is
>> - dspam handles message according to result (quarantine/send over smtp/etc/)
>>
>> Thus, normal execution of a message results in 3 log lines. This seems
>> reasonable to be, since other applications (f.i. postfix) are able to
>> produce much more logging without harming performance.
> Without harming performance? No way! Each logging is by definition 
> harming performance.

Correction: harming performance not too much. Considering the amount of
work that DSPAM does, pushing a string towards the syslog socket (which
it is already connected to) is only minor overhead, I think. Anyway,
this is tunable when performance issues might arise.

I'll ask around tomorrow if any performance is gained (or needed) by
limiting log output on some high-traffic machines with talky software
(postfix is very talky).

For good measure: I believe in "Premature optimization is the root of
all evil" :)

> 
>> As I'm sure I'm not the only person that is annoyed by the lack of
>> regular processing output in DSPAM, I'd like to hear your ideas and
>> comments on the above.
> I am not against additional logging. The question that I have is: Why 
> the additional logging? What is the purpose of it? What goal do you try 
> to reach with more logging?

Ever since the first time I setup DSPAM this was a pita. Various things
come to mind:

- When setting up DSPAM, you are looking for issues in the logfile or
stdout while configuring. Having no output at all is confusing, as
no-one is telling that you are doing a good job.

- When setting up DSPAM (and you are doing it right), a message might
get lost somewhere in your MTA->DSPAM->MTA->IMAP pipeline (insert more
hops when needed). It's nice that you can easily determine that DSPAM is
working as intended, so you can concentrate on somewhere else. However,
DSPAM does nothing to inform you of correct processing.

- The 3 lines a week that are currently logged on my setup, could also
be passed to stderr, and logging infrastructure within DSPAM would no
longer be needed at all. DSPAM wants syslog or logfile specified at
compile time, so you expect it be used, but this not really done. This
is counterintuitive.

- More information == good. People can add logparsers to their
monitoring setup, and don't need to rely on half-fabricates such as
system.log, or the output of dspam_stats etc. I wrote working munin
plugins around both of them, but log parsing is more flexible, much
easier and more fun.

- You are saying 'additional logging' and 'more logging', but by default
there is no logging at all: a non-debug setup only logs startup and
shutdown messages. As an administrator, I value informative logging just
as much as a working setup: it helps you to find errors and anomalies,
and tells you what is happening.

- No informative logging output feels awkward: it seems as if DSPAM does
not really know what it is doing it self, since it won't tell about it.
(this is subjective and no hard fact, but still true)

And now, time for sleep.

--
Tom

------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management 
Up to 160% more powerful than alternatives and 25% more efficient. 
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Dspam-devel mailing list
Dspam-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Re: [Dspam-devel] Improve logging

Reply via email to