Re: [Dspam-devel] Improve logging

Stevan Bajić Thu, 25 Aug 2011 06:33:00 -0700

 On Thu, 25 Aug 2011 00:54:39 +0200, Tom Hendrikx wrote:
> On 24/08/11 23:40, Stevan Bajic wrote:
>> On 24.08.2011 21:27, Tom Hendrikx wrote:
>>> Hello,
>> Hello,
>>
>>> I'm currently looking into possibilities to improve regular logging
>>> output for DSPAM. Currently, DSPAM does not log anything when 
>>> running
>>> normally. There is no easy way to see regular activity within a 
>>> DSPAM
>>> process. There are ways around this such as enabling TrackSources 
>>> which
>>> logs 1 line for each processed message (under most circumstances), 
>>> but
>>> this is far from ideal.
>>>
>>> The idea is to make DSPAM output some informal logging about its 
>>> own
>>> activity. To do this, my general idea is like this:
>>>
>>> - Add LOG(LOG_INFO, "informal message") to applicable places in the 
>>> code
>>>
>>> - Add the actual strings to language.h in the same way current 
>>> logging
>>> (90% errors) is done.
>>>
>>> - Convert some of the existing debug messages to LOG(LOG_DEBUG...
>>> statements and enable them even when --debug is disabled
>> this will result in bigger binaries and more memory needed to run 
>> DSPAM.
>> Currently DSPAM does not compile some code if --debug is disabled.
>
> This is meant for debugging output that is not inside ifdef blocks, 
> also
> see the part below about --debug etc.
>
> Some (not all) of the current LOGDEBUG output is also valid for 
> regular
> end user debugging, so we could just convert them to LOG(LOG_DEBUG, 
> ...
> in stead of duplicating them.
>
 I was not expecting to double the code/output. All I was saying is that 
 a lot of debug code right now is NOT compiled into the binary if you DO 
 NOT enable --debug. So without --debug = less code. Your proposal will 
 remove that benefit and add additional code to the binary. And since you 
 propose to add different logging levels, each of those levels will add 
 more code to the binary.



>>
>>> - Add a directive to dspam.conf that enables the user to change
>>> verbosity: LogLevel {debug,info,notice,warn,critical,alert,emerg} 
>>> and
>>> log all messages above and including this level to the regular 
>>> logging
>>> channel (syslog or ./configure --with-logfile=<path>)
>>>
>>>
>>> The general logging output should pass the following requirements:
>>>
>>> - Each (interesting) event should ad one (and only one) line to the 
>>> log
>>>
>>> - A log entry should be short but informative, and contain as much
>>> relevant information as possible.
>>>
>>> - LogLevel set to "info" should result in output that enables an
>>> administrator (i.e. no developer) to see high level message 
>>> handling on
>>> a per-message basis.
>>>
>>> - LogLevel set above "info" should only output issues, as is 
>>> currently
>>> done (albeit because no messages of level "info" or "lower" are
>>> available in the code).
>>>
>>> - LogLevel set to "debug" should add output that helps 
>>> administrators to
>>> find configuration issues.
>>>
>>> - Output that is only helpful for developers or significantly 
>>> impacts
>>> performance (because extra calculations/queries/etc are needed) 
>>> should
>>> be logged only when
>>> --enable-debug/--enable-bnr-debug/--enable-verbose-debug options 
>>> are
>>> enabled at compile time. This is currently already being done.
>>>
>>> Other things to consider:
>>>
>>> We could write all the debug output to the regular logging channel,
>>> thereby eliminating the current (IMHO annoying) practice of logging 
>>> to
>>> different files.
>> What different files? Do you mean the SQL logging and the message
>> content logging?
>
> Id like to eliminate the dspam.debug, bnr.log and sql.errors (which 
> are
> mostly not errors but queries) files as written to the logdir.
>
 sql.errors is usually used for SQL errors. If you enable debugging then 
 the sql.errors is misused as a query log. dspam.debug is really just 
 debug and bnr.log is a very verbose log for BNR processing.


> NB The files used for the webui (system.log and all other files in 
> dspam
> homedir) are not conventional logging and serve no actual logging
> purpose: their intention is to provide data to the webui. They are 
> not
> subject to this conversation.
>
>>
>>> Current error messages can be enhanced when possible. Depends on 
>>> the
>>> case, but adding debug/warning messages deeper inside the code 
>>> should
>>> reveal more information when someone is interested.
>>>
>>> If needed we could add some unique identifier for each message 
>>> passing
>>> though DSPAM, so log lines for each messages can be grouped 
>>> together
>>> easily (MTA queue id alike).
>> Currently DSPAM does not have such code/id because it does not have 
>> a
>> queue management.
>
> We could generate a semi-unique id for each message, not for queue
> management, but for log file readability.
>
 Yeah. Without such a number and with the need to log more then one line 
 per message, we would need such a id in order to concatenate log entries 
 belonging to one message.


> I have a low-traffic site so this would be not needed for me 
> personally,
> but i'd expect this to come in handy for larger sites that process
> multiple messages at the same time. Sounds like good practice to me.
>
>>
>>> Example
>>> -------
>>> We are running dspam as daemon and a message is processed. Loglevel 
>>> is
>>> set to "info". This means that a line is logged for each of the 
>>> events:
>>>
>>> - external entity connects to dspam via LMTP or dspamc
>>> - dspam classifies message, and decides what the result is
>>> - dspam handles message according to result (quarantine/send over 
>>> smtp/etc/)
>>>
>>> Thus, normal execution of a message results in 3 log lines. This 
>>> seems
>>> reasonable to be, since other applications (f.i. postfix) are able 
>>> to
>>> produce much more logging without harming performance.
>> Without harming performance? No way! Each logging is by definition
>> harming performance.
>
> Correction: harming performance not too much. Considering the amount 
> of
> work that DSPAM does, pushing a string towards the syslog socket 
> (which
> it is already connected to) is only minor overhead, I think. Anyway,
> this is tunable when performance issues might arise.
>
 Sending those little strings to syslog socket sounds like nothing but 
 such little things can degenerate performance. I know it sounds 
 ridiculous but any output is harming performance.


> I'll ask around tomorrow if any performance is gained (or needed) by
> limiting log output on some high-traffic machines with talky software
> (postfix is very talky).
>
> For good measure: I believe in "Premature optimization is the root of
> all evil" :)
>
 Well... no optimization has been done. It just was never there.
 btw: is not the opposite gender the root of all evil? ;)


>>
>>> As I'm sure I'm not the only person that is annoyed by the lack of
>>> regular processing output in DSPAM, I'd like to hear your ideas and
>>> comments on the above.
>> I am not against additional logging. The question that I have is: 
>> Why
>> the additional logging? What is the purpose of it? What goal do you 
>> try
>> to reach with more logging?
>
> Ever since the first time I setup DSPAM this was a pita. Various 
> things
> come to mind:
>
> - When setting up DSPAM, you are looking for issues in the logfile or
> stdout while configuring. Having no output at all is confusing, as
> no-one is telling that you are doing a good job.
>
 This is configuration thing. Nothing that will be globally solved with 
 more logging on a per message basis.


> - When setting up DSPAM (and you are doing it right), a message might
> get lost somewhere in your MTA->DSPAM->MTA->IMAP pipeline (insert 
> more
> hops when needed). It's nice that you can easily determine that DSPAM 
> is
> working as intended, so you can concentrate on somewhere else. 
> However,
> DSPAM does nothing to inform you of correct processing.
>
 It does mention stuff in system.log and/or can do some logging with 
 TracSources. Anyway.... to take the example from above:
 -> MTA -> DSPAM -> MTA -> WHATEVER

 When the message leaves your MTA then the MTA is logging this and when 
 the message is reinjected back from DSPAM to the MTA then the MTA is 
 logging that too. Right? So you know that DSPAM is working.


> - The 3 lines a week
>
 A week? You mean per message?


> that are currently logged on my setup, could also
> be passed to stderr, and logging infrastructure within DSPAM would no
> longer be needed at all. DSPAM wants syslog or logfile specified at
> compile time, so you expect it be used, but this not really done. 
> This
> is counterintuitive.
>
> - More information == good.
>
 I am to long doing IT to know that this is not true. More information 
 is not imperatively good.
 The RIGHT information at the RIGHT time == good.
 All other (more) information is usually useless.
 

> People can add logparsers to their
> monitoring setup, and don't need to rely on half-fabricates such as
> system.log, or the output of dspam_stats etc. I wrote working munin
> plugins around both of them, but log parsing is more flexible, much
> easier and more fun.
>
> - You are saying 'additional logging' and 'more logging', but by 
> default
> there is no logging at all: a non-debug setup only logs startup and
> shutdown messages. As an administrator, I value informative logging 
> just
> as much as a working setup: it helps you to find errors and 
> anomalies,
> and tells you what is happening.
>
 Off course you are right.


> - No informative logging output feels awkward: it seems as if DSPAM 
> does
> not really know what it is doing it self,
>
 Just because DSPAM does not do that much logging does not mean that it 
 self does not know what it is doing.


> since it won't tell about it.
> (this is subjective and no hard fact, but still true)
>
 So a calm person is in your viewpoint a person not knowing what it does 
 it self? LOL :)


> And now, time for sleep.
>
> --
> Tom
>
-- 
 Kind Regards from Switzerland,

 Stevan Bajić

------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management 
Up to 160% more powerful than alternatives and 25% more efficient. 
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Dspam-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Re: [Dspam-devel] Improve logging

Reply via email to