Re: [Declude.JunkMail] Feature-itis

2004-02-23 Thread R. Scott Perry

I want cleaner logs.  This has been discussed in the list before, and I'm
pretty sure that Pete and Sandy agreed that they'd seen the behaviour
elsewhere, i.e. that multiple processes of writing to the same log file are
garbling the text file, and that per se, the garbling wasn't strictly
declude's doing.
We will be looking into different options for logging.

I get pecked to death by ducks on the small-weight false positives I get on
short text matches that are matching the encoded body of BASE64 attachments.
I know that you've mentioned several times before that going beyond the
current functionality would require a big leap in going to full MIME
decoding, but I hope that my aim is lower: I want to skip matching the
BASE64 encoding.
Unfortunately, this really would require full MIME decoding.  But the goods 
news is that we probably will have to add full MIME decoding for an 
upcoming release.

   -Scott
---
Declude JunkMail: The advanced anti-spam solution for IMail mailservers 
since 2000.
Declude Virus: Catches known viruses and is the leader in mailserver 
vulnerability detection.
Find out what you've been missing: Ask for a free 30-day evaluation.

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]
---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.


Re: [Declude.JunkMail] Feature-itis

2004-02-23 Thread Matt
Andrew,

I just wanted to chime in and say that I of course would love to see 
non-text base64 stuff thrown out before scanning, and allow us to target 
only unencoded text strings.  The idea of scanning only the decoded text 
would also be a big processor saver and the primary method, so maybe you 
would make the decoded choice the BODY filter and have a BODYSOURCE 
filter for the encoded version.  It's important that a filter have the 
ability to scan MIME attachment descriptors though, so a method to 
provide for that would be necessary as well.  Maybe a BODYSOURCE check 
would match what a BODY filter does today even with the source of the 
attachments.

Regarding the logging, I'm not sure about the garbling, but streamed log 
files like Declude's, IMail's and all sorts of HTTP stuff is generally 
done on demand instead of grouped together.  I believe there are 
definite advantages beyond speed for doing this.  It does though lack a 
unique identifier as a single field which might be nice for log parsing, 
and maybe that's what's needed here.  Something like the spool file name 
with a -1 appended to it which increments and appears on each line 
would do the trick, am I right?  That would certainly make things easier 
to parse.

Matt

Colbeck, Andrew wrote:

Far be it for me to halt progress...

Scott, I can't wait to put in the new TESTSFAILED logic.  I've wanted
exactly this to keep certain multi-answer ip4r tests in check, and Matt is
off to a great start in combining tests...
I also find that CMDSPACE is very handy and has low false positives.

Seamless decoding of BASE64 Subjects (and quoted printables?) is also a good
thing(tm).
SPF testing and the time-based DOW and HOUR features could be very handy.

But for my two cents, I have other priorities:

Priority #1 (by far)

I want cleaner logs.  This has been discussed in the list before, and I'm
pretty sure that Pete and Sandy agreed that they'd seen the behaviour
elsewhere, i.e. that multiple processes of writing to the same log file are
garbling the text file, and that per se, the garbling wasn't strictly
declude's doing.
I find that I need to run at loglevel HIGH to get the reporting I need on
text filtering, which means bigger log files and presumably more time spent
by each instance of declude while it or Windows races to the end of the file
to append the text.
Without good logging, I'm very much put off my log analysis.  Filtering the
logs when I get a false positive during my mailserver's morning rush is a
major pain due to all the overlapping loglines.
I can think of a couple of techniques, and I'm a lousy programmer.  I don't
think you'll need my help there...
The simplest thing might be to give us a variable in the global.cfg to turn
on file locking, so that we can control whether the performance hit is
important in our environment.  I realize that would likely add a lot of
lines of code to your source, but it could also be trivial to implement
inside a function.
Sending to a syslog server might also be easy to implement, but the only
experience I have with using the logs in a resulting syslog server is with
Kiwi, and there, I was using the text log it creates rather than any kind of
interface to syslog (I don't know if that's the norm, nor what the IMail
users with syslog do with their logging.)
Ideally, the logs would be sent directly by declude.exe to an ODBC DSN and
the particular SQL database of our choosing, but I know that's really a
stretch.
Priority #2
===
I get pecked to death by ducks on the small-weight false positives I get on
short text matches that are matching the encoded body of BASE64 attachments.
I know that you've mentioned several times before that going beyond the
current functionality would require a big leap in going to full MIME
decoding, but I hope that my aim is lower: I want to skip matching the
BASE64 encoding.
Sure, it would also be great to skip decoding MIME attachments that aren't
text or HTML (I get false positives on the binary contents of decoded .zip
files, too), but that would probably be Priority #3.
I know that at least one person on the list relies on declude to match text
inside the BASE64 attachments to catch viruses, but perhaps matching that
could be toggled with a flag, or make it a new test, e.g. instead of
specifying
	BODY x CONTAINS abcdefghij

that this would be appropriate:

	BASE64CODE x CONTAINS s9Zci6Y4

I haven't thought through all the ways in which a decoder would be useful,
so that exact testname might not be appropriate, but hey, it's a start.
Thanks for reading this all the way through,

Andrew 8)
---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]
---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.
 

--

Re: [Declude.JunkMail] Feature-itis

2004-02-23 Thread Matt




Ick :)

LOGLEVEL MID doesn't look nearly as bad, though there will be the
occasional series of line breaks with code appearing in it. I haven't
tried parsing the logs with anything but DLAnalyzer though.

Matt


Colbeck, Andrew wrote:

  Ahem, like I said, the attachment.

Andrew ;)

-Original Message-
From: Colbeck, Andrew 
Sent: Monday, February 23, 2004 12:58 PM
To: '[EMAIL PROTECTED]'
Subject: RE: [Declude.JunkMail] Feature-itis


Matt, it's not that I particularly want the log files to be neatly ordered
instead of interleaved.  Although it would be nice, I'm used to identifying
the Q... number first and then filtering out the one(s) I want for
examination.

My bugbear is that individual lines are garbled, the start of new log lines
appearing in the middle of other log lines.  See the accompanying snippet
for a sample.  I have not added or removed any CR/LF from that sample.

Back in the stone age, I implemented a very similar logging system that was
also multi-process, and locking was feared to be too much of a drain.  For
readability, I wanted the log to not be interleaved, and I also wanted to
avoid generating unique names for the logs, so instead of printing the lines
instantly, I appended to a string, complete with CR/LF, and at the end of
the task, I then implemented a lock, spin, write, unlock routine.

Andrew 8(


-Original Message-
From: Matt [mailto:[EMAIL PROTECTED]] 
Sent: Monday, February 23, 2004 11:49 AM
To: [EMAIL PROTECTED]
Subject: Re: [Declude.JunkMail] Feature-itis


Andrew,

I just wanted to chime in and say that I of course would love to see 
non-text base64 stuff thrown out before scanning, and allow us to target 
only unencoded text strings.  The idea of scanning only the decoded text 
would also be a big processor saver and the primary method, so maybe you 
would make the decoded choice the BODY filter and have a BODYSOURCE 
filter for the encoded version.  It's important that a filter have the 
ability to scan MIME attachment descriptors though, so a method to 
provide for that would be necessary as well.  Maybe a BODYSOURCE check 
would match what a BODY filter does today even with the source of the 
attachments.

Regarding the logging, I'm not sure about the garbling, but streamed log 
files like Declude's, IMail's and all sorts of HTTP stuff is generally 
done on demand instead of grouped together.  I believe there are 
definite advantages beyond speed for doing this.  It does though lack a 
unique identifier as a single field which might be nice for log parsing, 
and maybe that's what's needed here.  Something like the spool file name 
with a "-1" appended to it which increments and appears on each line 
would do the trick, am I right?  That would certainly make things easier 
to parse.

Matt


Colbeck, Andrew wrote:

  
  
Far be it for me to halt progress...

Scott, I can't wait to put in the new TESTSFAILED logic.  I've wanted
exactly this to keep certain multi-answer ip4r tests in check, and Matt is
off to a great start in combining tests...

I also find that CMDSPACE is very handy and has low false positives.

Seamless decoding of BASE64 Subjects (and quoted printables?) is also a

  
  good
  
  
thing(tm).

SPF testing and the time-based DOW and HOUR features could be very handy.

But for my two cents, I have other priorities:

Priority #1 (by far)


I want cleaner logs.  This has been discussed in the list before, and I'm
pretty sure that Pete and Sandy agreed that they'd seen the behaviour
elsewhere, i.e. that multiple processes of writing to the same log file are
garbling the text file, and that per se, the garbling wasn't strictly
declude's doing.

I find that I need to run at loglevel HIGH to get the reporting I need on
text filtering, which means bigger log files and presumably more time spent
by each instance of declude while it or Windows races to the end of the

  
  file
  
  
to append the text.

Without good logging, I'm very much put off my log analysis.  Filtering the
logs when I get a false positive during my mailserver's "morning rush" is a
major pain due to all the overlapping loglines.

I can think of a couple of techniques, and I'm a lousy programmer.  I don't
think you'll need my help there...

The simplest thing might be to give us a variable in the global.cfg to turn
on file locking, so that we can control whether the performance hit is
important in our environment.  I realize that would likely add a lot of
lines of code to your source, but it could also be trivial to implement
inside a function.

Sending to a syslog server might also be easy to implement, but the only
experience I have with using the logs in a resulting syslog server is with
Kiwi, and there, I was using the text log it creates rather than any kind

  
  of
  
  
interface to syslog (I don't know if that's the norm, nor what the IMail
users with syslog do with their logging.)

Ideally, th

Re: [Declude.JunkMail] Feature-itis

2004-02-23 Thread DLAnalyzer Support
That was our log parsing tool (DLAnalyzer).  Our mail servers are very busy 
and we often see a lot of the lines intermixed during peak times.  We make 
every attempt to interpet mixed logging lines to extract as much information 
out of the lines, but sometimes its so intermixed its impossible so the 
information gets discarded as unusable.  It would be nice to have syslogging 
as a feature. 

However, it may not be very easy to integrate syslogging support into 
Declude.  I am curious to know if the majoriety of folks would prefer that 
the focus of the developer(s) be maintained on developing new spam features 
versus re-tooling Declude to work with a syslog daemon.  Depending on the 
amount of work my preference (if a lot of work was required) would be to 
spend the time working on new spam detection features. 

Darrell

Check Out DLAnalyzer a comprehensive reporting tool for
Declude Junkmail Logs - http://www.dlanalyzer.com 

Charles Frolick writes: 

I ran across this when I wrote my graphing app, and someone else that
wrote a log parser said they had to skip those lines as well, so the end
result is lost information.  High volume servers are going to be a lot
more likely to suffer. I know there is a solution, what it is I don't
know.  The ability to write to a syslog daemon would help, it could
buffer the input data and commit to the log file with full locking.   

Just a thought, Scott, you already send log info to Declude Console, how
about using Declude console or some other helper app as the log writer,
keeps the conversation local and should resolve the whole two processes
write to the same line issue? 

Thanks,
Chuck Frolick
ArgoLink.net
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Matt
Sent: Monday, February 23, 2004 3:29 PM
To: [EMAIL PROTECTED]
Subject: Re: [Declude.JunkMail] Feature-itis 

Ick :) 

LOGLEVEL MID doesn't look nearly as bad, though there will be the
occasional series of line breaks with code appearing in it.  I haven't
tried parsing the logs with anything but DLAnalyzer though. 

Matt 

Colbeck, Andrew wrote: 

Ahem, like I said, the attachment. 

Andrew ;) 

-Original Message-
From: Colbeck, Andrew 
Sent: Monday, February 23, 2004 12:58 PM
To: '[EMAIL PROTECTED]'
Subject: RE: [Declude.JunkMail] Feature-itis 

Matt, it's not that I particularly want the log files to be neatly
ordered
instead of interleaved.  Although it would be nice, I'm used to
identifying
the Q... number first and then filtering out the one(s) I want for
examination. 

My bugbear is that individual lines are garbled, the start of new log
lines
appearing in the middle of other log lines.  See the accompanying
snippet
for a sample.  I have not added or removed any CR/LF from that sample. 

Back in the stone age, I implemented a very similar logging system that
was
also multi-process, and locking was feared to be too much of a drain.
For
readability, I wanted the log to not be interleaved, and I also wanted
to
avoid generating unique names for the logs, so instead of printing the
lines
instantly, I appended to a string, complete with CR/LF, and at the end
of
the task, I then implemented a lock, spin, write, unlock routine. 

Andrew 8( 

-Original Message-
From: Matt [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 23, 2004 11:49 AM
To: [EMAIL PROTECTED]
Subject: Re: [Declude.JunkMail] Feature-itis 

Andrew, 

I just wanted to chime in and say that I of course would love to see 
non-text base64 stuff thrown out before scanning, and allow us to target 

only unencoded text strings.  The idea of scanning only the decoded text 

would also be a big processor saver and the primary method, so maybe you 

would make the decoded choice the BODY filter and have a BODYSOURCE 
filter for the encoded version.  It's important that a filter have the 
ability to scan MIME attachment descriptors though, so a method to 
provide for that would be necessary as well.  Maybe a BODYSOURCE check 
would match what a BODY filter does today even with the source of the 
attachments. 

Regarding the logging, I'm not sure about the garbling, but streamed log 

files like Declude's, IMail's and all sorts of HTTP stuff is generally 
done on demand instead of grouped together.  I believe there are 
definite advantages beyond speed for doing this.  It does though lack a 
unique identifier as a single field which might be nice for log parsing, 

and maybe that's what's needed here.  Something like the spool file name 

with a -1 appended to it which increments and appears on each line 
would do the trick, am I right?  That would certainly make things easier 

to parse. 

Matt 

Colbeck, Andrew wrote: 

  
Far be it for me to halt progress... 

Scott, I can't wait to put in the new TESTSFAILED logic.  I've wanted
exactly this to keep certain multi-answer ip4r tests in check, and Matt
is
off to a great start in combining tests... 

I

RE: [Declude.JunkMail] Feature-itis

2004-02-23 Thread R. Scott Perry

Just a thought, Scott, you already send log info to Declude Console, how
about using Declude console or some other helper app as the log writer,
keeps the conversation local and should resolve the whole two processes
write to the same line issue?
The problem is that the code used to communicate with the Declude Console 
isn't written to effectively handle large quantities of log file 
entries.  In order to avoid the problem with Windows not properly saving 
files when multiple processes are saving to it, we would need to write code 
better than what Microsoft uses.  It certainly can be done, but would 
likely require quite a bit of work.

   -Scott
---
Declude JunkMail: The advanced anti-spam solution for IMail mailservers 
since 2000.
Declude Virus: Catches known viruses and is the leader in mailserver 
vulnerability detection.
Find out what you've been missing: Ask for a free 30-day evaluation.

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]
---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.


RE: [Declude.JunkMail] Feature-itis

2004-02-23 Thread Markus Gufler


 However, it may not be very easy to integrate syslogging 
 support into Declude.  I am curious to know if the majoriety 
 of folks would prefer that the focus of the developer(s) be 
 maintained on developing new spam features versus re-tooling 
 Declude to work with a syslog daemon.  Depending on the 
 amount of work my preference (if a lot of work was required) 
 would be to spend the time working on new spam detection features. 

In my opinion definitely on new or enhanced features. The todo list is long
enough and spammers don't wait until we have solved our logging problems.

Maybe it would be the easiest solution change logging like describbed by
Andrew Colbeck: Save all logfile lines into a temporary variable including
cr/lf and write them to the logfile after finishing al tests. 

We do this in SpamChk and had never had any problem with broken/mixed log
lines. In addition we have all logfile entries for one message in one block.
So it's easier to read.

Markus

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.


Re: [Declude.JunkMail] Feature-itis

2004-02-23 Thread Bill Landry
- Original Message - 
From: Markus Gufler [EMAIL PROTECTED]

  However, it may not be very easy to integrate syslogging
  support into Declude.  I am curious to know if the majoriety
  of folks would prefer that the focus of the developer(s) be
  maintained on developing new spam features versus re-tooling
  Declude to work with a syslog daemon.  Depending on the
  amount of work my preference (if a lot of work was required)
  would be to spend the time working on new spam detection features.

 In my opinion definitely on new or enhanced features. The todo list is
long
 enough and spammers don't wait until we have solved our logging problems.

 Maybe it would be the easiest solution change logging like describbed by
 Andrew Colbeck: Save all logfile lines into a temporary variable including
 cr/lf and write them to the logfile after finishing al tests.

 We do this in SpamChk and had never had any problem with broken/mixed log
 lines. In addition we have all logfile entries for one message in one
block.
 So it's easier to read.

I agree, Markus, I have never seen any corrupted log entries in SpamChk and
all entries for a particular message are always together, never interspersed
with other message tests.  I would love it if Declude could do this, as
well.  Oh well, can't have everything, I guess - but it doesn't hurt to
dream...  ;-)

Bill

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.