Kevin wrote: > David wrote: > >> Relying only on the website linked, I see 2 things that ASSP doesn't have: >> >> 1) "Images embedded in emails are scanned for spam, even if they are >> contained in PDFs." >> It could either use something like the MSRBL clamav defs to detect image >> spam, or it appears to use OCR on images and PDFs, and then passes the >> text on to the filters, which, depending on its implementation and >> effectiveness, could be a pretty good feature >> > > ASSP, being a proxy, could probably never handle this. > It would be best handled in a full MTA with a queue. > > The point of ASSP is to do all the spam filtering on the front end and pass only clean messages to the MTA. Getting the MTA do to additional analysis defeats the purpose and doubles the work involved, especially since the MTA has no way of passing the text from the queued emails back to ASSP for analysis. If OCR were ever to be done, it would be done on ASSP's end. Whether or not it is feasible to do it in the timeframe that ASSP needs to process a message, and whether or not anyone is willing to write the code, is another question. >> 2) Logfiles processable through Sawmill. I would LOVE to see an >> improvement in ASSPs logging feature. The raw log is nice, and the "Info >> & Stats" page is good, but they are lacking. I'd love to see breakdowns >> of the effectiveness of different filters, such as which ones are the >> most effective and which ones generate the most false positives (by >> looking at errors/notspam). I'd like to see how much email my clients >> send and receive so I can see if anyone is abusing the service. >> > > I believe Sawmill had a filter for ASSP at one point. > http://www.sawmill.net/formats/anti_spam_smtpproxy.html (google!) > > Also the logging in ASSP is fine, what you want is log analysis. > > Yes, please, let's play with semantics. In that case, what I am interested in is log analysis and fine-tuning the effectiveness of the filters.
As far as log analysis goes, it kind of falls apart because 1) there are so many logging options that any log analyzer will need the logging to be done in a very specific manner so it can parse it; and if anything changes then the parsing goes to hell and 2) It all breaks anyways when Fritz decides to change how the logging is done and how the mail headers are added. Something as small as a space or spelling change will completely befuddle an analyzer trying to parse through plaintext. I've seen several iterations of the "X-ASSP-Spam:" header itself, I'm not sure what else fluctuates. Sawmill entirely choked on my logfiles and didn't know what to make of them. A move to something more standardized, like XML logging, would be a positive move. XSL can be simply used to make the XML human readable, and with a more standard logging format it would make data analysis that much easier without having to parse plaintext so laboriously. >> Analyzing the errors folder could be a goldmine of useful information in >> terms of which filters are the most effective and which ones aren't. If >> I see that a certain filter is giving me 0 false positives (like say >> Spam Helo or Forged Helo), then I might want to increase the scoring on >> that filter. If I see that a certain filter is too aggressive and making >> too many false positives, then I'd want to lower the scoring on that >> one. If certain RBLs are making too many false positives, then I'd want >> to remove them from the mix. >> > > The CCSpam and a few grep searches would probably give you the info you > want. > > While manually grepping through a few thousand mail messages does seem like a good time, an automated system really would be better. >> That sort of thing would be very useful in >> fine-tuning ASSP's performance. >> > > How does adding/removing a DNSBL or changing the scoring of a test > "fine-tune" performance? > You're talking about effectiveness. They are not the same. > > Again, semantics. I was talking about performance in the sense of how well a filter performs at stopping spam. ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ Assp-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/assp-user
