from:"Gary D. Margiotta"

RE: 0451.com

2006-08-07 Thread Gary D. Margiotta



On Mon, 7 Aug 2006, Sietse van Zanen wrote:


OK than let's put this in another 'political' context:

Caring about 'legitimate' e-mail coming from those domains would be like 
caring for the few 'legitimate' bombs dropped over Iraq, Afghanistan or 
Lebanon.


It would indeed be better to have no bombs at all

-Sietse



First off, STOP top-posting.

Secondly, let's keeps the political contexts, views, and any other 
personal beliefs off of this technical mailing list.  No, I am not saying 
this to express my beliefs on what you're talking about either way, this 
is no place for that type of discussion.


If you want to talk politics or whether your take on any conflict is 
right, just, leitimate, or whatever, then take it to a political 
discussion board and you can talk all day long.


Now, back on topic please.

-Gary





From: Tony Finch on behalf of Tony Finch
Sent: Mon 07-Aug-06 13:26
To: Sietse van Zanen
Cc: users@spamassassin.apache.org
Subject: RE: 0451.com



On Mon, 7 Aug 2006, Sietse van Zanen wrote:


Caring about 'legitimate' e-mail coming from these domains would be like
caring about the 'legitimate' claims of Bush saying he is a true
christian...


All-numeric domains are popular in China because they are easier for
people to deal with than alphabetic domains. For example, 263.com is
China's second-largest ISP. You can't just assume that an all-numeric
domain is necessarily abusive, any more so than Yahoo or Fastmail.

Tony.
--
f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
FISHER: WEST OR NORTHWEST 4 OR 5 BECOMING VARIABLE 3 OR 4. FAIR. MODERATE OR
GOOD.

Re: collecting spam(maybe offtopic)

2006-07-31 Thread Gary D. Margiotta


Hello!
It may be a strange request, but i need to collect spam for a research 
project about the way spammers attack and the way they bypass the antispam 
filters.
Obviously, for this project i need to collect spam in different ways and on 
different types. Also, my project can be concludent only if the spam that i 
analyze is new and variate.
So, i wold like to request your help about the way i can collect spam. I 
tried to post with this address on many Usenet groups and many mailing lists 
but the results was not so good. Also, i can't abuse to post on that mailing 
list because it's not nice to make noise on mailing list were people really 
need help.


   If you can tell me ways about how to make this address spammed i will 
really appreciated.


Just post in public forums, and sign up for all sorts of marketing 
materials or free promo accounts, and you'll get plenty.


In addition, I keep a nightly digest of all the spam we process if you'd 
like a copy.  We're up to over 7,000 spam messages per day on these 
accouns, roughly 3GB of gzip'd mbox-format mailboxes since March when we 
moved to the current servers.


-Gary




Thanks in Advice,

--
Michael
[EMAIL PROTECTED]



P.S.: Please excuse my english. I'm not a native speaker
P.S.2: I know my post is offtopic, but i hope that people that develop and 
use spamassassin will understand my request.

Re: (OT) RE: How do I assign a negative score to BAYES_00 ?

2006-07-31 Thread Gary D. Margiotta


Find a floppy disk. Format it. Move cpanel over to the floppy disk.
Remove the floppy disk from the system. Wrap the floppy in alternating
layers of foil, lead is best, and parafin until it is about 6 thick.
Save it until the next full Moon. Take it to a graveyard. In a quiet
corner dig a hole about 6' deep with a post hole digger. Drop the
disk in making sure it lands flat. Drive a fire hardened oaken stake
through the disk and wrappings. Then backfill the hole.

Finally, edit the right files with vi or emacs.

{^_-}


Wow, you're a complete jackass.


Now *THAT* is the funniest thing I've seen on this list for quite some 
time!

Thanks... you'll never know how much I needed one today! :)


Oh, I know - this new form of image spam seems to have percolated up
to having my address on the initial deliveries. I've been trying to
nail it into a coffin and for some reason it's like nailing jello
to a tree.

(And negativescore - I did wink, ya know. Or can't you read an upside
up smiley? {^_-})

{o.o}



The subtleties are lost on the unedumacated.  People relying on cpanel 
obviously have no idea what ASCII art is, or how to comprehend it.


You were aware that if you had a gui management interface, you could put 
'sysadmin' in your title, right?  Reading through manuals and simple 
searches for already answered questions would just be too much work.  If 
you can't click on the solution, it musn't exist, really.


I laughed myself into a coughing fit after reading this, after a long day, 
this was welcome.


:-D

-Gary

Re: spamassassin doing bad job filtering out spam

2006-07-20 Thread Gary D. Margiotta




I even lowered the required hits to 4.0 from 5.0.

for example, the latest batch of spams with your resume in the
subject:

X-Spam-Status: No, score=0.3 required=4.0 tests=BAYES_50,FORGED_RCVD_HELO,
   HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY autolearn=no
   version=3.1.3

and that's after using spamassassin -r on previous your resume spams.

Any ideas?



Check your tests, and feed more mail into Bayes, here's my scoring for a 
resume spam I just received:


X-Spam-Status: No, score=5.9 required=7.5 tests=BAYES_99,
HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY,
RCVD_IN_WHOIS_INVALID autolearn=no version=3.1.1

This was the first resume spam I've seen get through in the past couple 
days.  Mind you my threshold is much higher than yours, so at the default 
5.0 this would have been spam on an un-modified install.


-Gary

Re: Bayes_00 on spam

2006-07-20 Thread Gary D. Margiotta


Hi all,
Bayes seems to be missing quite  a lot of spam. I'm getting these
results quite often:



snip

Email:63252  Autolearn: 26740  AvgScore:  14.53  AvgScanTime:  1.69 sec
Spam: 51232  Autolearn: 23252  AvgScore:  21.08  AvgScanTime:  1.68 sec
Ham:  12020  Autolearn:  3488  AvgScore: -13.40  AvgScanTime:  1.72 sec


TOP SPAM RULES FIRED
--
RANKRULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM
--
   1HTML_MESSAGE3672070.25   71.67   64.18
   2BAYES_993526956.74   68.845.17
   3URIBL_SBL   3250254.28   63.44   15.22
   4URIBL_JP_SURBL  3180550.70   62.082.20
   5URIBL_SC_SURBL  2752443.83   53.721.65
   6URIBL_OB_SURBL  2290836.27   44.710.29
   7RCVD_IN_BL_SPAMCOP_NET  2208235.55   43.103.35
   8URIBL_AB_SURBL  2178934.63   42.530.96
   9AWL 1928043.57   37.63   68.89
  10RCVD_IN_XBL 1712227.09   33.420.12
  11FORGED_RCVD_HELO1538628.34   30.03   21.12
  12RCVD_IN_SORBS_DUL   1350121.49   26.350.74
  13RCVD_IN_NJABL_DUL   1093417.37   21.340.43
  14BODY_GAPPY_TEXT 1088822.04   21.25   25.40
  15URIBL_WS_SURBL  1061516.80   20.720.08
  16NO_REAL_NAME 888322.63   17.34   45.18
  17MIME_HTML_ONLY   822616.09   16.06   16.21
  18MSGID_FROM_MTA_ID766713.04   14.974.83
  19BAYES_00 744523.53   14.53   61.87
  20SUBJ_SPAMWORD701211.56   13.692.49
--




To me, it looks like Bayes_00 is hitting far too much spam.


snip

~ $ sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  02110713  0  non-token data: nspam
0.000  0 156758  0  non-token data: nham
0.000  01608693  0  non-token data: ntokens
0.000  0 1153323145  0  non-token data: oldest atime
0.000  0 1153446556  0  non-token data: newest atime
0.000  0 1153446557  0  non-token data: last journal sync atime
0.000  0 1153367234  0  non-token data: last expiry atime
0.000  0  43200  0  non-token data: last expire atime delta
0.000  01204872  0  non-token data: last expire reduction 
count



I have fed a large amount of mail into Bayes:


And I'm quite certain that it was fed correctly.
All of the misses I have checked have hit Bayes_00.

Any ideas why this is happening? I have toyed with the idea of lowering
the bayes_00 score. Anyone care to enlighten me on whether this would be
a bad idea and why?




Methinks you don't have enough mail trained in bayes... take a look at my 
numbers for hit count, then see how many spam and ham tokens I have in my 
bayes database.


If more training doesn't correct the scoring, you could lower the score 
for bayes_00, but mine's untouched.




Regards,
Leigh

Leigh Sharpe
Network Systems Engineer
Pacific Wireless
Ph +61 3 9584 8966
Mob 0408 009 502
email [EMAIL PROTECTED]
web www.pacificwireless.com.au






-Gary

Re: spamassassin on a mail relay

2006-06-19 Thread Gary D. Margiotta


Do any of you out there run spamassassin on a mail relay or pop/imap
server  to add the X-Spam headers to all mail that passes through your
gateway?


Yep, border MX servers which accept all mail for all domains we host, scan 
all the mail, then pass it along the line to the recipient servers.  Mail 
either gets tagged, or not, and continues on its way, no modification on 
the border machines.



If you do, how do you let individual users (who don't have accounts on
your relay) tweak their user_prefs file to whitelist things that are
not spam or otherwise tweek the rules?


Users can request a whitelisted address, we put it in the site-wide lists. 
There have been very few requests thanks to our scoring setup.  We have a 
higher scoring point (based on live testing prior to actual 
implementation) for spam, and tag it all and let it through.  We don't 
delete any mail at the gateway, that gets handled on down the line by the 
endpoint servers.



Do any of you who use spamassassin at the server level (as opposed to
the user level) use it to reject spam (versus just marking it up)?


All spam detected by SA first gets tagged by the border servers with the 
Subject: markup, as well as the X-Spam headers.  Then, depending on the 
destination server, multiple things happen.


For our mass hosting machines, all spam-tagged mail gets detected by 
Postfix header checks, and gets redirected to a set of e-mail addresses on 
our border servers for bayes training via nightly script.  Based upon 
feedback from our customers, this was the most effective way for dealing 
with the spam.  People were willing to deal with some possible FP's, as 
long as we killed most of the spam.  This is where our beta testing phase 
came in handy, so we could tweak the setup and scores, and it's been 
working like a charm since.


For our dedicated servers, the customer chooses the method of spam 
filtering.  Either they do the same redirect as above, they have us manage 
it via procmail rules, or they manage it internally with local mail client 
filters.  They also have the option to save mail into spam folders, and we 
routinely grab those folders, and send them over to the border servers as 
well for training.



I had this idea that something could add a url to the bottom of the
message that would let the user click on it and white/black list the
user back on the server.  Maybe something like this exists already?

I must say that in my own experience, I could not blindly reject mail
with Spamassassin because it has too many false positives with my
mail.


It all depends on your userbase, their tolerance levels, and the amount of 
training your filters get.  For us, our setup works darn near perfectly, 
and with the flexibility we have with how we handle the flow of mail, 
pretty much everyone is satisfied.




Michael Grant




-Gary

Re: Processing many mbox folders

2006-06-02 Thread Gary D. Margiotta



#!/bin/sh
cd mail/Lists
for x in `ls`
do
sa-learn --ham --mbox $x
done


-Gary

On Fri, 2 Jun 2006, Kenneth Porter wrote:

On Friday, June 02, 2006 9:47 PM -0400 JamesDR [EMAIL PROTECTED] 
wrote:



How many messages have you trained? You'll need 200 each to get it going,
and I recommend at least a thousand of each to really get it going.


I use procmail to distribute my mail to over a hundred folders in a large 
tree, mainly to deal with mailing lists and to separate mail from friends and 
coworkers. Has anyone come up with good tools for dealing with a hierarchy of 
mbox files when using SA? For instance, it would be convenient to have 
sa-learn start at the top of my mail/Lists hiearchy for ham training. I'd 
also like to run mass-checks against my hierarchy.

Re: Processing many mbox folders

2006-06-02 Thread Gary D. Margiotta


Gary, doesn't that presuppose that the mail/lists directory does not
contain a spam list?


Yep, but his original e-mail said mail/Lists was for ham training, nothing 
about spam, so that's why I put that in there.  It really was a quick and 
dirty answer, and in his other reply, there's more folders than just that.



I also sense a lack of spam training here. One sided Bayes training is
not a good thing.


Very true.



{^_^}



-Gary

Re: Processing many mbox folders

2006-06-02 Thread Gary D. Margiotta



Thanks, that handles the top level. ;)


Yeah, it was quick and simple for just the one scenario you had in your 
e-mail.


Me, I redirect mail using a combo of procmail and Postfix header checks to 
2 users on the border servers (hamfilter and spamfilter), then I do 2 
nightly script runs to sa-learn ham and spam.  I feed somewhere around 
6,000 spam e-mails alone nightly to sa-learn.  Maybe that's a bit much, 
but I get awesome results, and my FP rates are next to nil.  Mind you, I'm 
doing this site-wide on border servers, we pass 30k e-mails daily through 
those particular systems.



I figure I'll need to do something like:

find mail/Lists -type f -exec sa-learn --ham --mbox {} \;

(I'd need the same for mail/Friends and a few other top-level hierarchies, 
excluding my mail/Spam one. Within that tree, I need to put SpamAssassin and 
Uncaught under --spam and FalsePositives under --ham.)


Well, another way you could do it is just keep a text list of your spam 
and ham folders -


ham.txt:

mail/foo/hambox1
mail/bar/hambox1

spam.txt:

mail/foo/spambox1
mail/bar/spambox1

Then, the original for loop would work:

#!/bin/sh
for x in `cat ham.txt`
do
sa-learn --ham --progress --mbox $x  outfile
done

cat outfile | mail [EMAIL PROTECTED]

#!/bin/sh
for y in `cat spam.txt`
do
sa-learn --spam --progress --mbox $y  outfile
done

cat outfile | mail [EMAIL PROTECTED]

But I want to exclude my .imap folders created by the dovecot IMAP server to 
hold state data. I might also need to wrap sa-learn in a script to lock the 
mailboxes against modification by dovecot and procmail (my LDA).


To build the original text files, you could use find, or edit by hand. 
This way you could build a list of your mailboxes, and you can 
include/exclude whatever you want.


If you have those boxes as active, then yes.  But then again, if you learn 
a mailbox that you used to learn before, then it's a waste of cycles for 
the mails sa has already seen.



And what would be the equivalent for mass-checks?


Don't use those, sorry...

-Gary

Re: Managing Spamassassin Data

2006-04-17 Thread Gary D. Margiotta





2. Is there a way I can put the razor-agent.log into multilog? If not,
how do I rotate this log file?



For myself on FreeBSD, I installed by source, not by port, so adjust your 
configs as necessary, but I use the newsyslog facility (/etc/newsyslog) to 
rotate the log files with the nightly checks:


The maillog is rotated nightly:
/var/log/maillog640  120   *@T00  JC

So, I added another entry for my spam log:
/var/log/spam.log   640  120   *@T00  JC

I've added several logfiles to the file to auto-rotate, such as named, and 
it works like a charm.


My relevant config bits:

How I start spamd:
/usr/local/bin/spamd --daemonize --username spamd --max-children=20 
--min-spare=5 --pidfile /home/spamd/spamd.pid -s local5

(notice the local5 part at the end, which defines the local5 syslog 
identifier)


The relevant syslog config:
local5.*/var/log/spam.log

Hope this helps.

-Gary

Re: Postfix/SpamAssassin Integration

2006-04-07 Thread Gary D. Margiotta

Attached is what I use, found it on a webpage about installing SA when I 
was going through it.  Customized slightly for my local usernames and ways 
of doing things.


When spamd dies, all mail continues to come through, it just doesn't get 
analyzed by SA until spamd gets restarted.


Here's my config bits:

Postfix:

master.cf:

smtp  inet  n   -   n   -   -   smtpd
-o content_filter=spamchk:dummy

spamchk   unix  -   n   n   -   20  pipe
flags=Rq user=spamfilter argv=/usr/local/bin/spamchk -f ${sender} 
-- ${recipient}



attached files:

spamchk - the filter script that gets called, pushes messages over to 
spamc... note I commented out the bottom half of the script snice I don't 
use that functionality currently, but may on other boxes in the future so 
I left it there for reference.


spamdcheck.sh - wrote this scruipt to run every 5 minutes to check to see 
if spamd is running.  I've had instances where spamd just dies in the 
middle of the night, but leaves the pidfile there, so I wrote this to 
check and restart... might be crude, if anyone has suggestions on 
bettering it, please do (also it monitors the number of spamd children to 
tell me if I need to adjust child parameters if I'm running too many 
processes).


Any questions, let me know.

-Gary

On Fri, 7 Apr 2006, James Keating wrote:


Michael Monnerie wrote:

On Freitag, 7. April 2006 14:09 James Keating wrote:

Any other thoughts?


I just found this:
http://wiki.apache.org/spamassassin/IntegratePosfixViaSpampd

mfg zmi


I have already tried this script and it was very close to what I was wanting, 
but it does not connect to spamd in any manner.  It actually uses the perl 
libraries to interact with spamassassin in it's own manner, plus it is not 
designed to use per user preferences/bayes/awl.


Thanks anyway Michael.

- James
#!/bin/sh

#
# SpamAssassin Spamd checking script
#
#   
#
# Original script written by Gary Margiotta ([EMAIL PROTECTED]) 3/2006  
#
#   
#
# Run the check to see if spamd is running by running a ps and checking the 
number of   #
# lines returned.  If the test returns with less then 3 process lines, assume 
that  #
# spamd is not running, since there should be no less than 6 processes active 
at#
# any given time.  In that case, check for a stale pidfile, remove it and then 
restart  #
# spamd with the usual startup parameters, and mail the output to the admin to 
let them #
# know the process died and was restarted automatically.
#
#   
#
#

PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin;
export PATH;

DATE=`date +%Y%m%d%H%M`
SPAMDHOME=/data/home/spamd
LOGFILE=/tmp/spamdrestart-${DATE}.txt
PIDFILE=spamd.pid
PSCHECK=`ps -ax | grep spamd | wc -l`
PSLOG=/tmp/pschecksa.log

# Running the check and outputting to logfile for testing purposes
if [ -f ${PSLOG} ];
then
rm -f ${PSLOG}
fi

echo ${PSCHECK}  ${PSLOG}

#
# As an aside, check to see whether we need to adjust the number of child
# processes running.
#

if [ ${PSCHECK} -gt 16 ];
then
echo   ${DATE}   
${LOGFILE}
echo   ${LOGFILE}
echo spamd children exceeded 15, consider bumping max  ${LOGFILE}
echo   ${LOGFILE}
cat ${LOGFILE} | mail [EMAIL PROTECTED]
exit 0
fi

# Here's the meat of it
if [ ${PSCHECK} -le 5 ];
then
echo   ${DATE}   
${LOGFILE}
echo ###  
${LOGFILE}
echo #   #  
${LOGFILE}
echo # spamd doesn't appear to be running, attemping restart   #  
${LOGFILE}
echo #   #  
${LOGFILE}
echo ###  
${LOGFILE}

#
# Checking for an existing pidfile
#
if [ -f ${SPAMDHOME}/${PIDFILE} ]; 
then
echo   ${LOGFILE}
echo Old pidfile found, removing...  ${LOGFILE}
rm -f ${SPAMDHOME}/${PIDFILE}
echo ${SPAMDHOME}/${PIDFILE} removed.  ${LOGFILE}
echo   ${LOGFILE}
fi

echo   ${LOGFILE}
echo Restarting spamd...  ${LOGFILE}
spamd --daemonize --username spamd --max-children=20 --min-spare=5 
--pidfile ${SPAMDHOME}/${PIDFILE}

Re: Best way to send spam for learning from OE and Outlook

2006-04-06 Thread Gary D. Margiotta


On Thu, 6 Apr 2006, Patrick Sherrill wrote:

What is the best way to send spam candidates from Outlook and Outlook Express 
to spamassassin for learning?


Here, I have a generic spam address on my border servers running SA.

For the users, I have them set up a rule to send tagged spam to that 
account (it's aliased from a base address, so if the backend ever changes, 
it's a simple edit to the alias, and all is well again), and then I run a 
nightly script to process the spam mailbox for auto-learning.  I also have 
the same setup for ham, in case anyone gets an FP, or just wants to help 
train SA for good mail.


Currently, I'm averaging slightly over 4,000 messages per night that end 
up in the spam mailbox, less than 10 in the ham mailbox.  Some of it is 
auto-redirected by some of the customer servers, the rest is being fed in 
by customers through this process.


Works quite well, as the FP rate is next to nil here, so we don't worry 
too much about mis-training SA.  As part of the script I archive the 
nightly mailboxes, so if a user encounters an FP, it can easily be 
re-processed as ham if needed.  This also helps if I need to bring up a 
new border server, I can run all the archived mailboxes into it to train 
it so that it gets up to speed much quicker.


If you'd like more info, including a copy of my nightly scripts, let me 
know.


-Gary


TIA.
Pat...

RE: Which Operating Systems Do You Use and Why?

2006-04-06 Thread Gary D. Margiotta


On Thu, 6 Apr 2006, Gustafson, Tim wrote:


I have been using FreeBSD in a production environment for almost 10
years now (since version 2.2.5!) and have absolutely NO complaints about
it.  I've regularly had servers with uptimes in excess of 6 months, and
even those were just rebooted for kernel updates and the like.

The ports tree is excellent, well-maintained and can be used as either
binary packages or source code updates.

Tim Gustafson
MEI Technology Consulting, Inc
[EMAIL PROTECTED]
(516) 379-0001 Office
(516) 908-4185 Fax
http://www.meitech.com/




^^^ What he said...

I started with 2.1.5, and haven't looked back.  I use some linux boxes for 
mostly workstation type use, in-house server here and there, but really no 
production servers of mine run Linux (couple customers do, but not for my 
stuff).  Also run some Solaris boxes, Sparcs, no Solaris i386, hardware 
support was atrocious in earlier versions, might be better now, but if I'm 
running x86 (or x64), it's BSD or Linux.  Was never a huge fan of redhat, 
will one day try some other distros, when I have time (yeah, right), but 
with FreeBSD, It Just Works, and no need to change.


The answer tho is use what you know, and feel confident working with.  Use 
what you know will get the job done, done right, time and again, and give 
you and your customers the least amount of headaches.


FreeBSD is mainly more geared towards server use (IMO), set it and forget 
it in the closet.  It just chugs along, you never know it's there.  My 
uptimes are ridiculous, and they only go down when I upgrade system pieces 
like the kernel or for critical security patches.  Never had a base system 
compromise (user installed software excluded) in over 10 years, never had 
a system crash unless it was hardware or admin error (i.e servers never 
brought to their knees by attacks), and I'll swear by it's reliability.


And the answer to other posts, FreeBSD has both source and binary upgrades 
for both packages, and base system and security parts to my knowledge, 
though I've only used the binary packages sparingly here and there, 
everything else is source-built, including world (which is FreeBSD's 
way of upgrading the system in place).


-Gary

Re: Best Practices: SpamAssassin

2006-03-30 Thread Gary D. Margiotta


(sorry for the top-post)

Ryan,

I use SA with Postfix on FreeBSD in a border MX gateway solution for our 
customers, which would serve your store and forward requirement to 3 
geographic locations, with some nightly scripts to do auto-learning.  The 
border servers accept all mail for our domains, process through SA, then 
tag and forward through transport maps, and the host servers finish 
processing and delivery for the users.


This doean't cover all of your requirements, but it may be useful combined 
with other input.  Message me privately for more detail and discussion if 
you'd like.


-Gary

On Thu, 30 Mar 2006, Ryan Kather wrote:

I am about to evaluate SpamAssassin as a replacement in my environment 
for our present spam solution (Symantec Mail Security for SMTP without 
the BrightMail add-on).


I wish to compare SpamAssassin's performance directly with DSPAM, 
Brightmail, and a Barracuda Spam Filtering Appliance.  I also intend to 
publish my findings and test configurations to help other people make a 
decision.


So I'm writing to ask if anyone would like to provide some insight into 
the best practices for making SpamAssassin as effective as possible.


Environment Details: - Users: 4000 Mail System: 
GroupWise 6.0.4 (LDAP enabled) Domains:  3 Replication:  3 
Geographically Dispersed Locations Spam Filter:  Symantec Mail Security 
for SMTP sans Brightmail Configuration:  Spam Filter Store and Forward 
Gateway (non authenticated) User Proficiency:  Some Power Users, Many 
Non-Technical Users User Mood:  Very Impatient and Demanding


Ideas:  Postfix- I would prefer to use SpamAssassin as a store 
and forward mail filtering relay appliance.  It seems if I place a 
Postfix Linux MTA in front of my existing spam solution I could setup 
test groups.  100 users could be forwarded to the SpamAssassin test box 
and passed internally to GroupWise.  100 users could be forwarded to the 
DSPAM test box and passed internally to GroupWise.  The rest of the 
users would be forwarded to the Symantec Mail Security Gateway and 
passed internally to GroupWise (until such time that a selected solution 
can be enabled and Symantec disabled).  I would prefer to use LDAP to 
validate recipients for SpamAssassin and DSPAM which should be possible 
with Postfix.


I think I could accomplish this scenario with Postfix Transports, though 
I may need to run multiple instances of Postfix.  Does anyone see a flaw 
in this?


SpamAssassin- Now here is where I need the help (assuming my postfix 
section was sound).  I want to make sure this is as optimized as 
possible to provide a fair performance picture versus SpamAssassin and 
Barracuda.


It appears many seem to be using the Amavsid-new + Postfix + 
SpamAssassin configuration.  Is there a reason not to use this design? 
I have had good luck with this in the past.


I also have read a lot where people are improving accuracy by increasing 
the scoring of the Bayesian database (which needs training).  What would 
the optimal training method be, given my environment?  I could create a 
shared GroupWise IMAP folder for unclassified spam with a cron job to 
read this into sa-learn.  I cannot have a central IMAP folder for false 
positives, however, as other users must not be able to view the email 
for other users.  How can I insure user false positives are easily 
reportable?  What do others do to train the Bayesian database? 
Maia-Mailguard?


I could pretty much trust a small subset of users to be fairly regular 
in their training.  There is a somewhat larger portion of users who 
would train here and there.  Lastly, the largest portion of users may 
never train.  We also do not know which user belongs to which group 
(yet).  With this scenario it seems that I will have to use some kind of 
common database.  In the default configuration SA uses one Bayesian 
database for all users.  Is there a reason to change this?  What is the 
consensus on a shared ruleset versus individual rulesets?


It also seems that there is a falling out between pyzor, dcc, razor, and 
the community.  Is it simply a licensing issue (with legal 
implications), or are these systems flawed otherwise.  What alternatives 
are there?  Do I even need this functionality?  Has anyone seen a 
detriment to SpamAssassin's performance without DCC, Pyzor, or Razor.


What about an initial corpus to train the Bayesian database?  Will this 
hurt my accuracy in the long term?  What corpuses are being used?  Am I 
better off letting the Bayesian autolearn gradually perform this 
function?


SpamAssassin is typically represented as a magic dance of tweaking 
rules.  Are the default rule thresholds good values to start at?  How 
can I adequately decide which rules to tweak and how much to tweak them 
by?  In other words, how do you manage your adjustments without users 
noticing wide spam classifying variations?


Also, in regards to rules.  What is the preferred method for

RE: 0451.com

Re: collecting spam(maybe offtopic)

Re: (OT) RE: How do I assign a negative score to BAYES_00 ?

Re: spamassassin doing bad job filtering out spam

Re: Bayes_00 on spam

Re: spamassassin on a mail relay

Re: Processing many mbox folders

Re: Processing many mbox folders

Re: Processing many mbox folders

Re: Managing Spamassassin Data

Re: Postfix/SpamAssassin Integration

Re: Best way to send spam for learning from OE and Outlook

RE: Which Operating Systems Do You Use and Why?

Re: Best Practices: SpamAssassin

14 matches

Site Navigation

Mail list logo

Footer information