On 3/23/2014 8:05 AM, Angus McIntyre wrote: > > On Mar 20, 2014, at 6:34 PM, Eric Broch <[email protected] > <mailto:[email protected]>> wrote: >> Your welcome. Since November, I've created a much easier automated >> install here <ftp://ftp.whitehorsetc.com/pub/dspam/>. Be sure to look >> at the Readme file. And, as always, check the script. > > Hmm. That seems to be an FTP link. I tried logging on as 'guest', but > it doesn't seem to want to talk to me. > > I'm not really convinced by dspam yet. Untrained, it classifies > everything as 'Innocent'. I fed it a massive corpus of spam and it > then classified everything as 'Spam'. So I blew everything away and > started over. This time, I've been feeding it an unrecognized spam > (which is to say, all of it) in correction mode (i.e. --source=error). > This is having a limited effect. After feeding it many hundreds of > spams, it still believes that all my spam is actually 'Innocent', but > at least I've shaken its confidence a bit - it's now only 85% > convinced that 'Pro Viagra for Men' is a valid message. > > It looks like I will have a lot more training to do before I can > persuade it to successfully recognize any spam at all … and then only > for the particular user that I've trained. I'm also concerned that > many of the messages I see are filled with hash buster text, which is > designed specifically to dodge and poison statistical filters like dspam. > > Apologies if this is slightly off-topic, but given that dspam is under > consideration for future QMT releases, I felt that I should share my > experience. It's certainly not looking like a magic bullet to me at > the moment. > > Angus Angus,
The FTP site should work now. My firewall was blocking it for some reason (testing fail2ban). Anyway, for my set ups I did not train on the 'corpus' setting only on 'error.' In order to train on 'error' the message must have a DSPAM header which I configured to be in the email header NOT them message. On my own machine I trained (as error) about 30 spam messages marked by DSPAM as innocent. And, now I get 1 spam a month, if that. On my client's site which receives about 60,000 emails a month on average I trained between 100 and 200 messages the same way with similar results. I read through the users email directory and catenate (cat) each spam marked as Innocent through the dspam client program as follows: cat $email | dspamc --user user@domain --mode=teft --class=spam --source=error The results have been excellent. EricB
