Eric Shubert wrote:
Jon Ernster wrote:
Eric Shubert wrote:
Jon Ernster wrote:
Figured I'd share something I wrote that others here might use since
many of you have shared or helped me out as well.
I wrote this because when I go on vacation I usually shut down the
laptop that has my mail rules which sends all my spam to the spam
folder. By the time I get back the shell script that I have that runs
on a daily basis (courtesy of Jake Vickers) gives this error because
there are too many spam files:
/root/learn-spam: /usr/bin/sa-learn: /usr/bin/perl: bad interpreter:
Argument list too long
/root/learn-spam: line 10: /bin/rm: Argument list too long
So I just wrote this to process the files individually. Not the fastest
script in the world (because of SpamAssassin, not becaues of my code,
obviously) ;), but it works.
J.
------------------------------------------------------------------------
#!/usr/bin/perl
use strict;
use warnings;
use Getopt::Std;
#~ $Id: learn-spam.pl 15 2008-07-03 17:27:09Z jernster $
my %opts = ();
getopts( 'd:u:', \%opts );
my ( $domain, $user ) = @opts{ qw( d u ) };
my $usage =<<EOF;
Usage:
$0 -d example.com -u user
EOF
die "$usage" unless ( $domain && $user );
my $dir = "/home/vpopmail/domains/$domain/$user/Maildir/cur";
my $starttime = time;
my $count = 0;
opendir(DIR, $dir);
my @files = readdir(DIR);
close(DIR);
foreach my $file ( @files )
{
if ( $file =~ /^\./ )
{
next;
}
else
{
my $fpfile = "$dir/$file";
$count++;
print "Learning SPAM - $file\n";
system("/usr/bin/sa-learn --spam $fpfile");
print "Deleting $file\n";
unlink($fpfile);
}
}
print "Syncing databases...\n";
system("/usr/bin/sa-learn --sync");
print "De-linting files...\n";
system("/usr/bin/spamassassin --lint");
system("chown vpopmail:vchkpw /home/vpopmail/.spamassassin/*");
system("/usr/bin/qmail-spam restart");
print "Done!\n";
my $duration = time - $starttime;
print "\nTotal duration: $duration seconds\n";
print "Processed $count SPAM files.\n";
Thanks, Jon. I wish we had a little more of this.
Observations:
.) I like the way you've handled parameters
.) Is this learning everything in the user's cur directory as spam? Doesn't
seem appropriate to me
.) all sa-learn and spamassassin commands need to be run as user vpopmail.
How is that happening?
.) qmail-spam is usually in /usr/sbin, not /usr/bin
Eric,
I create a user named spam and I forward everything to that
user/directory. You're right that this wouldn't be ideal for someone to
run this against their personal mail box. Alternatively this could
easily be modified to learn email as HAM instead of SPAM.
I just run the script as root - seems to work.
It would update root's bayes database ok (the database is created if it
doesn't exist). That's not the same database which is used on incoming mail
though.
[EMAIL PROTECTED] ~]# perl learn-spam.pl -d dumbfounded.net -u spam
Learning SPAM -
1215351471.M922793P3482V000000000000004AI098D13EC_417.vps.dumbfounded.net,S=2432:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215351471.M922793P3482V000000000000004AI098D13EC_417.vps.dumbfounded.net,S=2432:2,
Learning SPAM -
1215370267.M407813P30062V000000000000004AI098D1994_2.vps.dumbfounded.net,S=2260:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215370267.M407813P30062V000000000000004AI098D1994_2.vps.dumbfounded.net,S=2260:2,
Learning SPAM -
1215344032.M28632P3482V000000000000004AI098D1066_416.vps.dumbfounded.net,S=3416:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215344032.M28632P3482V000000000000004AI098D1066_416.vps.dumbfounded.net,S=3416:2,
Learning SPAM -
1215381127.M194384P30062V000000000000004AI098D1A54_5.vps.dumbfounded.net,S=1986:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215381127.M194384P30062V000000000000004AI098D1A54_5.vps.dumbfounded.net,S=1986:2,
Learning SPAM -
1215361414.M459353P3482V000000000000004AI098D161C_419.vps.dumbfounded.net,S=2887:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215361414.M459353P3482V000000000000004AI098D161C_419.vps.dumbfounded.net,S=2887:2,
Learning SPAM -
1215366247.M890863P30062V000000000000004AI098D1972_0.vps.dumbfounded.net,S=20120:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215366247.M890863P30062V000000000000004AI098D1972_0.vps.dumbfounded.net,S=20120:2,
Learning SPAM -
1215343789.M956277P3482V000000000000004AI098D083E_415.vps.dumbfounded.net,S=3323:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215343789.M956277P3482V000000000000004AI098D083E_415.vps.dumbfounded.net,S=3323:2,
Learning SPAM -
1215336171.M990235P3482V000000000000004AI098D105C_414.vps.dumbfounded.net,S=2384:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215336171.M990235P3482V000000000000004AI098D105C_414.vps.dumbfounded.net,S=2384:2,
Learning SPAM -
1215379388.M717118P30062V000000000000004AI098D1A26_4.vps.dumbfounded.net,S=15039:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215379388.M717118P30062V000000000000004AI098D1A26_4.vps.dumbfounded.net,S=15039:2,
Learning SPAM -
1215368887.M166353P30062V000000000000004AI098D1980_1.vps.dumbfounded.net,S=2380:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215368887.M166353P30062V000000000000004AI098D1980_1.vps.dumbfounded.net,S=2380:2,
Learning SPAM -
1215374947.M536410P30062V000000000000004AI098D19C6_3.vps.dumbfounded.net,S=2662:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215374947.M536410P30062V000000000000004AI098D19C6_3.vps.dumbfounded.net,S=2662:2,
Learning SPAM -
1215359554.M445829P3482V000000000000004AI098D108A_418.vps.dumbfounded.net,S=6187:2,
Learned tokens from 1 message(s) (1 message(s) examined)
Deleting
1215359554.M445829P3482V000000000000004AI098D108A_418.vps.dumbfounded.net,S=6187:2,
Syncing databases...
De-linting files...
Use of uninitialized value in concatenation (.) or string at
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/Scalar/Util.pm line 30.
Use of uninitialized value in concatenation (.) or string at
/usr/lib/perl5/5.8.8/i386-linux-thread-multi/Scalar/Util.pm line 30.
Restarting spamd....
/var/qmail/supervise/spamd: up (pid 9686) 2 seconds
/var/qmail/supervise/spamd/log: up (pid 9687) 2 seconds
Done!
Total duration: 83 seconds
Processed 12 SPAM files.
[EMAIL PROTECTED] ~]# ll /home/vpopmail/.spamassassin/
total 24984
-rw------- 1 vpopmail vchkpw 20967424 Jul 6 15:00 auto-whitelist
-rw------- 1 vpopmail vchkpw 5 Jun 9 2007 bayes.mutex
-rw------- 1 vpopmail vpopmail 24648 Jul 6 15:00 bayes_journal
-rw------- 1 vpopmail vchkpw 5214208 Jul 6 15:00 bayes_seen
-rw------- 1 vpopmail vchkpw 5398528 Jul 6 15:00 bayes_toks