Re: sa-learn question
28.10.2011 4:38, Ricardo Ardila Vetrovec kirjoitti: Greetings list! Excuse my english, is not so good. I have a spamassasin standalone server, my MX it is another server postfix query spamassassin for score and work greats My question it's about spam that is not recognize on the spamassassin server i create a mail box so users can redirect the emails they consider spam i run sa-learn --spam -u spamd --mbox /var/mail/spam question is: is this method works? with the redirection of the mail all headers change, that's my doubt Any help about this topic? That is not optimal, it may be even bad. How are the users accessing their mail? If with POP, I can't figure out a solution. If with IMAP, you can create a Confirmed-SPAM folder for them, and they can drag the spam to that folder. Then use a cron job to learn those as spam... No extra headers, nothing. -- Best of all is never to have been born. Second best is to die soon. signature.asc Description: OpenPGP digital signature
Re: sa-learn question
On Fri, 28 Oct 2011 18:24:47 +0300 Jari Fredriksson wrote: 28.10.2011 4:38, Ricardo Ardila Vetrovec kirjoitti: on the spamassassin server i create a mail box so users can redirect the emails they consider spam ... question is: is this method works? with the redirection of the mail all headers change, that's my doubt That is not optimal, it may be even bad. How are the users accessing their mail? If with POP, I can't figure out a solution. If with IMAP, you can create a Confirmed-SPAM folder for them, and they can drag the spam to that folder. Then use a cron job to learn those as spam... No extra headers, nothing. There are two classic solutions. One is to have learning folders (imap or webmail), the other is to forward as an attachment and have a script that extracts the original from the mime. A simple redirect may work well enough, but there are (at least) a couple of problems. Firstly sa-learn should ideally be able to find the trusted and internal networks to reproduce the same tokenization that it does on classification. It would be useful to strip any received headers that break this. Secondly, if you use autolearning then sa-learn must be able to identify if a mail has been previously learned,and this requires all additional received headers be stripped.
Re: Sa-learn question
On Thu, June 5, 2008 12:01, alexpacio wrote: Hello, anybody knows if, when i teach spam to spamassassin through the Bayesan Trainer sa-learn , is needed to delete the X-Spam-Status: and X-Spam-Checker-Version: strings from the header to let Spamassassin teaching well? no training needs unmodified spam / ham mails with headers as you got them sa-learn --spam --showdots /tmp/spammail.msg sa-learn --ham --showdots /tmp/hammail.msg spamassassin auto remove the headers that it self put in mails Benny Pedersen Need more webspace ? http://www.servage.net/?coupon=cust37098
Re: Sa-learn question
alexpacio wrote: Hello, anybody knows if, when i teach spam to spamassassin through the Bayesan Trainer sa-learn , is needed to delete the X-Spam-Status: and X-Spam-Checker-Version: strings from the header to let Spamassassin teaching well? SpamAssassin will remove any headers that it added itself prior to learning the message, including those. So, it's fine to leave them in. However, SpamAssassin won't remove any nonstandard headers added by other spam scanning tools, or wrappers for SpamAssassin that add their own headers (ie: MailScanner). For those, you'll need to use a bayes_ignore_header directive, or strip them before feeding SA.
Re: sa-learn question
Hungry Snail wrote: Site-wide is what i'm trying to setup, I guess i need to do some more googling :) assuming you're using db_file not SQL: First create a path where you want your bayes DB to live, make that directory world RWX. (ie: chmod 0777) in your /etc/mail/spamassassin/local.cf: bayes_path your directory/bayes bayes_file_mode 0777 Gotchas people often run into: 1) DO NOT use that directory for anything else. If there are any other files starting with bayes_ it will screw up the file locking. 2) bayes_path doesn't actually specify a path, it's a path plus partial filename. You NEED the extra /bayes on the end, this is part of the filenames being used by SA to create it's database files. SA will append _seen, _toks, etc as needed to create bayes_seen (seen message database), bayes_toks (token database). 3) Yes the mode needs to be 0777 not 0666, as it is sometimes used in creating directories. Really, bayes_mode is a mask, not an explicit mode. It will not create it's db files with the X bit, even if this is set to 0777.
Re: sa-learn question
Matt Kettler wrote: Hungry Snail wrote: Site-wide is what i'm trying to setup, I guess i need to do some more googling :) Also, I've updated the wiki article on sitewide bayes. It is now at least technically correct. http://wiki.apache.org/spamassassin/SiteWideBayesSetup previously it had several bits of bad advice: Don't /etc/mail/spamassassin to store your bayes DB Don't specify -C on the sa-learn command-line. You REALLY don't want to use that option on any SA tool unless you know exactly what you're doing. (This option is really mostly for testers and developers.) Use init scripts to restart spamd
Re: sa-learn question
Matt Kettler-3 wrote: previously it had several bits of bad advice: Don't /etc/mail/spamassassin to store your bayes DB Don't specify -C on the sa-learn command-line. You REALLY don't want to use that option on any SA tool unless you know exactly what you're doing. (This option is really mostly for testers and developers.) Use init scripts to restart spamd Thanks for all your advice Matt, much appreciated. -- View this message in context: http://www.nabble.com/sa-learn-question-tp16019261p16025291.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-learn question
Hungry Snail wrote: Hi Guys, I am using spam/notspam via Squirrelmail If I mark an email as spam squirrelmail send this command. COMMAND USED TO REPORT: /usr/bin/sa-learn --spam --configpath=/etc/mail/spamassassin --showdots /var/spool/squirrelmail/attach//sb_tmp_174_1205370641 The result I get is.. [0] = Learned tokens from 0 message(s) (1 message(s) examined) Does the result look correct? I was just wondering why is has 0 learned tokens from 0 messages. That generally suggests the message was already learned as spam, therefore no action was needed for the 1 message it examined, and no learning was performed.
Re: sa-learn question
Hungry Snail wrote: Hi Guys, I am using spam/notspam via Squirrelmail If I mark an email as spam squirrelmail send this command. COMMAND USED TO REPORT: /usr/bin/sa-learn --spam --configpath=/etc/mail/spamassassin --showdots /var/spool/squirrelmail/attach//sb_tmp_174_1205370641 The result I get is.. [0] = Learned tokens from 0 message(s) (1 message(s) examined) Does the result look correct? I was just wondering why is has 0 learned tokens from 0 messages. Regards Thats what I thought, but I forwarded the message to myself and it didnt get flagged as spam, it was also a message that was received before spamassassin was setup. I did sa-learn --dump magic and this is what I got back. 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 0 0 non-token data: nham 0.000 0 0 0 non-token data: ntokens 0.000 0 0 0 non-token data: oldest atime 0.000 0 0 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count is the command im using correct? I want the spam/hame rules to apply to everyone and not have it on a per user basis. Regards -- View this message in context: http://www.nabble.com/sa-learn-question-tp16019261p16019763.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-learn question
Hungry Snail wrote: Hungry Snail wrote: Hi Guys, I am using spam/notspam via Squirrelmail If I mark an email as spam squirrelmail send this command. COMMAND USED TO REPORT: /usr/bin/sa-learn --spam --configpath=/etc/mail/spamassassin --showdots /var/spool/squirrelmail/attach//sb_tmp_174_1205370641 The result I get is.. [0] = Learned tokens from 0 message(s) (1 message(s) examined) Does the result look correct? I was just wondering why is has 0 learned tokens from 0 messages. Regards Thats what I thought, but I forwarded the message to myself and it didnt get flagged as spam, it was also a message that was received before spamassassin was setup. Why would forwarding a message to yourself be a valid test? Or do you mean something different like resubmitting the raw message to your mail queue. Generally speaking forwarded messages generated by a mail client are *COMPLETELY* different than the original. New headers, new Recieved path, new body encoding, possibly removal of text/plain section of a multipart alternative message, probably new linewrapping. Forwarding doesn't forward the same message. It forwards some rendering of the text parts, the rest is mangled by your MUA. try redirecting or piping the raw message to spamassassin -t, like you did with sa-learn. I did sa-learn --dump magic and this is what I got back. 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 0 0 non-token data: nham 0.000 0 0 0 non-token data: ntokens 0.000 0 0 0 non-token data: oldest atime 0.000 0 0 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count is the command im using correct? That's quite suspect. Did you run it as the same user as the sa-learn? You might want to try the sa-learn again with -D to see what the debugging has to say. I want the spam/hame rules to apply to everyone and not have it on a per user basis. Regards
Re: sa-learn question
Matt Kettler-3 wrote: Did you run it as the same user as the sa-learn? You might want to try the sa-learn again with -D to see what the debugging has to say. Bah, it works fine if I issue the command via the ssh. sa-learn --spam --configpath=/etc/mail/spamassassin --showdots /var/vmail/mydomain.tld/user/cur/1205274708.P15843Q0M957724.host:2, . Learned tokens from 1 message(s) (1 message(s) examined) My issues seem to be relating to squirrelmail issuing the command when I click the spam button, I wonder if it is trying to issue the command via the www-data user. -- View this message in context: http://www.nabble.com/sa-learn-question-tp16019261p16020246.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-learn question
Hungry Snail wrote: Matt Kettler-3 wrote: Did you run it as the same user as the sa-learn? You might want to try the sa-learn again with -D to see what the debugging has to say. Bah, it works fine if I issue the command via the ssh. sa-learn --spam --configpath=/etc/mail/spamassassin --showdots /var/vmail/mydomain.tld/user/cur/1205274708.P15843Q0M957724.host:2, . Learned tokens from 1 message(s) (1 message(s) examined) My issues seem to be relating to squirrelmail issuing the command when I click the spam button, I wonder if it is trying to issue the command via the www-data user. Quite likely. Regardless, unless you've got a site-wide single bayes db, it needs to run as whatever user gets used when email comes in, which may not be the same as the recipient..
Re: sa-learn question
Site-wide is what i'm trying to setup, I guess i need to do some more googling :) -- View this message in context: http://www.nabble.com/sa-learn-question-tp16019261p16020836.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: sa-learn question about number of messages processed
Matt Kettler wrote: Mário Gamito wrote: Hi, How can i know how many messages did already sa-learn processed ? You mean the total number of messages learned in the bayes database (includes sa-learn and autolearn)? sa-learn --dump magic and how do I read this information ? # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0569 0 non-token data: nspam 0.000 0 7 0 non-token data: nham 0.000 0 53898 0 non-token data: ntokens 0.000 0 987802486 0 non-token data: oldest atime 0.000 0 1176482771 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count
Re: sa-learn question about number of messages processed
PakOgah wrote: Matt Kettler wrote: Mário Gamito wrote: Hi, How can i know how many messages did already sa-learn processed ? You mean the total number of messages learned in the bayes database (includes sa-learn and autolearn)? sa-learn --dump magic and how do I read this information ? # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version Bayes DB is in the version 3 format. (it's changed a couple times in history, but hasn't changed recently) 0.000 0569 0 non-token data: nspam You have trained 569 nonspam messages 0.000 0 7 0 non-token data: nham You have trained 7 spam messages, which is very few, not enough for SA to be willing to start using the bayes database to rate mail yet.. by default you need 200 (and I do not recommend changing it to anything lower except in lab tests to study bayes errors in under-trained databases.). 0.000 0 53898 0 non-token data: ntokens There are 53,898 total tokens in the bayes database. (small, but not absurdly so. By default SA aims to keep it between 150k and 100k. Looking above, you've not trained enough emails for SA to start considering throwing out old tokens to keep it under 150k.) 0.000 0 987802486 0 non-token data: oldest atime 0.000 0 1176482771 0 non-token data: newest atime The least-recently used token in the database was last accessed 987802486 seconds after January 1st, 1970, and the most-recent was accessed at 1176482771. (not very interesting except to compare against each other) 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count There's never been a journal sync or expiration of old tokens. In a young database this is reasonably normal, although I'd eventually expect a journal sync after you've got enough nonspam for your bayes to become actively used by SA. Also, you'll never get expiry until your database is a bit larger. Expiry doesn't kick in until you've got 150,000 tokens, and you've got about a third of that.
Re: sa-learn question about number of messages processed
On Mon, 16 Apr 2007, Matt Kettler wrote: 0.000 0569 0 non-token data: nspam You have trained 569 nonspam messages that should be: 569 spams (Number SPAM) 0.000 0 7 0 non-token data: nham You have trained 7 spam messages and: 7 hams (Number HAM) Pakogah: you need to train 193 more ham emails before Bayes will start scoring. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Ten-millimeter explosive-tip caseless, standard light armor piercing rounds. Why? --- 3 days until The 232nd anniversary of The Shot Heard 'Round The World
Re: sa-learn question about number of messages processed
Mário Gamito wrote: Hi, How can i know how many messages did already sa-learn processed ? You mean the total number of messages learned in the bayes database (includes sa-learn and autolearn)? sa-learn --dump magic
Re: sa-learn question
Russell Jones wrote: If I have multiple sa-learn processes going at the same time, can that corrupt the database and/or cause some other problem that I don't want to happen? Or is it safe to have the following in crontab for example: @daily sa-learn --spam /home/eggycrew/imap/eggycrew.com/rjones/Maildir/.INBOX.spam @daily sa-learn --ham /home/eggycrew/imap/eggycrew.com/rjones/Maildir/cur @daily sa-learn --ham /home/eggycrew/imap/eggycrew.com/rjones/Maildir/new Well, nothing bad will happen, but they'll all effectively get run one at a time. Since only one process can have the R/W lock on the bayes DB, one of them will get the lock and the others will go to sleep waiting for the lock to be released.
Re: sa-learn question
On Fri, 22 Sep 2006, Russell Jones wrote: @daily sa-learn --spam /home/eggycrew/imap/eggycrew.com/rjones/Maildir/.INBOX.spam @daily sa-learn --ham /home/eggycrew/imap/eggycrew.com/rjones/Maildir/cur @daily sa-learn --ham /home/eggycrew/imap/eggycrew.com/rjones/Maildir/new Put all your learns in a single shell script, and run that. I also age the learn mailbox files to keep their sizes down when they are learned, and I only learn if the file has been modified in the last day or two. Attached is the script I have in my cron.daily directory... -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- False is the idea of utility that sacrifices a thousand real advantages for one imaginary or trifling inconvenience; that would take fire from men because it burns, and water because one may drown in it; that has no remedy for evils except destruction. The laws that forbid the carrying of arms are laws of such a nature. They disarm only those who are neither inclined nor determined to commit crime. -- Cesare Beccaria, quoted by Thomas Jefferson --- #!/bin/bash # # Train spamassassin global bayes filter # # learn from folders in user home dirs #: echo Learning from user local mailboxes for SPAM in `find /home/*/[Mm]ail -type f \( -name SpamAssassin-SPAM* -or -name spambox \) -mtime -3` do if [ -s $SPAM ] then echo SPAM from $SPAM MBTYPE=--mbox if [ `file $SPAM | grep ' MBX mail '` ] then MBTYPE=--mbx fi /usr/bin/sa-learn --spam -C /etc/mail/spamassassin $MBTYPE $SPAM fi done echo for HAM in `find /home/*/[Mm]ail -type f \( -name SpamAssassin-HAM* -or -name hambox \) -mtime -3` do if [ -s $HAM ] then echo HAM from $HAM MBTYPE=--mbox if [ `file $HAM | grep ' MBX mail '` ] then MBTYPE=--mbx fi /usr/bin/sa-learn --ham -C /etc/mail/spamassassin $MBTYPE $HAM fi done # Report status echo echo Bayes Statistics: /usr/bin/sa-learn --dump magic chmod a+r /etc/mail/spamassassin/bayes_seen /etc/mail/spamassassin/bayes_toks
Re: sa-learn question
On Thu, Sep 07, 2006 at 02:19:25PM -0500, EviL_SmUrF wrote: Quick question about spamassassin's sa-learn feature. I am running spamassassin on a semi-large webhosting server, and I can't seem to find rather or not when I run sa-learn, if what it learns it will apply to only that email address it was ran on, or the entire domain, or all of the domains hosted on the box. Example of what I am running: It doesn't quite work like that. sa-learn updates a database, the recipient information doesn't really matter. The tokens that are learned will be used by what or who-ever you have configured to use that database for scanning. ie: If you have individual DBs per user, then the learning applies to the user whose database you updated. If you have a sitewide DB config, then it'll be for all users. -- Randomly Generated Tagline: My wife and I were happy for years. Then we met. pgpRa9Tx6nyIX.pgp Description: PGP signature
Re: SA-LEARN Question
Hello Christopher, Tuesday, August 22, 2006, 3:21:36 PM, you wrote: CM Hi, CM We have over 100 domains on a server, all of which are getting junk mail. SA CM 3.1.4 installed, but I don't think it's properly trained yet (even though I CM did upgrade from an earlier version). CM If I set up a [EMAIL PROTECTED] address and tell all my customers to CM forward the junk mail they get to that address, then run sa-learn on that CM mailbox, will that help, or, will it train SA that the users that forwarded CM the junk ARE the spammers and start to assign higher scores to legitimate CM customers? Hi, I have qmail, SA and MUA is The Bat! I found that Redirect email is not good, as SA think about me as sender, but forward of spam to junk account is OK, it strip forwarded by headers and learn it. -- Best regards, Mikimailto:[EMAIL PROTECTED]
Re: SA-LEARN Question
Christopher Mills wrote: Hi, We have over 100 domains on a server, all of which are getting junk mail. SA 3.1.4 installed, but I don't think it's properly trained yet (even though I did upgrade from an earlier version). If I set up a [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] address and tell all my customers to forward the junk mail they get to that address, then run sa-learn on that mailbox, will that help, or, will it train SA that the users that forwarded the junk ARE the spammers and start to assign higher scores to legitimate customers? If you forward the emails, this process will not work. You must either forward it as an attachment and then strip the attachment and run sa-learn on that or use some other method which preserves the original headers. How you do this depends largely on your setup. -Jim
RE: SA-LEARN Question
Christopher Mills wrote: Hi, We have over 100 domains on a server, all of which are getting junk mail. SA 3.1.4 installed, but I don't think it's properly trained yet (even though I did upgrade from an earlier version). If I set up a [EMAIL PROTECTED] address and tell all my customers to forward the junk mail they get to that address, then run sa-learn on that mailbox, will that help, or, will it train SA that the users that forwarded the junk ARE the spammers and start to assign higher scores to legitimate customers? No, SA will learn that messages forwarded from your users are spam. As someone else pointed out, you need to find a method that preserves the original headers of the message. Forwarding the spam as an attachment and then stripping it out or copying it to a shared imap folder are two of the more common options. -- Bowie
Re: SA-LEARN Question
Jim Maul wrote: Christopher Mills wrote: Hi, We have over 100 domains on a server, all of which are getting junk mail. SA 3.1.4 installed, but I don't think it's properly trained yet (even though I did upgrade from an earlier version). If I set up a [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] address and tell all my customers to forward the junk mail they get to that address, then run sa-learn on that mailbox, will that help, or, will it train SA that the users that forwarded the junk ARE the spammers and start to assign higher scores to legitimate customers? If you forward the emails, this process will not work. You must either forward it as an attachment and then strip the attachment and run sa-learn on that or use some other method which preserves the original headers. How you do this depends largely on your setup. Here's a link describing how I use maildrop to deliver emails to special maildirs for processing by sa-learn. http://www.arda.homeunix.net/spamassassin.html#bayesian Andrew
RE: SA-LEARN Question
Wouldnt forwarding strip away header info that is used to train spam? From: Christopher Mills [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 22, 2006 9:22 AM To: users@spamassassin.apache.org Subject: SA-LEARN Question Hi, We have over 100 domains on a server, all of which are getting junk mail. SA 3.1.4 installed, but I don't think it's properly trained yet (even though I did upgrade from an earlier version). If I set up a [EMAIL PROTECTED] address and tell all my customers to forward the junk mail they get to that address, then run sa-learn on that mailbox, will that help, or, will it train SA that the users that forwarded the junk ARE the spammers and start to assign higher scores to legitimate customers?
Re: SA-LEARN Question
Bowie Bailey wrote: Christopher Mills wrote: Hi, We have over 100 domains on a server, all of which are getting junk mail. SA 3.1.4 installed, but I don't think it's properly trained yet (even though I did upgrade from an earlier version). If I set up a [EMAIL PROTECTED] address and tell all my customers to forward the junk mail they get to that address, then run sa-learn on that mailbox, will that help, or, will it train SA that the users that forwarded the junk ARE the spammers and start to assign higher scores to legitimate customers? No, SA will learn that messages forwarded from your users are spam. As someone else pointed out, you need to find a method that preserves the original headers of the message. Forwarding the spam as an attachment and then stripping it out or copying it to a shared imap folder are two of the more common options. I have similar, albiet smaller, environment. What I've done is asked my users who want to help to have a ConfirmedSpam folder in their IMAP directory. Every night I cron-job a LOCATE for that folder and then tell sa-learn to learn those emails. Then I empty the mail dir to start fresh for the next day. It works like a charm. -- --Michel Vaillancourt Wolfstar Systems www.wolfstar.ca
Re: SA-LEARN Question
On Tuesday 22 August 2006 16:31, Jean-Paul Natola took the opportunity to say: Wouldn't forwarding strip away header info that is used to train spam? It depends on the MUA. Some MUAs, like MS Outlook (who would've guessed?) (at least Outlook 2000), mangle the mail even when forwarding as an attachment. Well-behaved MUAs preserve everything when forwarding as an attachment, but then you need to extract that attachment. -- Magnus Holmgren[EMAIL PROTECTED] (No Cc of list mail needed, thanks) pgpNXFe7znmAg.pgp Description: PGP signature
Re: SA-LEARN Question
On 22-Aug-06, at 1:57 PM, Magnus Holmgren wrote: On Tuesday 22 August 2006 16:31, Jean-Paul Natola took the opportunity to say: Wouldn't forwarding strip away header info that is used to train spam? It depends on the MUA. Some MUAs, like MS Outlook (who would've guessed?) (at least Outlook 2000), mangle the mail even when forwarding as an attachment. Well-behaved MUAs preserve everything when forwarding as an attachment, but then you need to extract that attachment. I've been told to, and do use, Redirect instead of Forward when sending spam to a common mailbox for sa-learn. -- Gino Cerullo Pixel Point Studios 21 Chesham Drive Toronto, ON M3M 1W6 416-247-7740 smime.p7s Description: S/MIME cryptographic signature
RE: SA-LEARN Question
Michel Vaillancourt wrote: Bowie Bailey wrote: Christopher Mills wrote: Hi, We have over 100 domains on a server, all of which are getting junk mail. SA 3.1.4 installed, but I don't think it's properly trained yet (even though I did upgrade from an earlier version). If I set up a [EMAIL PROTECTED] address and tell all my customers to forward the junk mail they get to that address, then run sa-learn on that mailbox, will that help, or, will it train SA that the users that forwarded the junk ARE the spammers and start to assign higher scores to legitimate customers? No, SA will learn that messages forwarded from your users are spam. As someone else pointed out, you need to find a method that preserves the original headers of the message. Forwarding the spam as an attachment and then stripping it out or copying it to a shared imap folder are two of the more common options. I have similar, albiet smaller, environment. What I've done is asked my users who want to help to have a ConfirmedSpam folder in their IMAP directory. Every night I cron-job a LOCATE for that folder and then tell sa-learn to learn those emails. Then I empty the mail dir to start fresh for the next day. It works like a charm. For balanced learning, you should also have a ConfirmedHam folder so that you can learn from both ham and spam. -- Bowie
Re: sa-learn question
Drew Burchett a écrit : Does sa-learn read subdirectories? If you mean maildir folders, yes.
Re: sa-learn question
At 01:35 AM 4/3/2005, Roman Serbski wrote: There are some spam messages being not blocked by SA so as far as I understood I can teach Bayes to learn them? But is it worth to feed sa-learn with junk messages that already have headers modified? Yes, that's fine.. sa-learn is smart enough to undo any changes that the spamassassin configuration made.
Re: sa-learn question
I think you should check the SpamAssassin wiki for the solution to your problem http://wiki.apache.org/spamassassin/BayesInSpamAssassin Rakesh Lance wrote: Alright, we're running courier IMAP along with pop3 but our spool is all Maildir format. I've got a public spam folder for certain people so what would the sa-learn command be? sa-learn --spam /var/spool/mail/unixvault.net/shared/.Spam/cur/* or do I need to insert something in there? --mbx/--mbox? I'm not sure if there's a difference on how it learns or not or if it could result in false positives if its not learning correctly. lance