Re: Bayes learning for legitimate users
Am 14.03.2015 um 16:45 schrieb Matus UHLAR - fantomas uh...@fantomas.sk: ...but as I mentioned before, training spam from mail to non-existent recipients may be even a good thing… I would not train from mail to non-existent recipients, but would restrict to a defined set of spamtraps (which may have been non-existent addresses at some point in the past…). — Matthias smime.p7s Description: S/MIME cryptographic signature
Re: Bayes learning for legitimate users
I manage email through ISPConfig, I think wildcard for any domain is not set. Dne 13.3.2015 v 16:02 Matus UHLAR - fantomas napsal(a): Filip Havlí?ek wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? On 13.03.15 14:54, Filip Havlíček wrote: there is my configuration: /etc/spamassassin/local.cf: http://pastebin.com/PM5jN8wi /etc/postfix/main.cf: http://pastebin.com/KWN7Ebyi /etc/amavis/conf.d/50-user: http://pastebin.com/ijSaqhuJ you have virtual domains set up. Did you set up wildcard in any of them?
Re: Bayes learning for legitimate users
On 14.03.15 15:00, Filip Havlíček wrote: I manage email through ISPConfig, I think wildcard for any domain is not set. seems you have relay_recipient_maps set, isn't your domain listed there? note that postfix rejects non-existing recipients by default (http://www.postfix.org/postconf.5.html#smtpd_reject_unlisted_recipient) and in such case mail to non-existing recipients should not get to proxy, filter or milter so it could be learned from ...but as I mentioned before, training spam from mail to non-existent recipients may be even a good thing... Dne 13.3.2015 v 16:02 Matus UHLAR - fantomas napsal(a): Filip Havlí?ek wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? On 13.03.15 14:54, Filip Havlíček wrote: there is my configuration: /etc/spamassassin/local.cf: http://pastebin.com/PM5jN8wi /etc/postfix/main.cf: http://pastebin.com/KWN7Ebyi /etc/amavis/conf.d/50-user: http://pastebin.com/ijSaqhuJ you have virtual domains set up. Did you set up wildcard in any of them? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. WinError #98652: Operation completed successfully.
Re: Bayes learning for legitimate users
Filip Havlí?ek wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? On 13.03.15 14:54, Filip Havlíček wrote: there is my configuration: /etc/spamassassin/local.cf: http://pastebin.com/PM5jN8wi /etc/postfix/main.cf: http://pastebin.com/KWN7Ebyi /etc/amavis/conf.d/50-user: http://pastebin.com/ijSaqhuJ you have virtual domains set up. Did you set up wildcard in any of them? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. WinError #9: Out of error messages.
Re: Bayes learning for legitimate users
Hi, there is my configuration: /etc/spamassassin/local.cf: http://pastebin.com/PM5jN8wi /etc/postfix/main.cf: http://pastebin.com/KWN7Ebyi /etc/amavis/conf.d/50-user: http://pastebin.com/ijSaqhuJ So, what I should modify? Thanks Dne 4.3.2015 v 20:39 Reindl Harald napsal(a): Am 04.03.2015 um 19:57 schrieb Matus UHLAR - fantomas: On Wed, 04 Mar 2015 13:35:55 +0100 Filip Havlí?ek wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? On 04.03.15 14:37, RW wrote: Why send them through SpamAssassin in the first place? He apparently wants to filter mail for spam but can't reject nonexistent users recipients :-) in other words he is a backscatter and part of the spam-problem because if you don't know your own valid users where do your MTA deliver to and what happens with mail not rejected but not deliverable? However, that would also mean that spam going to random - nonexistent users will NOT be trained even if it scores damn high and luckily all the not catched spam with a damned low score too
Re: Bayes learning for legitimate users
On Wed, 4 Mar 2015, Filip Havlíček wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? Reject invalid users at the MTA level during SMTP before the message even hits SA. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Failure to plan ahead on someone else's part does not constitute an emergency on my part. -- David W. Barts in a.s.r --- 4 days until Daylight Saving Time begins in U.S. - Spring Forward
Re: Bayes learning for legitimate users
On Wed, 04 Mar 2015 13:35:55 +0100 Filip Havlí?ek wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? On 04.03.15 14:37, RW wrote: Why send them through SpamAssassin in the first place? He apparently wants to filter mail for spam but can't reject nonexistent users recipients :-) However, that would also mean that spam going to random - nonexistent users will NOT be trained even if it scores damn high. Don't turn-off auto-training unless you have a strategy for replacing it. agreed. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Where do you want to go to die? [Microsoft]
Re: Bayes learning for legitimate users
Am 04.03.2015 um 19:57 schrieb Matus UHLAR - fantomas: On Wed, 04 Mar 2015 13:35:55 +0100 Filip Havlí?ek wrote: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? On 04.03.15 14:37, RW wrote: Why send them through SpamAssassin in the first place? He apparently wants to filter mail for spam but can't reject nonexistent users recipients :-) in other words he is a backscatter and part of the spam-problem because if you don't know your own valid users where do your MTA deliver to and what happens with mail not rejected but not deliverable? However, that would also mean that spam going to random - nonexistent users will NOT be trained even if it scores damn high and luckily all the not catched spam with a damned low score too signature.asc Description: OpenPGP digital signature
Re: Bayes learning for legitimate users
On Wed, 04 Mar 2015 13:35:55 +0100 Filip Havlí?ek wrote: Hi, I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? Why send them through SpamAssassin in the first place? Table bayes_token grow up to 0,5GB right now, because there are thounsands of unknown email addresses like: That table shouldn't grow without limit , you can run sa-learn --force-expire from cron to prevent this. You may want to increase bayes_expiry_max_db_size before to prevent the size plummeting. Alternately you can expire directly using SQL based on time. Some people add a timestamp field to the bayes_seen table to expire entries from SQL. Alternately you can simple empty the table occasionally, the information is only needed to reverse or forget training. Don't turn-off auto-training unless you have a strategy for replacing it.
Re: Bayes learning for legitimate users
don't reply offlist! Am 04.03.2015 um 14:13 schrieb Filip Havlíček: So you recommend set parameter *bayes_auto_learn* to value *0*? I had truncate tables and try set bayes_auto_learn 0 in /etc/spamassassin/local.cf but it does not work - new hundrends records of unknown email addresses occured in tables *bayes_vars*, *bayes_token* and *bayes_seen* :-(. Any ideas? is /etc/spamassassin/local.cf really correct? if it si are the permissions correct? /etc/mail/spamassassin/local.cf is the correct path here how is your SA called? look for user_prefs in ~/.spamassassin/ no idea what bayes_vars is Dne 4.3.2015 v 13:45 Reindl Harald napsal(a): Am 04.03.2015 um 13:35 schrieb Filip Havlíček: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? Table bayes_token grow up to 0,5GB right now, because there are thounsands of unknown email addresses like: a...@hotmail.com ablewi...@hotmail.com abl...@hotmail.com don't use auto-learning or at least adjust the scores which are taken for autolearning - SpamAssassin can't know if a address exists while you could use http://www.postfix.org/ADDRESS_VERIFICATION_README.html on the MTA level *but* be careful with sender verification, you need to place a lot of DNSWL in front to not become blacklisted for your own i guess your main problem is that way too much mail makes it to SA at all instead block it by RBL scoring and other MTA restrictions long before - see below an example, all the stuff before the bayes stats never touched SpamAssassin __ Connections: 314179 Postscreen:171577 Helo: 1435 Subject: 187 Attachment:29 Header Length: 8 Sender Regex: 263 Sender Blocked:174 Sender Verify: 301 Sender Invalid:1622 Sender Spoofed:10 Sender Parked: 10 PTR Missing: 227 PTR Generic: 447 SPF: 709 __ BAYES_00 46223 77.63 % BAYES_05 7331.23 % BAYES_20 8941.50 % BAYES_40 9571.60 % BAYES_50 6463 10.85 % BAYES_60 6411.07 % BAYES_80 4720.79 % BAYES_95 3440.57 % BAYES_99 28144.72 % BAYES_999 24524.11 % signature.asc Description: OpenPGP digital signature
Re: Bayes learning for legitimate users
Am 04.03.2015 um 13:35 schrieb Filip Havlíček: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? Table bayes_token grow up to 0,5GB right now, because there are thounsands of unknown email addresses like: a...@hotmail.com ablewi...@hotmail.com abl...@hotmail.com don't use auto-learning or at least adjust the scores which are taken for autolearning - SpamAssassin can't know if a address exists while you could use http://www.postfix.org/ADDRESS_VERIFICATION_README.html on the MTA level *but* be careful with sender verification, you need to place a lot of DNSWL in front to not become blacklisted for your own i guess your main problem is that way too much mail makes it to SA at all instead block it by RBL scoring and other MTA restrictions long before - see below an example, all the stuff before the bayes stats never touched SpamAssassin __ Connections: 314179 Postscreen:171577 Helo: 1435 Subject: 187 Attachment:29 Header Length: 8 Sender Regex: 263 Sender Blocked:174 Sender Verify: 301 Sender Invalid:1622 Sender Spoofed:10 Sender Parked: 10 PTR Missing: 227 PTR Generic: 447 SPF: 709 __ BAYES_00 46223 77.63 % BAYES_05 7331.23 % BAYES_20 8941.50 % BAYES_40 9571.60 % BAYES_50 6463 10.85 % BAYES_60 6411.07 % BAYES_80 4720.79 % BAYES_95 3440.57 % BAYES_99 28144.72 % BAYES_999 24524.11 % signature.asc Description: OpenPGP digital signature
Re: Bayes learning for legitimate users
Sorry for bad reply only for you. How can I found out right path for config file: local.cf ? Maybe config is loaded from other path. I used MySQL structure from this file: http://spamassassin.apache.org/full/3.0.x/dist/sql/bayes_mysql.sql Thanks Dne 4.3.2015 v 14:43 Reindl Harald napsal(a): don't reply offlist! Am 04.03.2015 um 14:13 schrieb Filip Havlíček: So you recommend set parameter *bayes_auto_learn* to value *0*? I had truncate tables and try set bayes_auto_learn 0 in /etc/spamassassin/local.cf but it does not work - new hundrends records of unknown email addresses occured in tables *bayes_vars*, *bayes_token* and *bayes_seen* :-(. Any ideas? is /etc/spamassassin/local.cf really correct? if it si are the permissions correct? /etc/mail/spamassassin/local.cf is the correct path here how is your SA called? look for user_prefs in ~/.spamassassin/ no idea what bayes_vars is Dne 4.3.2015 v 13:45 Reindl Harald napsal(a): Am 04.03.2015 um 13:35 schrieb Filip Havlíček: I would like to ask you, how can I *allow **only **legitimate* email addresses (existing users) for bayes learning? Table bayes_token grow up to 0,5GB right now, because there are thounsands of unknown email addresses like: a...@hotmail.com ablewi...@hotmail.com abl...@hotmail.com don't use auto-learning or at least adjust the scores which are taken for autolearning - SpamAssassin can't know if a address exists while you could use http://www.postfix.org/ADDRESS_VERIFICATION_README.html on the MTA level *but* be careful with sender verification, you need to place a lot of DNSWL in front to not become blacklisted for your own i guess your main problem is that way too much mail makes it to SA at all instead block it by RBL scoring and other MTA restrictions long before - see below an example, all the stuff before the bayes stats never touched SpamAssassin __ Connections: 314179 Postscreen:171577 Helo: 1435 Subject: 187 Attachment:29 Header Length: 8 Sender Regex: 263 Sender Blocked:174 Sender Verify: 301 Sender Invalid:1622 Sender Spoofed:10 Sender Parked: 10 PTR Missing: 227 PTR Generic: 447 SPF: 709 __ BAYES_00 46223 77.63 % BAYES_05 7331.23 % BAYES_20 8941.50 % BAYES_40 9571.60 % BAYES_50 6463 10.85 % BAYES_60 6411.07 % BAYES_80 4720.79 % BAYES_95 3440.57 % BAYES_99 28144.72 % BAYES_999 24524.11 %
Re: Bayes learning differences: v3.3.2 to v3.4.0
On 11/4/2014 6:06 PM, John Woods wrote: Everyone, We're having problems with auto learning on v3.4.0 that we aren't having on v.3.3.2. The number of spam e-mails being auto-learned has dropped significantly, and the amount of spam being let through (false negatives) is higher as well.After looking through the wiki and the code, I'm pretty sure this change is related to the rule that says you must have 3 body only points and 3 header only points, which are hardcoded values in Mail::SpamAssassin::Plugin::AutoLearnThreshold. In 3.3.2, it looks like body-points equals the head-points, and in 3.4.0, they are changed. You are correct. There were changes and bugs found in the logic that were resolved on 3.4.0. See https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5503 I've got a few questions: 1) How does SpamAssassin derive and sum the body_only and head_only points? It doesn't look like the body_only points correspond to any scores from individual tests. There is a test_type flag. It was sometimes lost in previous parsing of messages. 2) How can we affect the configuration, to increase the number of spam e-mails being auto-learned? 3) Instead, do we need to completely change our strategy for how we're using Bayes? I will leave Bayes comments to other experts but in general, I believe you will find that some sort of NON automated learning will produce better results. My concern with auto-learning is you are just self-perpetuating any flaws in the current classification not really helping to stop new and different spam. I will likely setup a flamewar if I continue discussing Bayes. Perhaps you can buy a six pack for AXB and convince him to add his $0.04 on Bayes. He's the resident expert. regards, KAM
Re: Bayes learning differences: v3.3.2 to v3.4.0
On Tue, 04 Nov 2014 17:06:54 -0600 John Woods wrote: 1) How does SpamAssassin derive and sum the body_only and head_only points? It doesn't look like the body_only points correspond to any scores from individual tests. Scoring uses one of four score sets, chosen according to whether Bayes and network tests are off or on. Auto-training uses the scoreset that you would have with Bayes turned-off. Also rules marked noautolearn, learn and userconf are ignored.
Re: Bayes learning differences: v3.3.2 to v3.4.0
Kevin, I did skim bug 5503 earlier, but didn't understand it at first. Knowing the history now, it makes a little more sense, although I'm still fuzzy on why the value of 3 for the body and head points is important. It might be nice to have local.cf directives to allow admins to be able to affect the $required_body_points and $required_head_points in AutoLearnThreshold.pm. That way, admins could tune tweak this behavior to allow more/less auto-learning... (i.e. 1 body points, and 2.5 head points) Thoughts? As for Bayes strategies (and without starting a flamewar), we just started implementing an IMAP folder in everyone's mailbox called Learn As Spam, that gets processed through sa-learn --spam. It sounds like we may need to leave auto-learning to SA's defaults, and ask users to put e-mails in Learn As Spam and Learn As Non-Spam folders. Perhaps relying on out-of-the-box auto-learning, and tempering Bayes with user-based learning, may yield positive results. Thanks again, Kevin and RW, for your input. Sincerely, John On 11/05/14 06:40, Kevin A. McGrail wrote: On 11/4/2014 6:06 PM, John Woods wrote: Everyone, We're having problems with auto learning on v3.4.0 that we aren't having on v.3.3.2. The number of spam e-mails being auto-learned has dropped significantly, and the amount of spam being let through (false negatives) is higher as well.After looking through the wiki and the code, I'm pretty sure this change is related to the rule that says you must have 3 body only points and 3 header only points, which are hardcoded values in Mail::SpamAssassin::Plugin::AutoLearnThreshold. In 3.3.2, it looks like body-points equals the head-points, and in 3.4.0, they are changed. You are correct. There were changes and bugs found in the logic that were resolved on 3.4.0. See https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5503 I've got a few questions: 1) How does SpamAssassin derive and sum the body_only and head_only points? It doesn't look like the body_only points correspond to any scores from individual tests. There is a test_type flag. It was sometimes lost in previous parsing of messages. 2) How can we affect the configuration, to increase the number of spam e-mails being auto-learned? 3) Instead, do we need to completely change our strategy for how we're using Bayes? I will leave Bayes comments to other experts but in general, I believe you will find that some sort of NON automated learning will produce better results. My concern with auto-learning is you are just self-perpetuating any flaws in the current classification not really helping to stop new and different spam. I will likely setup a flamewar if I continue discussing Bayes. Perhaps you can buy a six pack for AXB and convince him to add his $0.04 on Bayes. He's the resident expert. regards, KAM
Re: Bayes learning differences: v3.3.2 to v3.4.0
On Wed, 5 Nov 2014, John Woods wrote: As for Bayes strategies (and without starting a flamewar), we just started implementing an IMAP folder in everyone's mailbox called Learn As Spam, that gets processed through sa-learn --spam. It sounds like we may need to leave auto-learning to SA's defaults, and ask users to put e-mails in Learn As Spam and Learn As Non-Spam folders. A warning: you should not blindly accept the training data from users, apart from a (likely small) group of users whose judgement and responsibility you trust. As a general rule, a mail admin or other skilled person should vet all user-submitted training data. Also, the training messages should be retained so that they can be correctly retrained if the user misclassified them or reported them improperly. Too many users will use the learn as spam folder as a substitute for unsubscribing from valid newsletters and such that they did voluntarily subscribe for but no longer are interested in. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Our government wants to do everything it can for the children, except sparing them crushing tax burdens. --- 6 days until Veterans Day
Re: Bayes learning differences: v3.3.2 to v3.4.0
On 11/5/2014 2:12 PM, John Woods wrote: I did skim bug 5503 earlier, but didn't understand it at first. Knowing the history now, it makes a little more sense, although I'm still fuzzy on why the value of 3 for the body and head points is important. Can disagree. I don't know the history either. I just know that 3 was the magic number and the code did not work as logically documented. It might be nice to have local.cf directives to allow admins to be able to affect the $required_body_points and $required_head_points in AutoLearnThreshold.pm. That way, admins could tune tweak this behavior to allow more/less auto-learning... (i.e. 1 body points, and 2.5 head points) Thoughts? Agreed. Can you work on a patch to provide this? As for Bayes strategies (and without starting a flamewar), we just started implementing an IMAP folder in everyone's mailbox called Learn As Spam, that gets processed through sa-learn --spam. It sounds like we may need to leave auto-learning to SA's defaults, and ask users to put e-mails in Learn As Spam and Learn As Non-Spam folders. Perhaps relying on out-of-the-box auto-learning, and tempering Bayes with user-based learning, may yield positive results. Agreed. Hand sorted corpora for spam and ham will lead to the best Bayes results and the system you are implementing is the closest practical method to achieve such a system. Regards, KAM
AW: SpamAssassin and Bayes learning
One crucial thing you didn't post: you ran the learning as root. Is the user that spamd is running as also root? The bayes database is user-specific, and a common problem is to train the database as a different user than the MTA+spamd is running under. Owner and Group of the folder .spamassassin and the files in it are both amavis. I hope this is the right user. But I made a big mistake: I recently changed the server and all the learned spams were on the old server. This is why there was so few spam only (801). So now I took the old files of learned spam and ham into the new server and now I have more (6225): root@example:~# sa-learn --dbpath /var/lib/amavis/.spamassassin/bayes --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 6225 0 non-token data: nspam 0.000 0 52634 0 non-token data: nham 0.000 01884302 0 non-token data: ntokens 0.000 0 1279163247 0 non-token data: oldest atime 0.000 0 1338890042 0 non-token data: newest atime 0.000 0 1338889064 0 non-token data: last journal sync atime 0.000 0 1284701742 0 non-token data: last expiry atime 0.000 05529600 0 non-token data: last expire atime delta 0.000 0 2438 0 non-token data: last expire reduction count So maybe this was the real problem.
SpamAssassin and Bayes learning
Hello I use SpamAssassin 3.3.1 on Ubuntu 12.04 with Postfix 2.9.1-4 and AMaViS 2.6.5 All the time I move Spam when I get, to my Spam-folder, where I have some spam together since the last two years. All night I use the script salearn-from-mails, to learn from the spam which is: #!/bin/bash -e SADIR=/var/lib/amavis/.spamassassin DBPATH=/var/lib/amavis/.spamassassin/bayes SPAMFOLDERS=\ /home/vmail/example.org/franc/.Spam/new \ /home/vmail/example.org/franc/.Spam/cur \ HAMFOLDERS=\ /home/vmail/example.org/franc/cur \ for spamfolder in $SPAMFOLDERS ; do \ echo Learning spam from $spamfolder ; \ nice sa-learn --spam --showdots --dbpath $DBPATH $spamfolder done for hamfolder in $HAMFOLDERS ; do \ echo Learning ham from $hamfolder ; \ nice sa-learn --ham --showdots --dbpath $DBPATH $hamfolder done chown -R amavis:amavis $SADIR When I look of the learnings I get some results: root@example:~# sa-learn --dbpath /var/lib/amavis/.spamassassin/bayes --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0801 0 non-token data: nspam 0.000 0 5585 0 non-token data: nham 0.000 0 127343 0 non-token data: ntokens 0.000 0 1332999307 0 non-token data: oldest atime 0.000 0 1338539336 0 non-token data: newest atime 0.000 0 1338535082 0 non-token data: last journal sync atime 0.000 0 1338524715 0 non-token data: last expiry atime 0.000 05529600 0 non-token data: last expire atime delta 0.000 0 1989 0 non-token data: last expire reduction count in my /etc/spamassassin/local.cf I have: use_bayes 1 bayes_auto_learn 1 bayes_auto_expire 0 bayes_path /var/lib/amavis/.spamassassin/bayes But when I send an email with the content and Subject of an old spam-mail this passes without much bayes-score: ... X-Virus-Scanned: Debian amavisd-new at ew6.org X-Spam-Flag: NO X-Spam-Score: 2.49 X-Spam-Level: ** X-Spam-Status: No, score=2.49 required=5 tests=[BAYES_50=0.8, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_DBL_SPAM=1.7] autolearn=no ... What am I doing wrong? Thanks in advance, frank
Re: SpamAssassin and Bayes learning
On Fri, 1 Jun 2012 10:52:05 +0200 francwal...@gmx.net wrote: But when I send an email with the content and Subject of an old spam-mail this passes without much bayes-score: What am I doing wrong? You are testing a message that's part spam and part non-spam and expecting BAYES to detect it as spam. What happens with actual spam?
Re: SpamAssassin and Bayes learning
There is very few spam in the spam folder and then these mails have a very small Bayes score (e.g. 0.8). But there is more spam in the inbox. I thought, if I put a mail into the spam folder and after sa learned it, there would be no question that the Bayes score for this mail would be high, the mail would be detected as spam. But it happens often that I get this kind of spam mail again. Are the settings I posted all right? RW rwmailli...@googlemail.com schrieb: On Fri, 1 Jun 2012 10:52:05 +0200 francwal...@gmx.net wrote: But when I send an email with the content and Subject of an old spam-mail this passes without much bayes-score: What am I doing wrong? You are testing a message that's part spam and part non-spam and expecting BAYES to detect it as spam. What happens with actual spam?
Re: SpamAssassin and Bayes learning
On Fri, 1 Jun 2012, Frank Walter wrote: There is very few spam in the spam folder and then these mails have a very small Bayes score (e.g. 0.8). But there is more spam in the inbox. I thought, if I put a mail into the spam folder and after sa learned it, there would be no question that the Bayes score for this mail would be high, the mail would be detected as spam. But it happens often that I get this kind of spam mail again. Are the settings I posted all right? One crucial thing you didn't post: you ran the learning as root. Is the user that spamd is running as also root? The bayes database is user-specific, and a common problem is to train the database as a different user than the MTA+spamd is running under. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- The more you believe you can create heaven on earth the more likely you are to set up guillotines in the public square to hasten the process. -- James Lileks --- 5 days until the 68th anniversary of D-Day
Re: SpamAssassin and Bayes learning
Hello John, Friday, June 1, 2012, 3:31:23 PM, you wrote: JH One crucial thing you didn't post: you ran the learning as root. Is the JH user that spamd is running as also root? The bayes database is JH user-specific, and a common problem is to train the database as a JH different user than the MTA+spamd is running under. Not always true, I run sa-learn as root, but it updates the database for the user spamtest -- Best regards, Niamhmailto:ni...@fullbore.co.uk pgp4ugAv3ztmc.pgp Description: PGP signature
Re: SpamAssassin and Bayes learning
On Fri, 01 Jun 2012 14:52:45 +0200 Frank Walter wrote: There is very few spam in the spam folder and then these mails have a very small Bayes score (e.g. 0.8). But there is more spam in the inbox. I thought, if I put a mail into the spam folder and after sa learned it, there would be no question that the Bayes score for this mail would be high, the mail would be detected as spam. That's a false assumption. If you learn a spam and retest the exact same spam it's very likely to hit BAYES_99, but that doesn't mean that similar spams will be caught. Some types of spam are very resistant to learning. Most ham is usually learned easily - check that most of it is hitting BAYES_00. But it happens often that I get this kind of spam mail again. Are the settings I posted all right? IIWY I'd increase the bayes_expiry_max_db_size to 50 which is about the maximum you can have without the expiry algorithm failing to find a solution due to its hard-coded 256 day limit. If 801 is the total spams from two years, that's about one a day; with a 64 day token retention you are probably not retaining enough spammy tokens. If a lot of spam isn't being caught, make sure you have network tests running and the trusted and/or internal network is setup properly.
Re: Apply Bayes learning to all users?
On Fri, 16 Dec 2011 08:54:36 +0100 Benny Pedersen wrote: On Fri, 16 Dec 2011 06:30:31 +, Martin Hepworth wrote: Created a shared iMap or similar email account with a spam and ham folder for users to drag email into (not forward as that breaks headers in thing like outlook) yes, here i found that dovecot-antispam helpfull in the way I think you've both misread the question. The OP wants to use spamtrap mail to train the individual user Bayes accounts. The best way to do this would be to use the global database to adjust the probabilities for low count tokens in the user database. Nothing like that is supported. Doing it via sa-learn sounds like more trouble than it's worth. It's probably a good thing for high volume accounts, but swamping low volume accounts may make things worse.
Re: Apply Bayes learning to all users?
On 12/16/11 05:53, RW wrote: On Fri, 16 Dec 2011 08:54:36 +0100 Benny Pedersen wrote: On Fri, 16 Dec 2011 06:30:31 +, Martin Hepworth wrote: Created a shared iMap or similar email account with a spam and ham folder for users to drag email into (not forward as that breaks headers in thing like outlook) yes, here i found that dovecot-antispam helpfull in the way I think you've both misread the question. The OP wants to use spamtrap mail to train the individual user Bayes accounts. The best way to do this would be to use the global database to adjust the probabilities for low count tokens in the user database. Nothing like that is supported. Doing it via sa-learn sounds like more trouble than it's worth. It's probably a good thing for high volume accounts, but swamping low volume accounts may make things worse. Thanks RW, you understood the question correctly. I'll take a look at those suggestions. Stev3e
Apply Bayes learning to all users?
Hi all, I have some spamtraps which get lots of spam. After a few precautions, I use sa-learn to train a single Bayes profile. This profile is used for many of my users. A significant amount of other users maintain their own Bayes profiles, and I'd like to make this training apply to their profiles as well. Is there an efficient way to do this? Repeatedly doing sa-learn for every user in my system doesn't seem like a good way to go about it. Thanks, Steve
Re: Apply Bayes learning to all users?
Created a shared iMap or similar email account with a spam and ham folder for users to drag email into (not forward as that breaks headers in thing like outlook) Then find one of the many perl scripts lying about the net to grab this email and SA-learn it to the main bayes db. Martin On Friday, 16 December 2011, Steve Freitas sfl...@ihonk.com wrote: Hi all, I have some spamtraps which get lots of spam. After a few precautions, I use sa-learn to train a single Bayes profile. This profile is used for many of my users. A significant amount of other users maintain their own Bayes profiles, and I'd like to make this training apply to their profiles as well. Is there an efficient way to do this? Repeatedly doing sa-learn for every user in my system doesn't seem like a good way to go about it. Thanks, Steve
Re: Apply Bayes learning to all users?
On Fri, 16 Dec 2011 06:30:31 +, Martin Hepworth wrote: Created a shared iMap or similar email account with a spam and ham folder for users to drag email into (not forward as that breaks headers in thing like outlook) yes, here i found that dovecot-antispam helpfull in the way that users just move spam into a spam folder, then dovecut-antispam will learn it as spam, if mails are moved out of that folder its learned as ham, if users delete it in spam folder it does nothing neat solution imho, since it support every client via imap protocol, no bug no problem, even for depricated clients like outlook express :-)
Can bayes learning be turned on and off in one procmailrc
I've been thinking about using bayes in learning mode, but I want to do it without disturbing my current mail setup. I thought I might (using procmail) channel a copy of all incoming mail through spamassassin with bayes learning turned on. I'd want bayes learning off in the main mail setup. So I wondered if one could turn bayes learning on by way of a call to spamassassin in .promailrc but have bayes learning turned off in a different call to spamassassin? I'm thinking something along this line (but turning bayes learning off/on as needed): .procmailrc: :0 c { :0fw ## How to turn bayes learning on here? | /usr/bin/spamc :0: * ^X-Spam-Status: Yes tspama_spam.in :0 ## Here I can check results with out involving main setup post_tspama_spam.in } [...] Then after my other recipes the 2nd call :0fw ## How to turn bayes learning off here? | /usr/bin/spamc :0: * ^X-Spam-Status: Yes spama_spam_.in [end .procmailrc] Is there someway at those calls to turn bayes learning off or on?
Re: is bayes learning?
On 18.02.10 09:56, tonjg wrote: well this has certainly thrown a spanner in the works and I don't know what to do next. I was under the impression that sa was scanning my mail and red flagging any spams, then mimedefang would kick in rejecting the email at smtp. I'm completely confused now It's apparent that it's mimedefang who takes care of spam checking, ans so it's mimedefang's business to take care of autolearn. Now, search mimedefang FAQ, mailing list, forum or any similar place for autolearing info. $ grep add_header 10_default_prefs.cf # grep add_header 10_default_prefs.cf grep: 10_default_prefs.cf: No such file or directory first try switching to directory where the sulr files atre stored! -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Depression is merely anger without enthusiasm.
Re: is bayes learning?
Jari Fredriksson wrote: That is not the recipe I meant. That calls SA yes, but does not reject. I can't provide a recipe for procmail as I personally use maildrop, but the recipe that is needed is one filing the spam to a spam folder (or /dev/null). the golden rule for my server is that spam is not diverted to any folder. Spam gets rejected at smtp. Is that what you meant by overkill? -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27652339.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
On 19.2.2010 12:42, tonjg wrote: Jari Fredriksson wrote: That is not the recipe I meant. That calls SA yes, but does not reject. I can't provide a recipe for procmail as I personally use maildrop, but the recipe that is needed is one filing the spam to a spam folder (or /dev/null). the golden rule for my server is that spam is not diverted to any folder. Spam gets rejected at smtp. Is that what you meant by overkill? No, if you want it really rejected at smtp time, the solution is OK for it. My lighter approach would have been a spam folder for spam but it does not serve your purposes. -- http://www.iki.fi/jarif/ One of the most striking differences between a cat and a lie is that a cat has only nine lives. -- Mark Twain, Pudd'nhead Wilson's Calendar signature.asc Description: OpenPGP digital signature
Re: is bayes learning?
Matus UHLAR - fantomas wrote: you may have autolearn plugin not active. What does X-Spam-Status header in your mail say? On 17.02.10 05:48, tonjg wrote: it says: X-Spam-Score: 4.463 () BAYES_60,HTML_IMAGE_ONLY_24,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY X-Scanned-By: MIMEDefang 2.67 on 172.16.1.36 I don't know what BAYES_60 means. you seem to be running mimedefang which takes care about the e-mail. I have no idea how does mimedefang interact with spamassassin, but I think you should ask your question in mimedefang mailing list, or at least search the web for mimedefang and auto-learn. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Remember half the people you know are below average.
Re: is bayes learning?
Matus UHLAR - fantomas wrote: you seem to be running mimedefang which takes care about the e-mail. I have no idea how does mimedefang interact with spamassassin, but I think you should ask your question in mimedefang mailing list, or at least search the web for mimedefang and auto-learn. thanks but I'm only using mimedefang to reject email recognised by spamassassin, I'm not using md to scan for spam. -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27638511.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
On 18.2.2010 18:16, tonjg wrote: Matus UHLAR - fantomas wrote: you seem to be running mimedefang which takes care about the e-mail. I have no idea how does mimedefang interact with spamassassin, but I think you should ask your question in mimedefang mailing list, or at least search the web for mimedefang and auto-learn. thanks but I'm only using mimedefang to reject email recognised by spamassassin, I'm not using md to scan for spam. How does MimeDefang reject anything if it does not scan it? Your log header sample looked like it was scanned by MimeDefang. Propably MD calls SpamAssassin in it's scan process just like amavisd does. Using a perl package just to reject spam would be an overkill. A simple procmail recipe would do it without any extra process. -- http://www.iki.fi/jarif/ Unless hours were cups of sack, and minutes capons, and clocks the tongues of bawds, and dials the signs of leaping houses, and the blessed sun himself a fair, hot wench in flame-colored taffeta, I see no reason why thou shouldst be so superfluous to demand the time of the day. I wasted time and now doth time waste me. -- William Shakespeare signature.asc Description: OpenPGP digital signature
Re: is bayes learning?
On Thu, 2010-02-18 at 08:16 -0800, an anonymous Nabble user wrote: Matus UHLAR wrote: you seem to be running mimedefang which takes care about the e-mail. I have no idea how does mimedefang interact with spamassassin, but I think you should ask your question in mimedefang mailing list, or at least search the web for mimedefang and auto-learn. thanks but I'm only using mimedefang to reject email recognised by spamassassin, I'm not using md to scan for spam. Let's have a look again at the headers you posted before. X-Spam-Score: 4.463 () BAYES_60,HTML_IMAGE_ONLY_24,HTML_MESSAGE, HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY X-Scanned-By: MIMEDefang 2.67 on 172.16.1.36 Are these not added by mimedefang? Specifically the first one. That's not a standard SA header. If it's NOT mimedefang, you changed the configuration. The default Status header includes auto-learning info *always*, whether it's enabled or not. $ grep add_header 10_default_prefs.cf Besides, SA does not allow removing the Checker-Version header. Thus, if the above are all X-Spam headers in your mail, it was not SA adding them. But some other tool in your mail processing chain. guenther -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: is bayes learning?
well this has certainly thrown a spanner in the works and I don't know what to do next. I was under the impression that sa was scanning my mail and red flagging any spams, then mimedefang would kick in rejecting the email at smtp. I'm completely confused now $ grep add_header 10_default_prefs.cf # grep add_header 10_default_prefs.cf grep: 10_default_prefs.cf: No such file or directory -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27642949.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
Jari Fredriksson wrote: How does MimeDefang reject anything if it does not scan it? Your log header sample looked like it was scanned by MimeDefang. Propably MD calls SpamAssassin in it's scan process just like amavisd does. Using a perl package just to reject spam would be an overkill. A simple procmail recipe would do it without any extra process. but md is a mail filter designed to process mail, how is that an overkill? and where would one find a simple procmail recipe? -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27642991.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
On Thu, 2010-02-18 at 09:56 -0800, an anonymous Nabble user wrote: well this has certainly thrown a spanner in the works and I don't know what to do next. I was under the impression that sa was scanning my mail and red flagging any spams, then mimedefang would kick in rejecting the email at smtp. I'm completely confused now Well, yes -- according to the rules in your headers, SA is scanning the messages. However, SA does not talk SMTP itself, and thus needs some glue to be integrated. In your case, it is likely mimedefang which calls SA to scan the message. IMHO, you should get an overview of the mail flow on your system first. Further debugging after that. $ grep add_header 10_default_prefs.cf # grep add_header 10_default_prefs.cf grep: 10_default_prefs.cf: No such file or directory Ahem. You do understand what that mysterious 'grep' command does, don't you? As you can easily deduct from the error message, the second argument is a file -- and it doesn't exists in the dir where you ran the command... -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: is bayes learning?
On Thu, 2010-02-18 at 09:59 -0800, tonjg wrote: Jari Fredriksson wrote: How does MimeDefang reject anything if it does not scan it? Your log header sample looked like it was scanned by MimeDefang. Propably MD calls SpamAssassin in it's scan process just like amavisd does. Using a perl package just to reject spam would be an overkill. A simple procmail recipe would do it without any extra process. but md is a mail filter designed to process mail, how is that an overkill? and where would one find a simple procmail recipe? This is what I use, may not be the greatest but it's been working for years: :0 fw : $ASSASSINLOCK * 50 | /usr/local/bin/spamc -f -- KeyID 0xE372A7DA98E6705C signature.asc Description: This is a digitally signed message part
Re: is bayes learning?
On 19.2.2010 1:48, Chris wrote: On Thu, 2010-02-18 at 09:59 -0800, tonjg wrote: Jari Fredriksson wrote: How does MimeDefang reject anything if it does not scan it? Your log header sample looked like it was scanned by MimeDefang. Propably MD calls SpamAssassin in it's scan process just like amavisd does. Using a perl package just to reject spam would be an overkill. A simple procmail recipe would do it without any extra process. but md is a mail filter designed to process mail, how is that an overkill? and where would one find a simple procmail recipe? This is what I use, may not be the greatest but it's been working for years: :0 fw : $ASSASSINLOCK * 50 | /usr/local/bin/spamc -f That is not the recipe I meant. That calls SA yes, but does not reject. I can't provide a recipe for procmail as I personally use maildrop, but the recipe that is needed is one filing the spam to a spam folder (or /dev/null). -- http://www.iki.fi/jarif/ Your sister swims out to meet troop ships. signature.asc Description: OpenPGP digital signature
Re: is bayes learning?
On 19.2.2010 1:48, Chris wrote: On Thu, 2010-02-18 at 09:59 -0800, tonjg wrote: Jari Fredriksson wrote: How does MimeDefang reject anything if it does not scan it? Your log header sample looked like it was scanned by MimeDefang. Propably MD calls SpamAssassin in it's scan process just like amavisd does. Using a perl package just to reject spam would be an overkill. A simple procmail recipe would do it without any extra process. but md is a mail filter designed to process mail, how is that an overkill? and where would one find a simple procmail recipe? This is what I use, may not be the greatest but it's been working for years: :0 fw : $ASSASSINLOCK * 50 | /usr/local/bin/spamc -f I wonder the lock file used in procmail scripts... I do not use one in my maildrop and I see no use to a lock file in when using spamc. -- http://www.iki.fi/jarif/ Your sister swims out to meet troop ships. signature.asc Description: OpenPGP digital signature
Re: is bayes learning?
On Fri, 19 Feb 2010, Jari Fredriksson wrote: On 19.2.2010 1:48, Chris wrote: On Thu, 2010-02-18 at 09:59 -0800, tonjg wrote: Jari Fredriksson wrote: How does MimeDefang reject anything if it does not scan it? Your log header sample looked like it was scanned by MimeDefang. Propably MD calls SpamAssassin in it's scan process just like amavisd does. Using a perl package just to reject spam would be an overkill. A simple procmail recipe would do it without any extra process. but md is a mail filter designed to process mail, how is that an overkill? and where would one find a simple procmail recipe? This is what I use, may not be the greatest but it's been working for years: :0 fw : $ASSASSINLOCK * 50 | /usr/local/bin/spamc -f That is not the recipe I meant. That calls SA yes, but does not reject. You _can't_ SMTP-time reject if you're using procmail, as procmail is a delivery agent. The message has already been accepted by the MTA by the time procmail sees it. Is that what you meant? I can't provide a recipe for procmail as I personally use maildrop, but the recipe that is needed is one filing the spam to a spam folder (or /dev/null). Take a look in http://www.impsec.org/~jhardin/antispam/ -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- The first time I saw a bagpipe, I thought the player was torturing an octopus. I was amazed they could scream so loudly. -- cat_herder_5263 on Y! SCOX --- 4 days until George Washington's 278th Birthday
Re: is bayes learning?
On Tue, 2010-02-16 at 15:22 -0800, tonjg wrote: I've got a feeling that the spamassassin on my machine is improving in the way it recognises spam but I'd like to be sure it's not just my imagination. I did my first manual bayes learn about 2 weeks ago using 200 spams and 200 hams, the process appeared to go properly. I read that autolearn is enabled by default and kicks in after 200 emails learnt, but is there a way to tell whether bayes is actually learning? In addition to what the other respondents to this thread have said (sa-learn --dump magic) you should also bear in mind the fact that autolearn only works within set parameters. These are configurable, but I forget what the default is for the moment. What this means is, that if the threshold for autolearning spam is set at 12, spam that is correctly identified as such and scores about 6 - 11 points in SA will not be autolearned. By the same token there is a maximum threshold for autolearning ham. I believe this is done for safety to prevent learning FPs and FNs inappropriately. What this means is that you must still continue to train bayes manually with those mails close to the threshold. I have a nightly cron job set up to read all my verified mail from spam and ham folders and learn as ham or spam respectively. It doesn't matter if the mail has already been learned - sa-learn will work that out for itself. See man sa-learn for the most comprehensive help you will ever find in a man page! HTH
Re: is bayes learning?
Mikael Syska wrote: [r...@freebsd /]# date -r 1266318121 Tue Feb 16 12:02:01 CET 2010 newsest atime should tell you when it last learned from a message. thanks for your response, I ran sa-learn --dump magic: 0.000 0 3 0 non-token data: bayes db version 0.000 0234 0 non-token data: nspam 0.000 0280 0 non-token data: nham 0.000 0 28982 0 non-token data: ntokens 0.000 0 1048982400 0 non-token data: oldest atime 0.000 0 1266390928 0 non-token data: newest atime 0.000 0 1266379330 0 non-token data: last journal sync atime 0.000 0 1264788275 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count but I don't get the same results as you. I get: [r...@home admin]# date -r 1266390928 date: 1266390928: No such file or directory -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27622857.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
RW-15 wrote: On Wed, 17 Feb 2010 00:29:38 +0100 Mikael Syska mik...@syska.dk wrote: Watching nham, nspam counts is more meaningful. my nspam and nham counts look the same as they were two weeks ago without change, which makes me think that bayes isn't learning... -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27622878.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
On Wed, 2010-02-17 at 04:16 -0800, tonjg wrote: Mikael Syska wrote: [r...@freebsd /]# date -r 1266318121 Tue Feb 16 12:02:01 CET 2010 newsest atime should tell you when it last learned from a message. thanks for your response, I ran sa-learn --dump magic: 0.000 0 3 0 non-token data: bayes db version 0.000 0234 0 non-token data: nspam 0.000 0280 0 non-token data: nham 0.000 0 28982 0 non-token data: ntokens 0.000 0 1048982400 0 non-token data: oldest atime 0.000 0 1266390928 0 non-token data: newest atime 0.000 0 1266379330 0 non-token data: last journal sync atime 0.000 0 1264788275 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count but I don't get the same results as you. I get: [r...@home admin]# date -r 1266390928 date: 1266390928: No such file or directory Try # date -d @1266390928 or go to http://www.epochconverter.com/
Re: is bayes learning?
RW-15 wrote: On Wed, 17 Feb 2010 00:29:38 +0100 Mikael Syska mik...@syska.dk wrote: Watching nham, nspam counts is more meaningful. On 17.02.10 04:18, tonjg wrote: my nspam and nham counts look the same as they were two weeks ago without change, which makes me think that bayes isn't learning... you may have autolearn plugin not active. What does X-Spam-Status header in your mail say? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. I just got lost in thought. It was unfamiliar territory.
Re: is bayes learning?
Arthur Dent-6 wrote: Try # date -d @1266390928 ah yes thanks Arthur that worked: [r...@home admin]# date -d @1266390928 Wed Feb 17 07:15:28 GMT 2010 [r...@home admin]# -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27623785.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
Matus UHLAR - fantomas wrote: you may have autolearn plugin not active. What does X-Spam-Status header in your mail say? it says: X-Spam-Score: 4.463 () BAYES_60,HTML_IMAGE_ONLY_24,HTML_MESSAGE,HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY X-Scanned-By: MIMEDefang 2.67 on 172.16.1.36 I don't know what BAYES_60 means. -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27623876.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
is bayes learning?
I've got a feeling that the spamassassin on my machine is improving in the way it recognises spam but I'd like to be sure it's not just my imagination. I did my first manual bayes learn about 2 weeks ago using 200 spams and 200 hams, the process appeared to go properly. I read that autolearn is enabled by default and kicks in after 200 emails learnt, but is there a way to tell whether bayes is actually learning? -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27616380.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
Hi, [r...@freebsd ]# sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 22 0 non-token data: nham 0.000 0793 0 non-token data: ntokens 0.000 0 1266272147 0 non-token data: oldest atime 0.000 0 1266318121 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count [r...@freebsd /]# date -r 1266318121 Tue Feb 16 12:02:01 CET 2010 newsest atime should tell you when it last learned from a message. Yes, my system is new and not yet using bayes ... mvh On Wed, Feb 17, 2010 at 12:22 AM, tonjg t...@freeuk.com wrote: I've got a feeling that the spamassassin on my machine is improving in the way it recognises spam but I'd like to be sure it's not just my imagination. I did my first manual bayes learn about 2 weeks ago using 200 spams and 200 hams, the process appeared to go properly. I read that autolearn is enabled by default and kicks in after 200 emails learnt, but is there a way to tell whether bayes is actually learning? -- View this message in context: http://old.nabble.com/is-bayes-learning--tp27616380p27616380.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: is bayes learning?
On Wed, 17 Feb 2010 00:29:38 +0100 Mikael Syska mik...@syska.dk wrote: newsest atime should tell you when it last learned from a message. Token atimes get updated when you scan a mail. Watching nham, nspam counts is more meaningful.
Re: is bayes learning?
On Tue, 2010-02-16 at 15:22 -0800, tonjg wrote: I've got a feeling that the spamassassin on my machine is improving in the way it recognises spam but I'd like to be sure it's not just my imagination. I did my first manual bayes learn about 2 weeks ago using 200 spams and 200 hams, the process appeared to go properly. I read that autolearn is enabled by default and kicks in after 200 emails learnt, but is there a way to tell whether bayes is actually learning? Look at X-Spam-status message headers: X-spam-status: No, score=1.2 required=6.0 tests=BAYES_00,HELO_LOCALHOST, RCVD_IN_BSP_OTHER autolearn=ham version=3.2.5 or scan /var/log/maillog for spamd messages that report the results for each message: Feb 13 04:51:07 zoogz spamd[8924]: spamd: result: Y 15 - BAYES_80, EMPTY_MESSAGE,HELO_LOCALHOST,MG_IMAGEATT,MG_IMAGESUS,MG_JPEG,MG_VIAUKFSN, MISSING_SUBJECT,RCVD_IN_BL_SPAMCOP_NET,SHORT_HELO_AND_INLINE_IMAGE,TVD_SPACE_RATIO scantime=2.0,size=17758,user=getmail,uid=522,required_score=6.0, rhost=localhost.localdomain,raddr=127.0.0.1,rport=41130, mid=20100213044404.5698549saliva...@zavodzpr-sa.ba,bayes=0.873808,autolearn=spam In both places the autolearn clause tells you what, if any, learning was done from the message. The possible answers are ham,spam or no. The latter applies to messages with scores that are fairly close to zero and so were not automatically learned. Martin
Re: bayes learning '0 messages found'
John Hardin wrote: On Sat, 13 Feb 2010, smfabac wrote: Is there a message size limit for sa-learn? Yes, there is, and sadly sa-learn does not explicitly tell you a message has been skipped because it's too large. If there's a non-text attachment try deleteing it and re-learning the message. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- End users want eye candy and the ooo's and hhh's experience when reading mail. To them email isn't a tool, but an entertainment form. -- Steve Lake --- 9 days until George Washington's 278th Birthday Ok. It's a size problem: I edited the notspam message and deleted 1000 lines from line 3000 to 4000, saved the file and then reprocessed notspam. I continued getting 0 messages examined until I had deleted 3000 lines of the message: Message size as received: $ wc -l notspam 6408 notspam -- sa-learn --ham failed on notspam folder with one message of 6000+ lines $ After deleting 3003 lines: $ wc -l notspam 3405 notspam $ vi notspam 1 ^A^A^A^A 2 From smf Thu Feb 11 01:30:02 2010 3 From: Boyd Lynn Gerber gerb...@zenez.com 4 To: distribut...@registry.ca 5 Subject: Quarterly ASCII posting of SCO UnixWare 7/OpenUNIX 8/OpenServer6 FAQ 6 Date: Thu, 11 Feb 2010 00:05:18 -0700 (MST) 7 Message-Id: ou8faqqt_1265871...@news.xmission.com 3395 3396 filepriv -f setuid programfile.exe 3397 3398 -- 3399 Boyd Gerber gerb...@zenez.com 801 849-0213 3400 ZENEZ 1042 East Fort Union #135, Midvale Utah 84047 3401 3402 3403 =_4B73B21B.8398EDEC-- 3404 3405 ^A^A^A^A $ sa-learn --showdots --ham --mbox notspam . Learned tokens from 1 message(s) (1 message(s) examined) $ $ wc notspam lines: 3405 words: 18735 characters: 130876 notspam So, does the documentation on sa-learn indicate that there is a size limit on the message to be processed? -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27590620.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
Smfabac wrote on Mon, 15 Feb 2010 00:20:06 -0800 (PST): So, does the documentation on sa-learn indicate that there is a size limit on the message to be processed? Why not check yourself? Kai -- Get your web at Conactive Internet Services: http://www.conactive.com
Re: bayes learning '0 messages found'
Kai Schaetzl wrote: Smfabac wrote on Mon, 15 Feb 2010 00:20:06 -0800 (PST): So, does the documentation on sa-learn indicate that there is a size limit on the message to be processed? Why not check yourself? Kai -- Get your web at Conactive Internet Services: http://www.conactive.com Thanks for your help Kai. After checking http://spamassassin.apache.org/full/3.0.x/dist/doc/sa-learn.html I see that there is no official answer to the question. what is the message size limit where sa-learn fails. The question So, does the documentation on sa-learn indicate that there is a size limit on the messages to be processed? is a veiled request to the SA developers/maintainers that people may be interested in that information. -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27595445.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
Smfabac wrote on Mon, 15 Feb 2010 07:27:19 -0800 (PST): The question So, does the documentation on sa-learn indicate that there is a size limit on the messages to be processed? is a veiled request to the SA developers/maintainers that people may be interested in that information. If you want to ask for better documentation of this for instance in the man file or even an option to override the default size limit you should ask on https://issues.apache.org/SpamAssassin/ Kai -- Get your web at Conactive Internet Services: http://www.conactive.com
Re: bayes learning '0 messages found'
On Mon, 2010-02-15 at 07:27 -0800, smfabac wrote: I see that there is no official answer to the question. what is the message size limit where sa-learn fails. If you use something spamc rather than using sa_learn you can gain some flexibility due to the places and hosts where you can run spamc plus you get the ability to set the max message size yourself. Here's an extreme example: for f in spam/* do l=$(wc $f | gawk '{ print $3 }') spamc --learntype=spam --max-size=$l $f done where the limit is set to the size of each spam message in turn. Martin
Re: bayes learning '0 messages found'
RW-15 wrote: On Fri, 12 Feb 2010 17:51:12 + RW rwmailli...@googlemail.com wrote: On Fri, 12 Feb 2010 09:17:54 -0800 (PST) smfabac smfa...@att.net wrote: Mark, On UNIX any file is a mbox file if it contains mail messages in the form: ^A^A^A^A mail headers mail body ^A^A^A^A ^A^A^A^A Next Message mail headers mail body ^A^A^A^A I don't know what that is, but it's not a standard mbox format. In mbox format the emails all start with a blank line and a From. It appears to be mmdf format http://www.washington.edu/imap/documentation/formats.txt.html Ok, Now that we're all on the same page. How do I find out why sa-learn is not processing the legal not-spam file? To re-cap, sa-learn --spam --mbox isspam works but sa-learn --ham --mbox not-spam is not working. The sa-learn --dump magic shows that messages have been added by the sa-learn command: $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 12551 0 non-token data: nspam 0.000 0 68020 0 non-token data: nham 0.000 0 143948 0 non-token data: ntokens 0.000 0 1260104403 0 non-token data: oldest atime 0.000 0 1266048014 0 non-token data: newest atime 0.000 0 1266049794 0 non-token data: last journal sync atime 0.000 0 1265630710 0 non-token data: last expiry atime 0.000 05529600 0 non-token data: last expire atime delta 0.000 0 19095 0 non-token data: last expire reduction co unt $ sa-learn --spam --mbox isspam Learned tokens from 1 message(s) (1 message(s) examined) $ $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 12552 0 non-token data: nspam 0.000 0 68020 0 non-token data: nham 0.000 0 144608 0 non-token data: ntokens 0.000 0 1260104403 0 non-token data: oldest atime 0.000 0 1266048014 0 non-token data: newest atime 0.000 0 1266049794 0 non-token data: last journal sync atime 0.000 0 1265630710 0 non-token data: last expiry atime 0.000 05529600 0 non-token data: last expire atime delta 0.000 0 19095 0 non-token data: last expire reduction co unt $ As you can see the nspam has incremented by 1. $ sa-learn --ham --mbox not-spam Learned tokens from 0 message(s) (0 message(s) examined) $ Read Create Save Delete Undelete Print Folder Options Quit Set mail options and preferences Folder: not-spamSaturday February 13, 2010 2:34 -- [1] Message 1 gerb...@zenez.co 11 Feb 10 6404 Quarterly ASCII posting of SCO Uni Is there a message size limit for sa-learn? The message in not-spam is plain ascii, no html. $ wc -l not-spam 6408 not-spam -- sa-learn --ham failed on not-spam folder with one message $ $ wc -l isspam 1039 isspam -- sa-learn --spam worked on isspam folder with one message $ -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27573012.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
On Sat, 13 Feb 2010, smfabac wrote: Now that we're all on the same page. How do I find out why sa-learn is not processing the legal not-spam file? To re-cap, sa-learn --spam --mbox isspam works but sa-learn --ham --mbox not-spam is not working. Well, I would expect if this suggestion were right you would have had all sorts of warning messages about syntax, but just in case Maybe linux is interpreting the dash in the filename as a switch indicator? Try enclosing the file name in single quotes or use a filename without a dash... - C
Re: bayes learning '0 messages found'
Charles Gregory wrote: On Sat, 13 Feb 2010, smfabac wrote: Now that we're all on the same page. How do I find out why sa-learn is not processing the legal not-spam file? To re-cap, sa-learn --spam --mbox isspam works but sa-learn --ham --mbox not-spam is not working. Well, I would expect if this suggestion were right you would have had all sorts of warning messages about syntax, but just in case Maybe linux is interpreting the dash in the filename as a switch indicator? Try enclosing the file name in single quotes or use a filename without a dash... - C $ ls -lt | head -3 total 15868 -rw--- 1 smf group 249046 Feb 13 02:37 not-spam -rw-rw-rw- 1 smf group 94762 Feb 13 02:29 isspam $ mv not-spam notspam $ ls -lt | head -3 total 15868 -rw--- 1 smf group 249046 Feb 13 02:37 notspam -rw-rw-rw- 1 smf group 94762 Feb 13 02:29 isspam $ sa-learn --showdots --ham --mbox notspam Learned tokens from 0 message(s) (0 message(s) examined) $ On the off chance that permissions on the file is an issue: $ chmod 666 notspam $ ls -lt | head -3 total 15868 -rw-rw-rw- 1 smf group 249046 Feb 13 02:37 notspam -rw-rw-rw- 1 smf group 94762 Feb 13 02:29 isspam $ sa-learn --showdots --ham --mbox notspam Learned tokens from 0 message(s) (0 message(s) examined) Still no luck. -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27576922.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
On 12.02.10 09:17, smfabac wrote: On UNIX any file is a mbox file if it contains mail messages in the form: ^A^A^A^A mail headers mail body ^A^A^A^A ^A^A^A^A Next Message mail headers mail body ^A^A^A^A mmdf, not mbox. And my not-spam file meets this requirement: ^A^A^A^A sa-learn apparently does not support mmdf. when sa-learn does not recognize the format of the file, it does not learn from it. Also, reading the file with the command mail -f not-spam launches the UNIX mail reader showing that the file is legal mbox file. your mail command supports mmdf. save the message to mbox format (saving it to a single file without the ^A's could work) and try sa-learn from it. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Linux - It's now safe to turn on your computer. Linux - Teraz mozete pocitac bez obav zapnut.
Re: bayes learning '0 messages found'
On Sat, 13 Feb 2010, smfabac wrote: Is there a message size limit for sa-learn? Yes, there is, and sadly sa-learn does not explicitly tell you a message has been skipped because it's too large. If there's a non-text attachment try deleteing it and re-learning the message. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- End users want eye candy and the ooo's and hhh's experience when reading mail. To them email isn't a tool, but an entertainment form. -- Steve Lake --- 9 days until George Washington's 278th Birthday
Re: bayes learning '0 messages found'
On Sat, 13 Feb 2010, smfabac wrote: $ sa-learn --showdots --ham --mbox notspam Learned tokens from 0 message(s) (0 message(s) examined) Still no luck. Are we sure the notspam file is clean? Try trimming it down to just one or two messages, and see how it goes - C
Re: bayes learning '0 messages found'
tonjg wrote: raq550 server OS: strongbolt2 spamassassin.i386 0:3.2.5-1.el4 I'm trying to run: sa-learn --spam --showdots --dir /path/to...mbox but it fails with: 'Learned tokens from 0 message(s) (0 messages examined)' my spam mail is in a file called mbox but when I run the above command to the directory containg mbox it always fails with the '0 messages examined' error. I've also tried copying the mbox file to another location, removing all the restrictions on it but I still get '0 messages learned'. I know the sa-learn command is working properly because I previously pointed it to a wrong location and it picked up 3 tokens but it won't pick up anything from the mbox file. I've even tried renaming the (copied) mbox file and restarting spamassassin but no joy. The mbox file contains about 200 spam mails and is 3.5Mb. Thanks for any help. I am having a similar problem as the poster but I have successfully run spamassassin for several years and today when I used the sa-lean command to process the mailbox where I moved the mis-classified mail message (not-spam) I get: $ sa-learn --showdots --ham --mbox not-spam Learned tokens from 0 message(s) (0 message(s) examined) $ Check the mail folder not-spam: $ mail -f not-spam SCO OpenServer Mail Release 5.0.7 Type ? for help. not-spam: 1 message 1 gerb...@zenez.co Thu Feb 11 01:30 6405/248986 Quarterly ASCII posting of And reading the message: Message 1: From smf Thu Feb 11 01:30:02 2010 From: Boyd Lynn Gerber gerb...@zenez.com To: distribut...@registry.ca Subject: Quarterly ASCII posting of SCO UnixWare 7/OpenUNIX 8/OpenServer 6 FAQ Date: Thu, 11 Feb 2010 00:05:18 -0700 (MST) Message-Id: ou8faqqt_1265871...@news.xmission.com X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on unix.smfabac.com X-Spam-Level: *** X-Spam-Status: Yes, score=3.4 required=3.0 tests=HEADER_SPAM autolearn=unavailable version=3.2.5 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary=--=_4B73B21B.8398EDEC Status: RO This is a multi-part message in MIME format. =_4B73B21B.8398EDEC Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Spam detection software, running on the system unix.smfabac.com, has And sa-learn --dump --magic shows: $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 12551 0 non-token data: nspam 0.000 0 67987 0 non-token data: nham 0.000 0 143194 0 non-token data: ntokens 0.000 0 1260104403 0 non-token data: oldest atime 0.000 0 1265990403 0 non-token data: newest atime 0.000 0 1265991303 0 non-token data: last journal sync atime 0.000 0 1265630710 0 non-token data: last expiry atime 0.000 05529600 0 non-token data: last expire atime delta 0.000 0 19095 0 non-token data: last expire reduction co unt $ I have successfully run sa-learn --ham --mbox not-spam in the past so why is it failing me now? how do I determine why the message is not being processed by sa-learn? -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27566005.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
tonjg wrote: I'm trying to run: sa-learn --spam --showdots --dir /path/to...mbox but it fails with: 'Learned tokens from 0 message(s) (0 messages examined)' my spam mail is in a file called mbox but when I run the above command to the directory containg mbox it always fails with the '0 messages examined' error. If your messages are in a mbox *file*, you need an option --mbox, not --dir . smfabac wrote: I am having a similar problem as the poster but I have successfully run spamassassin for several years and today when I used the sa-lean command to process the mailbox where I moved the mis-classified mail message (not-spam) I get: $ sa-learn --showdots --ham --mbox not-spam Learned tokens from 0 message(s) (0 message(s) examined) Check the mail folder not-spam: If not-spam is a folder (not a mbox file), you must not use the option --mbox. Mark
Re: bayes learning '0 messages found'
Mark Martinec wrote: tonjg wrote: I'm trying to run: sa-learn --spam --showdots --dir /path/to...mbox but it fails with: 'Learned tokens from 0 message(s) (0 messages examined)' my spam mail is in a file called mbox but when I run the above command to the directory containg mbox it always fails with the '0 messages examined' error. If your messages are in a mbox *file*, you need an option --mbox, not --dir . smfabac wrote: I am having a similar problem as the poster but I have successfully run spamassassin for several years and today when I used the sa-lean command to process the mailbox where I moved the mis-classified mail message (not-spam) I get: $ sa-learn --showdots --ham --mbox not-spam Learned tokens from 0 message(s) (0 message(s) examined) Check the mail folder not-spam: If not-spam is a folder (not a mbox file), you must not use the option --mbox. Mark Mark, On UNIX any file is a mbox file if it contains mail messages in the form: ^A^A^A^A mail headers mail body ^A^A^A^A ^A^A^A^A Next Message mail headers mail body ^A^A^A^A And my not-spam file meets this requirement: ^A^A^A^A From smf Thu Feb 11 01:30:02 2010 From: Boyd Lynn Gerber gerb...@zenez.com To: distribut...@registry.ca ... stuff deleted ... =_4B73B21B.8398EDEC-- ^A^A^A^A Also, reading the file with the command mail -f not-spam launches the UNIX mail reader showing that the file is legal mbox file. -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27566692.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
On Fri, 12 Feb 2010 09:17:54 -0800 (PST) smfabac smfa...@att.net wrote: Mark, On UNIX any file is a mbox file if it contains mail messages in the form: ^A^A^A^A mail headers mail body ^A^A^A^A ^A^A^A^A Next Message mail headers mail body ^A^A^A^A I don't know what that is, but it's not a standard mbox format. In mbox format the emails all start with a blank line and a From.
Re: bayes learning '0 messages found'
On Fri, 12 Feb 2010 17:51:12 + RW rwmailli...@googlemail.com wrote: On Fri, 12 Feb 2010 09:17:54 -0800 (PST) smfabac smfa...@att.net wrote: Mark, On UNIX any file is a mbox file if it contains mail messages in the form: ^A^A^A^A mail headers mail body ^A^A^A^A ^A^A^A^A Next Message mail headers mail body ^A^A^A^A I don't know what that is, but it's not a standard mbox format. In mbox format the emails all start with a blank line and a From. It appears to be mmdf format http://www.washington.edu/imap/documentation/formats.txt.html
Re: bayes learning '0 messages found'
On Thursday 28 January 2010 17:16:04 tonjg wrote: spamassassin.i386 0:3.2.5-1.el4 I'm trying to run: sa-learn --spam --showdots --dir /path/to...mbox but it fails with: 'Learned tokens from 0 message(s) (0 messages examined)' my spam mail is in a file called mbox but when I run the above command to the directory containg mbox it always fails with the '0 messages examined' error. If the argument is a single mbox file, precede it with a --mbox option, not with --dir . Mark
Re: bayes learning '0 messages found'
it's okay - I found the solution at: http://spamassassin.apache.org/full/3.1.x/doc/sa-learn.html the command needed --mbox to be included. I added this and the learning worked. -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27358559.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
Mark Martinec wrote: If the argument is a single mbox file, precede it with a --mbox option, not with --dir . thanks for your response but I've got a further problem now (I think). I'm trying to do the same thing with the ham command# sa-learn --showdots --mbox --ham but nothing's happening. When I did the spam command it showed a progression of dots and ended with a confirmation message of tokens found and 216 emails scanned. But with the ham command there's nothing happening - the cursor just dropped to the next line and it's been there for half an hour now. Is this normal? -- View this message in context: http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27358771.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: bayes learning '0 messages found'
If what you presented in your message is actually the command you used, then it might be looking for some input from the keyboard - you don't illustrate having specified the particular file you want it to use following the '--mbox' option, you have --ham in that position on the line. I have not done any testing, so I can't say exactly how it would behave in that situation. tonjg t...@freeuk.com 01/28/10 2:02 PM Mark Martinec wrote: If the argument is a single mbox file, precede it with a --mbox option, not with --dir . thanks for your response but I've got a further problem now (I think). I'm trying to do the same thing with the ham command# sa-learn --showdots --mbox --ham but nothing's happening. When I did the spam command it showed a progression of dots and ended with a confirmation message of tokens found and 216 emails scanned. But with the ham command there's nothing happening - the cursor just dropped to the next line and it's been there for half an hour now. Is this normal?
Re: Bayes learning trusted networks mailing list email
On Fri, 05 Jun 2009 10:24:31 -0400 Micah Anderson mi...@riseup.net wrote: If I understand things properly, because I've got these setup in my trusted_networks, then these previous hops will be checked in RBLs, so the spam is more detectable. That doesn't really help. If you think about it, tests that run on untrusted headers will run whether or not you put the list servers into your trusted network. The tests that run on the trusted boundary are whitelisting rules (plus a few rules that will soon get moved to the internal boundary). You might get some benefit from putting the list servers into the internal network, but the chances are that the list is already blocking on zen, and maybe DUL lists and SPF. What I am unsure of is if I am poisoning my bayes by reporting these messages that make it through as spam. Should I be just deleting them? The tokens that are legitimate that will end up as collateral damage are going to be the list footers, the list administration messages, and potentially other pieces. I'm hoping I can identify why my bayes database is so bad (it thinks everything is BAYES_00 now), and if this is why I will want to change my training behavior. It's really hard for BAYES to work on in-list spams because they contain so many strong ham tokens. What I would suggest is to use a separate address and Bayes database for the lists and train it on all spam, but only learn ham that doesn't hit BAYES_00. I use sieve to select some in-list candidates for learning (with dspam rather than SA). You might also configure BAYES to ignore some of the list headers. Things like challenge-response messages and out-of-office replies are best handled with simple filtering or custom SA tests.
Re: Bayes Learning with Analysis Attached
On Tue, Apr 29, 2008 at 11:08:22AM -0700, Matt Florido wrote: feature. However, I'm wondering if this impacts sa-learn? Can I simply run sa-learn on mails that have the analysis attached? I also noticed Yes. sa-learn removes markup before doing the processing. I'm not seeing Bayes participating in the scoring. Is this because it's new and my Bayes db hasn't been fully trained? Yes. You need 200 each ham and spam. Also, is adding additional rulesets from rulesemporium.com still necessary for added value? And if so, do I just add them to my /etc/spamassassin directory? First, use sa-update and get the SA updated rules. Then, if you wanted to add in third party rulesets, you could also look at using sa-update for that. There's docs on the wiki or you can just search the list archives. -- Randomly Selected Tagline: Phenomenal Cosmic Powers, Itty Little Living Space. - Aladdin pgp9xv3wCXdUD.pgp Description: PGP signature
Re: Bayes Learning with Analysis Attached
Theo Van Dinter wrote: Matt Florido wrote: I'm not seeing Bayes participating in the scoring. Is this because it's new and my Bayes db hasn't been fully trained? Yes. You need 200 each ham and spam. You can use sa-learn to dump the database stats and see how many of each have been learned and other information. sa-learn --dump magic Bob
Re: Bayes Learning with Analysis Attached
Theo Van Dinter wrote: Matt Florido wrote: I'm not seeing Bayes participating in the scoring. Is this because it's new and my Bayes db hasn't been fully trained? Yes. You need 200 each ham and spam. You can use sa-learn to dump the database stats and see how many of each have been learned and other information. sa-learn --dump magic Bob I wonder why it is called magic. dump statistics would be much better. Dumping numbers from database is not rocket science, nor magic...
Re: Bayes Learning with Analysis Attached
On Wed, Apr 30, 2008 at 03:23:38AM +0300, Jari Fredriksson wrote: I wonder why it is called magic. Because the data that is being dumped is from the metadata in the DB, which we store using magic tokens, since they're tokens that can't possibly exist in the DB through normal means. -- Randomly Selected Tagline: This is not a novel to be tossed aside lightly. It should be thrown with great force. - Dorothy Parker pgpJvVENhyn4L.pgp Description: PGP signature
Re: spamc/spamd bayes learning question
On Saturday 24 March 2007 23:04, Marc Perkel wrote: The learn-spam script looks like this: /usr/bin/spamc -d euclid.ctyme.com -x -t 15 -L spam /dev/null 2 /dev/null /bin/echo /dev/null The echo command is just there so it returns a 0 and exim doesn't complain. Probably a better way to do that. It's common to put || true at the end of a command you don't care about the exit status of. Or you could just exit 0. -- Magnus Holmgren[EMAIL PROTECTED] (No Cc of list mail needed, thanks) Exim is better at being younger, whereas sendmail is better for Scrabble (50 point bonus for clearing your rack) -- Dave Evans pgp2R2b4NU4nl.pgp Description: PGP signature
Re: spamc/spamd bayes learning question
Marc Perkel wrote: Trying to set up spamc/spamd learning. Have a dedicated spamd server that is fed from several MTA machines running exim. On the exim side I'm piping messages into spamc as follows: unseen pipe /etc/exim/scripts/learn-spam The learn-spam script looks like this: /usr/bin/spamc -d euclid.ctyme.com -x -t 15 -L spam /dev/null 2 /dev/null /bin/echo /dev/null The echo command is just there so it returns a 0 and exim doesn't complain. Probably a better way to do that. But - over on the spamd server side I'm getting: Mar 24 15:01:30 euclid spamd[2870]: spamd: Tell: Setting local for mail:11 in 0.1 seconds, 1512 bytes Mar 24 15:01:30 euclid spamd[5417]: spamd: Tell: Did nothing for mail:11 in 0.1 seconds, 13139 bytes The Did Nothing doesn't look good. I'm doing something wrong? is it the user? Should I try to force it to be root? Is it permissions? Trying to feed mysql bayes. I'm certainly no expert at this config, but did you start spamd with the -l (aka --allow-tell) option?
spamc/spamd bayes learning question
Trying to set up spamc/spamd learning. Have a dedicated spamd server that is fed from several MTA machines running exim. On the exim side I'm piping messages into spamc as follows: unseen pipe /etc/exim/scripts/learn-spam The learn-spam script looks like this: /usr/bin/spamc -d euclid.ctyme.com -x -t 15 -L spam /dev/null 2 /dev/null /bin/echo /dev/null The echo command is just there so it returns a 0 and exim doesn't complain. Probably a better way to do that. But - over on the spamd server side I'm getting: Mar 24 15:01:30 euclid spamd[2870]: spamd: Tell: Setting local for mail:11 in 0.1 seconds, 1512 bytes Mar 24 15:01:30 euclid spamd[5417]: spamd: Tell: Did nothing for mail:11 in 0.1 seconds, 13139 bytes The Did Nothing doesn't look good. I'm doing something wrong? is it the user? Should I try to force it to be root? Is it permissions? Trying to feed mysql bayes. Thanks in advance.
Re: Bayes learning email address
John D. Hardin wrote: On Sat, 15 Apr 2006, mouss wrote: - you are trusting your users to make the right decision. The problem is that different people have different opinions of what is spam and what is not. Things get even worst if one user isn't honest... That's a problem with *any* scheme for allowing the users to train Bayes themselves. In practice, however, I think you'll see much more apathy than stupidity or malice. My problem was with getting my users to even *look at* their marginal-spams folder and classify the messages. Ever. You should check for things like your own quota notification messages in the spam folder. If you send a boilerplate email in response to someone sending an email to your abuse or postmaster address, check for that too. I used to work for a fairly large ISP and we got these sorts of things sent to us all the time. Andrew
Re: Bayes learning email address
On Sat, 15 Apr 2006, mouss wrote: - you are trusting your users to make the right decision. The problem is that different people have different opinions of what is spam and what is not. Things get even worst if one user isn't honest... That's a problem with *any* scheme for allowing the users to train Bayes themselves. In practice, however, I think you'll see much more apathy than stupidity or malice. My problem was with getting my users to even *look at* their marginal-spams folder and classify the messages. Ever. -- John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- Senator, when you took your oath of office, you placed your hand on the Bible and swore to uphold the Constitution. You didn't place your hand on the Constitution and swear to uphold the Bible. -- Jamie Raskin, Professor of Law at American University, testifying before the Maryland Senate ---
Re: Bayes learning email address
Owen Mehegan wrote: To make it easier for my users to train my server's Bayes database, I set up a user with the following procmail recipe in its .procmailrc: :0 * 256000 { :0c: spamassassin.spamlock | sa-learn --spam :0: spamassassin.filelock spam } The idea is for people to redirect (not forward) uncaught spam to that address and have it added to our Bayes system. I suppose I could also --report those messages to the various reporting systems. Will this work, or are there pitfalls I haven't thought of? - you are trusting your users to make the right decision. The problem is that different people have different opinions of what is spam and what is not. Things get even worst if one user isn't honest... - you must protect this address from getting mail from untrusted sources (from outside for example). otherwise, anyone can pollute your bayes. - how about reporting false positives?
Problem with Bayes learning
Greetings! I got a problem when I try to feed Bayes with large number of emails (over 1500). It just hang there and I got the the following error messages from maillog file: .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock failed: File exists Does anyone know how to fix it? Thanks. Jonathan
Re: Problem with Bayes learning
On Tuesday 28 February 2006 05:06 pm, Jonathan Nie wrote: Greetings! I got a problem when I try to feed Bayes with large number of emails (over 1500). It just hang there and I got the the following error messages from maillog file: .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock failed: File exists Does anyone know how to fix it? The bayes section of my spamassassin setup in local.cf looks like this: #-- bayes_path /etc/mail/spamassassin/bayes/bayes bayes_file_mode 0777 use_bayes 1 #bayes_use_hapaxes 1 # Enable Bayes auto-learning bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam0.1 bayes_auto_learn_threshold_spam 9.0 #-- Which for me.. would mean that I'd cd to: /etc/mail/spamassassin/bayes .. and do a (as root): chmod 666 bayes* ... to allow anyprocess with access to those bayes files an opportunity to either open and read/write to it. Nobody else (that can login to the system) has access to the files on that file system so it should be safe (at least for me) to perform this and it not be a breach of security of some kind. I'm pretty sure that the owner/group of the bayes files are spamd so that it can access the files as it needs.. and when I run sa-learn to harvest other tokens, I run sa-learn as the same user as well. -- Tyler Nally [EMAIL PROTECTED]
Re: Problem with Bayes learning
Tyler Nally wrote: On Tuesday 28 February 2006 05:06 pm, Jonathan Nie wrote: Greetings! I got a problem when I try to feed Bayes with large number of emails (over 1500). It just hang there and I got the the following error messages from maillog file: .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock failed: File exists Does anyone know how to fix it? The bayes section of my spamassassin setup in local.cf looks like this: #-- bayes_path /etc/mail/spamassassin/bayes/bayes bayes_file_mode 0777 snip Which for me.. would mean that I'd cd to: /etc/mail/spamassassin/bayes .. and do a (as root): chmod 666 bayes* ... to allow anyprocess with access to those bayes files an opportunity to either open and read/write to it. OUCH! Bad advice. DO NOT DO THIS Note that SA is not complaining it cannot access the files. It is complaining that the bayes database is already locked. This means that SA believes another process is ACTIVELY writing to the bayes database. If you wind up doing a chmod 666 on your bayes lock file (which the above command WILL do), and then invoke sa-learn while another process is accessing the database, you will corrupt your ENTIRE bayes database beyond recovery.
Re: Problem with Bayes learning
Jonathan Nie wrote: Greetings! I got a problem when I try to feed Bayes with large number of emails (over 1500). It just hang there and I got the the following error messages from maillog file: .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock failed: File exists Does anyone know how to fix it? SA believes another process is currently writing to the bayes database. This would be quite normal if a bayes expiry run was going on at the time. Wait a while and see if it still happens. If it still fails, shutdown ALL spamassassin operations, and try again. If it *still* fails, manually delete the bayes lock file. (it will be in your bayes directory. I think it's called bayes.mutex)
Re: Problem with Bayes learning
Hi Matt, I am new to spamassassin. Thank you so much for your help and Tyler too. Bayes autolearn is enabled when I feed Bayes with the 1500 emails manually using the sa-learn command. Does it cause the problem? I also checked the Bayes database directory and found two stale lock files bayes.lock One is pretty old, almost 4 months and the other was created during I feed bayes this time. Could I delete them? Thanks again. Jonathan Jonathan Nie wrote: Greetings! I got a problem when I try to feed Bayes with large number of emails (over 1500). It just hang there and I got the the following error messages from maillog file: .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock failed: File exists Does anyone know how to fix it? SA believes another process is currently writing to the bayes database. This would be quite normal if a bayes expiry run was going on at the time. Wait a while and see if it still happens. If it still fails, shutdown ALL spamassassin operations, and try again. If it *still* fails, manually delete the bayes lock file. (it will be in your bayes directory. I think it's called bayes.mutex)
Re: Problem with Bayes learning
On Tuesday 28 February 2006 10:46 pm, you wrote: I am new to spamassassin. Thank you so much for your help and Tyler too. Thanks.. I'm not the expert.. I just use it! Bayes autolearn is enabled when I feed Bayes with the 1500 emails manually using the sa-learn command. Does it cause the problem? I think that sa-learn... probably creates a lock file. Assuming that sa-learn exits normally, I would think that it'd remove the lock file when it's done. I assume that it works this way because when you're sa-learn-ing .. the auto-learn feature is unavailable for spamd to record the bayes tokens (I think) because it can't get a lock on the bayes structures to record them. Once sa-learn halts and removes the lock.. auto-learn should be available. I also checked the Bayes database directory and found two stale lock files bayes.lock One is pretty old, almost 4 months and the other was created during I feed bayes this time. Could I delete them? I'd say.. that you can toast the 4 month old one rather easily... Watch for when sa-learn finishes.. and you should see the newer lock file go away after it's completion. If it doesn't... then remove that one as well I don't think, in the normal operation of spamassassin.. if the auto-learn *write* to the bayes structure put's a lockfile on the bayes structures. At the same time... I've never explicitly watched the directory that bayes exists .. to see if a lock file appears quickly and disapppears just as fast when it's done. I do know.. that if I evoke *sa-learn*.. that a lockfile will exist while it's sa-learn'ing.. and then go away afterwards. While it's sa-learn'ing, I see the Spamassassin header tags show that autolearn is unavailable during this time because it knows it can't open up the bayes structures to write the tokens to it. -- Tyler Nally [EMAIL PROTECTED]
Re: Problem with Bayes learning
Stop receiving emails. Stop the SpamAssassin service once the incoming mail spool is empty. Then kill all vestiges of spamd or spamassassin that might still be running from previously improperly terminated sessions. Then run sa-learn. If it STILL hangs with this lock you'd a problem somewhere fer shure. Once sa-learn is run then restart spamassassin and restart your email reception process. Do NOT kill lockfiles while SpamAssassin is running. That invites database corruption. Is it possible the 1500 messages all at once triggers a potential Bayes database expiration about half way through the pass and that is what is getting it hung up? I'll leave it to the authors to address that potential. It seems unlikely. {^_^} - Original Message - From: Jonathan Nie [EMAIL PROTECTED] Hi Matt, I am new to spamassassin. Thank you so much for your help and Tyler too. Bayes autolearn is enabled when I feed Bayes with the 1500 emails manually using the sa-learn command. Does it cause the problem? I also checked the Bayes database directory and found two stale lock files bayes.lock One is pretty old, almost 4 months and the other was created during I feed bayes this time. Could I delete them? Thanks again. Jonathan Jonathan Nie wrote: Greetings! I got a problem when I try to feed Bayes with large number of emails (over 1500). It just hang there and I got the the following error messages from maillog file: .bayes: cannot open bayes databases /spamassassin/bayes_* R/W: lock failed: File exists Does anyone know how to fix it? SA believes another process is currently writing to the bayes database. This would be quite normal if a bayes expiry run was going on at the time. Wait a while and see if it still happens. If it still fails, shutdown ALL spamassassin operations, and try again. If it *still* fails, manually delete the bayes lock file. (it will be in your bayes directory. I think it's called bayes.mutex)
Per-User - Bayes Learning
Hello All, I have e-mail accounts that have been sending Spam to a specific e-mail address as an attachment for some time now. Before they were manually gone through as I didn't have anything specific set up on a per-account basis. Now that I have SA on our Win2K server storing everything in a MySQL schema, I would like to automate the process more. I have a script that I wrote that will take and strip out any attached message and uses sa-learn. However, sa-learn seems to be time consuming (at minimum, 9 seconds per attached message submitted). Is there anything that can be done to speed up the process? -- This message is made of 100% recycled electrons.
Re: Per-User - Bayes Learning
At 10:00 PM 1/1/2006, Duane Hill wrote: Hello All, I have e-mail accounts that have been sending Spam to a specific e-mail address as an attachment for some time now. Before they were manually gone through as I didn't have anything specific set up on a per-account basis. Now that I have SA on our Win2K server storing everything in a MySQL schema, I would like to automate the process more. I have a script that I wrote that will take and strip out any attached message and uses sa-learn. However, sa-learn seems to be time consuming (at minimum, 9 seconds per attached message submitted). Is there anything that can be done to speed up the process? Are you using the mysql.pm bayes store module, or the default generic one? If you're using the generic sql.pm, I'd suggest switching. The learning time is cut by more than half. http://wiki.apache.org/spamassassin/BayesBenchmarkResults (1a and 1b are learning). Also, if you're using SA 3.1.0 you can learn using spamc -L, which will take advantage of spamd instead of spawning a whole new perl instance. Very useful if you do a lot of learning, but I'll warn you this is a newish feature and it might have some growing pains (I've not used it) http://spamassassin.apache.org/full/3.1.x/dist/doc/spamc.html
RE: Bayes learning error
Hi Robert, You need to install the DB_File perl module. Do the following: perl -eshell -MCPAN install DB_File Cheers, Chris From: Robert Swan [mailto:[EMAIL PROTECTED] Sent: 20 June 2005 14:53To: users@spamassassin.apache.orgSubject: Bayes learning error I am getting an error when I run manual learning sa-learn ham . Has anyone seen this before or have a clue how to fix it debug: bayes: DB_File module not installed, cannot use Bayes I am using Redhat, spamassassin 3.03 spamd,spamc, postfix thanks Robert Swan Peace he would say instead of goodbyepeace my brother. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. MailScanner is part of the Email Filtering Service from Nexent Internet . ___The contents of this e-mail may be privileged and are confidential.It may not be disclosed to or used by anyone other than the addressee(s), nor copied in any way. Any views or opinionspresented are solely those of the author and do not necessarily represent those of Knowledge Limited.If received in error, please advise the sender, then delete it from your system.___
Detailed directions for using IMAP for Bayes learning and configuring webuserprefs
Some folks might be interested in the updated detailed install instructions on the wiki. I've added sections on setting up a LearnAsSpam IMAP folder that's remotely processed. This is the best solution I've seen for integrating SpamAssassin with end-users on an Exchange server. http://wiki.apache.org/spamassassin/SingleUserUnixInstall#head-bea6b8dc4 f219edd3b9976e8f922a8f1c0603125 I've also added a section on configuring webuserprefs to give a friendly web user interface for end-users wanting to edit their whitelists and other settings. http://wiki.apache.org/spamassassin/SingleUserUnixInstall#head-1dd15c06b 7e645638def3d2ed2ef31557d853659 Please let me know if you have any comments, or just fix them on the wiki directly. - dan -- Dan Kohn mailto:[EMAIL PROTECTED] http://www.dankohn.com/ tel:+1-650-327-2600
Do you use MS Exchange public folders for bayes learning?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, I would like to throw out a request for admins that are using, have tried or want to use MS Exchage public folders to gather messages that will be fed back to sa-learn. Background: Since there are not many (any?) good ways to retrieve email messages out of an Exchange/Outlook system in order to feed the messages back to a SA server to be run through sa-learn, the best option for a large amount of users is to setup a public folder and have users drop their messages in the PFs, then run an automated script on the SA server to pull those messages in using a script like Nick Burch's power-imap-sa-learn.pl [1]. However there seems to be an issue with MS Exchange public folders and IMAP. When an email message is placed in a public folder, then retrived via IMAP, Exchange strips out some of the SMTP headers and inserts some custom MS headers. This is clearly a non-optimal method due the loss of some great spam/ham signs. While the messages are fairly close to their original forms, they could be much better. Request: I currently have a ticket open with MS Premier support due to a bug in MS's implementation of IMAP public folders. At this point in time, MS support confirms that they can replicate the behavior and have escalated the issue and have sent it off to the Exchange development team for a RFC. My best guess is that they will confirm the issue, but say that there is not enough reason to develop a fix or a patch for the issue. I would like to gather a list of admins who would be like this issue to be resolved, so that I can have a little more push with MS. It may help if I can let them know that I have discussed this with X number of admins, who have X numbers of servers and users who would like to this this issue resolved. If you use SpamAssassin as a filter for an Exchange system and would like to add your voice, please contact me off-list. Thanks for your time, Matt Yackley [1] http://tirian.magd.ox.ac.uk/~nick/code/ -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (GNU/Linux) iD8DBQFCKhrnjzAeShEp8NMRAinUAJsGxhKgq22XUyCSSqWCiC5WkUYZwgCcDijV B8iyPACkUHQE4MIYfc25mqU= =Mvc6 -END PGP SIGNATURE-
Re: Bayes learning
Lisa Casey wrote: Hi All, I'm still fairly new to Spamassassin. I have a question regarding Bayes learning in Spamassassin. I'm running Spamassassin 3.0.1 on Redhat Linux. I have one mailbox on this server that receives nothing but spam and quite a lot of it. I decided that would be a good mailbox for spamassassin to learn spam via Bayes. So I set up a script in /etc/cron.daily that looks like this: sa-learn --spam -C /etc/mail/spamassassin --showdots --dir /var/mail/netlinkspam sa-learn --sync rm /var/mail/netlinkspam The idea being that it would learn from all messages in that mailbox on a daily basis, then delete that mail so it isn't just learning the same thing over and over (made sense to me...?). I set this up a couple of weeks ago. From the volume of mail this mailbox receives the bayes database should be well over 200 spams by now. But when I do a spamassassin --lint --debug, I see this in regards to Bayes: debug: bayes: 23170 tie-ing to DB file R/O /var/spool/spamassassin/bayes_toks debug: bayes: 23170 tie-ing to DB file R/O /var/spool/spamassassin/bayes_seen debug: bayes: found bayes db version 3 debug: bayes: Not available for scanning, only 8 spam(s) in Bayes DB 200 debug: bayes: 23170 untie-ing debug: bayes: 23170 untie-ing db_toks debug: bayes: 23170 untie-ing db_seen I've gotta be doing something wrong here. Any suggestions? Sounds like you are sa-learn'ing the messages as a different user than mail processing is running as. This will cause bayes to set up two completely different and separate databases. Check to make sure you are running sa-learn as the same user as mail processing. -Jim
Re: Bayes learning
- Original Message - From: [EMAIL PROTECTED] To: Lisa Casey [EMAIL PROTECTED] Cc: users@spamassassin.apache.org Sent: Monday, November 29, 2004 4:21 PM Subject: Re: Bayes learning Make sure the user you are running the script as is the same user that spamassassin runs as and that you are logged in as that same user when you run spamassassin --lint --debug. You're probably training a different database file than the one that's getting used when you run the --lint check. Since two folks have come up with this same answer (and my cron script is running as root) I'm sure this is probably the problem. OK, now here's a dumb question (and I apologise for that) but I'm not sure which user Spamassassin is actually running as. In my setup, spamassassin is being called by Mimedefang which is running as the user defang. Is this then the user that Spamassassin is running as? Thanks, Lisa Casey