BAYES_999 strange behavior

2014-02-17 Thread Ian Zimmerman
Hello. This is the first time SA is giving me enough trouble that I need to ask for help. I hope I get this right. I observed a marked increase in false negatives in the last few weeks. Only today I had enough sense to look at the detailed scores. And, all the escaped spams have hit the

Re: BAYES_999 strange behavior

2014-02-17 Thread Ian Zimmerman
On Mon, 17 Feb 2014 16:05:23 -0500 Kevin A. McGrail kmcgr...@pccc.com wrote: Kevin BAYES_999 is just a finer gradient on BAYES_99 allowing for a Kevin higher score on the top .001% of Bayes hits. Thanks for your reply. Could you explain in a bit more detail what gradient on top (of another

Re: sa-learn from a cronjob?

2014-04-23 Thread Ian Zimmerman
On Sun, 20 Apr 2014 12:14:37 -0700 (PDT) Dan Mahoney, System Admin d...@prime.gushi.org wrote: Most of my users aren't command-line friendly. I'd like to basically have my IMAP server default to handing out two imap mailboxes that get auto-crontabbed to training bayes. Here is my cronjob for

Re: sa-learn from a cronjob?

2014-04-24 Thread Ian Zimmerman
On Thu, 24 Apr 2014 15:07:32 +0100 RW rwmailli...@googlemail.com wrote: RW I don't think it will work for the purpose mentioned, and if it's RW working properly for you, there's a lot you're not mentioning. RW It's only looking for mail in the immediate post-delivery state RW after it's been put

Re: Bayes refinement

2014-05-16 Thread Ian Zimmerman
On Fri, 16 May 2014 07:22:56 -0400 David F. Skoll d...@roaringpenguin.com wrote: James Is there any way to limit Bayes content checking to only the James first X characters of the message body? I ask this because it is James clear that the spam messages getting through contain text meant James

Re: SPAM from a registrar

2014-05-16 Thread Ian Zimmerman
On Thu, 15 May 2014 09:45:21 -0800 Kevin Miller kevin_mil...@ci.juneau.ak.us wrote: Have you looked into Day old bread? http://wiki.apache.org/spamassassin/Rules/URIBL_RHS_DOB Just for the fun of it, I did a manual whois on the domain of one random spam I got today which was not killed by SA.

Re: SPAM from a registrar

2014-05-16 Thread Ian Zimmerman
On Sat, 17 May 2014 01:34:58 +0200 Karsten Bräckelmann guent...@rudersport.de wrote: I don't know whether DOB limits DNS queries of a single host. However, if you *never* get that rule firing, the NXDOMAIN result may indicate exceeding a query limit. Do you use a local caching DNS resolver,

Re: Bayes refinement

2014-05-16 Thread Ian Zimmerman
On Fri, 16 May 2014 16:20:21 -0400 Bowie Bailey bowie_bai...@buc.com wrote: Keep in mind that BAYES_50 and BAYES_60 still contribute positive scores by default. Though it is technically a neutral result, it still adds a point or two to the score. Rather than messing with Bayes, I would

Re: SPAM from a registrar

2014-05-19 Thread Ian Zimmerman
On Mon, 19 May 2014 10:46:25 -0800 Kevin Miller kevin_mil...@ci.juneau.ak.us wrote: Ian Excellent point. I _used to_ run a local DNS cache, but got rid of Ian it a few months ago, in the name of simplicity. Was that a good or Ian bad thing to do in the current context? Kevin That's a bad thing

Re: Bayes refinement

2014-05-21 Thread Ian Zimmerman
On Thu, 15 May 2014 12:18:25 -0800 Kevin Miller kevin_mil...@ci.juneau.ak.us wrote: I implemented a rule that looks for multiple breaks for just that reason. Can't remember where I stole it from - probably some folks here helped me with it a few years ago. Can't remember who, but

Re: Bayes refinement

2014-05-21 Thread Ian Zimmerman
On Wed, 21 May 2014 19:08:51 +0100 Martin Gregorie mar...@gregorie.org wrote: rawbody __LOCAL_MUCHO_BLANKS /\n{10,}/m Martin Looking for newlines rather than whitespace? Does /\s{10,}/m Martin work any better? Nope, it doesn't :-( Anyway, looking for newlines was my intention, sorry for the

Matching multiple newlines [Was: Bayes refinement]

2014-05-21 Thread Ian Zimmerman
On Wed, 21 May 2014 11:50:15 -0700 (PDT) John Hardin jhar...@impsec.org wrote: rawbody __LOCAL_MUCHO_BLANKS /\n\n\n\n\n\n\n\n\n\n/m Hmmm, no, your version doesn't work, either. Would this be of any import? [24+0]~$ perl --version This is perl 5, version 14, subversion 2 (v5.14.2) built

Re: Bayes refinement

2014-05-21 Thread Ian Zimmerman
On Wed, 21 May 2014 22:26:41 +0200 Karsten Bräckelmann guent...@rudersport.de wrote: Karsten Seriously, the above rule, the shorter /\n{10}/, as well as the Karsten variant posted by John without quantifier do exactly what you Karsten asked for. They match 10 consecutive \n newline chars in the

autolearn_force

2014-05-21 Thread Ian Zimmerman
I don't understand this setting, and reading the documentation doesn't help. It seems it sould make bayes learn spam whenever the total score surpasses the value of bayes_auto_learn_threshold_spam, and not require 3 points from header and body each; that would make it a global setting similar in

Re: autolearn_force

2014-05-22 Thread Ian Zimmerman
On Thu, 22 May 2014 15:54:42 +0100 RW rwmailli...@googlemail.com wrote: Ian I don't understand this setting, and reading the documentation Ian doesn't help. Ian It seems it should make Bayes learn spam whenever the total score Ian surpasses the value of bayes_auto_learn_threshold_spam, and not

Re: Blank line rules

2014-05-22 Thread Ian Zimmerman
On Thu, 22 May 2014 13:47:04 -0700 (PDT) John Hardin jhar...@impsec.org wrote: John Regular expressions by default only consider a single line of John text. You need to provide an option to say treat multiple lines John as a single line. Try this: rawbody RAW_BLANK_LINES_05

lint versus spamd log

2014-05-23 Thread Ian Zimmerman
I have diligently used spamassassin --lint after every edit to my user_prefs file, and made sure there was no output. This morning, in the course of the ongoing battle against enom related spam, I looked in /var/log/mail.log, and imagine my surprise when I found this logged with every delivery:

Re: lint versus spamd log

2014-05-23 Thread Ian Zimmerman
On Fri, 23 May 2014 20:35:26 +0200 Karsten Bräckelmann guent...@rudersport.de wrote: Ian spamassassin --lint Ian after every edit to my user_prefs file, and made sure there was no Ian output. This morning, in the course of the ongoing battle against Ian enom related spam, I looked in

Re: lint versus spamd log

2014-05-23 Thread Ian Zimmerman
On Sat, 24 May 2014 00:51:38 +0200 Karsten Bräckelmann guent...@rudersport.de wrote: Ian I mostly get the rest of your answer, but this is incorrect. Same Ian user, I'm 100% sure. Unless you count spamd checking on my behalf Ian as different user - do you? Karsten Yes. Karsten user_prefs are

Re: autolearn_force

2014-05-24 Thread Ian Zimmerman
On Thu, 22 May 2014 15:54:42 +0100 RW rwmailli...@googlemail.com wrote: Ian But in fact this is a per-test setting, a subcategory of tflags. Ian Do I have to specify it separately for every test? Why? RW The point is to set it for a small number of rules that are RW sufficiently strong as to

Re: autolearn_force

2014-05-24 Thread Ian Zimmerman
So, now I am really confused. I think I did everything right in user_prefs: bayes_auto_learn 1 bayes_auto_learn_threshold_nonspam -2.00 bayes_auto_learn_threshold_spam 6.00 bayes_auto_learn_on_error 0 [snip] tflags URIBL_DBL_SPAM autolearn_force tflags URIBL_JP_SURBL

Re: autolearn_force

2014-05-25 Thread Ian Zimmerman
On Sun, 25 May 2014 16:40:44 +0200 Axb axb.li...@gmail.com wrote: Axb URIBL rules are not set to use 'userconf' (user configuration) Axb so entries in user_prefs shouldn't affect the results Axb if anything it should go in a system wide rule (ie: local.cf) (not Axb user_prefs) Axb your: tflags

Re: autolearn_force

2014-05-25 Thread Ian Zimmerman
On Sun, 25 May 2014 20:06:22 +0200 Axb axb.li...@gmail.com wrote: Axb Yes, when it reached certain conditions and a score above 15.0 Axb you can tune that score via local.cf entries: Axb bayes_auto_learn_threshold_nonspam bayes_auto_learn_threshold_spam Please see the prefs in my post upthread

Re: Capture vs non-capture groups

2014-05-28 Thread Ian Zimmerman
On Wed, 28 May 2014 10:47:35 -0700 (PDT) John Hardin jhar...@impsec.org wrote: John The only place I've found backreferences useful is when writing a John header rule that is looking for the same string in multiple John headers. John Other than that, captures are very rare. There was a pattern

Re: SA without procmail?

2014-06-20 Thread Ian Zimmerman
On Wed, 18 Jun 2014 15:24:36 +0200 Axb axb.li...@gmail.com wrote: Axb Dovecot's Sieve is your friend. (replaces procmail) Not really, not in this context. OP is using procmail merely as a LDA. And in that capacity, is is replaced by the LDA that comes with dovecot. On my debian system, it is

Re: SA without procmail?

2014-06-20 Thread Ian Zimmerman
On Fri, 20 Jun 2014 14:05:04 +0100 Timothy Murphy gayle...@eircom.net wrote: Is there something similar I could append instead to use dovecot-lda? Yes. mailbox_command = /usr/libexec/dovecot/dovecot-lda or mailbox_command = /usr/libexec/dovecot/dovecot-lda -m INBOX I don't know postfix, so

Re: SA and Ubuntu 14.04 LTS

2014-07-16 Thread Ian Zimmerman
On Wed, 16 Jul 2014 06:09:08 +0200 Karsten Bräckelmann guent...@rudersport.de wrote: And to really include *local* plugins, provide a relative path (to the current site-wide configuration dir, without a leading slash) as optional second argument to the loadplugin statement. There's hardly

Re: Ready to throw in the towel on email providing...

2014-07-28 Thread Ian Zimmerman
On Mon, 28 Jul 2014 12:57:38 -0400 David F. Skoll d...@roaringpenguin.com wrote: David 1) Gmail is actually pretty good at filtering spam. I can't David speak for MSFT since I don't use it. David 2) Especially in North America, companies are short-sighted and David go for quick fixes and things

Mojibake alert [Was: Advice sought on how to convince irresponsible Megapath ISP]

2014-08-18 Thread Ian Zimmerman
On Sun, 17 Aug 2014 07:37:36 -0700, Linda Walsh sa-u...@tlinx.org wrote: Karsten Brmojibake elided/ wrote: In addition to other problems with your posts (which experts here have already pointed out), your scripts clearly do not handle non-ASCII emails well, as you have completely mangled

Re: Bayes training via inotify (incron)

2014-08-22 Thread Ian Zimmerman
On Fri, 22 Aug 2014 08:34:34 +, Eric Wong e...@80x24.org wrote: Eric I always thought inotify was an obvious way to train for anybody Eric using Maildirs on Linux, so I set it up for my server and Eric basically forgot about it since it worked well. Fast forward to Eric 2014 and I realize

Learning both spam and ham, edge case

2014-08-22 Thread Ian Zimmerman
I know that if you misclassify a mail as spam with sa-learn --spam /path/to/ham you can later run sa-learn --ham /path/to/ham to correct the mistake, and SA will do the right thing (ie. forget the wrong classification). And conversely, with ham - spam. My question is, what happens if you

Re: drop of score after update tonight

2014-08-25 Thread Ian Zimmerman
I definitely have FNs today (about 10 by now today, normally 0). Looks like some/all RBLs tests are not working. I have not changed my configuration at all. Sample here: http://pastebin.com/dsqaVA9Z -- Please *no* private copies of mailing list or newsgroup messages. Local Variables:

Re: drop of score after update tonight

2014-08-25 Thread Ian Zimmerman
On Mon, 25 Aug 2014 19:50:20 +, David Jones djo...@ena.com wrote: Ian I definitely have FNs today (about 10 by now today, normally 0). Ian Looks like some/all RBLs tests are not working. I have not changed Ian my configuration at all. Ian Sample here: Ian http://pastebin.com/dsqaVA9Z

Re: drop of score after update tonight

2014-08-26 Thread Ian Zimmerman
On Tue, 26 Aug 2014 08:10:23 +0200, Matus UHLAR - fantomas uh...@fantomas.sk wrote: Ian Isn't it a bit odd that SA has rules for all these other Bayes Ian powered backends? Why not give a bit more weight to its own Bayes Ian instead, rather than make users forage for other tools that do Ian

Re: Give a penalty to messages with non latin UTF-8 characters?

2014-08-31 Thread Ian Zimmerman
On Sat, 30 Aug 2014 06:44:39 -0600, LuKreme krem...@kreme.com wrote: LuKreme I would welcome rules that would reliably penalize messages LuKreme that use chinese, japanese, korean, thai, or any other LuKreme characters in the UTF-8 address space that I don’t read. I LuKreme would put them in

Re: bayes scroing too low

2014-08-31 Thread Ian Zimmerman
On Sun, 31 Aug 2014 12:20:41 +0200, Axb axb.li...@gmail.com wrote: Axb Bayes scores are *not* set to be a sole indicator of spam/ham. Axb They're supposed to be yet another indicator. FWIW, I use both Razor and Pyzor, and there are times when they seem to be just asleep. Or maybe a particular

Re: sa-learn and find

2014-08-31 Thread Ian Zimmerman
On Sat, 30 Aug 2014 19:59:53 -0600, LuKreme krem...@kreme.com wrote: RW This may run into shell argument limits if you have to learn a lot RW of spam. Consider piping the output of find to xargs, or using -exec RW ...{} + in find. LuKreme Yes, I tried to do that, but as I said in my first post,

Re: SA works great!

2014-08-31 Thread Ian Zimmerman
On Sun, 31 Aug 2014 16:55:50 +0200, Axb axb.li...@gmail.com wrote: Axb During the last +-4 years, scores have been set by the masscheck GA Axb system. IF more ppl would contribute with masschecks and rules, Axb detection could be better, but the lack of volunteers doing this Axb shows that

Re: sa-learn and find

2014-08-31 Thread Ian Zimmerman
On Sun, 31 Aug 2014 17:37:50 -0600, LuKreme krem...@kreme.com wrote: Ian xargs (the GNU one at least) has an option to not run the inferior Ian when there are no args to give it. LuKreme The interior is the find: _Inferior_ which is GNU speak for subprocess. I should have tried to be less

Re: bayes scroing too low

2014-09-01 Thread Ian Zimmerman
On Sun, 31 Aug 2014 12:20:41 +0200, Axb axb.li...@gmail.com wrote: Axb get the source from http://razor.sourceforge.net/ I don't recommend Axb installing via some rpm. The last version mentioned on that site is 2.84, from May 2007. strangely, the version on current Debian packages is 2.85.

Re: large spam messages

2014-09-06 Thread Ian Zimmerman
On Thu, 4 Sep 2014 12:52:34 -0400 (EDT), Jude DaShiell jdash...@panix.com wrote: Jude Since spamassassin cannot handle large spam over 2MB in size, what Jude can be used to handle that class of junk? I use a script on the MX host to MIME reshape all large messages, dropping all non-text

Reply versus new thread [Was: Dumping email with blank To: header ?]

2014-09-06 Thread Ian Zimmerman
Others have gracefully answered as to the substance of your message. I'll have to be a pest and ask that you please do not use Reply or Followup when you're starting a new topic. For list readers with user agents that thread the standard (RFC standard) way, that breaks threading. The way to

Re: sa-learn from a remote imap folder

2014-09-12 Thread Ian Zimmerman
On Fri, 12 Sep 2014 07:45:22 -0500, Dave Pooser dave...@pooserville.com wrote: Marcus spamassassin and imap (cyrus) are running on different Marcus boxes. What is best practice to learn spam from a remote imap Marcus folder? Dave At $DAYJOB we export the spam folder (and a ham folder for FPs)

KAM_BODY_URIBL_PCCC misfire

2014-09-15 Thread Ian Zimmerman
I have just had a false positive due to KAM_BODY_URIBL_PCCC (good for 5 pts.), for no apparent reason whatsoever. The are no URIs in the body. spample here: http://pastebin.com/6kaxtNcq -- Please *no* private copies of mailing list or newsgroup messages. Local Variables: mode:claws-external

Re: more_spam_from like more_spam_to

2014-09-18 Thread Ian Zimmerman
On Wed, 17 Sep 2014 13:43:49 +0100, RW rwmailli...@googlemail.com wrote: RW A lot of people don't put mailing lists through Spamassassin, most RW of them have already been spam filtered, and to get the best results RW you have to extend your internal network and maintain it. Do you mean the

Re: more_spam_from like more_spam_to

2014-09-19 Thread Ian Zimmerman
On Fri, 19 Sep 2014 08:37:45 +0200, Matus UHLAR - fantomas uh...@fantomas.sk wrote: RW A lot of people don't put mailing lists through Spamassassin, most RW of them have already been spam filtered, and to get the best results RW you have to extend your internal network and maintain it. Ian Do

Re: Non-English spam

2014-09-27 Thread Ian Zimmerman
On Thu, 25 Sep 2014 13:13:07 -0400, dar...@chaosreigns.com wrote: To enable TextCat to flag everything that's not English, in local.pre I have: loadplugin Mail::SpamAssassin::Plugin::TextCat And in local.cf I have: ok_languages en I have done this too, but I live in an English speaking

Re: spam - why spam score is low,

2014-09-28 Thread Ian Zimmerman
On Fri, 26 Sep 2014 17:07:31 +0200, Antony Stone antony.st...@spamassassin.open.source.it wrote: motty Received: from maria.fqdn.com ([127.0.0.1]) Antony That won't be helping - it means you're not basing any tests on Antony the sending server. can you run SA on your inbound MX instead Antony

Re: what's wrong

2014-10-01 Thread Ian Zimmerman
On Tue, 30 Sep 2014 09:47:41 +0200, Matus UHLAR - fantomas uh...@fantomas.sk wrote: Do you trust smtp.cesky-hosting.cz? Even if it's open socks and http proxy server? I wonder if slovensky-hosting.sk does better :-P -- Please *no* private copies of mailing list or newsgroup messages. Local

Re: Local URL blocking based on NS records?

2014-10-06 Thread Ian Zimmerman
On Fri, 03 Oct 2014 00:08:49 +0200, Axb axb.li...@gmail.com wrote: Axb What's wrong with running rbldnsd? It's the tool all BLs use for Axb mirroring BL data. It's so stable and simple to use nothing can Axb beat it. From the website: There is no config file, rbldnsd accepts all configuration

Re: Regarding mass-check access

2014-10-11 Thread Ian Zimmerman
On Fri, 10 Oct 2014 16:19:39 -0400, staticsafe m...@staticsafe.ca wrote: I sent an email to priv...@spamassassin.apache.org regarding access to mass-check back on the first of September. Is anybody out there? :) So did I, on August 31, to be precise. Crickets for me, too. -- Please *no*

Re: procmail (was Re: Spam messages bypassing SA)

2014-10-28 Thread Ian Zimmerman
On Fri, 24 Oct 2014 08:43:41 -0400, David F. Skoll d...@roaringpenguin.com wrote: David Procmail is also unmaintained abandonware, as far as I can tell. David If you use SpamAssassin, you probably like Perl, so I would David recommend Email::Filter instead. It's far more flexible than David

Re: procmail

2014-10-28 Thread Ian Zimmerman
On Tue, 28 Oct 2014 11:43:04 -0700 jdow j...@earthlink.net wrote: jdow That is hardly a compelling reason to change from procmail to jdow perl, for me or others with working procmail systems. You seem to jdow be advocating handing me perl and turning me loose after ripping jdow procmail out of my

Re: SOUGHT 2.0 ?

2014-11-12 Thread Ian Zimmerman
On Sat, 01 Nov 2014 10:06:57 -, Kevin Golding k...@caomhin.org wrote: Kevin So anyone else want to raise their hands? It depends. Would I mind a bit of regular maintenance work? No, I wouldn't mind. Would I mind a major change in how I run my server - for instance, run a virus checker, or

Re: SOUGHT 2.0 ?

2014-11-13 Thread Ian Zimmerman
On Thu, 13 Nov 2014 09:28:30 -, Kevin Golding k...@caomhin.org wrote: Kevin The main thing that's going to be needed is good, reliable, Kevin data. We'll only get good rules with good feeds. That should be Kevin fairly low impact for people in many respects. Kevin Obviously there's always

Re: SOUGHT 2.0

2014-12-04 Thread Ian Zimmerman
On Thu, 04 Dec 2014 22:41:13 +0100, Axb axb.li...@gmail.com wrote: Axb To be able to create usable rules, several times/day I need feeds Axb to spit *at least* +150k/day. As I don't have the data 150k of what? Bytes? Emails? Tokens? -- Please *no* private copies of mailing list or

whitelist_from_rcvd not working, WAIDW

2015-02-27 Thread Ian Zimmerman
Header of test message, massaged for privacy, is here: http://pastebin.com/EV6g15aN I have this in user_prefs: trusted_networks 198.1.2.3/32 [...lots snipped...] whitelist_from_rcvd *@wetransfer.com *.wetransfer.com Why is the whitelist not firing? -- Please *no* private copies of

Re: whitelist_from_rcvd not working, WAIDW

2015-02-28 Thread Ian Zimmerman
On Sat, 28 Feb 2015 13:37:29 +0100, Mark Martinec mark.martinec...@ijs.si wrote: Ian trusted_networks 198.1.2.3/32 Ian [...lots snipped...] Ian whitelist_from_rcvd *@wetransfer.com *.wetransfer.com Mark It seems the: Mark Received: (from itz@localhost) Mark by myalias.trusted.mx

Confused about Bayes expiry

2015-05-24 Thread Ian Zimmerman
I am very confused by the various features involving expiry from Bayes. perldoc Mail::SpamAssassin::Conf : bayes_expiry_max_db_size (default: 15) What should be the maximum size of the Bayes tokens database? When expiry occurs, the Bayes system will keep

Re: Confused about Bayes expiry

2015-05-24 Thread Ian Zimmerman
On 2015-05-24 23:25 +0200, Mark Martinec wrote: Mark With other bayes back-ends the traditional expiration mechanisms Mark need to be used, either auto-expiration runs triggered from time Mark to time by SpamAssassin, or explicit expiration runs, e.g. from a Mark cron job. With these traditional

Re: Confused about Bayes expiry

2015-05-25 Thread Ian Zimmerman
On 2015-05-25 09:43 +0200, Matus UHLAR - fantomas wrote: Ian But, in fact I already have a cronjob running sa-learn Ian --force-expire. The reason I would prefer to remove it (and so Ian the reason for my original post) is that it does a journal sync as Ian well, which I didn't intend and which

Re: no reporting methods available

2015-07-31 Thread Ian Zimmerman
On 2015-07-31 18:28 -0500, David B Funk wrote: Reporting is separate from learning. It is the case that spamassassin -r is supposed to report and learn. However it isn't quite the same as sa-learn --spam in that unlike sa-learn --spam it won't override the spam learn prohibition of

no reporting methods available

2015-07-31 Thread Ian Zimmerman
I run spamassassin -r from cron nightly. Last night I got this output: Jul 30 23:00:11.830 [31065] warn: reporter: no reporting methods available, so couldn't report Jul 30 23:00:11.830 [31065] warn: spamassassin: warning, unable to report message Jul 30 23:00:11.830 [31065] warn: spamassassin:

bayes expiry not happening when it should

2015-08-05 Thread Ian Zimmerman
~$ grep '^bayes_expiry_max_db_size' ~/.spamassassin/user_prefs | awk '{print $2}' 200 ~$ sa-learn --force-expire bayes: synced databases from journal in 0 seconds: 2784 unique entries (2805 total entries) ~$ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes

Re: bayes expiry not happening when it should

2015-08-05 Thread Ian Zimmerman
On 2015-08-05 12:58 +0100, RW wrote: The number of tokens is within 0.5% of the configured value. It's designed to produce a value between 75% and roughly 150%. I can't quite parse that answer, so let's be more specific. Doc says: bayes_expiry_max_db_size (default: 15) What

Live upgrade safe?

2015-08-14 Thread Ian Zimmerman
Can I safely upgrade SA from 3.4.0 to 3.4.1 without changing any local configuration files, and without regenerating the Bayes database? (I use the default bdb Bayes store.) -- Please *no* private copies of mailing list or newsgroup messages. Rule 420: All persons more than eight miles high to

Re: bayes expiry not happening when it should

2015-08-05 Thread Ian Zimmerman
On 2015-08-05 19:34 +0100, RW wrote: What it actually does is estimate a cut-off time and then delete all tokens older than that. How it gets the cut-off time is described the next two sections: EXPIRE LOGIC and ESTIMATION PASS LOGIC. OMG. For one thing, are the clauses in the definition of

another bayes oddity

2015-07-23 Thread Ian Zimmerman
I have bayes_auto_learn0 bayes_auto_expire 0 bayes_learn_to_journal 0 add_header all Autolearn _AUTOLEARN_ and indeed, all messages are tagged with X-Spam-Autolearn: disabled Nevertheless, the mtime _and_ size of ~/.spamassassin/bayes_journal inches forward with every delivery. Why?

Re: Large spam

2015-07-15 Thread Ian Zimmerman
On 2015-07-15 20:12 +, Zinski, Steve wrote: We're starting to see a lot of spam in the 800KB to 1.2MB size range. I’m running MIMEdefang and it’s configured to skip messages larger than 100KB (and I hesitate to increase the limit due to performance issues). I read somewhere that there’s a

Re: Debian jessie - new setup, missing data directory

2015-11-09 Thread Ian Zimmerman
On 2015-11-09 16:42 +0100, Antony Stone wrote: > What did Jessie install it as? > > > > /var/mail/.spamassassin/user_prefs This is very strange. Are you really sure it is not operator error? I run wheezy, so I can't flat out exclude it, but it flies in the face of too much Debian tradition.

Re: Checking if sa-learn is actually learning

2015-10-16 Thread Ian Zimmerman
On 2015-10-16 20:59 -0500, Ryan Coleman wrote: > sa-learn commands: > [scans domains for specified folders and scans them] > > /usr/bin/find /var/mail/vhosts/ -name '*.Spam.New*' -type d -exec > > /usr/bin/sa-learn --no-sync --spam --progress {}* \; > > /usr/bin/find /var/mail/vhosts/ -name

Return Path (TM) whitelists

2015-07-09 Thread Ian Zimmerman
I just got in my inbox what I consider spam from the Belgian domain selling Japanese copiers printers (you probably know which one). What made it pass through SA were RCVD_IN_RP_CERTIFIED and RCVD_IN_RP_SAFE. Together they account for a whopping -5 points - a poison antidote pill! Isn't that a

Re: Return Path (TM) whitelists

2015-07-09 Thread Ian Zimmerman
On 2015-07-09 16:58 +, David Jones wrote: Did the email have a valid unsubscribe link/process? It is in Dutch, and I can't read Dutch. (Yes, I do use the language plugin.) I shortcircuit as ham for these two rule hits and never have had a report of spam that couldn't be reliably/safely

Re: Return Path (TM) whitelists

2015-07-10 Thread Ian Zimmerman
On 2015-07-10 13:54 +0100, RW wrote: I don't get any spam at all in the return-path lists. ... I don't doubt that there's some abuse, but I also find it hard to believe that the accuracy of the return-path rules isn't dominated by user behaviour. Can you specify user behaviour in more

Re: Return Path (TM) whitelists

2015-07-10 Thread Ian Zimmerman
On 2015-07-10 16:36 +0200, Reindl Harald wrote: most users enable checkboxes which are needed to get random forms submitted, even if they say i agree to get mails from here and there and are missing the context when that mails are coming later You don't know me, so you can hardly claim a

Re: Live upgrade safe?

2015-09-11 Thread Ian Zimmerman
On 2015-09-11 17:35 +0200, Reindl Harald wrote: > >>>Can I safely upgrade SA from 3.4.0 to 3.4.1 without changing any local > >>>configuration files, and without regenerating the Bayes database? (I > >>>use the default bdb Bayes store.) > >> > >>yes, but you need to run "sa-update" before

Re: [Announce] SA-Plugins: RedisAWL, RuleTimingRedis

2015-09-15 Thread Ian Zimmerman
On 2015-06-09 17:57 +0200, Benning, Markus wrote: > RuleTimingRedis - collect SA rule timings in redis I'm trying this out. I have a little annoying problem: the logs beginning on line 178 seem to go to stdout or stderr as well as syslog. The result is that cron sends me email every time spamd

Re: Live upgrade safe?

2015-09-11 Thread Ian Zimmerman
On 2015-08-14 17:45 +0200, Reindl Harald wrote: > >Can I safely upgrade SA from 3.4.0 to 3.4.1 without changing any local > >configuration files, and without regenerating the Bayes database? (I > >use the default bdb Bayes store.) > > yes, but you need to run "sa-update" before restart to fetch

Re: best way to whitelist this list?

2015-09-19 Thread Ian Zimmerman
On 2015-09-19 20:12 +0200, A. Schulze wrote: > today I was notified by ezmlm that my MTA rejected messages to > me. Messages to this list where classified as spam by .. spamassassin. All of today's messages here scored around -7.5 for me, with no special handling. -- Please *no* private copies

Re: A Plan to Stop Violence on Social Media

2015-12-16 Thread Ian Zimmerman
On 2015-12-16 14:21 -0800, jdow wrote: > One thing worth pointing out is if this CAN be done refusing to do it > yourself is a shallow gesture. No, it is not. Refusing to take part in what you believe is wrong, even if you know the wrong will be done eventually because the Zeitgeist favors it,

Re: Trying Bayes / Redis

2015-12-11 Thread Ian Zimmerman
On 2015-12-11 14:29 -0800, Marc Perkel wrote: > Anyone using this rule timing plugin? Having trouble getting it to > work. Just wondering if it's worth it? > > Mail::SpamAssassin::Plugin::RuleTimingRedis I use it and I have no trouble now. But I remember I had to disable the LUA scripting

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Ian Zimmerman
On 2015-12-29 20:41 -0500, Bill Cole wrote: > Neither su nor sudo magically changes the permissions or ownership of > files. If you pass filenames as arguments they must be readable by the > user actually running sa-learn, which is the *unprivileged* user > handling the system-wide BayesDB

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Ian Zimmerman
On 2015-12-29 19:44 -0500, Bill Cole wrote: > On 29 Dec 2015, at 18:54, Ian Zimmerman wrote: > > >In fact sa-learn accepts multiple named arguments on the command line, > >so the alternative I use is to go through the spambox N files at a time > >in a shell loop. (I

Re: Is BAYES filtering working? Having doubts.

2015-12-29 Thread Ian Zimmerman
On 2015-12-29 17:50 -0500, Bill Cole wrote: > Yes, with the advantage of using Mail::SpamAssassin::Util::secure_tmpfile() > rather > than whatever I happen to roll up in a bit of Q shell that I never get > around to > reviewing for edge cases... > > The main reason to do something like that is

Bayes expiry vs. sync, again

2016-03-15 Thread Ian Zimmerman
I am sorry to return to this horse which has perhaps been beaten enough. But I still don't know and don't understand (_after_ reading the docs) if I can, at the same time: 1. completely disable expiry 2. force a sync of the journal I just saw with my own eyes that passing --sync to sa-learn

Re: Interesting rule combo results

2016-03-09 Thread Ian Zimmerman
On 2016-03-09 07:12 -0800, Marc Perkel wrote: > >>HAM RULES: > >>... > >> 80056 HTML_MESSAGE > > > >What's happening here? This seems to imply that HTML_MESSAGE only > >appears in ham. > > > > > > I think my results are a little strange in that I might not be > training off all the data

Re: Disabling spamcop plugin

2016-04-07 Thread Ian Zimmerman
On 2016-04-07 14:37 +0100, RW wrote: > What exactly are you trying to do here? > > The pyzor plugin does testing and reporting, use_pyzor is mostly there > to control the test. The spamcop plugin does reporting only. So, if I don't do any explicit reporting (neither spamc -C nor spamassassin

Disabling spamcop plugin

2016-04-06 Thread Ian Zimmerman
Is there any way to disable the spamcop plugin for an individual user (i.e. from ~/.spamassassin/user_prefs) if the plugin is loaded by /etc/spamassassin/*.pre ? By comparison, I seem to be able to disable pyzor even if it is loaded, by writing use_pyzor 0 in my user_prefs. -- Please *no*

[OT] still configuring [Was: Disabling spamcop plugin]

2016-04-12 Thread Ian Zimmerman
On 2016-04-12 10:57 -0400, David Niklas wrote: > You could use Gentoo, you get to configure it all yourself! Funny you'd say that, I _am_ actually switching to it - on my "workstation" role computers. I'm already over 50% over the hump, I think. But on "server type" computers, I just cannot

Re: [OT] still configuring [Was: Disabling spamcop plugin]

2016-04-13 Thread Ian Zimmerman
On 2016-04-13 09:12 -0400, Michael Orlitzky wrote: > package will be recompiled automatically as part of the updates. Any > packages *depending on* that package (like, if they're statically linked > to it) will also be recompiled. But also _direct_ dependencies of the affected package, if the

Re: sa-update through proxy

2016-05-04 Thread Ian Zimmerman
On 2016-05-04 08:13 -0700, John Hardin wrote: > > alias sa-update='env http_proxy=http://myserver:myport/ > > https_proxy=http://myserver:myport/ sa-update' > > Lose the "env"? Why? Apart from using an extra process, this should work exactly the same. -- Please *no* private copies of

Reporting [Was: Disabling spamcop plugin]

2016-04-21 Thread Ian Zimmerman
On 2016-04-07 13:55 -0700, Ian Zimmerman wrote: > sa-learn doesn't do any reporting, right? [snip snip] > By the way, manpage for spamc says: > >-C report type, --reporttype=type >Report or revoke a message to one of the configured >colla

Re: Childish actions of Harald Reindl

2016-08-05 Thread Ian Zimmerman
On 2016-08-05 09:46 +0100, Martin wrote: > The biggest reason is the way this mailing list is set up, when you > click reply it replies to the poster not the list, this has always > been a bug bare of mine and something that probably should be > addressed. Then don't "click reply" but use a

Re: Issue on disable ipv6

2016-07-01 Thread Ian Zimmerman
On 2016-07-01 20:25 +0200, Massimo Sandolo wrote: > Hi, > I have an issue when try to disable ipv6. > I'm running Debian 8.3 with SpamAssassin version 3.4.0 (running on Perl > version 5.20.2). > In /etc/defualt/spamassassin the options line is the following: > OPTIONS="-4 --create-prefs

New type of monstrosity

2017-02-06 Thread Ian Zimmerman
Last couple of weeks I saw some messages whose entire contents is in the Subject. They have both a text/plain and text/html part but both are empty (in the case of html, there is some markup but no character data). The Subject is maybe 400 or 500 chars long. Needless to say, this is a 100% spam

Re: New type of monstrosity

2017-02-06 Thread Ian Zimmerman
On 2017-02-06 20:06, Kevin A. McGrail wrote: > > Last couple of weeks I saw some messages whose entire contents is in > > the Subject. > never seen such a monster. likely killed by some other piece in the > puzzle. Throw it up on pastebin? http://pastebin.com/PYaMcZa7 (I was wrong, the

Re: New type of monstrosity

2017-02-07 Thread Ian Zimmerman
On 2017-02-07 09:37, Matus UHLAR - fantomas wrote: > 11.5 - 3.5 = 8.0 And of course 1.2.3.x is not the true relay address, so > 1.5 BOTNET Relay might be a spambot or virusbot > [botnet0.8,ip=1.2.3.12,rdns=disorder.censored.net,maildomain=outlook.fr,baddns] this goes out of the

Re: RFC compliance pedantry (was Re: New type of monstrosity)

2017-02-07 Thread Ian Zimmerman
On 2017-02-07 18:33, Ruga wrote: > I follow the actual RFC standard, not the proposed revisions. The To > From and Cc fields are defined by a grammar AND a natural language > description. Such fields MUST hold addresses, were an address is a > username the "@" symbol and a domain name. The string

Re: Ignore third-party SA headers

2017-01-25 Thread Ian Zimmerman
On 2017-01-26 01:03, RW wrote: > Probably what's happening is that these are emails over 500 kB which > by default are just passed through by spamc without sending them to > spamd. If they don't get sent to spamd the existing SA headers don't > get stripped. > > You can to set the -s parameter

Re: Fastest listing RBL ?

2017-02-15 Thread Ian Zimmerman
On 2017-02-15 16:30, Tom Hendrikx wrote: > Note that the period that you describe as 'seen by SA a bit later' is > typically less than a second. Not in my case. I have a custom Exim configuration where I intentionally wait for a period of time (currently 4 minutes) between SMTP acceptance and

  1   2   >