Re: [POSSIBLE SPAM] Re: a concept for spam filter

2012-11-11 Thread Russell L. Harris
* Cameron Simpson c...@zip.com.au [121103 09:12]:
... 
   - anything else lands in my UNKNOWN folder; it is 99% spam
 I always sort that folder on subject when I visit it;
 it makes tossing it much easier because a lot of spam gets repeated
 in big chunks
...

Cameron,

I thank you for the detailed reply; I still am digesting it, and I
hope very soon to begin using the method which you described.

But your reply provided and unexpected and immediate benefit: I have
used Mutt for years, but I never knew that the index could be sorted.
I learned how through a quick search with Google for mutt sort
index, so now handling unknown mail is much easier.  Thanks!

RH


Re: a concept for spam filter

2012-11-11 Thread Cameron Simpson
On 11Nov2012 09:02, Russell L. Harris rlhar...@oplink.net wrote:
| * Cameron Simpson c...@zip.com.au [121103 09:12]:
|- anything else lands in my UNKNOWN folder; it is 99% spam
|  I always sort that folder on subject when I visit it;
|  it makes tossing it much easier because a lot of spam gets repeated
|  in big chunks
| 
| I thank you for the detailed reply; I still am digesting it, and I
| hope very soon to begin using the method which you described.
| 
| But your reply provided and unexpected and immediate benefit: I have
| used Mutt for years, but I never knew that the index could be sorted.
| I learned how through a quick search with Google for mutt sort
| index, so now handling unknown mail is much easier.  Thanks!

FYI, my sort settings are thus:

  folder-hook . set sort=reverse-threads
  folder-hook '/(spam|junk|U|UNKNOWN)$' set sort=subject
  folder-hook tozap set sort=threads
  set sort_aux=last-date

Cheers,
-- 
Cameron Simpson c...@zip.com.au

Microsoft - where cross platform means runs in both Win95 and WinNT.
- Andy Newman a...@research.canon.com.au


Re: a concept for spam filter

2012-11-04 Thread Jamie Paul Griffin
/ Cameron Simpson wrote on Sun  4.Nov'12 at  9:04:16 +1100 /

 It also parses each message header just one on demand, so to test
 hundreds of rules the parsing happens only once. And of course the rules
 are parsed when I start mailfiler, not for each message. The other
 upside of extracting the core address part is that you can do this:
 
   friends Friends from:(FRIENDS)
 
 which means match is the address in the From: header is in my friends
 group, a set of addresses pulled in from a text db. Again parsed, just
 at load time. So very fast. When I was using procmail I actually had
 code that generated an enormous regexp with tens of addresses in it.
 Ghastly!
 
   :0
   * 
 ^(to|cc):.*\(cameron\.simpson@gmail\.com|cameron\.simpson@me\.com|cs@zip\.com\.au|...
   * ^from:.*(huge regexp for family etc kilobytes long...
 
 My now obsolete .procmailrc for the spool-in folder is 1036401 bytes
 long. Nasty!

I see your point. I've not looked into other filtering methods a great deal but 
I intend to. Mind you, I don't have many friends so my procmail rules are 
pretty basic ATM lol.


Re: a concept for spam filter

2012-11-04 Thread Cameron Simpson
On 04Nov2012 07:31, Jamie Paul Griffin ja...@kode5.net wrote:
| I see your point. I've not looked into other filtering methods a great
| deal but I intend to. Mind you, I don't have many friends so my procmail
| rules are pretty basic ATM lol.

I'm on a lot of mailing lists, and several have multiple rules.
-- 
Cameron Simpson c...@zip.com.au

This is not a bug. It's just the way it works, and makes perfect sense.
- Tom Christiansen tchr...@jhereg.perl.com
I like that line. I hope my boss falls for it.
- Chaim Frenkel cha...@cris.com


Re: a concept for spam filter

2012-11-03 Thread Cameron Simpson
On 02Nov2012 20:15, Russell L. Harris rlhar...@oplink.net wrote:
| * Jamie Paul Griffin ja...@kode5.net [121102 19:36]:
|  I have set up macros that bind keys to pass messages to spamassassin
|  using sa-learn and then puts the message into the spam mailbox. Is
|  this the type thing you mean? The spam mailbox can later be used to
|  train spamassassin for future filtering, using procmail.
| 
| Is the return path the true address of the sender?

No.

In legitimate messages it is where errors should go. For a person, that
would normally be themselves. But for a mailing list it would be the
list admin.

| If so, I would
| like to blacklist such addresses.

Not a lot of point.

| The From: field shown in the
| message index of Mutt almost always is rewritten -- sometimes to an
| address from which valid messages may originate -- so I hesitate to
| blacklist such addresses.

Indeed.

I've just implemented a script for this (adding a message subject as a
spam filter rule), and I have this macro:

  macro index,pager SS save-message+spool-spam-subjenternext-undeleted 
delete message as spam by subject

I have arranged that messages saved to +spool-spam-subj get their subjects
saved to my spam-subj mail filing rules file. Details below.

I am not using spamassassin myself, but have a fairly effective
strategy:

  - I catch important messages as being to me and from a
whitelist of known addresses (actually a whitelist of address
groups - the SO, family, friends and an assortment of business
entities like my credit union)

  - I catch and alert on a short list of very specific messages from
monitoring systems, based on to: and from: _and_ subject:

  - an ad hoc list of other special rules

  - I have a zillion rules for various mailing lists, generally based on
to/cc: or sender:

  - anything else lands in my UNKNOWN folder; it is 99% spam
I always sort that folder on subject when I visit it;
it makes tossing it much easier because a lot of spam gets repeated
in big chunks

This keeps my inbox fairly clean without nebulous bayesian filters etc.

The SO also keeps a blacklist of subject lines; I've been meaning to do
the same and your post has kicked me to do so.

How it works:

My mail setup is as follows:

  - I fetch with getmail, delivering to my spool maildir

  - I filter messaging using my mailfiler program, which monitors
a list of maildirs once a second, thus:

  mailfiler monitor -d 1 ~/mail/spool ~/mail/spool-in ~/mail/spool-out 
~/mail/spool-spam-subj

The spool rules are meant to divert the spam and deposit probably
nonspam into the spool-in maildir, which runs all the rules for the
mailing lists etc.

The rules for spool sources my spam-subj rule file, currently saying
this:

  # spam matching by subject line
  =spam SPAM-SUBJ subject:/^Anz e-banking alert

which files messages to the spam folder with the X-Label SPAM-SUBJ
if they match my (sole, so far) common spam subject.

The reason it is a separate file is that the script to add a new subject
line rewrites that file.

The script itself is here:

  
https://bitbucket.org/cameron_simpson/css/src/tip/bin-cs/email-add-spam-subject

The filter rules for the spool-spam-subj maildir (monitored above by the
mailfiler) read:

   env
  # add this message's subject to the spam subject list
  |email-add-spam-subject . .
  spam. .

It pipes the message to the email-add-spam-subject script (which
rewrites the spam-suj rules file) and saves the message in my spam
folder where it all accumulates, like a septic tank.

email-add-spam-subject emits a new rule and sorts it into the existing
rules, rewriting the file if it changes.

Cheers,
-- 
Cameron Simpson c...@zip.com.au

Is it true, Sen. Bedfellow, that your wife rides with bikers?   - Milo Bloom


Re: a concept for spam filter

2012-11-03 Thread Jamie Paul Griffin
/ Cameron Simpson wrote on Sat  3.Nov'12 at 20:08:03 +1100 /

 On 02Nov2012 20:15, Russell L. Harris rlhar...@oplink.net wrote:
 | * Jamie Paul Griffin ja...@kode5.net [121102 19:36]:
 |  I have set up macros that bind keys to pass messages to spamassassin
 |  using sa-learn and then puts the message into the spam mailbox. Is
 |  this the type thing you mean? The spam mailbox can later be used to
 |  train spamassassin for future filtering, using procmail.
 | 
 | Is the return path the true address of the sender?
 
 No.
 
 In legitimate messages it is where errors should go. For a person, that
 would normally be themselves. But for a mailing list it would be the
 list admin.
 
 | If so, I would
 | like to blacklist such addresses.
 
 Not a lot of point.
 
 | The From: field shown in the
 | message index of Mutt almost always is rewritten -- sometimes to an
 | address from which valid messages may originate -- so I hesitate to
 | blacklist such addresses.
 
 Indeed.
 
 I've just implemented a script for this (adding a message subject as a
 spam filter rule), and I have this macro:
 
   macro index,pager SS 
 save-message+spool-spam-subjenternext-undeleted delete message as 
 spam by subject
 
 I have arranged that messages saved to +spool-spam-subj get their subjects
 saved to my spam-subj mail filing rules file. Details below.
 
 I am not using spamassassin myself, but have a fairly effective
 strategy:

See this is where experience prevails. I would not have thought of something 
like this. Thanks Cameron for sharing, it looks like an effective method. 

I just have my mail delivered by smtp so use OpenBSD spamd and spamassassin as 
well as clamav and unofficial sigs, with procmail sorting as i mentioned. So my 
set up is different of course. 


Re: a concept for spam filter

2012-11-03 Thread Cameron Simpson
On 03Nov2012 11:03, Russell L. Harris rlhar...@broadcaster.org wrote:
| I thank you for taking the trouble to give a detailed reply, Cameron.
| I have printed it out, and I plan to study it carefully in the
| morning.

On 03Nov2012 09:53, Jamie Paul Griffin ja...@kode5.net wrote:
| I just have my mail delivered by smtp so use OpenBSD spamd and
| spamassassin as well as clamav and unofficial sigs, with procmail sorting
| as i mentioned. So my set up is different of course.

I should have made it clear that my setup is a bit roundabout.

The natural macro for this would simply pipe the messages to
email-add-spam-subject and then delete it or save it to the known spam
bucket.

I save it to a special spool folder for two reasons:

  - it is snappier to just save a message to a folder than to pipe the
message to a program which does some work, making for a snappier user
experience; I don't care that the subject line isn't in my rules until a
few seconds later (mailfiler will pick it up as part of its regular
scan)

  - I've already got my system monitoring maildirs as spools with simple
rules, so folding this in was very easy

The important thing is the script to add a new rule and telling your
filtering software about the rule update.

With procmail it rereads (and therefore recompiles, alas) the rules file
every time you fire it up; my mailfiler notices rule files changes and
reloads if they get updated.

I outlined my setup to give background and to show that a small leading
blacklist and an UNKNOWN folder for messages matching no filing rule
diverts most stuff away from your inbox fairly effectively without
spamassassin et al.

Regarding filing tools:

I used to use procmail. At some point I decided its rule syntax was
too painful, especially if you want to do a few things with _every_
filing, like X-Labels, log lines and so forth, so some years ago I
wrote cats2procmailrc to take a simple rule syntax and transcribe
a procmailrc. And I finally decided to write something that directly
understood my rule syntax, which has several advantages: reads the rules
once (more performant!), doesn't need a wrapper script to watch maildirs,
leaves me free to make the rules say what I want instead of what can be
said to procmail.

My core gripe with procmail, aside from the from-scratch startup per
message thing, is that it works entirely off regexps. This is not a good
way to parse email addresses. These are all equivalent:

  c...@zip.com.au
  Cameron Simpson c...@zip.com.au
  (Cameron Simpson) c...@zip.com.au

Matching that while not matching:

  c...@zip.com.au
  foo.cs.zip.com.au

and so forth just does not work reliably. A mailfiler rule like this:

  me to-me c...@zip.com.au

files to the folder me with the tag/x-label to-me if the to/cc/bcc
contains c...@zip.com.au in the address component as extracted by a
proper RFC2822 parser. No regexps, just string equality tests.

It also parses each message header just one on demand, so to test
hundreds of rules the parsing happens only once. And of course the rules
are parsed when I start mailfiler, not for each message. The other
upside of extracting the core address part is that you can do this:

  friends Friends from:(FRIENDS)

which means match is the address in the From: header is in my friends
group, a set of addresses pulled in from a text db. Again parsed, just
at load time. So very fast. When I was using procmail I actually had
code that generated an enormous regexp with tens of addresses in it.
Ghastly!

  :0
  * 
^(to|cc):.*\(cameron\.simpson@gmail\.com|cameron\.simpson@me\.com|cs@zip\.com\.au|...
  * ^from:.*(huge regexp for family etc kilobytes long...

My now obsolete .procmailrc for the spool-in folder is 1036401 bytes
long. Nasty!

Cheers,
-- 
Cameron Simpson c...@zip.com.au

Very few things happen at the right time, and the rest do not happen at all.
The conscientious historian will correct these defects.
- Mark Twain, _A Horse's Tale_


a concept for spam filter

2012-11-02 Thread Russell L. Harris
Has anyone devised a spam filtering into which an address and subject
line could be entered simply by pressing a key while viewing the mutt
index?

That way, whenever I go down the index pressing the d key to
mark spam items to be deleted, perhaps by pressing some other key I
could mark spam items to be both deleted and added to a spam filter,
so that neither the same sender nor the same subject line ever again
would clutter my screen.

RLH


Re: a concept for spam filter

2012-11-02 Thread Jamie Paul Griffin
/ Russell L. Harris wrote on Fri  2.Nov'12 at  9:21:20 + /

 Has anyone devised a spam filtering into which an address and subject
 line could be entered simply by pressing a key while viewing the mutt
 index?
 
 That way, whenever I go down the index pressing the d key to
 mark spam items to be deleted, perhaps by pressing some other key I
 could mark spam items to be both deleted and added to a spam filter,
 so that neither the same sender nor the same subject line ever again
 would clutter my screen.
 
 RLH

I have set up macros that bind keys to pass messages to spamassassin using 
sa-learn and then puts the message into the spam mailbox. Is this the type 
thing you mean? The spam mailbox can later be used to train spamassassin for 
future filtering, using procmail. 


Re: a concept for spam filter

2012-11-02 Thread Russell L. Harris
* Jamie Paul Griffin ja...@kode5.net [121102 19:36]:
 
 I have set up macros that bind keys to pass messages to spamassassin
 using sa-learn and then puts the message into the spam mailbox. Is
 this the type thing you mean? The spam mailbox can later be used to
 train spamassassin for future filtering, using procmail.

Is the return path the true address of the sender?  If so, I would
like to blacklist such addresses.  The From: field shown in the
message index of Mutt almost always is rewritten -- sometimes to an
address from which valid messages may originate -- so I hesitate to
blacklist such addresses.

At present, I am using mailfilter to delete messages from the POP3
server of my ISP before download.  This helps greatly; but the
spammers keep changing subjects and addresses, so maintenance of the
.mailfilterrc file is becoming an unreasonable burden.

Then I am using getmail to download messages and maildrop to sort
them.

How much trouble is it to set up SpamAssassin for a setup such as
mine, in which cron is used to pull down messages every ten or fifteen
minutes?  Must I switch from getmail and maildrop to procmail?

Ultimately, the solution is to make spamming a crime for which the
punishment is a suspended sentence.

RLH




Re: Which spam filter do you use?

2007-10-05 Thread M. Fioretti
On Fri, Oct 05, 2007 00:23:06 AM +0200, Eyolf Østrem
([EMAIL PROTECTED]) wrote:

 So, should I switch? I'm quite happy with bogo, especially with the
 current setup with some macros I borrowed from an article in linux
 journal (I think it was), but I would very much like to hear what
 your experiences are in this respect.

Besides some MTA-level filtering, I am using bogo instead of
spamassassin because of one simple reason: lack of maintenance.

I've read several times that SA rules must be constantly updated and
added, otherwise they get half-useless every few weeks. Is this still
true?

Bogofilter, in any case, _does_ let a few spam go through every
day. But once you have installed it and integrated it with Mutt,
procmail, whatever, it just works by itself. Upgrading SA can be
automated, of course, but is one more thing that must work fine all
the time, so I still prefer the other approach: 2/3 minutes every day
to retrain bogofilter on missed spam (via mutt macros and scripts, of
course) versus keeping a constant eye on what's happening in the SA
world.

Of course, if one has to set things up for many people is an
entirely different story.

One thing I plan to do some day, when I have time, is to try something
I've read online: install _another_ bayesian filter and run it from
procmail _only_ on those messages which bogofilter classified as
almost spam. This would minimize resource consumption, as the
lighter filter is enough to trap almost everything, but increase
accuracy while still being zero-maintenance or so.

HTH,
Marco
-- 
The one book on software and digital technologies that no parent or
teacher can ignore:  http://digifreedom.net/node/84


Re: Which spam filter do you use?

2007-10-05 Thread Wilkinson, Alex
0n Fri, Oct 05, 2007 at 08:06:55AM +0200, M. Fioretti wrote: 

On Fri, Oct 05, 2007 00:23:06 AM +0200, Eyolf Østrem
([EMAIL PROTECTED]) wrote:

 So, should I switch? I'm quite happy with bogo, especially with the
 current setup with some macros I borrowed from an article in linux
 journal (I think it was), but I would very much like to hear what
 your experiences are in this respect.

Besides some MTA-level filtering, I am using bogo instead of
spamassassin because of one simple reason: lack of maintenance.

Have you got any documents describing the process of integrating bogofilter 
with mutt ?

 -aW

IMPORTANT: This email remains the property of the Australian Defence 
Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 
1914.  If you have received this email in error, you are requested to contact 
the sender and delete the email.



Re: Which spam filter do you use?

2007-10-05 Thread Kyle Wheeler
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday, October  5 at 08:06 AM, quoth M. Fioretti:
Besides some MTA-level filtering, I am using bogo instead of
spamassassin because of one simple reason: lack of maintenance.

I've read several times that SA rules must be constantly updated and
added, otherwise they get half-useless every few weeks. Is this still
true?

Bah; SpamAssassin is the swiss-army-knife of spam filters. It includes 
a bayesian filter (not to mention things like razor and dcc, which are 
constantly up-to-date), and as such, does not require updating the 
rules. Updating the rules can *help*, but it is not required to 
continue functioning at a reasonably high level.

The real criticism I'd level against SpamAssassin as compared to 
bogofilter is that SpamAssassin's bayesian classifier is relatively 
simple. To my knowledge, it doesn't tokenize word-pairs and phrases, 
but just single words; thus, something that uses more advanced 
bayesian techniques (I presume bogofilter fits this description?) may 
well beat it at that particular game---which is where updating the 
rules can help as a compensating factor. It's not like a virus-scanner 
where an out-of-date database is worthless.

~Kyle
- -- 
Just because you do not take an interest in politics doesn't mean 
politics won't take an interest in you.
   -- Pericles (430 BC)
-BEGIN PGP SIGNATURE-
Comment: Thank you for using encryption!

iD8DBQFHBeLqBkIOoMqOI14RAquVAJ9sG8NsvhJdHB1qCJdab6Xh2/fczwCggME0
FGk2+Rbq8wVY/Rleab56KhI=
=kjny
-END PGP SIGNATURE-


Re: Which spam filter do you use?

2007-10-05 Thread M. Fioretti
On Fri, Oct 05, 2007 07:56:44 AM +0200, Christian Kuka ([EMAIL PROTECTED]) 
wrote:
 
 At the moment I'm using bogofilter, razor, pyzor, dcc, spamassassin 
 and clamassassin and the following procmail rules:

This (cascading several filters, lightest to heaviest) is what inmy
other reply I said I'd like to try someday. Do you have any _measure_
of how better it is wrt a bogofilter-only setup?

Something like bogo alone stops 80% of spam, razor 80% of bogo
missed, SA 80% of what is still left etc.

Thanks,
Marco
-- 
Your own civil rights and the quality of your life heavily depend on
how software is used *around* you:http://digifreedom.net/node/84


Re: Which spam filter do you use?

2007-10-05 Thread M. Fioretti
On Fri, Oct 05, 2007 14:25:43 PM +0800, Wilkinson, Alex wrote:
 Have you got any documents describing the process of integrating
 bogofilter with mutt ?

I just used these:
http://www.linuxjournal.com/article/6439
http://www.linuxjournal.com/article/7436

Another approach, valid also on Imap and compatible with webmail, is
to just move all the spam missed by bogofilter in a predefined with a
macro, and then have a cron shell script which runs the same
bogofilter command of those mutt macros on each file in that
directory and then remove the files.

HTH,
Marco
-- 
The Family Guide to Digital Freedom:  http://digifreedom.net/node/99


Re: Which spam filter do you use?

2007-10-05 Thread Michael Tatge
* On Fri, Oct 05, 2007 M. Fioretti ([EMAIL PROTECTED]) muttered:
 On Fri, Oct 05, 2007 14:25:43 PM +0800, Wilkinson, Alex wrote:
 Another approach, valid also on Imap and compatible with webmail, is
 to just move all the spam missed by bogofilter in a predefined with a
 macro,

When imap is involved sieve really is the way to go.
e.g.

# spam level 5 * or more
if header :contains X-Spam-Level * {
   fileinto INBOX/spam;
   stop;
}



HTH,

Michael
-- 
Excusing bad programming is a shooting offence, no matter _what_ the
circumstances.
-- Linus Torvalds, to the linux-kernel list

PGP-Key-ID: 0xDC1A44DD
Jabber: [EMAIL PROTECTED]


Re: Which spam filter do you use?

2007-10-05 Thread Eyolf Østrem
On 05.10.2007 (02:08), Kyle Wheeler wrote:
 On Friday, October  5 at 08:06 AM, quoth M. Fioretti:
 Besides some MTA-level filtering, I am using bogo instead of
 spamassassin because of one simple reason: lack of maintenance.
 
 I've read several times that SA rules must be constantly updated and
 added, otherwise they get half-useless every few weeks. Is this still
 true?
 
 Bah; SpamAssassin is the swiss-army-knife of spam filters. It includes 
 a bayesian filter (not to mention things like razor and dcc, which are 
 constantly up-to-date), and as such, does not require updating the 
 rules. Updating the rules can *help*, but it is not required to 
 continue functioning at a reasonably high level.
 
 The real criticism I'd level against SpamAssassin as compared to 
 bogofilter is that SpamAssassin's bayesian classifier is relatively 
 simple. To my knowledge, it doesn't tokenize word-pairs and phrases, 
 but just single words; thus, something that uses more advanced 
 bayesian techniques (I presume bogofilter fits this description?) may 
 well beat it at that particular game---which is where updating the 
 rules can help as a compensating factor. It's not like a virus-scanner 
 where an out-of-date database is worthless.

So this does in fact mean that without that extra time tending the
rules, SA may actually let more spam through? 

It's not that I'm dissatisfied with my current situation. My #1
concern is actually with false negatives; I've since long given up
browsing through the CapturedSpam folder to check for them before I
delete everything. I haven't had any complaints from people who think
I neglect them (unless the complaints also end up in the filter...),
but in one comparison I read, it seemed that SA had absolutely 0 of
that, whereas bogo might have one or two (out of I don't remember how
many thousand).

All in all, it may not be such a bad thing; if there IS some mail I
haven't answered but should have, I can always say sorry, never got
your mail, it must have been trapped in the filter. I'd hate to lose
that excuse... :-)

eyolf

-- 
And 1.1.81 is officially BugFree(tm), so if you receive any bug-reports
on it, you know they are just evil lies.
(By Linus Torvalds, [EMAIL PROTECTED])


Re: Which spam filter do you use?

2007-10-05 Thread Michal Vitecek
 hi,

Eyolf ?strem wrote:
I've been using bogofilter ever since I first installed KDE/Kmail and the
potentially hassle-free configuration of SpamAssassin led to constant
crashes. I belive it has been solved by now, and in any case Kmail is
ancient history, but I was wondering what experiences the list people
have with various filters.

 we're using qsf (http://www.ivarch.com/programs/qsf/) which i personaly
 find perfect. moreover it's fast so we can use it on our mail server
 without any problems (tried spamassassin, spambayes with bad
 excerience).

-- 
fuf ([EMAIL PROTECTED])


Re: Which spam filter do you use?

2007-10-05 Thread Holger Weiss
* Eyolf Østrem [EMAIL PROTECTED] [2007-10-05 13:22]:
 On 05.10.2007 (02:08), Kyle Wheeler wrote:
  The real criticism I'd level against SpamAssassin as compared to
  bogofilter is that SpamAssassin's bayesian classifier is relatively
  simple.

Yes.  Also, SpamAssassin is terribly slow.

  To my knowledge, it doesn't tokenize word-pairs and phrases,
  but just single words; thus, something that uses more advanced
  bayesian techniques (I presume bogofilter fits this description?) may
  well beat it at that particular game---which is where updating the
  rules can help as a compensating factor. It's not like a virus-scanner
  where an out-of-date database is worthless.

 So this does in fact mean that without that extra time tending the
 rules, SA may actually let more spam through? 

Yes, even if you do take that time :-)  From my experience, which
includes few benchmarks I did, Bogofilter's accuracy is way better than
SpamAssassin's, even if enabling SpamAssassin's bayesian classifier,
Razor, and a few other non-default modules.

Bogofilter's accuracy highly depends on good training, though.  It's
critical to not only train Bogofilter with misclassified messages but
also with messages it's unsure about.  Fine-tuning the configuration
might also increase Bogofilter's accuracy[*].

IMO, SpamAssassin is only useful if you don't want or cannot train your
spam filter for some reason (e.g., if you're an ISP, though in this case
SpamAssassin's bad performance can be a real drawback).

 It's not that I'm dissatisfied with my current situation. My #1
 concern is actually with false negatives; I've since long given up
 browsing through the CapturedSpam folder to check for them before I
 delete everything.

So you mean false positives :-)  In any case, any serious spam filter
allows for adjusting the spam/ham thresholds, so you can always buy
redrucing the number of false positives to almost zero at the cost of
increasing the number of false negatives.  SpamAssassin's default
configuration does just that (which makes sense, of course).

 it seemed that SA had absolutely 0 of that, whereas bogo might have
 one or two (out of I don't remember how many thousand).

Depends on the configuration.

Holger

[*] You could try bogotune(1) and/or increasing multi-token-count,
though this increases the database size and decreases Bogofilter's
performance.  It _should_ increase Bogofilter's accuracy, though for
me it had far less effect than described in the following posting
(mainly because I get better results with multi-token-count=1 than
described in the posting):

http://www.bogofilter.org/pipermail/bogofilter-dev/2006-August/003357.html


Re: Which spam filter do you use?

2007-10-05 Thread Holger Weiss
* M. Fioretti [EMAIL PROTECTED] [2007-10-05 08:19]:
 On Fri, Oct 05, 2007 07:56:44 AM +0200, Christian Kuka ([EMAIL PROTECTED]) 
 wrote:
  At the moment I'm using bogofilter, razor, pyzor, dcc, spamassassin 
  and clamassassin and the following procmail rules:

 This (cascading several filters, lightest to heaviest) is what inmy
 other reply I said I'd like to try someday. Do you have any _measure_
 of how better it is wrt a bogofilter-only setup?

 Something like bogo alone stops 80% of spam, razor 80% of bogo
 missed, SA 80% of what is still left etc.

I did that myself for quite some time, but note that while this of
course reduces the number of false negatives, it also increases the
number of false positives, so you won't necessarily increase the global
accuracy with such a setup.  Giving useful _measures_ is really hard
though as they will heavily depend on the configuration/training of the
filters and of course also on the actual mail corpus they operate on.

Holger


Re: Which spam filter do you use?

2007-10-05 Thread M. Fioretti
On Fri, Oct 05, 2007 02:08:26 AM -0500, Kyle Wheeler
([EMAIL PROTECTED]) wrote:

 Bah; SpamAssassin is the swiss-army-knife of spam filters. It includes 
 a bayesian filter (not to mention things like razor and dcc, which are 
 constantly up-to-date), and as such, does not require updating the 
 rules.

But razor and dcc consume bandwidth, so to figure out if SA is
worthwhile maybe one should ask:

0) Do I have a flat rate fast connection, where I wouldn't notice SA
   contactly doing network checks?

1) after the whole message has been downloaded anyway, does SA block
   a LOT more spam than bogofilter or qsf? if the answer is yes,
   (and that is quite a big if, judging from both online literature
   and other answers in this thread...) is the difference big enough
   to justify the extra CPU and/or bandwidth consumption, plus keeping
   the rules updated?

I mean sure, SA has tons of extra non-bayesian tricks to catch spam,
but if the bayesian algorithms in bogofilter or qsf catch almost all
of it anyway without those extra tricks, bandwidth, cpu cycles and
manual maintenance... do I need to bother (this *is* a serious
question, I'm really trying to figure out if the _possibility_ to go
from just 1/2 spam messages a day in my inbox to 0 is worth the extra
effort)??

Of course, the answer depend on one's needs, how much mail he or she
receives and much other stuff. And if one has full control of the MTA,
where lots of spam can and should be recognized and blocked before
ever starting SA or any other content filter.

Ciao,   
Marco
-- 
Help your relatives, friends and partners love Free Standards and Free
Software!   http://digifreedom.net/node/84


Re: Which spam filter do you use?

2007-10-05 Thread M. Fioretti
On Fri, Oct 05, 2007 14:48:09 PM +0200, Michal Vitecek ([EMAIL PROTECTED]) 
wrote:
  hi,
 
 Eyolf ?strem wrote:
 I've been using bogofilter ever since I first installed KDE/Kmail and the
 potentially hassle-free configuration of SpamAssassin led to constant
 crashes. I belive it has been solved by now, and in any case Kmail is
 ancient history, but I was wondering what experiences the list people
 have with various filters.
 
  we're using qsf (http://www.ivarch.com/programs/qsf/) which i personaly

Now I've remembered! The page I mentioned earlier in the thread, where
I had read of cascading different bayesian filters was centered just
around bogofilter and qsf, and is this one:

http://www.acme.com/mail_filtering/bayesian_frameset.html

where the author also justifies why he put the filters in that
particular order. It is an interesting read, and what I plan to try.

HTH
Marco
-- 
Help *everybody* love Free Standards and Free Software!
http://digifreedom.net


Re: Which spam filter do you use?

2007-10-05 Thread Eyolf Østrem
On 05.10.2007 (15:30), Holger Weiss wrote:
 * eyolf østrem [EMAIL PROTECTED] [2007-10-05 13:22]:
 
  It's not that I'm dissatisfied with my current situation. My #1
  concern is actually with false negatives; I've since long given up
  browsing through the CapturedSpam folder to check for them before I
  delete everything.
 
 So you mean false positives :-) 

Uhm... yes... positive, negative, who's to know what's what in the
long run?  

Anyway, thanks for all the feedback (I sortof knew when I sent it off
that I'm starting a long thread here). I think I'll start by
training mr bogo and ask him to be careful with those false
whatever.

e


-- 
The seven eyes of Ningauble the Wizard floated back to his hood as he
reported to Fafhrd: I have seen much, yet cannot explain all.  The Gray
Mouser is exactly twenty-five feet below the deepest cellar in the palace
of Gilpkerio Kistomerces.  Even though twenty-four parts in twenty-five of
him are dead, he is alive.
Now about Lankhmar.  She's been invaded, her walls breached
everywhere and desperate fighting is going on in the streets, by a fierce
host which out-numbers Lankhamar's inhabitants by fifty to one -- and
equipped with all modern weapons.  Yet you can save the city.
How? demanded Fafhrd.
Ningauble shrugged.  You're a hero.  You should know.
-- Fritz Leiber, The Swords of Lankhmar


Re: Which spam filter do you use?

2007-10-05 Thread Holger Weiss
* Christian Kuka [EMAIL PROTECTED] [2007-10-05 07:56]:
 I also read from a scanner called crm114 in the linux magazine that
 should be realy good, but never checked that.

I use OSBF-Lua[1] which is a port of CRM114.  It's a Bayesian classifier
which uses a more sophisticated algorithm than Bogofilter and others,
and which indeed performs even better than Bogofilter for me, regarding
both classification accuracy and speed.  It gives me a global accuracy
of about 99.9% (about 1 out of 1000 e-mails are misclassified).  Also,
it learns a lot quicker than Bogofilter, usually there's no need to do
any pre-training[2].  And it has far less configuration knobs to play
with, which is also good :-)  It just works out-of-the-box.

I put the Mutt (1.5.x) macros I use for training online:

ftp://ftp.in-berlin.de/pub/users/weiss/spamfilter/README.osbf4mutt
ftp://ftp.in-berlin.de/pub/users/weiss/spamfilter/osbf4mutt-0.2.tar.gz

Holger

[1] http://osbf-lua.luaforge.net/
[2] http://page.mi.fu-berlin.de/~siefkes/software/trainfilter/#section_1_4


Re: Which spam filter do you use?

2007-10-05 Thread Kyle Wheeler
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday, October  5 at 01:22 PM, quoth Eyolf Østrem:
 So this does in fact mean that without that extra time tending the 
 rules, SA may actually let more spam through? 

As in most cases, the answer is it depends. It *may* let more spam 
through. In my case, I use SA for everything, and I train it on 
everything I can lay hands on---and have for years. I didn't notice a 
difference in performance (for my personal mail) when I added 
sa-update to the crontab. On the other hand, not everyone on my server 
uses the training folders, and so their spamassassin configurations 
don't adapt as quickly to new spam trends. That is where, on my 
system, updating the rules files really helps with.

But everyone has different experiences.

I will say, though, it is true that SpamAssassin is really slow, but 
not for the reason many people claim. We have a pretty standard 
spamc/spamd setup, and if I turn off all the network tests, it flies. 
But, because I find the extra tests SA performs valuable, I have it 
crunch through all the various DNS, DKIM, and remote-spam-database 
tests, which adds significantly to its average processing time. 
According to my logs, it takes anywhere from 4 to 16 seconds to 
process each message, which is pretty hefty! Then again, it's 
processing mail at delivery time so most folks don't notice, and it's 
doing it all in parallel... but because of those network tests, it's 
unlikely that I could improve performance significantly by putting it 
on a faster machine. If my email server was under heavier load, I'd 
probably have to look into changing my filtering setup (put SA on a 
different machine, boost my dns cache size, maybe add a prefilter of 
some kind, etc.).

As for impact, it chunks through on the order of 5000 emails a day, of 
which over 86% is spam (according to my logs), and I think the last 
time I got a message that was incorrectly classified was probably... 
maybe a week ago, or so.

Some of my users have better experiences, some worse. Usually, if one  
of my users complains that they're getting too much spam, it's because 
they haven't been using the training features.

I think the real question you need to ask is: does my current spam 
system work sufficiently well for my taste, and what am I willing to 
pay to get better accuracy?

~Kyle
- -- 
To believe in God is impossible---to not believe in him is absurd.
-- Voltaire
-BEGIN PGP SIGNATURE-
Comment: Thank you for using encryption!

iD8DBQFHBkxSBkIOoMqOI14RAqP4AKCDDa6efM5yi54+HFsZHCxn1atlcQCfcHui
8QCYRTaN6F3hnUPsXpYAhj0=
=211r
-END PGP SIGNATURE-


Re: Which spam filter do you use?

2007-10-05 Thread Kyle Wheeler
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday, October  5 at 04:02 PM, quoth M. Fioretti:
 0) Do I have a flat rate fast connection, where I wouldn't notice SA
   contactly doing network checks?

Indeed! Not all solutions are perfect for all situations.

 1) after the whole message has been downloaded anyway, does SA block
   a LOT more spam than bogofilter or qsf? if the answer is yes,
   (and that is quite a big if, judging from both online literature
   and other answers in this thread...) is the difference big enough
   to justify the extra CPU and/or bandwidth consumption, plus keeping
   the rules updated?

Unfortunately, that's a tough one to answer. How much CPU is a spam 
worth? Does the answer change depending on whether you get 5 spams a 
day, or 5000? There's no right answer to that one, it's all personal 
preference.

 And if one has full control of the MTA, where lots of spam can and 
 should be recognized and blocked before ever starting SA or any 
 other content filter.

That depends on what you're willing to put up with. For example, many 
of my users have learned to distrust things like DNS blacklists, and 
by extension any completely blocking spam mechanism. ANY anti-spam 
technique will have false-positives, and sometimes users aren't 
willing to put up with that, which changes the requirements of your 
antispam solution.

~Kyle
- -- 
Every gun that is made, every warship launched, every rocket fired 
signifies, in the final sense, a theft from those who hunger and are 
not fed, those who are cold and are not clothed.
-- Dwight D. Eisenhower
-BEGIN PGP SIGNATURE-
Comment: Thank you for using encryption!

iD8DBQFHBk5+BkIOoMqOI14RAsN2AKDJS8mF2Fi6dMZqzB1UVJPsUdb6nwCguz69
6DlFQihv3TkWUgUxX+oJKSw=
=8blO
-END PGP SIGNATURE-


Re: Which spam filter do you use?

2007-10-05 Thread Holger Weiss
* Kyle Wheeler [EMAIL PROTECTED] [2007-10-05 09:38]:
 I will say, though, it is true that SpamAssassin is really slow, but
 not for the reason many people claim.

My claim is that spamd requires an order of magnitude more CPU and
memory ressources than Bogofilter and others do.  Of course, that's
usually not an issue on single-user systems.

 We have a pretty standard spamc/spamd setup, and if I turn off all the
 network tests, it flies.

At my workplace, for about 40.000 users, we need six dedicated servers
for spamd and one which does Bogofilter and other stuff.

 As for impact, it chunks through on the order of 5000 emails a day, of 
 which over 86% is spam (according to my logs), and I think the last 
 time I got a message that was incorrectly classified was probably... 
 maybe a week ago, or so.

So you get about 1 out of 35.000 messages misclassified (which would be
an accuracy of about 99.997%)?  I cannot quite believe that :-)  In any
case, it does sound as if you do get significantly better results out of
SpamAssassin than I ever did, despite a lot of tweaking I tried.

 I think the real question you need to ask is: does my current spam
 system work sufficiently well for my taste, and what am I willing to
 pay to get better accuracy?

Yes.

Holger


Re: Which spam filter do you use?

2007-10-05 Thread Kyle Wheeler
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday, October  5 at 05:14 PM, quoth Holger Weiss:
 As for impact, it chunks through on the order of 5000 emails a day, 
 of which over 86% is spam (according to my logs), and I think the 
 last time I got a message that was incorrectly classified was 
 probably... maybe a week ago, or so.

 So you get about 1 out of 35.000 messages misclassified (which would be 
 an accuracy of about 99.997%)?  I cannot quite believe that :-)  In any 
 case, it does sound as if you do get significantly better results out of 
 SpamAssassin than I ever did, despite a lot of tweaking I tried.

Well, the twist to that statistic is that not all 5000 of those emails 
are for *me*. I'm only one of the users on my system that uses SA 
(though I'm certainly the biggest user of email on the system). Based 
on yesterday's logs, I personally get around 10,521 emails (including 
spam) a week. Assuming an oversized fudge factor of maybe ten 
misclassified messages that I either didn't notice or don't remember, 
that puts my accuracy around 99.9%.

I think this really does, though, boil down to: different filters do 
better for different people.

~Kyle
- -- 
I beseech you, in the bowels of Christ, think it possible you may be 
mistaken.
 -- Oliver Cromwell
-BEGIN PGP SIGNATURE-
Comment: Thank you for using encryption!

iD8DBQFHBly8BkIOoMqOI14RAsAUAJ9Ub6P2lQdUHXWK0aTq4EeaasKeSwCgxFfz
y7saTUN2IJvuttxUNpnPWE8=
=4sAa
-END PGP SIGNATURE-


Re: Which spam filter do you use?

2007-10-05 Thread M. Fioretti
On Fri, Oct 05, 2007 09:47:26 AM -0500, Kyle Wheeler
([EMAIL PROTECTED]) wrote:

 On Friday, October  5 at 04:02 PM, quoth M. Fioretti:
  0) Do I have a flat rate fast connection, where I wouldn't notice SA
contactly doing network checks?
 
 Indeed! Not all solutions are perfect for all situations.

Yes, that's the same thing I had said, which it applies to all the pieces of
the puzzle, including this:

is the difference big enough
to justify the extra CPU and/or bandwidth consumption, plus keeping
the rules updated?
 
  And if one has full control of the MTA, where lots of spam can and 
  should be recognized and blocked before ever starting SA or any 
  other content filter.
 
 That depends on what you're willing to put up with.

Of course. The MTA can block many surely spammish messages (those
pretending to come from inside your network, for example). At the same
time, DNS blacklists as a completely blocking mechanism make sense
only _if_ their maintainer is inhumanly perfect. Otherwise, it comes
too often too close to censorship (when who decides what you will not
receive is somebody ELSE, of course: any individual filtering
exclusively his or her own mail must remain free to shoot himself in
the feet).

Ciao,
Marco
-- 
Your own civil rights and the quality of your life heavily depends on
how software is used *around* you: http://digifreedom.net/node/84


Which spam filter do you use?

2007-10-04 Thread Eyolf Østrem
I've been using bogofilter ever since I first installed KDE/Kmail and the
potentially hassle-free configuration of SpamAssassin led to constant
crashes. I belive it has been solved by now, and in any case Kmail is
ancient history, but I was wondering what experiences the list people
have with various filters.

From what I've read, bogo is quicker than the other contenders, but
lets more spam through. While the speed was a concern in Kmail, since
the filtering was done in the app itself, which meant that it was
unresponsive for a while while the filtering was going on, that is not
so much of a concern now, when that is taken care of by procmail. That
leaves me with c. 10-15 spam mails a day that slip through (out of c.
150-200). 

So, should I switch? I'm quite happy with bogo, especially with the
current setup with some macros I borrowed from an article in linux
journal (I think it was), but I would very much like to hear what your
experiences are in this respect.

Eyolf

-- 
Unceasing warfare gives rise to its own social conditions which have been 
similar in all epochs. People enter a permanent state of alertness to ward
off attacks. You seethe absolute rule of the autocrat. All new things become 
dangerous frontier districts-new planets, new economic areas to exploit, new 
ideas or new devices, visitors-everything suspect. Feudalism takes firm hold, 
sometimes disguised as a politbureau or similar structure, but always present. 
Hereditary succession follows the lines of power. The blood of the powerful 
dominates. The vice regents of heaven or their equivalent apportion the wealth. 
And their know they must control inheritance or slowly let the power melt away. 
Now, do you understand Leto's Peace?

  -- The Stolen Journals


Re: Which spam filter do you use?

2007-10-04 Thread Kyle Wheeler
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Friday, October  5 at 12:23 AM, quoth Eyolf Østrem:
So, should I switch? I'm quite happy with bogo

If you're satisfied with your current spam prevention technique, 
there's absolutely no reason to switch. At best, you'll get fewer 
spams. If the spam level you currently receive is acceptable: count 
your blessings.

For what it's worth, all the domains I administer have used 
spamassassin for several years, and the accuracy is simply stunning.

~Kyle
- -- 
Of course it's the same old story. Truth usually is the same old 
story.
   -- Margaret Thatcher
-BEGIN PGP SIGNATURE-
Comment: Thank you for using encryption!

iD8DBQFHBZHMBkIOoMqOI14RAtUbAJ91DOTtzyh1sI8hagsTxF0Y5Y0eawCgwIrI
yAw6/xg2FPtkzaN8yqiijxs=
=nnJ7
-END PGP SIGNATURE-


Re: Which spam filter do you use?

2007-10-04 Thread Christian Kuka
Hi,

Thus wrote Eyolf Østrem ([EMAIL PROTECTED]) [07.10.05 06:54]:
 I've been using bogofilter ever since I first installed KDE/Kmail and the
 potentially hassle-free configuration of SpamAssassin led to constant
 crashes. I belive it has been solved by now, and in any case Kmail is
 ancient history, but I was wondering what experiences the list people
 have with various filters.

At the moment I'm using bogofilter, razor, pyzor, dcc, spamassassin 
and clamassassin and the following procmail rules:

-snip--

SPAM=$MAILDIR/.Junk/
VIRUS=$MAILDIR/.Virus/

DCCPROC=/usr/bin/dccproc
BOGOFILTER=/usr/bin/bogofilter
PYZOR=/usr/bin/pyzor
RAZOR=/usr/bin/razor-check
SPAMC=/usr/bin/spamc
CLAM=/usr/bin/clamassassin

:0 fw
| $BOGOFILTER -u -e -p

:0 e
{ EXITCODE=75 HOST }

:0:
* ^X-Bogosity:.(Yes|Spam)
$SPAM

# Razor
:0 Wc
| $RAZOR -conf=$HOME/.razor/razor-agent.conf

:0 Wa:
$SPAM

# Pyzor
:0 Wc
| $PYZOR check

:0 Wa:
$SPAM

# DCC
:0 fw 
| $DCCPROC  -ERw whiteclnt -ccmn,10

:0 e:
$SPAM

# Spamassassin
:0 fw: $PMDIR/spamassassin.db
| $SPAMC

:0:
* ^X-Spam-Status: Yes
$SPAM

:0 fw: $PMDIR/clamassassin.db
| $CLAM 

:0:
* ^X-Virus-Status: Yes
$VIRUS

-snip--

At the moment I don't get any spam/junk mail but sometimes some mails
from mailinglists (especially from the debian list) are in the junk folder.
But I also have to say that it realy takes some time till I get a mail.


 From what I've read, bogo is quicker than the other contenders, but
 lets more spam through. While the speed was a concern in Kmail, since
 the filtering was done in the app itself, which meant that it was
 unresponsive for a while while the filtering was going on, that is not
 so much of a concern now, when that is taken care of by procmail. That
 leaves me with c. 10-15 spam mails a day that slip through (out of c.
 150-200). 
 
 So, should I switch? I'm quite happy with bogo, especially with the
 current setup with some macros I borrowed from an article in linux
 journal (I think it was), but I would very much like to hear what your
 experiences are in this respect.

I also read from a scanner called crm114 in the linux magazine that
should be realy good, but never checked that.

 Eyolf
 

Christian
-- 
---
PGP/OpenPGP/GnuPG encrypted mail preferred in all private communication.
Key ID: 0x61E7150B - 4EFC 3FA6 FB8E 2BD5 CA11  6F15 F557 6B5D 61E7 150B

Christian Kuka
[EMAIL PROTECTED] 


signature.asc
Description: Digital signature


Re: spam filter

2002-08-16 Thread Andre Berger

* Rob 'Feztaa' Park [EMAIL PROTECTED], 2002-07-29 19:08 -0400:
 Alas! Andre Berger spake thus:
  By the way, what would an exmaple
  procmail rule to add a sender to the spamassassin blacklist look
  like?
 
 Probably something along the lines of this (but I'm a little rusty;
 the flags are probably wrong):
 
 :0 Wh:
 * some spam heuristic, like all caps subject lines
 |grep ^From: |some sed to extract info from the header  killfile
 :0 a:
 spamfolder

This didn't work as expected,

:0 c:
* ^X-Spam-Status: Yes
| grep ^From: | grep @ | grep -v lists.debian | grep -v uzscd5 | grep -v 
|berger.150 | grep -v andre.berger | grep -v -e ^$ | sed -e 's/ *(.*)//; s/.*//; 
|s/.*[:] *//'  $HOME/.procmail/blacklist

:0:
* ^X-Spam-Status: Yes
spamblock

does, at least for me...

-Andre



msg30314/pgp0.pgp
Description: PGP signature


Re: spam filter

2002-08-16 Thread Mike Erickson

* Andre Berger ([EMAIL PROTECTED]) wrote:
 * Rob 'Feztaa' Park [EMAIL PROTECTED], 2002-07-29 19:08 -0400:
  Alas! Andre Berger spake thus:
   By the way, what would an exmaple
   procmail rule to add a sender to the spamassassin blacklist look
   like?

Read `perldoc Mail::SpamAssasUser:Conf`. There are blacklist entries you
can put in your ~/.spamassassin/user_prefs file that do this much more
elegantly and easily. An excerpt:

   blacklist_from [EMAIL PROTECTED]
   Used to specify addresses which send mail that is
   often tagged (incorrectly) as non-spam, but which the
   user doesn't want.  Same format as whitelist_from.

Note that is per-user, which I think is what you want.

hth,

mike

  Probably something along the lines of this (but I'm a little rusty;
  the flags are probably wrong):
  
  :0 Wh:
  * some spam heuristic, like all caps subject lines
  |grep ^From: |some sed to extract info from the header  killfile
  :0 a:
  spamfolder
 
 This didn't work as expected,
 
 :0 c:
 * ^X-Spam-Status: Yes
 | grep ^From: | grep @ | grep -v lists.debian | grep -v uzscd5 | grep -v 
berger.150 | grep -v andre.berger | grep -v -e ^$ | sed -e 's/ *(.*)//; s/.*//; 
s/.*[:] *//'  $HOME/.procmail/blacklist
 
 :0:
 * ^X-Spam-Status: Yes
 spamblock
 
 does, at least for me...






msg30315/pgp0.pgp
Description: PGP signature


Re: spam filter

2002-07-29 Thread Andre Berger


--MGYHOYXEY6WxJCY8
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

* Rob 'Feztaa' Park [EMAIL PROTECTED], 2002-07-29 00:31 -0400:
 Alas! Iain Truskett spake thus:
  * Patrick ([EMAIL PROTECTED]) [29 Jul 2002 12:02]:
   * Andre Berger [EMAIL PROTECTED] [07-28-02 20:46]:
  [...]
   This would be better accomplished by procmail, since this is one of
   it's intended uses.  Use mutt to read/respond to email.
 =20
  If one is adding to a kill file, I personally would prefer it to be done
  in mutt (e.g. piped to another program while reading) just in case of
  false positives.
=20
 If you're worried about false positives, have it add the names to a
 'dormant' killfile, ie one that is not in use. Then, periodically, you
 can check the 'dormant' killfile for innocents, and if there aren't any,
 you can merge it into the real killfile that is actually in use on your
 system.

Spamassassin does this for me by sorting it to a spam folder. - I use
a macro to add the entires to a proto black- or whitelist
respectively, which both mutt and spamassassin share. More exactly,
from which the respective rules for scoring (mutt) and lack- or
whitelist (spamassassin) are generated.

As I use mutt with nntp patch, the whole usenet (well, my view on it
:) is not in control of spamassassin - but in the range of my
frontend (mutt's) scoring.

I'm just too lazy to add every spam entry manually to my blacklist, I
whould rather like to define the few false positives!

 But this is not mutt's job, either way.

The original idea behind it is to synchronize mutt's scoring
mechanism with spamassassin. By the way, what would an exmaple
procmail rule to add a sender to the spamassassin blacklist look
like?

-Andre

--MGYHOYXEY6WxJCY8
Content-Type: application/pgp-signature
Content-Disposition: inline

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9RTzyWkhBtALlJZ0RAq22AKCBZbqL59wIiibwNDotFakphd/5YQCfRar2
s/BVhd6+HOiUAe7BF0CBLvw=
=pXVD
-END PGP SIGNATURE-

--MGYHOYXEY6WxJCY8--



Re: spam filter

2002-07-29 Thread Rob 'Feztaa' Park


--gBBFr7Ir9EOA20Yy
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Alas! Andre Berger spake thus:
 By the way, what would an exmaple
 procmail rule to add a sender to the spamassassin blacklist look
 like?

Probably something along the lines of this (but I'm a little rusty;
the flags are probably wrong):

:0 Wh:
* some spam heuristic, like all caps subject lines
|grep ^From: |some sed to extract info from the header  killfile
:0 a:
spamfolder

--=20
Rob 'Feztaa' Park
http://members.shaw.ca/feztaa/
--
 No manual is ever necessary.
May I politely interject here: BULLSHIT.  That's the biggest Apple lie of a=
ll!
(Discussion in comp.os.linux.misc on the intuitiveness of interfaces.)

--gBBFr7Ir9EOA20Yy
Content-Type: application/pgp-signature
Content-Disposition: inline

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9RapcPTh2iSBKeccRAjxvAJ9QcUrn01wo6OmLHjRGLBTlAmc9ygCePIBA
+Z6GPH9E8xtn4bumHalWuhQ=
=Gvhc
-END PGP SIGNATURE-

--gBBFr7Ir9EOA20Yy--



Re: spam filter

2002-07-29 Thread Rob Reid

At  4:49 PM EDT on July 29 Rob 'Feztaa' Park sent off:
 Alas! Andre Berger spake thus:
  By the way, what would an exmaple
^^
It's not mutt, but since I don't have time to read the procmail list...

  procmail rule to add a sender to the spamassassin blacklist look
  like?
 
 Probably something along the lines of this (but I'm a little rusty;
 the flags are probably wrong):
 
 :0 Wh:
 * some spam heuristic, like all caps subject lines
*X-Spam-Flag: Yes

(unchecked) would use the result of all applied spamassassin tests.

 |grep ^From: |some sed to extract info from the header  killfile

I think I got this addysort tidbit from Gilbert linuxbrit somebody, to use
instead of sed.

#!/usr/bin/perl -wn 
# Picks out the actual address from the From: line 

unless (/\/) { print; } else { print /([^]+)/, \n; }

 :0 a:
 spamfolder
 

But what's the point?  I love spamassassin because it lets me *avoid*
blacklists, and their maintenance, and filter on the spamminess of the message
itself.  (unlike, say, twerp /. moderators... :-P ) I suppose you could check
the blacklist first, and skip spamassassin if it matches, to save some
computation, but my preSA experience with a personal blacklist is that there's
only a handful of spammers* that this is worthwhile for, because they don't
change their address.

*and I'm not sure they're even spammers (so I haven't razored** them) or just
mailing lists with overly open subscription policies.  But they smell spammy
enough that I'm too chicken to unsubscribe.

** An interesting alternative to blacklisting.

-- 
Minds that have nothing to confer find little to perceive. - Wordsworth
Robert I. Reid [EMAIL PROTECTED] http://astro.utoronto.ca/~reid/
PGP Key: http://astro.utoronto.ca/~reid/pgp.html



msg29946/pgp0.pgp
Description: PGP signature


spam filter

2002-07-28 Thread Andre Berger


--RnlQjJ0d97Da+TV1
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Hi!

I wonder if it is posssible to have mutt execute a command based on
spam subject lines created by spamassassin to automatically add the
corresponding email addresses to a shared kill file of mine?

-Andre

--RnlQjJ0d97Da+TV1
Content-Type: application/pgp-signature
Content-Disposition: inline

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9Q90TWkhBtALlJZ0RAkk4AKDnbEHpwjOEWyllw5fnfD/tM7LPpwCg6Ws7
5Dpd7BF/7u9pgBLveSDIKLU=
=SWYG
-END PGP SIGNATURE-

--RnlQjJ0d97Da+TV1--



Re: spam filter

2002-07-28 Thread Patrick

* Andre Berger [EMAIL PROTECTED] [07-28-02 20:46]:
 Hi!
 
 I wonder if it is posssible to have mutt execute a command based on
 spam subject lines created by spamassassin to automatically add the
 corresponding email addresses to a shared kill file of mine?

This would be better accomplished by procmail, since this is one of
it's intended uses.  Use mutt to read/respond to email.
-- 
Patrick Shanahan
Registered Linux User #207535 
  @ http://counter.li.org



Re: spam filter

2002-07-28 Thread Iain Truskett

* Patrick ([EMAIL PROTECTED]) [29 Jul 2002 12:02]:
 * Andre Berger [EMAIL PROTECTED] [07-28-02 20:46]:
[...]
 This would be better accomplished by procmail, since this is one of
 it's intended uses.  Use mutt to read/respond to email.

If one is adding to a kill file, I personally would prefer it to be done
in mutt (e.g. piped to another program while reading) just in case of
false positives.


cheers,
-- 
Iain 'Spoon' Truskett. http://eh.org/~koschei/



Re: spam filter

2002-07-28 Thread Rob 'Feztaa' Park

Alas! Iain Truskett spake thus:
 * Patrick ([EMAIL PROTECTED]) [29 Jul 2002 12:02]:
  * Andre Berger [EMAIL PROTECTED] [07-28-02 20:46]:
 [...]
  This would be better accomplished by procmail, since this is one of
  it's intended uses.  Use mutt to read/respond to email.
 
 If one is adding to a kill file, I personally would prefer it to be done
 in mutt (e.g. piped to another program while reading) just in case of
 false positives.

If you're worried about false positives, have it add the names to a
'dormant' killfile, ie one that is not in use. Then, periodically, you
can check the 'dormant' killfile for innocents, and if there aren't any,
you can merge it into the real killfile that is actually in use on your
system.

But this is not mutt's job, either way.

-- 
Rob 'Feztaa' Park
http://members.shaw.ca/feztaa/
--
Democracy is the worst form of government except all those other
forms that have been tried from time to time.
-- Winston Churchill



msg29933/pgp0.pgp
Description: PGP signature


Re: Offline SPAM-filter with mutt?

2002-02-21 Thread Dr. Sharukh K. R. Pavri.

On Thu, 21 Feb 2002, Marco Fioretti wrote:

 Hello,

Hi !
 
 I've been following this discussion with great interest, and looked
 (very shortly, I confess) to the tools that were mentioned. I have the
 impression that they require you to be online to work. Do they still
 work if you dial up, run fetchmail and hang off immediately via
 scripts?

Take a look at mailfilter.sourceforge.net.

It can be called from within fetchmail and best of all, it deletes spam
*on the server* so you don't need to download it at all. I had it set up
and working in less than 15 mins. It's a 150 kb download. Get version
0.3.2 though it is devlopment version, it works like a charm.

regards,

Sharukh.
-- 
Dr. Sharukh K. R. Pavri
Mumbai, India.



Re: SPAM-filter with mutt

2002-02-20 Thread Mark J. Reed

On Tue, Feb 19, 2002 at 02:22:08AM -0800, Will Yardley wrote:
 hrmm... i found it fairly easy to install - perl Makefile.pl ; make ;
 make install.
Yup.  That part worked fine.  Passed all the tests, etc.

 you need to setup your own copies of the config and user prefs files,
 which is perhaps what it's complaining about.
Indeed.  It could not find the master copies of those files. 
Despite installing itself according to perl's Config.pm - which
in my case meant it wound up in /opt/perl/bin, /opt/perl/share, etc
- it was looking for the files under /usr/share/spamassassin and in a
handful of other hard-coded places that have nothing to do with the
actual installation directories.

I did finally get it configured properly, and so far it seems to be
working very well.  In the past day it appears to have caught all of
the spam sent me, along with three false positives - all of which
were cron job output that uses a spamer-like attention-getting subject line.

-- 
Mark J. REED[EMAIL PROTECTED]



Re: SPAM-filter with mutt

2002-02-20 Thread Tony Godshall

On Tue, Feb 19, 2002 at 02:22:08AM -0800, Will Yardley wrote:
 Mark J. Reed wrote:
 
  If you're not comfortable digging around your Perl install
  and manually tweaking files, I recommend looking elsewhere.
 
 hrmm... i found it fairly easy to install - perl Makefile.pl ; make ;
 make install.

On debian it was just 
'apt-get update  apt-get install spamassassin'

 you need to setup your own copies of the config and user prefs files,
 which is perhaps what it's complaining about.
 
 i've used spambouncer for a long time, and like it, but i think
 spamassassin is much easier to configure, and seems to catch as much or
 more spam, while catching much less legitimate mail.

I agree.  Spamassassin uses multiple criteria, assigning
each one a point value, and messages that exceed a
threshhold are marked as spam.  If you have Vipul's Razor
installed too, a hit there is worth 3 points with 5 as the 
threshhold. (These are the default values (I don't know if
they are debian or upstream defaults); you can configure 
them to be whatever you want).  I read someplace that they
choose the points per criteria by running a sophisticated 
AI against a large base of legit vs. spam email.  (I think
it was a genetic algorithm.)

###



Offline SPAM-filter with mutt?

2002-02-20 Thread Marco Fioretti

Hello,

I've been following this discussion with great interest, and looked
(very shortly, I confess) to the tools that were mentioned. I have the
impression that they require you to be online to work. Do they still
work if you dial up, run fetchmail and hang off immediately via
scripts?

OR maybe, do they work simultaneously with fetchmail, so it just makes
the phone call some seconds longer?

Marco
(still living with *one* phone to share with
family, and no 56K solution in the neighborood
yet...)

RULE: Run Up2date Linux Everywhere
savannah.gnu.org/projects/rule/
http://www.freesoftware.fsf.org/rule/

-- 
Real leaders are ordinary people with extraordinary determination



Re: Offline SPAM-filter with mutt?

2002-02-20 Thread Justin R. Miller

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Said Marco Fioretti on Thu, Feb 21, 2002 at 12:28:19AM +0100:

 I've been following this discussion with great interest, and looked
 (very shortly, I confess) to the tools that were mentioned. I have the
 impression that they require you to be online to work. Do they still
 work if you dial up, run fetchmail and hang off immediately via
 scripts?
 
 OR maybe, do they work simultaneously with fetchmail, so it just makes
 the phone call some seconds longer?

The way you typically set up Spamassassin is as a pipe via procmail for
each message as it comes in.  You can optionally disable network-based
checks if you want to speed things up (such as sender domain MX
checking, etc.).  

- -- 
[!] Justin R. Miller [EMAIL PROTECTED]
PGP 0xC9C40C31 -=- http://codesorcery.net

http://www.cnn.com/2002/US/02/19/gen.strategic.influence/index.html

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD4DBQE8dDAc94d6K8nEDDERAnl2AJUcDIfc4GwVlrffTMkGgvHLJqo8AJ9qT5dt
NEHVjw27lfC0ptieiuN49Q==
=g+yr
-END PGP SIGNATURE-



Re: SPAM-filter with mutt

2002-02-19 Thread Thorsten Haude

Hi,

* Jobst Landgrebe [EMAIL PROTECTED] [02-02-18 14:10]:
I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
have a good suggestion which filter is easy to setup together with mutt
and how this must be done? Is there a filter one can call from the
.muttrc-file?
You should have a look at Maildrop, which is powerful while using a
very straightforward syntax.
http://www.flounder.net/~mrsam/maildrop/

Filters are not called from Mutt though. They usually act before Mutt
touches the mail. See http://www.vranx.de/mail/mail.html

Thorsten
-- 
The only reason to be alive is to enjoy it.



Re: SPAM-filter with mutt

2002-02-19 Thread Thorsten Haude

Hi,

* Jobst Landgrebe [EMAIL PROTECTED] [02-02-18 14:10]:
I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
have a good suggestion which filter is easy to setup together with mutt
and how this must be done? Is there a filter one can call from the
.muttrc-file?
Sorry for misreading your mail. Yes, Maildrop is a general purpose
Mailfilter, which could be used to reduce spam if you spend the work.
Some special program like Spamassassin is probably what you need.

Thorsten
-- 
Guns don't protect freedom, people protect freedom.



Re: SPAM-filter with mutt

2002-02-19 Thread Mark J. Reed

Well, I tried Spam Assassin, but have yet to get it to work.
It installed its files in one directory tree, but looks for them in
a completely different one.  The test works great when run from the
build directory, with all the files it needs in $PWD, but from any
other directory it can't find its rules, even though I think I've
manually corrected all of the paths in SpamAssassin.pm and
SpamAssassin/Conf.pm.

If you're not comfortable digging around your Perl install
and manually tweaking files, I recommend looking elsewhere.


-- 
Mark J. REED[EMAIL PROTECTED]



Re: SPAM-filter with mutt

2002-02-19 Thread Tony Godshall

On Mon, Feb 18, 2002 at 09:01:46AM -0600, johnathan spectre wrote:
 Some people have mentioned spamassassin, you could also try Razor at 
http://razor.sourceforge.net. Pretty simple to set up, requires perl and procmail 
to be useful, includes documentation for getting it to work with mutt for reporting.
 

I use spamassassin with Vipul's Razor (spamassissin, IIRC, uses
the Razor if it's installed.

It does a good job, with few false-positives (I put the
false-positives in my whitelist).

I have it in my .procmailrc; not directly in mutt.

--
Tony Godshall



Re: SPAM-filter with mutt

2002-02-19 Thread Will Yardley

Mark J. Reed wrote:

 If you're not comfortable digging around your Perl install
 and manually tweaking files, I recommend looking elsewhere.

hrmm... i found it fairly easy to install - perl Makefile.pl ; make ;
make install.

you need to setup your own copies of the config and user prefs files,
which is perhaps what it's complaining about.

i've used spambouncer for a long time, and like it, but i think
spamassassin is much easier to configure, and seems to catch as much or
more spam, while catching much less legitimate mail.

-- 
William Yardley
GnuPG public key: http://infinitejazz.net/will/pgp/gpg.asc




Re: SPAM-filter with mutt

2002-02-19 Thread David Rock

On Tue, Feb 19, 2002 at 10:56:59AM +0100, Thorsten Haude wrote:
 
 * Jobst Landgrebe [EMAIL PROTECTED] [02-02-18 14:10]:
 I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
 have a good suggestion which filter is easy to setup together with mutt
 and how this must be done? Is there a filter one can call from the
 .muttrc-file?

I am using junkfilter and have been pretty happy with it.

http://junkfilter.zer0.org/

-- 
David Rock
[EMAIL PROTECTED]



msg24590/pgp0.pgp
Description: PGP signature


SPAM-filter with mutt

2002-02-18 Thread Jobst Landgrebe

Dear List,

I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
have a good suggestion which filter is easy to setup together with mutt
and how this must be done? Is there a filter one can call from the
.muttrc-file?

Thanks in advance,

Jobst

Dr. Jobst Landgrebe
AG Wurst (Molecular Neurogenetics)
MPI of Psychiatry
Kraepelinstr. 2-10
D-80804 Munich
phone +49 89 30622-626 or -252
fax +49 89 30622642



Re: SPAM-filter with mutt

2002-02-18 Thread Gerhard Siegesmund

 I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
 have a good suggestion which filter is easy to setup together with mutt
 and how this must be done? Is there a filter one can call from the
 .muttrc-file?

I think it can't be done alone with mutt, as mutt is just a program to
read the mails in your folder. To detect spam and delete or file it to
another folder you have to use something like procmail and e.g.
spamassassin, which is great. Take a look at 

http://spamassassin.org/

and decide for yourself.

-- 
cu
  --== Jerri ==--
Homepage: http://www.jerri.de/   ICQ: 54160208



msg24546/pgp0.pgp
Description: PGP signature


Re: SPAM-filter with mutt

2002-02-18 Thread Mads Martin Jørgensen

* Jobst Landgrebe [EMAIL PROTECTED] [Feb 18. 2002 14:59]:
 Dear List,
 
 I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
 have a good suggestion which filter is easy to setup together with mutt
 and how this must be done? Is there a filter one can call from the
 .muttrc-file?

One alternative to look at would be:

http://www.spamassassin.org

-- 
Mads Martin Jørgensen, http://mmj.dk
Why make things difficult, when it is possible to make them cryptic
 and totally illogic, with just a little bit more effort?
-- A. P. J.



Re: SPAM-filter with mutt

2002-02-18 Thread Thomas Huemmler

* Jobst Landgrebe [EMAIL PROTECTED] [02/02/18 15:00]:
 I'm looking for a SPAMfilter that I could combine with mutt. Does anyone

Spamblock 
(http://www.belwue.de/wwwservices/hilfestellungen/spamblock.html) or
(ftp://ftp.belwue.de/belwue/software/spamblock) works fine for me.

Thomas

-- 
Thomas Hümmler * [EMAIL PROTECTED] * http://www.huemmler.de



Re: SPAM-filter with mutt

2002-02-18 Thread Justin R. Miller

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Said Jobst Landgrebe on Mon, Feb 18, 2002 at 02:10:52PM +0100:

 I'm looking for a SPAMfilter that I could combine with mutt. Does
 anyone have a good suggestion which filter is easy to setup together
 with mutt and how this must be done? Is there a filter one can call
 from the .muttrc-file?

Spamassassin has been mentioned, and I've done a small writeup on it
here: 

http://codesorcery.net/docs/spamtricks.html

- -- 
[!] Justin R. Miller [EMAIL PROTECTED]
PGP 0xC9C40C31 -=- http://codesorcery.net

http://www.american-partisan.com/cols/mcelroy/102399.htm

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8cTX794d6K8nEDDERAmuVAJ9U4x+pIJf/Tz4aewtfeyGFewAIJACcDNhd
KCKHycXF+GWaaA+TsIoPwIM=
=c3tx
-END PGP SIGNATURE-



Re: SPAM-filter with mutt

2002-02-18 Thread Artem Okounev

On Mon, Feb 18, 2002 at 02:10:52PM +0100, Jobst Landgrebe wrote:

 I'm looking for a SPAMfilter that I could combine with mutt. Does anyone
 have a good suggestion which filter is easy to setup together with mutt
 and how this must be done? Is there a filter one can call from the
 .muttrc-file?
I've been using the SpamBouncer for few months and quite happy with it.
It's just a set of Procmail rules, small, easy to configure. For more
info check out www.spambouncer.org.

--
Regards,
Artem Okounev.



msg24551/pgp0.pgp
Description: PGP signature