Re: bayes learning '0 messages found'

2010-02-13 Thread smfabac


RW-15 wrote:
 
 On Fri, 12 Feb 2010 17:51:12 +
 RW rwmailli...@googlemail.com wrote:
 
 On Fri, 12 Feb 2010 09:17:54 -0800 (PST)
 smfabac smfa...@att.net wrote:
 
  
 
  Mark, 
  
  On UNIX any file is a mbox file if it contains mail messages in the
  form:
  
  ^A^A^A^A
  mail headers
  mail body
  ^A^A^A^A
  ^A^A^A^A
  Next Message mail headers
  mail body
  ^A^A^A^A
 
 I don't know what that is, but it's not a standard mbox format.
 
 In mbox format the emails all start with a blank line and a From.
 
 
 It appears to be mmdf format
 
 http://www.washington.edu/imap/documentation/formats.txt.html
 
 

Ok, 

Now that we're all on the same page. How do I find out why sa-learn
is not processing the legal not-spam file?  To re-cap, sa-learn --spam
--mbox isspam works but sa-learn --ham --mbox not-spam is not
working.  

The sa-learn --dump magic shows that messages have been 
added by the sa-learn command:

$ sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0  12551  0  non-token data: nspam
0.000  0  68020  0  non-token data: nham
0.000  0 143948  0  non-token data: ntokens
0.000  0 1260104403  0  non-token data: oldest atime
0.000  0 1266048014  0  non-token data: newest atime
0.000  0 1266049794  0  non-token data: last journal sync
atime
0.000  0 1265630710  0  non-token data: last expiry atime
0.000  05529600  0  non-token data: last expire atime
delta
0.000  0  19095  0  non-token data: last expire
reduction co
unt

$ sa-learn --spam --mbox isspam
Learned tokens from 1 message(s) (1 message(s) examined)
$

$ sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0  12552  0  non-token data: nspam
0.000  0  68020  0  non-token data: nham
0.000  0 144608  0  non-token data: ntokens
0.000  0 1260104403  0  non-token data: oldest atime
0.000  0 1266048014  0  non-token data: newest atime
0.000  0 1266049794  0  non-token data: last journal sync
atime
0.000  0 1265630710  0  non-token data: last expiry atime
0.000  05529600  0  non-token data: last expire atime
delta
0.000  0  19095  0  non-token data: last expire
reduction co
unt
$ 

As you can see the nspam has incremented by 1.

$ sa-learn --ham --mbox not-spam
Learned tokens from 0 message(s) (0 message(s) examined)
$ 

Read Create Save Delete Undelete Print Folder Options Quit
Set mail options and preferences
Folder: not-spamSaturday February 13, 2010 
2:34
-- [1] Message 

  1 gerb...@zenez.co  11 Feb 10 6404  Quarterly ASCII posting of SCO
Uni


Is there a message size limit for sa-learn?  The message in not-spam is 
plain ascii, no html.

$ wc -l not-spam
   6408 not-spam  -- sa-learn --ham failed on not-spam folder with one
message
$ 
$ wc -l isspam
   1039 isspam   -- sa-learn --spam worked on isspam folder with one
message
$ 
-- 
View this message in context: 
http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27573012.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: MTX plugin created (Re: Spam filtering similar to SPF, less breakage)

2010-02-13 Thread Per Jessen
Justin Mason wrote:

 On Thu, Feb 11, 2010 at 03:00,  dar...@chaosreigns.com wrote:
 http://www.chaosreigns.com/mtx/
 
 
 It might be useful to compare with MTA MARK and see what the status of
 that proposal currently is:
 
 http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/
http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/draft-stumpf-dns-mtamark-04.txt

Amazing.  Justin, you must have known about that one - you can't
possibly have just googled it?


/Per Jessen, Zürich




Re: bayes learning '0 messages found'

2010-02-13 Thread Charles Gregory

On Sat, 13 Feb 2010, smfabac wrote:

Now that we're all on the same page. How do I find out why sa-learn
is not processing the legal not-spam file?  To re-cap, sa-learn --spam
--mbox isspam works but sa-learn --ham --mbox not-spam is not
working.


Well, I would expect if this suggestion were right you would have had all 
sorts of warning messages about syntax, but just in case


Maybe linux is interpreting the dash in the filename as a switch 
indicator? Try enclosing the file name in single quotes or use a filename 
without a dash...


- C




Re: MTX plugin created (Re: Spam filtering similar to SPF, less breakage)

2010-02-13 Thread Charles Gregory

On Sat, 13 Feb 2010, Per Jessen wrote:

Justin Mason wrote:

It might be useful to compare with MTA MARK and see what the status of
that proposal currently is:
http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/

Amazing.  Justin, you must have known about that one - you can't
possibly have just googled it?


Well, I certainly had never heard of this one. And I think that with one 
minor variation in concept it could be useful to scoring systems like 
SA...


Because of the threat of hacks, any system that 'favors' an MTA is simply 
giving spammers a target for exploitation. But an explicit 'disallow' 
record (MTA=0) created by the sysadmin would have a similar impact to 
deliberately naming PTR records as 'dynamic'. SA could 'detect' the 
explicit MTA=0 and add a score (or block outright at MTA level) The 
only thing I would *not* do, given the general laziness of the internet, 
is apply any default meaning to the absence of this TXT record. Only 
explicit identification of an IP or subnet as 'not permitted to send mail' 
would have significance to SA or a blocking MTA.


H. Could work. No impact for non-implementation. Disables an 
unauthorized IP for any case where it is used. I like it...


- C


Re: X-Spam-Languages always blank?

2010-02-13 Thread Robert Nicholson
I still need to do some debugging as it works sometimes so it looks like it's 
setup properly.

just don't understand why when it said possibly en in the debug log why the 
language wasn't populated but in some other cases it is being populated.

On Feb 12, 2010, at 10:48 PM, Matt Kettler wrote:

 On 2/12/2010 10:50 PM, Robert Nicholson wrote:
 I have 
 
 Feb 12 19:35:31.669 [81642] dbg: textcat: X-Languages: en, 
 X-Languages-Length: 424
 
 in my testing
 
 but the X-Spam-Languages ends up with nothing
 
 I have in my user_prefs 
 
 add_header all Languages _LANGUAGES_
 
 
 
 Is the X-Spam-Languages header being added, with no text, or is not
 appearing at all?
 
 What version of SA are you using? some versions (IIRC early 3.1.x
 members) did not support the _LANGUAGES_ meta-tag.
 
 



Re: X-Spam-Languages always blank?

2010-02-13 Thread Robert Nicholson
I'm going thru debug with version 3.3.0 and I see it definately puts the 
X-Languages metadata in place but even with the add_header line that I have it 
leaves X-Spam-Languages unpopulated

500:@matches = classify(\$body, $opts-{conf}, %skip);
  DB2 n
Feb 13 08:30:23.876 [56907] dbg: textcat: language possibly: en
Mail::SpamAssassin::Plugin::TextCat::extract_metadata(/home/elastica/SALOCAL-3.3.0/lib/perl5/site_perl/5.8.8/Mail/SpamAssassin/Plugin/TextCat.pm:507):
507:  undef $body;
  DB2 x @matches
0  'en'

but all I get for this example is blank for the header value

ok looks like my problem is because I'm reprocessing a message that already has 
an X-Spam-Languages: nothing header in it.

I expected that it wouldn't matter that it was there as it would be removed and 
populated accordingly but that is only happening if I remove
the original X-Spam-Languages header from the message.

is this a bug? 

My understanding is that it should remove all X-Spam headers when reprocessing 
the message? 

On Feb 12, 2010, at 10:48 PM, Matt Kettler wrote:

 On 2/12/2010 10:50 PM, Robert Nicholson wrote:
 I have 
 
 Feb 12 19:35:31.669 [81642] dbg: textcat: X-Languages: en, 
 X-Languages-Length: 424
 
 in my testing
 
 but the X-Spam-Languages ends up with nothing
 
 I have in my user_prefs 
 
 add_header all Languages _LANGUAGES_
 
 
 
 Is the X-Spam-Languages header being added, with no text, or is not
 appearing at all?
 
 What version of SA are you using? some versions (IIRC early 3.1.x
 members) did not support the _LANGUAGES_ meta-tag.
 
 



Re: SA 3.30 question: redundant index in bayes?

2010-02-13 Thread David Morton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Matt Kettler wrote:

 A quick diff of the 3.2 and 3.3 versions of these files shows this table
 was changed:
 
 
 CREATE TABLE bayes_token (
   id int(11) NOT NULL default '0',
   token char(5) NOT NULL default '',
   spam_count int(11) NOT NULL default '0',
   ham_count int(11) NOT NULL default '0',
   atime int(11) NOT NULL default '0',
   PRIMARY KEY  (id, token),
   INDEX bayes_token_idx1 (token),- deleted
   INDEX bayes_token_idx2 (id, atime)- renamed idx1
 ) TYPE=MyISAM;
 
 
 
 So token was both a primary key, and an index, which is redundant.

How is that redundant?  If you search for only a token, it would not be
indexed, and would perform very poorly.

In section 7.4.4 of the mysql docs:
http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html

 If the table has a multiple-column index, any leftmost prefix of the
index can be used by the optimizer to find rows. For example, if you
have a three-column index on (col1, col2, col3), you have indexed search
capabilities on (col1), (col1, col2), and (col1, col2, col3).

MySQL cannot use an index if the columns do not form a leftmost prefix
of the index. Suppose that you have the SELECT statements shown here:

SELECT * FROM tbl_name WHERE col1=val1;
SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

SELECT * FROM tbl_name WHERE col2=val2;
SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;

If an index exists on (col1, col2, col3), only the first two queries use
the index. The third and fourth queries do involve indexed columns, but
(col2) and (col2, col3) are not leftmost prefixes of (col1, col2, col3).





- --
David Morton morto...@dgrmm.net

Morton Software  Design  http://www.dgrmm.net - Ruby on Rails
 PHP Applications
Maia Mailguard http://www.maiamailguard.com- Spam management
 for mail servers
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFLdtRDUy30ODPkzl0RAhuUAKDLlkErP+nXtPQ6gfHQwOQpBw5e7wCgvn+n
1VsGlPGW6GW9GoJrwE3cTgw=
=PJew
-END PGP SIGNATURE-


Re: bayes learning '0 messages found'

2010-02-13 Thread smfabac


Charles Gregory wrote:
 
 On Sat, 13 Feb 2010, smfabac wrote:
 Now that we're all on the same page. How do I find out why sa-learn
 is not processing the legal not-spam file?  To re-cap, sa-learn --spam
 --mbox isspam works but sa-learn --ham --mbox not-spam is not
 working.
 
 Well, I would expect if this suggestion were right you would have had all 
 sorts of warning messages about syntax, but just in case
 
 Maybe linux is interpreting the dash in the filename as a switch 
 indicator? Try enclosing the file name in single quotes or use a filename 
 without a dash...
 
 - C
 
 
 
 

$ ls -lt | head -3
total 15868
-rw---   1 smf  group 249046 Feb 13 02:37 not-spam
-rw-rw-rw-   1 smf  group  94762 Feb 13 02:29 isspam
$ mv not-spam notspam
$ ls -lt | head -3
total 15868
-rw---   1 smf  group 249046 Feb 13 02:37 notspam
-rw-rw-rw-   1 smf  group  94762 Feb 13 02:29 isspam

$ sa-learn --showdots --ham --mbox notspam

Learned tokens from 0 message(s) (0 message(s) examined)
$

On the off chance that permissions on the file is an issue:

$ chmod 666 notspam
$ ls -lt | head -3
total 15868
-rw-rw-rw-   1 smf  group 249046 Feb 13 02:37 notspam
-rw-rw-rw-   1 smf  group  94762 Feb 13 02:29 isspam

$ sa-learn --showdots --ham --mbox notspam

Learned tokens from 0 message(s) (0 message(s) examined)

Still no luck.

-- 
View this message in context: 
http://old.nabble.com/bayes-learning-%270-messages-found%27-tp27358517p27576922.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SA 3.30 question: redundant index in bayes?

2010-02-13 Thread Henrik K
On Sat, Feb 13, 2010 at 10:33:08AM -0600, David Morton wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 Matt Kettler wrote:
 
  A quick diff of the 3.2 and 3.3 versions of these files shows this table
  was changed:
  
  
  CREATE TABLE bayes_token (
id int(11) NOT NULL default '0',
token char(5) NOT NULL default '',
spam_count int(11) NOT NULL default '0',
ham_count int(11) NOT NULL default '0',
atime int(11) NOT NULL default '0',
PRIMARY KEY  (id, token),
INDEX bayes_token_idx1 (token),- deleted
INDEX bayes_token_idx2 (id, atime)- renamed idx1
  ) TYPE=MyISAM;
  
  
  
  So token was both a primary key, and an index, which is redundant.
 
 How is that redundant?  If you search for only a token, it would not be
 indexed, and would perform very poorly.

As you didn't bother to check SpamAssassin sources, let me clarify it for
you. All the SA queries use id=? AND token=?. If something is changed,
it's usually for a reason. But thanks for the effort anyway.

https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5659



Re: bayes learning '0 messages found'

2010-02-13 Thread Matus UHLAR - fantomas
On 12.02.10 09:17, smfabac wrote:
 On UNIX any file is a mbox file if it contains mail messages in the form:
 
 ^A^A^A^A
 mail headers
 mail body
 ^A^A^A^A
 ^A^A^A^A
 Next Message mail headers
 mail body
 ^A^A^A^A

mmdf, not mbox.

 And my not-spam file meets this requirement:
 
 ^A^A^A^A

sa-learn apparently does not support mmdf. when sa-learn does not recognize
the format of the file, it does not learn from it.

 Also, reading the file with the command mail -f not-spam launches 
 the UNIX mail reader showing that the file is legal mbox file.

your mail command supports mmdf.

save the message to mbox format (saving it to a single file without the ^A's
could work) and try sa-learn from it.
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Linux - It's now safe to turn on your computer.
Linux - Teraz mozete pocitac bez obav zapnut.


Re: X-Spam-Languages always blank?

2010-02-13 Thread Matus UHLAR - fantomas
On 13.02.10 08:08, Robert Nicholson wrote:
 I still need to do some debugging as it works sometimes so it looks like
 it's setup properly.
 
 just don't understand why when it said possibly en in the debug log why
 the language wasn't populated but in some other cases it is being
 populated.

How do you call spamassassin?

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
We are but packets in the Internet of life (userfriendly.org)


Re: SA 3.30 question: redundant index in bayes?

2010-02-13 Thread David Morton
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Henrik K wrote:
 So token was both a primary key, and an index, which is redundant.
 How is that redundant?  If you search for only a token, it would not be
 indexed, and would perform very poorly.
 
 As you didn't bother to check SpamAssassin sources, let me clarify it for
 you. All the SA queries use id=? AND token=?. If something is changed,
 it's usually for a reason. But thanks for the effort anyway.

My bad, I was thinking of id as an autoincrement type... not with its
own semantic meaning.

- --
David Morton morto...@dgrmm.net

Morton Software  Design  http://www.dgrmm.net - Ruby on Rails
 PHP Applications
Maia Mailguard http://www.maiamailguard.com- Spam management
 for mail servers
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iD8DBQFLdvEoUy30ODPkzl0RAiqSAJoDB1ARJ1vGajHwE1pdEFHCbBAhogCgidNP
j86yRAfmlVfXPzPnx0n+bnk=
=9eEV
-END PGP SIGNATURE-


Re: MTX plugin functionally complete? Re: Spam filtering similar to SPF, less breakage

2010-02-13 Thread Matus UHLAR - fantomas
 On 02/11, Matus UHLAR - fantomas wrote:
  So you define the IP 64.71.152.40 as OK when sending mail from
  @panic.chaosreigns.com. address.
  
  so it's the exactly same as
  
  panic.chaosreigns.com. IN SPF v=spf1 a:64.71.152.40 -all

On 12.02.10 22:24, dar...@chaosreigns.com wrote:
 No.  MTX defines 64.71.152.40 as a legitimate transmitting mail server,
 regardless of the domain in the envelope from, From: header, etc..
 Popular misconception, it seems.

So the only effect of MTX should be confirmation that a machine may send
mail? So why the complicated check for DNS record combining DNS name and IP?
Why not simply requesting that machine has a mail or smtp in its DNS
name? 
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I'm not interested in your website anymore.
If you need cookies, bake them yourself.


Re: MTX plugin functionally complete? Re: Spam filtering similar to SPF, less breakage

2010-02-13 Thread Darxus
On 02/13, Matus UHLAR - fantomas wrote:
 So the only effect of MTX should be confirmation that a machine may send
 mail? 

Yes.

 So why the complicated check for DNS record combining DNS name and IP?
 Why not simply requesting that machine has a mail or smtp in its DNS
 name? 

I answered that recently.  

(I need to state that such a method would require a full circle DNS check.
Not a problem)

1) I am not comfortable requiring people to modify existing host names to
   participate.

2) Probably more importantly, I am concerned about the possibility of
   spammers tricking DNS maintainers into giving them such host names.

These two problems are handled by
http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/draft-stumpf-dns-mtamark-04.txt
which was recently mentioned by Justin Mason.


The advantage MTX has over mtamark, which I believe is important, is that
MTX ties the spam to a domain name, which is tied to a registrar, which can
be subpoenaed for the identity of the spammer.  mtamark leaves the spam
still only tied to the transmitting IP, which I believe is less convenient
to track.  Especially given IP hijacking via BGP.  Nasty.

-- 
Of course there's strength in numbers. But there's strength in sharp
weaponry too. Ironically, this lead to what we call 'civilization'.
- spore
http://www.ChaosReigns.com


Re: MTX plugin functionally complete? Re: Spam filtering similar to SPF, less breakage

2010-02-13 Thread mouss
dar...@chaosreigns.com a écrit :
 On 02/13, Matus UHLAR - fantomas wrote:
 So the only effect of MTX should be confirmation that a machine may send
 mail? 
 
 Yes.
 
 So why the complicated check for DNS record combining DNS name and IP?
 Why not simply requesting that machine has a mail or smtp in its DNS
 name? 
 
 I answered that recently.  
 
 (I need to state that such a method would require a full circle DNS check.
 Not a problem)
 
 1) I am not comfortable requiring people to modify existing host names to
participate.
 

fully agreed. an IP is not necessarily dedicated to mail, so there is no
reason to force people to put mail in it.

and snow shoe spammers already use names that people want...

 2) Probably more importantly, I am concerned about the possibility of
spammers tricking DNS maintainers into giving them such host names.
 
 These two problems are handled by
 http://tools.ietf.org/draft/draft-stumpf-dns-mtamark/draft-stumpf-dns-mtamark-04.txt
 which was recently mentioned by Justin Mason.
 
 
 The advantage MTX has over mtamark, which I believe is important, is that
 MTX ties the spam to a domain name, which is tied to a registrar, which can
 be subpoenaed for the identity of the spammer.  mtamark leaves the spam
 still only tied to the transmitting IP, which I believe is less convenient
 to track.  Especially given IP hijacking via BGP.  Nasty.
 

did you take a look at CSA

http://mipassoc.org/csv/draft-ietf-marid-csv-csa-02.txt

it uses an SRV record instead of the so-much-abused reverse dns hack.


Anyway, such approaches are only helpful if widely adopted. otherwise,
the overhead is not worth the pain.

At this time, just register your IP in DNSWL.




Outbound SPAM filter

2010-02-13 Thread shawnbor

Hello all, 

I'm having trouble getting my mail server to scan outbound emails for spam,
incoming works fine (in as much as that it captures spammy email). Any
feedback would be appreciated. My setup as follows: 

FreeBSD 6 
postfix 2.3.2 
amavisd-new 2.4.2 
spamassassin 3.1.3 
mysql 5 

conf files: 

amavisd.conf 

# @bypass_virus_checks_maps = (1); # uncomment to DISABLE anti-virus code 
# @bypass_spam_checks_maps = (1); # uncomment to DISABLE anti-spam code 

$max_servers = 2; # num of pre-forked children (2..15 is common), -m 
$daemon_user = 'vscan'; # (no default; customary: vscan or amavis), -u 
$daemon_group = 'vscan'; # (no default; customary: vscan or amavis), -g 

$mydomain = 'somedomain.com'; # a convenient default for other settings 

$MYHOME = '/var/amavis'; # a convenient default for other settings, -H 
$TEMPBASE = $MYHOME/tmp; # working directory, needs to exist, -T 
$ENV{TMPDIR} = $TEMPBASE; # environment variable TMPDIR 
$QUARANTINEDIR = '/var/virusmails'; # -Q 
# $quarantine_subdir_levels = 1; # add level of subdirs to disperse
quarantine 

# $daemon_chroot_dir = $MYHOME; # chroot directory or undef, -R 

# $db_home = $MYHOME/db; # dir for bdb nanny/cache/snmp databases, -D 
# $helpers_home = $MYHOME/var; # working directory for SpamAssassin, -S 
# $lock_file = $MYHOME/var/amavisd.lock; # -L 
# $pid_file = $MYHOME/var/amavisd.pid; # -P 
#NOTE: create directories $MYHOME/tmp, $MYHOME/var, $MYHOME/db manually 

# @local_domains_maps = ( [.$mydomain] ); 
# @mynetworks = qw( 127.0.0.0/8 [::1] [FE80::]/10 [FEC0::]/10 
# 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16 ); 

@local_domains_maps = ( [.$mydomain] ); 
@mynetworks = qw( 127.0.0.0/8 x.x.x.x/24 x.x.x.x/24 x.x.x.x/24 x.x.x.x
x.x.x.x/24 ); 

$log_level = 5; # verbosity 0..5, -d 
$log_recip_templ = undef; # disable by-recipient level-0 log entries 
$DO_SYSLOG = 1; # log via syslogd (preferred) 
$syslog_facility = 'mail.debug'; # Syslog facility as a string 
# e.g.: mail, daemon, user, local0, ... local7 
$syslog_priority = 'debug'; # Syslog base (minimal) priority as a string, 
# choose from: emerg, alert, crit, err, warning, notice, info, debug 

#$enable_db = 1; # enable use of BerkeleyDB/libdb (SNMP and nanny) 
#$enable_global_cache = 1; # enable use of libdb-based cache if $enable_db=1 

$inet_socket_port = 10024; # listen on this local TCP port(s) (see
$protocol) 
$unix_socketname = $MYHOME/amavisd.sock; # amavisd-release or
amavis-milter 
# option(s) -p overrides $inet_socket_port and $unix_socketname 

$interface_policy{'SOCK'}='AM.PDP-SOCK'; # only relevant with
$unix_socketname 
# Use with amavis-release over a socket or with Petr Rehor's amavis-milter.c 
# (with amavis-milter.c from this package or old amavis.c client use
'AM.CL'): 
$policy_bank{'AM.PDP-SOCK'} = { protocol='AM.PDP' }; 

$sa_tag_level_deflt = 2.0; # add spam info headers if at, or above that
level 
$sa_tag2_level_deflt = 6.31; # add 'spam detected' headers at that level 
$sa_kill_level_deflt = 6.31; # triggers spam evasive actions 
$sa_dsn_cutoff_level = 10; # spam level beyond which a DSN is not sent 
#$sa_debug 
# $sa_quarantine_cutoff_level = 20; # spam level beyond which quarantine is
off 
# $penpals_bonus_score = 4; # (no effect without a @storage_sql_dsn
database) 
# $penpals_threshold_high = $sa_kill_level_deflt; # don't waste time on hi
spam 

$sa_mail_body_size_limit = 512*1024; # don't waste time on SA if mail is
larger 
$sa_local_tests_only = 0; # only tests which do not require internet access? 

# @lookup_sql_dsn = 
# ( ['DBI:mysql:database=mail;host=127.0.0.1;port=3306', 'user1',
'passwd1'], 
# ['DBI:mysql:database=mail;host=host2', 'username2', 'password2'], 
# [DBI:SQLite:dbname=$MYHOME/sql/mail_prefs.sqlite, '', ''] ); 
# @storage_sql_dsn = @lookup_sql_dsn; # none, same, or separate database 

# $timestamp_fmt_mysql = 1; # if using MySQL *and* msgs.time_iso is
TIMESTAMP; 
# defaults to 0, which is good for non-MySQL or if msgs.time_iso is CHAR(16) 

$virus_admin = postmast...@$mydomain; # notifications recip. 

$mailfrom_notify_admin = postmast...@$mydomain; # notifications sender 
$mailfrom_notify_recip = postmast...@$mydomain; # notifications sender 
$mailfrom_notify_spamadmin = postmast...@$mydomain; # notifications sender 
$mailfrom_to_quarantine = ''; # null return path; uses original sender if
undef 

@addr_extension_virus_maps = ('virus'); 
@addr_extension_spam_maps = ('spam'); 
@addr_extension_banned_maps = ('banned'); 
@addr_extension_bad_header_maps = ('badh'); 
# $recipient_delimiter = '+'; # undef disables address extensions altogether 
# when enabling addr extensions do also Postfix/main.cf:
recipient_delimiter=+ 

$path = '/usr/local/sbin:/usr/local/bin:/usr/sbin:/sbin:/usr/bin:/bin'; 
# $dspam = 'dspam'; 

$MAXLEVELS = 14; 
$MAXFILES = 1500; 
$MIN_EXPANSION_QUOTA = 100*1024; # bytes (default undef, not enforced) 
$MAX_EXPANSION_QUOTA = 300*1024*1024; # bytes (default undef, not enforced) 

$sa_spam_subject_tag = '***SPAM*** '; 

Re: SA 3.30 question: redundant index in bayes?

2010-02-13 Thread Matt Kettler
On 2/13/2010 11:33 AM, David Morton wrote:


  So token was both a primary key, and an index, which is redundant.

 How is that redundant?  If you search for only a token, it would not be
 indexed, and would perform very poorly.

Because it is the primary key, which is by definition, an index!!! (it
is the fastest index too, so any other index on the same column will
just be slower (if used at all))






Re: MTX plugin functionally complete? Re: Spam filtering similar to SPF, less breakage

2010-02-13 Thread Darxus
On 02/13, mouss wrote:
 dar...@chaosreigns.com a écrit :
 did you take a look at CSA
   
 http://mipassoc.org/csv/draft-ietf-marid-csv-csa-02.txt

I had not, thanks.

Looks like it ties the helo domain to the delivering IP, breaking (broken)
forwarding just like SPF?

 Anyway, such approaches are only helpful if widely adopted. otherwise,
 the overhead is not worth the pain.

I disagree.  But I think you have probably already read my reasons.

 At this time, just register your IP in DNSWL.

I have provided a server since 2007, and been an admin longer.  And wrote
some stuff.  I have assigned a minor penalty to emails not matching DNSWL
for years.  A significant part of my motivation for creating MTX is the
difficulty of maintaining that list.  MTX is very much inspired by DNSWL
- it's the same, except the domain that hosts the records (and omitting
the host category in the third octet).  SPF and DNSWL were the two
things in my head at the time that MTX occurred to me.  The bottom of
my MTX page credits them.  http://www.chaosreigns.com/mtx/background/
goes into detail.

-- 
I'd rather be happy than right any day.
- Slartiblartfast, The Hitchhiker's Guide to the Galaxy
http://www.ChaosReigns.com


Re: SA 3.30 question: redundant index in bayes?

2010-02-13 Thread John Hardin

On Sat, 13 Feb 2010, Matt Kettler wrote:


On 2/13/2010 11:33 AM, David Morton wrote:



So token was both a primary key, and an index, which is redundant.


How is that redundant?  If you search for only a token, it would not be
indexed, and would perform very poorly.


Because it is the primary key, which is by definition, an index!!! (it
is the fastest index too, so any other index on the same column will
just be slower (if used at all))



  PRIMARY KEY  (id, token),


token is _not_ the primary key. It is a primary key _member_, but it is 
not the entire primary key.


David's comment is precisely correct. The PK index would not help a search 
on token by itself, as it is not the first member of the PK.


The fact that token is never queried by itself means an index on token is 
not needed, but such an index is _not_ redundant.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  End users want eye candy and the ooo's and hhh's experience
  when reading mail. To them email isn't a tool, but an entertainment
  form. -- Steve Lake
---
 9 days until George Washington's 278th Birthday


Re: bayes learning '0 messages found'

2010-02-13 Thread John Hardin

On Sat, 13 Feb 2010, smfabac wrote:


Is there a message size limit for sa-learn?


Yes, there is, and sadly sa-learn does not explicitly tell you a message 
has been skipped because it's too large.


If there's a non-text attachment try deleteing it and re-learning the 
message.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  End users want eye candy and the ooo's and hhh's experience
  when reading mail. To them email isn't a tool, but an entertainment
  form. -- Steve Lake
---
 9 days until George Washington's 278th Birthday


sa-update fails: daryl.dostech...404

2010-02-13 Thread jidanni
$ sa-update
http: GET http://daryl.dostech.ca/sa-update/asf/909775.tar.gz request failed: 
404 Not Found


spamassassin script is v3.003000, but using modules v3.004000

2010-02-13 Thread jidanni
Help, I dared to update from SVN, and now spamassassin refuses to run:
$ svn update
$ make install
$ sa-update --install by hand the newest files I could find on 
http://daryl.dostech.ca/sa-update/asf/
$ spamassassin
spamassassin: spamassassin script is v3.003000, but using modules v3.004000


Re: spamassassin script is v3.003000, but using modules v3.004000

2010-02-13 Thread Mark Martinec
On Sunday February 14 2010 01:02:10 jida...@jidanni.org wrote:
 Help, I dared to update from SVN, and now spamassassin refuses to run:
 $ svn update
 $ make install

The usual procedure is:
  perl Makefile.PL; make; make test; make install

 $ sa-update --install by hand the newest files I could find on
  http://daryl.dostech.ca/sa-update/asf/ $ spamassassin
 spamassassin: spamassassin script is v3.003000, but using modules v3.004000

It should be alright to use the SVN trunk version, I'm using it here.

 spamassassin: spamassassin script is v3.003000, but using modules v3.004000

Apparently you updated the modules but somehow the spamassassin program
was not updated, or was installed in a different location that your
old 3.3.0 spamassassin. Please check your paths, perhaps you have
the two versions installed at different locations, or perhaps the
'make install' failed to install the 'spamassassin' program.

  Mark


Re: spamassassin script is v3.003000, but using modules v3.004000

2010-02-13 Thread Robert Nicholson
How many spamassassin's do you have?

Isn't it saying the script in bin isn't matching the modules in lib

On Feb 13, 2010, at 6:02 PM, jida...@jidanni.org wrote:

 Help, I dared to update from SVN, and now spamassassin refuses to run:
 $ svn update
 $ make install
 $ sa-update --install by hand the newest files I could find on 
 http://daryl.dostech.ca/sa-update/asf/
 $ spamassassin
 spamassassin: spamassassin script is v3.003000, but using modules v3.004000



Re: X-Spam-Languages always blank?

2010-02-13 Thread Robert Nicholson
I think I've solved this issue already.

I had the add_header etc but I hadn't yet enabled the TextCat plugin.

On Feb 13, 2010, at 12:34 PM, Matus UHLAR - fantomas wrote:

 On 13.02.10 08:08, Robert Nicholson wrote:
 I still need to do some debugging as it works sometimes so it looks like
 it's setup properly.
 
 just don't understand why when it said possibly en in the debug log why
 the language wasn't populated but in some other cases it is being
 populated.
 
 How do you call spamassassin?
 
 -- 
 Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
 Warning: I wish NOT to receive e-mail advertising to this address.
 Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
 We are but packets in the Internet of life (userfriendly.org)



Why does svn update pull in Mail-SpamAssassin-3.4.0.tar.gz?

2010-02-13 Thread jidanni
Why does svn update pull in Mail-SpamAssassin-3.4.0.tar.gz?
That's not how things work with Mediawiki.


Re: bayes learning '0 messages found'

2010-02-13 Thread Charles Gregory

On Sat, 13 Feb 2010, smfabac wrote:

$ sa-learn --showdots --ham --mbox notspam
Learned tokens from 0 message(s) (0 message(s) examined)
Still no luck.


Are we sure the notspam file is clean? Try trimming it down to just one or 
two messages, and see how it goes


- C


Re: spamassassin script is v3.003000, but using modules v3.004000

2010-02-13 Thread jidanni
Maybe they changed something. In the past
perl Makefile.PL PREFIX=$HOME/.spamassassin-tree
also took care of where bin/spamassassin went. Now it seems left behind,
due to this suspicious commented out code?

# needs to be added to MY::install if used
#bin__install: $(INST_SCRIPT)/sa-filter
## $(RM_F) $(B_SCRIPTDIR)/spamassassin
## $(SYMLINK) $(INST_SCRIPT)/sa-filter $(B_SCRIPTDIR)/spamassassin