Re: URIBL/DNSBL from a database

2016-03-02 Thread Alex
Hi,

>> Is there any reason to not use the bl.score.sendrescore.com with
>> postscreen? I don't understand the distinction
>
> why?
>
> postscreen is supposed to be configured with sensible scoring to reject most
> spam without false positives long before it reachs smtpd or even expesnive
> contentfilters
>
> hence the scoring and any sensible setup would use postscreen combined with
> several whitelists
>
> that way your contentfilter has only to deal with the remaining 10% of junk
> and when you optimize postscreen to use a honeypot-MX (backup mx on a second
> IP with a postscreen whitelist_veto) and enforce pre-greet tests with a
> larger wait time there is not much for SpamAssasin to deal with

No, no, no. That's not at all what I mean. I know what the purpose and
benefit of postscreen is.

My issue relates to why is score.senderscore.com used with postscreen,
and not bl.score.senderscore.com as it is with SA?

Perhaps it should be as well?

The postscreen weights for score.senderscore.com are such that they
are relative to the threshold, so a reputation of say, 70 would
receive a higher score than a reputation of say, 90. In fact, 90
removes points.

And why is only bl.score.senderscore.com used with SA, and not the
reputation system?

Thanks,
Alex


Re: URIBL/DNSBL from a database

2016-03-02 Thread Reindl Harald



Am 03.03.2016 um 02:44 schrieb Alex:

Is there any reason to not use the bl.score.sendrescore.com with
postscreen? I don't understand the distinction


why?

postscreen is supposed to be configured with sensible scoring to reject 
most spam without false positives long before it reachs smtpd or even 
expesnive contentfilters


hence the scoring and any sensible setup would use postscreen combined 
with several whitelists


that way your contentfilter has only to deal with the remaining 10% of 
junk and when you optimize postscreen to use a honeypot-MX (backup mx on 
a second IP with a postscreen whitelist_veto) and enforce pre-greet 
tests with a larger wait time there is not much for SpamAssasin to deal with




signature.asc
Description: OpenPGP digital signature


Re: URIBL/DNSBL from a database

2016-03-02 Thread Alex
Hi,

Some time ago, David Jones wrote:
> In a related note, I have found that using the senderscore.org score combined
> with postscreen's weighting is very effective in quickly catching new 
> spammers.
>
> postscreen_dnsbl_sites =
>   score.senderscore.com=127.0.4.[60..69]*2
>   score.senderscore.com=127.0.4.[50..59]*4
>   score.senderscore.com=127.0.4.[30..49]*6
>   score.senderscore.com=127.0.4.[0..29]*8
>   score.senderscore.com=127.0.4.[90..100]*-6
>   score.senderscore.com=127.0.4.[80..89]*-4
>   score.senderscore.com=127.0.4.[70..79]*-2

This has been quite effective, but there have also been some
false-positives which I've had to whitelist. I've lowered the 0-29
result a bit so as to not make it a poison pill in my case.

I also probably should have asked at the time what your
postscreen_dnsbl_threshold is? Mine is 8.

Can someone explain how this differs from the bl.score.senderscore.com
that's used in the RCVD_IN_RP_RNBL rule?

Is there any reason to not use the bl.score.sendrescore.com with
postscreen? I don't understand the distinction.

Does anyone know where the return result codes are defined? I've
looked all over the senderscore website and can't find them.

Thanks,
Alex


Re: dcc checks

2016-03-02 Thread Reindl Harald



Am 03.03.2016 um 00:07 schrieb Roman Gelfand:

On Wed, Mar 2, 2016 at 1:54 PM RW > wrote:

On Wed, 2 Mar 2016 12:48:18 -0500
Roman Gelfand wrote:

 > I have awl disabled and dcc checks configured.  Why, sometimes,
 > spamassassin doesn't do dcc checks?

What makes you think that it doesn't?

this

X-Spam-Status: No, score=4.4 required=5.0 tests=BAYES_99,BAYES_999,
HTML_MESSAGE,MIME_HTML_ONLY autolearn=no version=3.3.2
X-Spam-Pyzor: Reported 0 times.


as opposed to

X-Spam-Status: Yes, score=6.3 required=5.0 tests=BAYES_99,BAYES_999,DCC_CHECK,

DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS
autolearn=no version=3.3.2


DCC_CHECK mieans it *hitted* as well as RAZOR2_CHECK and PYZOR_CHECK
the *_CHCK meta-rules are re-used for DIGEST_MULTIPLE

why don't you just read the rule descriptions which are also part of the 
report-headers?


cat /var/lib/spamassassin/3.004001/updates_spamassassin_org/*.cf | grep 
DCC_CHECK

#DATE_IN_PAST_12_24,DCC_CHECK,DRASTIC_REDUCED,FROM_HAS_MIXED_NUMS
meta DIGEST_MULTIPLERAZOR2_CHECK + DCC_CHECK + PYZOR_CHECK > 1
full DCC_CHECK  eval:check_dcc()
describe DCC_CHECK  Detected as bulk mail by DCC (dcc-servers.net)
tflags   DCC_CHECK  net
reuseDCC_CHECK




signature.asc
Description: OpenPGP digital signature


Re: dcc checks

2016-03-02 Thread Roman Gelfand
On Wed, Mar 2, 2016 at 2:50 PM Matus UHLAR - fantomas 
wrote:

> On 02.03.16 12:48, Roman Gelfand wrote:
> >I have awl disabled and dcc checks configured.  Why, sometimes,
> >spamassassin doesn't do dcc checks?
>
> that has nothing to do with AWL.
>
> You have already asked in the DCC mailing list (and I have replied), why
> did
> you interrupt the conversation and brought it here?
> --
> Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
> Warning: I wish NOT to receive e-mail advertising to this address.
> Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
> Enter any 12-digit prime number to continue.
>

I misuderstood your response.  I thought it was spamassassin thing.


Re: dcc checks

2016-03-02 Thread Roman Gelfand
On Wed, Mar 2, 2016 at 1:54 PM RW  wrote:

> On Wed, 2 Mar 2016 12:48:18 -0500
> Roman Gelfand wrote:
>
> > I have awl disabled and dcc checks configured.  Why, sometimes,
> > spamassassin doesn't do dcc checks?
>
> What makes you think that it doesn't?
>

this

X-Spam-Status: No, score=4.4 required=5.0 tests=BAYES_99,BAYES_999,
HTML_MESSAGE,MIME_HTML_ONLY autolearn=no version=3.3.2
X-Spam-Pyzor: Reported 0 times.


as opposed to

X-Spam-Status: Yes, score=6.3 required=5.0 tests=BAYES_99,BAYES_999,DCC_CHECK,

DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS
autolearn=no version=3.3.2


Re: RCVD_NUMERIC_HELO

2016-03-02 Thread Reindl Harald



Am 02.03.2016 um 23:13 schrieb RW:

On Wed, 2 Mar 2016 22:45:15 +0100
Reindl Harald wrote:


Am 02.03.2016 um 22:12 schrieb RW:

The only argument you have made against these rules is that they
don't work for you. They do work on the corpus that generates the
rule scores, so clearly the corpus does matter


VERY_LONG_REPTO_SHORT_MSG with a poison-pill score showed how much
you can trust that in real life


You just provided a another example of why the corpus does matter


the corpus is no magic which solves every problem

a misguided rule written with wrong expectations can only be mitigated 
by the corpus and mass-tests but it will never get fixed by it when the 
rule should not exist in that way from the begin


your expectation that the mass-test corpus can reproduce the whole real 
world is fundamentally broken




signature.asc
Description: OpenPGP digital signature


Re: RCVD_NUMERIC_HELO

2016-03-02 Thread RW
On Wed, 2 Mar 2016 22:45:15 +0100
Reindl Harald wrote:

> Am 02.03.2016 um 22:12 schrieb RW:
> > The only argument you have made against these rules is that they
> > don't work for you. They do work on the corpus that generates the
> > rule scores, so clearly the corpus does matter  
> 
> VERY_LONG_REPTO_SHORT_MSG with a poison-pill score showed how much
> you can trust that in real life

You just provided a another example of why the corpus does matter.


Re: RCVD_NUMERIC_HELO

2016-03-02 Thread Reindl Harald



Am 02.03.2016 um 22:12 schrieb RW:

The only argument you have made against these rules is that they don't
work for you. They do work on the corpus that generates the rule scores,
so clearly the corpus does matter


VERY_LONG_REPTO_SHORT_MSG with a poison-pill score showed how much you 
can trust that in real life


it also don't work for others but most people just don't look that much 
on their report-headers as i do while i can assure you nobody looked 
that deep into his overall setup and results as i did in the past year 
because only few people are perfectionists and are satisfied as long 
things are working somehow


most only cry out when the damage happened or just silently disable 
nonsense to save their own precious time





signature.asc
Description: OpenPGP digital signature


Re: RCVD_NUMERIC_HELO

2016-03-02 Thread RW
On Wed, 2 Mar 2016 15:58:10 +0100
Reindl Harald wrote:

> Am 02.03.2016 um 14:12 schrieb RW:
> > The FSL_HELO_BARE_IP_* rules were a bit broken until the end of
> > January, but for the last month they have been proper mutually
> > exclusive, deep and last-external tests. Any problems with the deep
> > hits are down to the rule generation corpus not matching your mail
> > rather than poor rule design. The ideal way to fix this is to
> > contribute to the QA process  
> 
> no, such tests are a matter of what they are doing 

Why's that? 

There are some tests that aren't done deep for good logical reasons. For
example dynamic-pool rDNS is a spam sign in the last-external
receivedheader, but is perfectly normal in a submission received header
or a webmail originating-ip header. 

RFC violations are not normal practice anywhere, so there's no
intrinsic reason not to test for them on deep headers. It's all about
results.


> and no auto-scoring / corpus will be able to change that at all

The only argument you have made against these rules is that they don't
work for you. They do work on the corpus that generates the rule scores,
so clearly the corpus does matter.

 


Re: Redis Bayes Expire

2016-03-02 Thread Reindl Harald



Am 02.03.2016 um 16:52 schrieb Marc Perkel:

My Redis bayes keeps growing. It acts like it's not expiring like it
should. Do I need to do something to force expire? Also - anything ekse
I should set?


Google "spamassassin redis expire" and hit number 2:
http://www.gossamer-threads.com/lists/spamassassin/users/191555



signature.asc
Description: OpenPGP digital signature


Re: dcc checks

2016-03-02 Thread Matus UHLAR - fantomas

On 02.03.16 12:48, Roman Gelfand wrote:

I have awl disabled and dcc checks configured.  Why, sometimes,
spamassassin doesn't do dcc checks?


that has nothing to do with AWL.

You have already asked in the DCC mailing list (and I have replied), why did
you interrupt the conversation and brought it here?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Enter any 12-digit prime number to continue.


Re: dcc checks

2016-03-02 Thread RW
On Wed, 2 Mar 2016 12:48:18 -0500
Roman Gelfand wrote:

> I have awl disabled and dcc checks configured.  Why, sometimes,
> spamassassin doesn't do dcc checks?

What makes you think that it doesn't?


dcc checks

2016-03-02 Thread Roman Gelfand
I have awl disabled and dcc checks configured.  Why, sometimes,
spamassassin doesn't do dcc checks?


Re: Redis Bayes Expire

2016-03-02 Thread Axb

On 03/02/2016 05:32 PM, Marc Perkel wrote:



On 03/02/16 08:02, Axb wrote:

On 03/02/2016 04:52 PM, Marc Perkel wrote:

My Redis bayes keeps growing. It acts like it's not expiring like it
should. Do I need to do something to force expire? Also - anything ekse
I should set?

Here's my settings.

bayes_sql_dsn  server=localhost:6379
use_bayes 1
use_bayes_rules 1

# Your choice if you want to use auto_learn
bayes_auto_learn  1

use_learner 1
bayes_learn_to_journal 0

# THIS IS MANDATORY - You do NOT need to run sa-learn to expire tokens
# *_ttl below takes care of it.
bayes_auto_expire  1

# You  will need to changes this according to your need
# This replaces sa-learn's sql/file based expire routines.
bayes_token_ttl 3d
bayes_seen_ttl  1d




run
"redis-cli info" and see

"expired_keys


expired_keys:0

This doesn't look right.

db0:keys=56725213,expires=3,avg_ttl=257005915


send me your redis.conf and the full "redis-cli info" output OFFLIST



Re: Redis Bayes Expire

2016-03-02 Thread Marc Perkel



On 03/02/16 08:02, Axb wrote:

On 03/02/2016 04:52 PM, Marc Perkel wrote:

My Redis bayes keeps growing. It acts like it's not expiring like it
should. Do I need to do something to force expire? Also - anything ekse
I should set?

Here's my settings.

bayes_sql_dsn  server=localhost:6379
use_bayes 1
use_bayes_rules 1

# Your choice if you want to use auto_learn
bayes_auto_learn  1

use_learner 1
bayes_learn_to_journal 0

# THIS IS MANDATORY - You do NOT need to run sa-learn to expire tokens
# *_ttl below takes care of it.
bayes_auto_expire  1

# You  will need to changes this according to your need
# This replaces sa-learn's sql/file based expire routines.
bayes_token_ttl 3d
bayes_seen_ttl  1d




run
"redis-cli info" and see

"expired_keys


expired_keys:0

This doesn't look right.

db0:keys=56725213,expires=3,avg_ttl=257005915



--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400



Re: Redis Bayes Expire

2016-03-02 Thread Axb

On 03/02/2016 05:02 PM, Axb wrote:

On 03/02/2016 04:52 PM, Marc Perkel wrote:

My Redis bayes keeps growing. It acts like it's not expiring like it
should. Do I need to do something to force expire? Also - anything ekse
I should set?

Here's my settings.

bayes_sql_dsn  server=localhost:6379
use_bayes 1
use_bayes_rules 1

# Your choice if you want to use auto_learn
bayes_auto_learn  1

use_learner 1
bayes_learn_to_journal 0

# THIS IS MANDATORY - You do NOT need to run sa-learn to expire tokens
# *_ttl below takes care of it.
bayes_auto_expire  1

# You  will need to changes this according to your need
# This replaces sa-learn's sql/file based expire routines.
bayes_token_ttl 3d
bayes_seen_ttl  1d




run
"redis-cli info" and see

"expired_keys"



Also see "Keyspace"

# Keyspace
db0:keys=28302508,expires=28302504,avg_ttl=256529266
...
db0:keys=28312796,expires=28312792,avg_ttl=245602093

these numbers should not be static..



Re: Redis Bayes Expire

2016-03-02 Thread Axb

On 03/02/2016 04:52 PM, Marc Perkel wrote:

My Redis bayes keeps growing. It acts like it's not expiring like it
should. Do I need to do something to force expire? Also - anything ekse
I should set?

Here's my settings.

bayes_sql_dsn  server=localhost:6379
use_bayes 1
use_bayes_rules 1

# Your choice if you want to use auto_learn
bayes_auto_learn  1

use_learner 1
bayes_learn_to_journal 0

# THIS IS MANDATORY - You do NOT need to run sa-learn to expire tokens
# *_ttl below takes care of it.
bayes_auto_expire  1

# You  will need to changes this according to your need
# This replaces sa-learn's sql/file based expire routines.
bayes_token_ttl 3d
bayes_seen_ttl  1d




run
"redis-cli info" and see

"expired_keys"




Redis Bayes Expire

2016-03-02 Thread Marc Perkel
My Redis bayes keeps growing. It acts like it's not expiring like it 
should. Do I need to do something to force expire? Also - anything ekse 
I should set?


Here's my settings.

bayes_sql_dsn  server=localhost:6379
use_bayes 1
use_bayes_rules 1

# Your choice if you want to use auto_learn
bayes_auto_learn  1

use_learner 1
bayes_learn_to_journal 0

# THIS IS MANDATORY - You do NOT need to run sa-learn to expire tokens
# *_ttl below takes care of it.
bayes_auto_expire  1

# You  will need to changes this according to your need
# This replaces sa-learn's sql/file based expire routines.
bayes_token_ttl 3d
bayes_seen_ttl  1d


--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400



RE: CHARSET_FARAWAY and other charsets

2016-03-02 Thread MAYER Hans

Dear All,

Many thanks for your reply.

> It may not be set where you think it is. IIWY I'd do a recursive grep

Yes, actually true. There is a ' ok_locales   en'  in file "sa-mimedefang.cf"
I have overlooked this fact. But this explains. 

>  the SA can use users' ~/.spamassassin/user_prefs where ...

I tried this but it does NOT work for me. Is it possible it only works when 
"spamd" is running ? 

> So, how do you call spamassassin? 

I am using "mimedefang" as INPUT_MAIL_FILTER in my sendmail configuration. 
And mimedefang is calling SA as a perl script in perl's bin directory. 
There is no "spamd" running. 


Kind regards 
Hans

-- 





quite probably, but it highly depends on how you filter the spam.
For example, using spamassassin/spamc from spamass-milter or per-user 
procmail/maildrop filters, the SA can use users' ~/.spamassassin/user_prefs 
where the directives are configured. 

So, how do you call spamassassin?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
10 GOTO 10 : REM (C) Bill Gates 1998, All Rights Reserved!



-Original Message-
From: RW [mailto:rwmailli...@googlemail.com] 
Sent: Wednesday, March 2, 2016 2:42 PM
To: users@spamassassin.apache.org
Subject: Re: CHARSET_FARAWAY and other charsets

On Wed, 2 Mar 2016 14:12:18 +0100
MAYER Hans wrote:


> pts rule name  description
>  --
> -- 3.2 
> CHARSET_FARAWAY_HEADER A foreign language charset used in headers 3.2
> CHARSET_FARAWAYBODY: Character set indicates a foreign
> language 2.5 MIME_CHARSET_FARAWAY   MIME character set indicates
> foreign language
> ...
> Looking into my configuration I didn't set "ok_languages" and I didn't 
> take "ok_locales". So I assume it will take the defaults from 
> "10_default_prefs.cf" with a value of "all".

"ok_locales" is the relevant setting and those rules shouldn't fire if it's set 
to "all". 

It may not be set where you think it is. IIWY I'd do a recursive grep
for ok_locales and see if it turns anything up.   


Re: RCVD_NUMERIC_HELO

2016-03-02 Thread Reindl Harald



Am 02.03.2016 um 14:12 schrieb RW:

The FSL_HELO_BARE_IP_* rules were a bit broken until the end of
January, but for the last month they have been proper mutually
exclusive, deep and last-external tests. Any problems with the deep
hits are down to the rule generation corpus not matching your mail
rather than poor rule design. The ideal way to fix this is to
contribute to the QA process


no, such tests are a matter of what they are doing and no auto-scoring / 
corpus will be able to change that at all




signature.asc
Description: OpenPGP digital signature


Re: CHARSET_FARAWAY and other charsets

2016-03-02 Thread RW
On Wed, 2 Mar 2016 14:12:18 +0100
MAYER Hans wrote:


> pts rule name  description
>  --
> -- 3.2
> CHARSET_FARAWAY_HEADER A foreign language charset used in headers 3.2
> CHARSET_FARAWAYBODY: Character set indicates a foreign
> language 2.5 MIME_CHARSET_FARAWAY   MIME character set indicates
> foreign language
> ...
> Looking into my configuration I didn't set "ok_languages" and I
> didn't take "ok_locales". So I assume it will take the defaults from
> "10_default_prefs.cf" with a value of "all".

"ok_locales" is the relevant setting and those rules shouldn't fire if
it's set to "all". 

It may not be set where you think it is. IIWY I'd do a recursive grep
for ok_locales and see if it turns anything up.   


Re: CHARSET_FARAWAY and other charsets

2016-03-02 Thread Matus UHLAR - fantomas

On 02.03.16 14:12, MAYER Hans wrote:

We are an international institute with employees from around the world. 
Therefore we send and receive regular e-mails in almost all languages.
Our setup: sendmail, mimedefang and SA version 3.3.2

Generally our spam detection rate is quite OK ( I would say). But one of our user is complaining 
that "all" his e-mails in Russian get marked as spam. I would say "some few" 
are marked.
Looking for one of his spam marked e-mails I see the following:

pts rule name  description
 -- --
3.2 CHARSET_FARAWAY_HEADER A foreign language charset used in headers
3.2 CHARSET_FARAWAYBODY: Character set indicates a foreign language
2.5 MIME_CHARSET_FARAWAY   MIME character set indicates foreign language

Beside several other entries with positive and negative points.
This e-mail was written in Cyrillic with an Cyrillic subject.

Looking into my configuration I didn't set "ok_languages" and I didn't take 
"ok_locales".
So I assume it will take the defaults from "10_default_prefs.cf" with a value of 
"all".


quite probably, but it highly depends on how you filter the spam.
For example, using spamassassin/spamc from spamass-milter or per-user
procmail/maildrop filters, the SA can use users' ~/.spamassassin/user_prefs
where the directives are configured. 


So, how do you call spamassassin?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
10 GOTO 10 : REM (C) Bill Gates 1998, All Rights Reserved!


Re: RCVD_NUMERIC_HELO

2016-03-02 Thread RW
On Tue, 01 Mar 2016 23:39:58 +0100
Benny Pedersen wrote:

> That one should not trigger in deap header tests

RCVD_NUMERIC_HELO doesn't do what the description says. It's actually a
*bare* IP address test (i.e. for an RFC violation rather than simply
an IP address), something that is better covered by the
FSL_HELO_BARE_IP_[12] rules. 

IMO it should go because it's a near duplicate of __FSL_HELO_BARE_IP_2.
Once it's gone the  FSL_HELO_BARE_IP_* rules will increase their
scores, and can be sensibly capped. 


The FSL_HELO_BARE_IP_* rules were a bit broken until the end of
January, but for the last month they have been proper mutually
exclusive, deep and last-external tests. Any problems with the deep
hits are down to the rule generation corpus not matching your mail
rather than poor rule design. The ideal way to fix this is to
contribute to the QA process.


CHARSET_FARAWAY and other charsets

2016-03-02 Thread MAYER Hans


Dear All,

We are an international institute with employees from around the world. 
Therefore we send and receive regular e-mails in almost all languages. 
Our setup: sendmail, mimedefang and SA version 3.3.2

Generally our spam detection rate is quite OK ( I would say). But one of our 
user is complaining that "all" his e-mails in Russian get marked as spam. I 
would say "some few" are marked. 
Looking for one of his spam marked e-mails I see the following: 

pts rule name  description
 -- --
 3.2 CHARSET_FARAWAY_HEADER A foreign language charset used in headers
 3.2 CHARSET_FARAWAYBODY: Character set indicates a foreign language
 2.5 MIME_CHARSET_FARAWAY   MIME character set indicates foreign language

Beside several other entries with positive and negative points.
This e-mail was written in Cyrillic with an Cyrillic subject. 

Looking into my configuration I didn't set "ok_languages" and I didn't take 
"ok_locales".
So I assume it will take the defaults from "10_default_prefs.cf" with a value 
of "all".

So my question how is it possible that the character set is detected to be "far 
away" ? 

Kind regards 
Hans