Re: yahoo rcvd bug?

2014-10-22 Thread Quinn Comendant
So apparently on my system—qmail as per qmailtoaster.com—it is by design not to 
include the rDNS hostname in the Received: … header. See my discussion on the 
QMT list: 
http://www.mail-archive.com/qmailtoaster-list@qmailtoaster.com/msg38313.html.

So that degrades SA performance huh? 

At least it's configurable, I'm going to enable lookups and see what happens.

Quinn



Re: General rules for training bayes

2014-10-22 Thread Axb

On 10/22/2014 03:29 AM, Alex Regan wrote:

I have the database in a replicated mysql database for now. I'd like to
go to redis, but it's not quite ready for distributed configurations,
correct?


What do you mean by distributed configurations?

- many clients querying a central Redis DB?

- real clustering?

- something like mysql dual master?


Redis *does* support master/slaves config.
plus failover handling via Sentinel
(http://redis.io/topics/sentinel)

Redis does *not* support full clustering. (atm...Redis cluster is RC1)

all this you can read on http://redis.io

h2h
Axb





Hacked sites: dropbox V.2

2014-10-22 Thread Axb



uriAXB_URI_MLW_DROPBOX/\/dropbox\/doc\.php$/
score  AXB_URI_MLW_DROPBOX25.0


this rule will probably loose it's teeth pretty fast

enjoy


Re: tag DKIMDOMAIN is still blocking action 0

2014-10-22 Thread Mark Martinec

Chris,


Ran some spam and ham through 'spamassassin -D -t' today mainly looking
to see if there were any mention of dns issues as I had reported
earlier. At the end of the run I see this whether it's ham or spam:

Oct 21 19:30:09.086 [31076] dbg: check: tagrun - tag DKIMDOMAIN is 
still

blocking action 0


If a message does not contain a *valid* DKIM signature, then the
tag DKIMDOMAIN won't be set, so any rules that depend on this tag
will not be activated. So this is a normal situation for unsigned
or forged mail.

The rules in question are probably
  DKIMDOMAIN_IN_DWL and __DKIMDOMAIN_IN_DWL_ANY


20_dnsbl_tests.cf : (wrapped for clarity):

  askdns   DKIMDOMAIN_IN_DWL
_DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT
 /^([a-z]+ )*(transaction|list|all)( [a-z]+)*$/

  askdns   __DKIMDOMAIN_IN_DWL_ANY
_DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT

So these rules would launch a DNS query against _vouch.dwl.spamhaus.org
if and only if a message would contain a valid DKIM signature.

  Mark


Re: General rules for training bayes

2014-10-22 Thread Matus UHLAR - fantomas

On 21.10.14 21:29, Alex Regan wrote:
I'm having some trouble with my bayes database, and thought it would 
be a good time to just rebuild it. I'm wondering if anyone has any 
good suggestions for the type of mail that should be used for 
training.


be careful about forwarded mail, if possible. if you get many spam from your
old account, it may start to classify ALL mail forwarded through that
account as spam.

I understand individually-crafted emails would make the best ham, 


crafted?


but do you train mail from mass-mailers?  Staples?  Facebook?  Banks?


why not? of course I train if I want such mail to be properly classified
later.

The main problem I'd like to avoid is the emails that are 
questionable as to whether they were opt-in and something the user 
actually wants, or those that are probably spam.


agreed.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I'm not interested in your website anymore.
If you need cookies, bake them yourself.


Re: General rules for training bayes

2014-10-22 Thread Benny Pedersen
On October 22, 2014 1:08:45 PM Matus UHLAR - fantomas uh...@fantomas.sk 
wrote:



be careful about forwarded mail, if possible. if you get many spam from your
old account, it may start to classify ALL mail forwarded through that


This only correct if internal networks and or trusted networks is not 
configured correct


Re: General rules for training bayes

2014-10-22 Thread Reindl Harald


Am 22.10.2014 um 13:15 schrieb Benny Pedersen:

On October 22, 2014 1:08:45 PM Matus UHLAR - fantomas:


be careful about forwarded mail, if possible. if you get many spam
from your
old account, it may start to classify ALL mail forwarded through that


This only correct if internal networks and or trusted networks is not
configured correct


what has a forwarding from @gmx.net or so to do with trusted_networks?

the topic was about train the bayes on the Received headers and no 
single word about internal hops





signature.asc
Description: OpenPGP digital signature


Re: General rules for training bayes

2014-10-22 Thread Matus UHLAR - fantomas

be careful about forwarded mail, if possible. if you get many spam from your
old account, it may start to classify ALL mail forwarded through that


On 22.10.14 13:15, Benny Pedersen wrote:
This only correct if internal networks and or trusted networks is not 
configured correct


oh, does BAYES take care about these?

we are still talking about manually feeding BAYES, aren't we?
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
M$ Win's are shit, do not use it !


Re: General rules for training bayes

2014-10-22 Thread RW
On Wed, 22 Oct 2014 13:30:44 +0200
Matus UHLAR - fantomas wrote:

 be careful about forwarded mail, if possible. if you get many spam
 from your old account, it may start to classify ALL mail forwarded
 through that
 
 On 22.10.14 13:15, Benny Pedersen wrote:
 This only correct if internal networks and or trusted networks is
 not configured correct
 
 oh, does BAYES take care about these?

To a limited extent. It effects the contents of some metadata, but
I don't think affect which headers are tokenized.


Re: General rules for training bayes

2014-10-22 Thread Benny Pedersen
On October 22, 2014 1:30:44 PM Matus UHLAR - fantomas uh...@fantomas.sk 
wrote:



oh, does BAYES take care about these?
we are still talking about manually feeding BAYES, aren't we?


Sorry, yes bayes can be ignore all headers if one dont like it to track 
origin senders or ips


Re: General rules for training bayes

2014-10-22 Thread Reindl Harald


Am 22.10.2014 um 14:30 schrieb Benny Pedersen:

On October 22, 2014 1:30:44 PM Matus UHLAR - fantomas
uh...@fantomas.sk wrote:


oh, does BAYES take care about these?
we are still talking about manually feeding BAYES, aren't we?


Sorry, yes bayes can be ignore all headers if one dont like it to track
origin senders or ips


again: what has that to do with trusted_networks?

back to what you said above: it don't by default and so your response 
was completly OT as well your yesterdays Fokus should just be reversed 
to allow ip ranges not deny ip ranges in context of fail2ban


if you want to do that just remove fail2ban and open your ports only for 
specific IP's and you are done - but please try to stay at context





signature.asc
Description: OpenPGP digital signature


Re: General rules for training bayes

2014-10-22 Thread Matus UHLAR - fantomas

On October 22, 2014 1:30:44 PM Matus UHLAR - fantomas
uh...@fantomas.sk wrote:

oh, does BAYES take care about these?
we are still talking about manually feeding BAYES, aren't we?



Am 22.10.2014 um 14:30 schrieb Benny Pedersen:

Sorry, yes bayes can be ignore all headers if one dont like it to track
origin senders or ips


On 22.10.14 14:44, Reindl Harald wrote:

again: what has that to do with trusted_networks?


seems that Benny just missed the fact that we are talking about BAYES.
I think it's clear now...

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
The 3 biggets disasters: Hiroshima 45, Tschernobyl 86, Windows 95


Re: General rules for training bayes

2014-10-22 Thread Benny Pedersen
On October 22, 2014 3:05:56 PM Matus UHLAR - fantomas uh...@fantomas.sk 
wrote:



On October 22, 2014 1:30:44 PM Matus UHLAR - fantomas
uh...@fantomas.sk wrote:
oh, does BAYES take care about these?
we are still talking about manually feeding BAYES, aren't we?

Am 22.10.2014 um 14:30 schrieb Benny Pedersen:
Sorry, yes bayes can be ignore all headers if one dont like it to track
origin senders or ips

On 22.10.14 14:44, Reindl Harald wrote:
again: what has that to do with trusted_networks?

seems that Benny just missed the fact that we are talking about BAYES.
I think it's clear now...


Its all independic but related


Re: General rules for training bayes

2014-10-22 Thread RW
On Wed, 22 Oct 2014 14:44:24 +0200
Reindl Harald wrote:

 
 Am 22.10.2014 um 14:30 schrieb Benny Pedersen:
  On October 22, 2014 1:30:44 PM Matus UHLAR - fantomas
  uh...@fantomas.sk wrote:
 
  oh, does BAYES take care about these?
  we are still talking about manually feeding BAYES, aren't we?
 
  Sorry, yes bayes can be ignore all headers if one dont like it to
  track origin senders or ips
 
 again: what has that to do with trusted_networks?

His original point was not irrelevant. Trusted and internal networks
settings affect how the the Received  headers are normalized into
metadata. Extending the trusted networks doesn't eliminate
tokens from irrelevant headers added in the trusted path, but it does
cause them to produce separate tokens. 


Re: Hacked sites: dropbox V.2

2014-10-22 Thread Reindl Harald


Am 22.10.2014 um 11:47 schrieb Axb:

uriAXB_URI_MLW_DROPBOX/\/dropbox\/doc\.php$/
score  AXB_URI_MLW_DROPBOX25.0



this rule will probably loose it's teeth pretty fast


thanks, the same applies to googlebox

uri   RH_URI_MLW_GOOGLEBOX1 /\/googlebox\/document\.php$/
score RH_URI_MLW_GOOGLEBOX1 100
uri   RH_URI_MLW_GOOGLEBOX2 /\/googlebox\/doc\.php$/
score RH_URI_MLW_GOOGLEBOX2 100



signature.asc
Description: OpenPGP digital signature


Re: tag DKIMDOMAIN is still blocking action 0

2014-10-22 Thread Chris
On Wed, 2014-10-22 at 12:25 +0200, Mark Martinec wrote:
 Chris,
 
  Ran some spam and ham through 'spamassassin -D -t' today mainly looking
  to see if there were any mention of dns issues as I had reported
  earlier. At the end of the run I see this whether it's ham or spam:
  
  Oct 21 19:30:09.086 [31076] dbg: check: tagrun - tag DKIMDOMAIN is 
  still
  blocking action 0
 
 If a message does not contain a *valid* DKIM signature, then the
 tag DKIMDOMAIN won't be set, so any rules that depend on this tag
 will not be activated. So this is a normal situation for unsigned
 or forged mail.
 
 The rules in question are probably
DKIMDOMAIN_IN_DWL and __DKIMDOMAIN_IN_DWL_ANY
 
 
 20_dnsbl_tests.cf : (wrapped for clarity):
 
askdns   DKIMDOMAIN_IN_DWL
  _DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT
   /^([a-z]+ )*(transaction|list|all)( [a-z]+)*$/
 
askdns   __DKIMDOMAIN_IN_DWL_ANY
  _DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT
 
 So these rules would launch a DNS query against _vouch.dwl.spamhaus.org
 if and only if a message would contain a valid DKIM signature.
 
Mark

Got it, thanks Mark

-- 
Chris
KeyID 0xE372A7DA98E6705C
31.11°N 97.89°W (Elev. 1092 ft)
09:12:18 up 1 day, 13:45, 1 user, load average: 0.24, 0.32, 0.31
Ubuntu 14.04.1 LTS, kernel 3.13.0-38-generic



SOUGHT 2.0 ?

2014-10-22 Thread Axb


As most have probably noticed, the SOUGHT rules are not being publish/ 
updated anymore. (you can shutdown your updates)

The reasons for this are beyond this msg.

An option was to run such a project under the Apache umbrella but it 
makes it a VERY complicated process.


Thanks to Justin Mason which helped me a lot getting the bits and pieces 
glued together, I've been successfully autogenerating rules for $dayjob 
on a pretty regular basis.


It would be nice to be able to use this experience to replace the SOUGHT 
rules for everyone BUT:


- it's too much of project for me to run on my own.

- I don't really want to be the single point of failure.

- there's need of certain bits (DNS, CDN, etc) which are either above me 
or outside my interest.


- my spam data isn't enough to publish rules of global relevance

Question is: Can we (the SA users) get a project together with enough 
members taking care of different tasks to ensure that the project 
doesn't die when one person decides to step out?


What I think would be required:

- a project coordinator in charge of banging on tables and documenting 
the processes (I don't want this task).


- a closed mailing list for project members.

- lots more trap data / domains

- a DNS admin with a highly available DNS system to publish the update 
records.


- a packager which takes care of signing, and passing over updates to 
the CDN,etc.


- a CDN system for the sa-update clients to pickup the updates

I can offer 16 cores of fat iron to handle trap/spam/ham data and to 
run/babysit the rule generation process.


There's no way I can or want to handle all this on my own.

If you're interested, find you could cover one or more tasks and willing 
to provide long term commitment.


This is initial brainstorming...

Please post comments, etc on the SA list till we have a closed list.

And if you think I'm nuts.. I agree...

Axb





Re: tag DKIMDOMAIN is still blocking action 0

2014-10-22 Thread Chris
On Wed, 2014-10-22 at 12:25 +0200, Mark Martinec wrote:
 Chris,
 
  Ran some spam and ham through 'spamassassin -D -t' today mainly looking
  to see if there were any mention of dns issues as I had reported
  earlier. At the end of the run I see this whether it's ham or spam:
  
  Oct 21 19:30:09.086 [31076] dbg: check: tagrun - tag DKIMDOMAIN is 
  still
  blocking action 0
 
 If a message does not contain a *valid* DKIM signature, then the
 tag DKIMDOMAIN won't be set, so any rules that depend on this tag
 will not be activated. So this is a normal situation for unsigned
 or forged mail.
 
 The rules in question are probably
DKIMDOMAIN_IN_DWL and __DKIMDOMAIN_IN_DWL_ANY
 
 
 20_dnsbl_tests.cf : (wrapped for clarity):
 
askdns   DKIMDOMAIN_IN_DWL
  _DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT
   /^([a-z]+ )*(transaction|list|all)( [a-z]+)*$/
 
askdns   __DKIMDOMAIN_IN_DWL_ANY
  _DKIMDOMAIN_._vouch.dwl.spamhaus.org TXT
 
 So these rules would launch a DNS query against _vouch.dwl.spamhaus.org
 if and only if a message would contain a valid DKIM signature.
 
Mark
Mark, now I'm confused. As you can see the 'action 0 .' takes place
before the DKIM lookup

Oct 22 09:16:14.220 [8459] dbg: check: tagrun - action 0 blocking on
tags DKIMDOMAIN
Oct 22 09:16:14.477 [8459] dbg: dkim: using Mail::DKIM version 0.4
Oct 22 09:16:14.482 [8459] dbg: dkim: performing public key lookup and
signature verification
Oct 22 09:16:14.623 [8459] dbg: dkim: DKIM, i=@shop.identitydirect.com,
d=shop.identitydirect.com, s=neolane, a=rsa-sha256, c=relaxed/relaxed,
fail, matches author domain
Oct 22 09:16:14.623 [8459] dbg: dkim: DK,
i=en...@shop.identitydirect.com, d=shop.identitydirect.com, s=neolane,
a=rsa-sha1, c=nofws, fail, matches author domain
Oct 22 09:16:14.623 [8459] dbg: dkim: signature verification result:
FAIL (BODY HAS BEEN ALTERED)
Oct 22 09:16:14.623 [8459] dbg: dkim: FAILED signature by
shop.identitydirect.com, author en...@shop.identitydirect.com, no valid
matches
Oct 22 09:16:14.624 [8459] dbg: dkim: FAILED signature by
shop.identitydirect.com, author en...@shop.identitydirect.com, no valid
matches
Oct 22 09:16:14.624 [8459] dbg: dkim: author
en...@shop.identitydirect.com, not in any dkim whitelist

The tests show that the DKIM test failed yet the SA headers show AFAICT
it's good.

,DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1

And of course at the end it shows

Oct 22 09:16:15.611 [8459] dbg: check: tagrun - tag DKIMDOMAIN is still
blocking action 0

So I guess I'm just a confused old man but something doesn't make sense
or I'm just not getting it and I need a 2X4 about the head.

Chris



-- 
Chris
KeyID 0xE372A7DA98E6705C
31.11°N 97.89°W (Elev. 1092 ft)
09:19:18 up 1 day, 13:52, 2 users, load average: 0.39, 0.28, 0.29
Ubuntu 14.04.1 LTS, kernel 3.13.0-38-generic



Re: tag DKIMDOMAIN is still blocking action 0

2014-10-22 Thread Mark Martinec

Chris,


Mark, now I'm confused. As you can see the 'action 0 .' takes place
before the DKIM lookup

Oct 22 09:16:14.220 [8459] dbg: check: tagrun - action 0 blocking on
tags DKIMDOMAIN


Yes, that's normal. It happens immediately after basic information
has been extracted from a mail header ('extract_metadata' plugins hook).

In case of this dependency on DKIMDOMAIN it is a direct consequence
of having rules DKIMDOMAIN_IN_DWL and __DKIMDOMAIN_IN_DWL_ANY,
regardless of the actual message.

The 'check: tagrun - action ... blocking on ...' log message
just says that a callback routine has been provided, which is
to be called at some point later if/when a tag value for a
tag DKIMDOMAIN becomes available.


Oct 22 09:16:14.623 [8459] dbg: dkim: signature verification result:
FAIL (BODY HAS BEEN ALTERED)
Oct 22 09:16:14.623 [8459] dbg: dkim: FAILED signature by
shop.identitydirect.com, author en...@shop.identitydirect.com, no valid
matches
Oct 22 09:16:14.624 [8459] dbg: dkim: FAILED signature by
shop.identitydirect.com, author en...@shop.identitydirect.com, no valid
matches
Oct 22 09:16:14.624 [8459] dbg: dkim: author
en...@shop.identitydirect.com, not in any dkim whitelist

The tests show that the DKIM test failed



Right. So the DKIM signature was not valid, so the tag
DKIMDOMAIN never got its value assigned, so a callback routine
attached to DKIMDOMAIN tag was never called, which yields the:

  dbg: check: tagrun - tag DKIMDOMAIN is still blocking action 0

at the end of mail processing. Normal.



yet the SA headers show AFAICT it's good.

,DKIM_SIGNED=0.1, DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1


You are probably looking at the report header from a previous
spamassassin run. The messages that you provided in your test run
was somehow clobbered, invalidating a DKIM signature.

  Mark


Re: General rules for training bayes

2014-10-22 Thread Alex Regan

Hi,


I'm having some trouble with my bayes database, and thought it would
be a good time to just rebuild it. I'm wondering if anyone has any
good suggestions for the type of mail that should be used for training.


be careful about forwarded mail, if possible. if you get many spam from
your
old account, it may start to classify ALL mail forwarded through that
account as spam.


After reading the rest of the comments on this, the point is to make 
sure trusted_networks is properly configured, correct? That's been done 
for me long ago.



I understand individually-crafted emails would make the best ham,


crafted?


I meant specifically business-related email. Correspondence between 
co-workers, clients/customers, etc, as the main focus of what should be 
bayes00.



but do you train mail from mass-mailers?  Staples?  Facebook?  Banks?


why not? of course I train if I want such mail to be properly classified
later.


The problem I've had with doing this is that it's often so difficult to 
determine which bulk message should be considered ham and which were 
not. This would somewhat raise the burden on the sender instead of 
automatically giving them a -1.9 pass.


Thanks,
Alex



Re: .link TLD spammer haven?

2014-10-22 Thread Jesse Stroik
I noticed URLs from the TLD .link aren't properly classified on my mail 
server. I wrote a simple URI rule to recognize that TLD which never 
matched. I wrote a similar body rule, which did properly match. 
Interestingly, I do see DNS queries going out for the URLs in question.


This is sa 3.3.2-4 -- is it a known issue? The URL in question is on a 
single line and is easily pulled out with egrep and properly parsed with 
the body rule.


Best,
Jesse Stroik


On 10/13/2014 2:53 PM, Dave Funk wrote:

On Mon, 13 Oct 2014, Philip Prindeville wrote:


Every connection I’ve gotten from a hostname resolving to *.link or
saying helo *.link has been spam (I block the connections with
MIMEDefang).

Has anyone actually seen a legitimate email from a host in the .link TLD?

I’ve seen (last week alone):

bgo.blc-onlineconsumer140.link
ratio.allgiftcardsonlinefriendly.link
ratio.autodealersstarted.link

[snip..]


Is it worth having that triggers on the relay’s hostname being *.link?

Also, I noticed that every message we saw was missing a Received: header…

-Philip


I'll second that and add a similar comment about .link URLs inside the
message. Last week I created a uri rule to fire on any .link hosted URL
and so far havn't seen a single FP.



Re: .link TLD spammer haven?

2014-10-22 Thread Ken Bass


On 10/22/2014 2:40 PM, Jesse Stroik wrote:
I noticed URLs from the TLD .link aren't properly classified on my 
mail server. I wrote a simple URI rule to recognize that TLD which 
never matched. I wrote a similar body rule, which did properly match. 
Interestingly, I do see DNS queries going out for the URLs in question.


This is sa 3.3.2-4 -- is it a known issue? The URL in question is on a 
single line and is easily pulled out with egrep and properly parsed 
with the body rule.




3.3.2 does not work with tlds that are not hardcoded into the software. 
I signed up on this list last week with the same complaint (.link and 
.website) are the latest spam havens.
Apparently even 3.4 does not address this yet, but is being address in 
the future. Since I use Centos 7 which ships with 3.3.2, it creates a 
problem for me, meaning unless backported, I'm kinda stuck.


What is a bit frustrating is that the URI rules will work for emails 
that are HTML encoded, but not for plain text emails. So I was pulling 
my hair out trying to figure out why my rules were working sometimes and 
not others.


Re: .link TLD spammer haven?

2014-10-22 Thread Joolee
You can try replacing your RegistrarBoundaries.pm file with the one from
trunk. It should be kept up-to-date with the latest TLD craze. As far as I
know, it hasn't been tested with 3.2.2 but should work nonetheless.

http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Util/RegistrarBoundaries.pm?revision=1633582view=markup

Kind regards,
Peter Overtoom

On 22 October 2014 20:46, Ken Bass kb...@kenbass.com wrote:


 On 10/22/2014 2:40 PM, Jesse Stroik wrote:

 I noticed URLs from the TLD .link aren't properly classified on my mail
 server. I wrote a simple URI rule to recognize that TLD which never
 matched. I wrote a similar body rule, which did properly match.
 Interestingly, I do see DNS queries going out for the URLs in question.

 This is sa 3.3.2-4 -- is it a known issue? The URL in question is on a
 single line and is easily pulled out with egrep and properly parsed with
 the body rule.


 3.3.2 does not work with tlds that are not hardcoded into the software. I
 signed up on this list last week with the same complaint (.link and
 .website) are the latest spam havens.
 Apparently even 3.4 does not address this yet, but is being address in the
 future. Since I use Centos 7 which ships with 3.3.2, it creates a problem
 for me, meaning unless backported, I'm kinda stuck.

 What is a bit frustrating is that the URI rules will work for emails that
 are HTML encoded, but not for plain text emails. So I was pulling my hair
 out trying to figure out why my rules were working sometimes and not others.



Re: .link TLD spammer haven?

2014-10-22 Thread Martin Gregorie
On Wed, 2014-10-22 at 13:40 -0500, Jesse Stroik wrote:
 I noticed URLs from the TLD .link aren't properly classified on my mail 
 server. I wrote a simple URI rule to recognize that TLD which never 
 matched. I wrote a similar body rule, which did properly match. 
 Interestingly, I do see DNS queries going out for the URLs in question.
 
 This is sa 3.3.2-4 -- is it a known issue? The URL in question is on a 
 single line and is easily pulled out with egrep and properly parsed with 
 the body rule.
 
As others have already said, URI body rules use a list of valid TLDs to
help with recognising URIs embedded in body text and this list is
currently hardcoded into SA. 

However, this doesn't affect any rules you might write to match domain
names in headers, so rules that use a regex to look for .link domains
in, for instance, Received or Reply-to headers will work as you'd expect
them to. So, If you don't want to mess around with replacing the
RegistrarBoundaries.pm file in your installation, you may care to write
a few rules that work with the headers and use them until a version of
SA with a configurable TLD list is released. I'm currently using this
meta-rule:

describe MG_LINK_TLD Messages from or containing a URL with the .link
TLD
uri  __MG_LTD1   /\.link/i
header   __MG_LTD2   From =~ /\.link/
header   __MG_LTD3   Received =~ /from.*\.link/
header   __MG_LTD4   Return-Path =~ /\.link/
meta MG_LINK_TLD (__MG_LTD1 || __MG_LTD2 || __MG_LTD3 || __MG_LTD4)
scoreMG_LINK_TLD 7.5

which I've tested fairly carefully. All the subrules except __MG_TLD1
work exactly as I wanted them to. I can live with __MG_TLD1 not working
until either a current SA maintenance version is released with an
extended list of hardcoded TLDs or a version using a configurable list
appears.

HTH

Martin








spamc causing Duplicate emails

2014-10-22 Thread LuKreme
I am seeing duplicate emails when saved off into my Maildirs. My normal mail 
application ignores these duplicates, but iOS 8 does not, so I need to figure 
out what's going on.


 1412808979.M904650P22299.mail.covisp.net,S=65189,W=66526:2,S
 1412808979.M904651P22299.mail.covisp.net,S=65197,W=66534:2,S

 $ diff 1412808979.M904651P22299.mail.covisp.net\,S\=65197\,W\=66534\:2\,S 
1412808979.M904650P22299.mail.covisp.net\,S\=65189\,W\=66526\:2\,S 
7c7
   RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS,URIBL_GREY autolearn=unavailable
---
   RP_MATCHES_RCVD,SPF_HELO_PASS,SPF_PASS,URIBL_GREY autolearn=ham
9a10,11
   *  0.4 URIBL_GREY Contains an URL listed in the URIBL greylist
   *  [URIs: mailchimp.com]
13,14d14
   *  0.4 URIBL_GREY Contains an URL listed in the URIBL greylist
   *  [URIs: mailchimp.com]

Does this indicate that it's spamassassin that is somehow creating a duplicate?

There are no 'c' flags in my procmailrc:

$ grep :0 .procmailrc 
:0
:0fw
:0E
:0
:0
:0
:0 hf
:0 fw
:0
:0 
  :0
  :0
  :0
:0
   :0
   :0
   :0

Looking through my mailspool it look like this started Sep 25, but I last 
updated SA on 30 August.

my local.cf is (no comments)
allow_user_rules 1
rewrite_header Subject (Spam? _SCORE(0)_)
report_safe 1
add_header all Report _REPORT_
report_contact ad...@covisp.net
trusted_networks 75.148.37.66
trusted_networks 75.148.37.67
trusted_networks 75.148.37.68
trusted_networks 75.148.37.69
lock_method flock
required_score 5.0
use_bayes 1
bayes_auto_learn 1
bayes_store_module Mail::SpamAssassin::BayesStore::SQL
bayes_sql_dsn DBI:mysql:bayes:localhost:3306
bayes_sql_username bayesuser
bayes_sql_password 1vJWe4ms0a23EGRpM
bayes_sql_override_username bayesuser
score DKIM_ADSP_CUSTOM_HIGH 10
score DKIM_ADSP_DISCARD 5
score DKIM_ADSP_ALL 3
 ... a bunch of ads overrides ... 
bayes_ignore_header X-Spam-Flag
bayes_ignore_header X-Spam-Status
score HABEAS_ACCREDITED_COI 0.1
score HABEAS_ACCREDITED_SOI 0.5
score HABEAS_CHECKED 0
score BAYES_99 4.0
score BAYES_95 2.5
score BAYES_80 2
score BAYES_60 1.00
score BAYES_50 0.50
score BAYES_40 -0.50
score BAYES_20 -2.50
score BAYES_05 -3.50
score BAYES_00 -4.00
score USER_IN_DEF_DKIM_WL -0.3
score DKIM_VERIFIED -0.1
score DKIM_SIGNED 0.1
score URIBL_DBL_SPAM 3.1
score DCC_CHECK 2.0
rawbody LOCAL_U_UNESCAPE /[+=(]\s*unescape\s*\(\s*[']%(6[1-9A-F]|7[0-9A])/
describe LOCAL_U_UNESCAPE Suspicious use of JS unescape function
score LOCAL_U_UNESCAPE 2.8
rawbody LOCAL_U_STRCONCAT /[+=(]\s*(['])[a-zA-Z0-9\.]{1,16}\1 
?\+?\1[a-zA-Z0-9\.]{0,16}\1/
describe LOCAL_U_STRCONCAT Suspicious unnecessary string concatenation
score LOCAL_U_STRCONCAT 2.7
rawbody LOCAL_HIDE_FROMCHARCODE /=\s*String\.fromCharCode\b/
describe LOCAL_HIDE_FROMCHARCODE Obfuscated used of JS fromCharCode function
score LOCAL_HIDE_FROMCHARCODE 0.6
rawbody LOCAL_HIDE_URL /[+=(]\s*(['])(?!http)h(\1 ?\+ ?\1)?t(\1 ?\+?\1)?t(\1 
?\+ ?\1)?p(\1 ?\+ ?\1)?(?!:\/\/):(\1 ?\+ ?\1)?\/(\1 ?\+ ?\1)?\//
describe LOCAL_HIDE_URL Obfuscated HTTP link eg. 'ht'+'tp:'+'//'
score LOCAL_HIDE_URL 1.9
rawbody LOCAL_JS_REDIR1 
/[Ss][Cc][Rr][Ii][Pp][Tt]\s*(type=[^]+\s*)?\s*(window|self|(var\s+)?([a-z]+)\s*=\s*window\s*;?\s*\4)?\.?(location|\[[']location[']\])(\.href)?\s*[=(]/
describe LOCAL_JS_REDIR1 Code for a JS redirect
score LOCAL_JS_REDIR1 0.5
body LOCAL_FILLER_TEXT /([A-Z][a-z]*(\s[a-z]+){4,6}\.?\s?){18}/
describe LOCAL_FILLER_TEXT Long sequence of 5-7 word sentences with capital 
only at start
score LOCAL_FILLER_TEXT 1.4
score RP_MATCHES_RCVD -0.1
score RCVD_IN_BRBL_LASTEXT 2.7
score DCC_CHECK 3.0
report BAYES_HT _HAMMYTOKENS(50)_
report BAYES_ST _SPAMMYTOKENS(50)_
... a bunch of blacklist_from ...

spamassasin -D --lint it very long


-- 
ALL WORK AND NO PLAY MAKES BART A DULL BOY ALL WORK AND NO PLAY MAKES
BART A DULL BOY ALL WORK AND NO PLAY MAKES BART A DULL BOY Bart
chalkboard Ep. 1F07



Re: spamc causing Duplicate emails

2014-10-22 Thread John Hardin

On Wed, 22 Oct 2014, LuKreme wrote:

I am seeing duplicate emails when saved off into my Maildirs. My normal 
mail application ignores these duplicates, but iOS 8 does not, so I need 
to figure out what's going on.


1412808979.M904650P22299.mail.covisp.net,S=65189,W=66526:2,S 
1412808979.M904651P22299.mail.covisp.net,S=65197,W=66534:2,S


How separated in time are the two message files?

Do you have any kind of procmail logging turned on?

Are all messages duplicated, or only some?

Is the message addressed to you and also to an alias that also resolves to 
you, or something else that would cause the system to duplicate the 
message upstream of procmail?


Does this indicate that it's spamassassin that is somehow creating a 
duplicate?


Doubtful. SA only scores and may rewrite the headers a bit. It's vaguely 
possible that the glue is doing it somehow. Is procmail your glue, or 
something else upstream (a milter or some such)?


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  ...in the 2nd amendment the right to arms clause means you have
  the right to choose how many arms you want, and the militia clause
  means that Congress can punish you if the answer is none.
-- David Hardy, 2nd Amendment scholar
---
 874 days since the first successful private support mission to ISS (SpaceX)


Re: spamc causing Duplicate emails

2014-10-22 Thread LuKreme

 On 22 Oct 2014, at 19:38 , John Hardin jhar...@impsec.org wrote:
 
 On Wed, 22 Oct 2014, LuKreme wrote:
 
 I am seeing duplicate emails when saved off into my Maildirs. My normal mail 
 application ignores these duplicates, but iOS 8 does not, so I need to 
 figure out what's going on.
 
 1412808979.M904650P22299.mail.covisp.net,S=65189,W=66526:2,S 
 1412808979.M904651P22299.mail.covisp.net,S=65197,W=66534:2,S
 
 How separated in time are the two message files?

They aren't. the first blog is the ephod time stamp, so they are in the same 
second.

 Do you have any kind of procmail logging turned on?

Yes. All I see is that when the message comes in to my procmailrc, it comes in 
twice, so the duplication is happening up stream (which probably means dovecot, 
but It looked like spamc initially, so I posted here first).

 Are all messages duplicated, or only some?

All of them across multiple accounts.

 Is the message addressed to you and also to an alias that also resolves to 
 you, or something else that would cause the system to duplicate the message 
 upstream of procmail?
 
 Does this indicate that it's spamassassin that is somehow creating a 
 duplicate?
 
 Doubtful. SA only scores and may rewrite the headers a bit. It's vaguely 
 possible that the glue is doing it somehow. Is procmail your glue, or 
 something else upstream (a milter or some such)?

The more I look at it, the more it looks like it must be dovecot somehow.

Thanks, the questions help me focus on what is really happening.

-- 
Let the Wookiee win.



Re: spamc causing Duplicate emails

2014-10-22 Thread John Hardin

On Wed, 22 Oct 2014, LuKreme wrote:


Thanks, the questions help me focus on what is really happening.


Happy to help.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Windows Genuine Advantage (WGA) means that now you use your
  computer at the sufferance of Microsoft Corporation. They can
  kill it remotely without your consent at any time for any reason;
  it also shuts down in sympathy when the servers at Microsoft crash.
---
 874 days since the first successful private support mission to ISS (SpaceX)


Re: spamc causing Duplicate emails

2014-10-22 Thread LuKreme

 On 22 Oct 2014, at 20:39 , John Hardin jhar...@impsec.org wrote:
 
 On Wed, 22 Oct 2014, LuKreme wrote:
 
 Thanks, the questions help me focus on what is really happening.
 
 Happy to help.

Aha. It was procmail. but it was /usr/local/etc/procmailrc

:0c
/backups/imap.backups

if that FAILS, the duplicate message falls through, and that folder was moved 
but procmailrc was not updated. doh!


-- 
...but the senator, while insisting he was not intoxicated, could not
explain his nudity.