Re: relay not detected

2016-11-21 Thread Pedro David Marco
Thanks Bill,
>. I don't know why some spammers do this sort of lame 
>Received fakery, since it fingerprints their mail as spam, but it has 
>been a fairly common practice for a long time.
But SA did not trigger any rule about the forgering...  and debug mode does not 
showany message about unparseable lines. It seems just ignored, so the relay 
remains unchecked in RBLS.
-Pedro.




   

Re: relay not detected

2016-11-21 Thread Bill Cole

On 21 Nov 2016, at 17:54, Pedro David Marco wrote:


Hi,
i have spam emails with a Received line like this:
Received: by 9-30-239-23.uocdn.net (Postfix) with ESMTPSA id 693A0C56B 
with (unknown [158.69.130.12]) ; Sun, 20 Nov 2016 21:06:55 -0300
there is no parsing perl code for lines like this in Received.pm 
module so the relay 158.69.130.12 is never checked

is this normal? 


Yes.  Why would anyone want SA to attempt to parse an intentionally 
deceptive Received header?


Unadulterated Postfix does not now generate (and never has generated) 
Received headers like that. The queue id is too short and the header 
would start with 'from' not 'by' if it was actually Postfix generating 
it as claimed. That looks like some moron spammer tried to weld together 
the 2-part mutant qmail Received format and label it as Postfix for 
obfuscation. I don't know why some spammers do this sort of lame 
Received fakery, since it fingerprints their mail as spam, but it has 
been a fairly common practice for a long time.


Re: Bayes scoring and role accounts

2016-11-21 Thread Joe Quinn

On 11/21/2016 11:27 AM, Karl Denninger wrote:


On 11/21/2016 10:12, Karl Denninger wrote:
I'm using SpamAssassin on a system that uses Postfix for MTA and 
Dovecot for handling final delivery.  Spamassassin is being called 
via Postfix through spamd with:


#
# Spam Assassin bayesian filter updaters
#
sa-spam unix-   n   n   -   -   pipe 
user=spamd:spamd argv=/usr/local/bin/sa-wrapper.pl spam ${sender}
sa-ham  unix-   n   n   -   -   pipe 
user=spamd:spamd argv=/usr/local/bin/sa-wrapper.pl ham ${sender}


I have a material number of role accounts on the box that are all 
aliased to the various places they need to go.  Most of these do not 
have entries in /etc/passwd, that is, they're not real login accounts.


The issue is that if I am reading the code correctly my particular 
Bayes database (for "karl") is not being consulted, and can't be, for 
anything that comes into a role account since the user side of the 
email address is (obviously) not altered in the message.  As a result 
I have the rulesets, but none of the "training" that individual Bayes 
recognition would provide, nor is there any way for that training to 
take place since none of these accounts are "real".


sa-learn --dump magic -u karl shows the expected (large) number of 
tokens in the database, but the same command targeting any of the 
role account names shows nearly nothing (which isn't surprising since 
they're role accounts and not real user logins.)


How have people dealt with this -- or do they?


To add to this the way the bayes database gets built (other than via 
auto-add) is from anything that a user sticks in the "Junk" folder.  
There is a cron job that runs every hour that runs sa-learn against 
that and then moves anything it finds in there to a "Junk-Saved" 
folder, expiring anything older than 14 days from that folder (so spam 
emails are held for 2 weeks.)  Dovecot is configured to deliver 
confirmed spam to the "Junk" folder as well.


Is the best way to handle role accounts to (1) create a "dummy" user 
account for them and (2) have the script that runs sa-learn add spam 
to not only the target's account but also, if the target is a role 
account, to each of the role account's database entries as well?  
That's a somewhat-messy maintenance job if/when role accounts are 
added/removed/changed, but it appears to be the only way to accomplish 
the goal.


--
Karl Denninger
k...@denninger.net 
/The Market Ticker/
/[S/MIME encrypted email preferred]/


I can't speak for specifically making it work with Postfix, but you 
usually want a site-wide Bayes database. No matter what (real or fake) 
user is receiving the message, it would get trained as the spamd user, 
or whatever ends up running SA. That same user runs SA and reads that 
appropriate database, which gets training from everyone and classifies 
based on a much more statistically useful volume of data.




relay not detected

2016-11-21 Thread Pedro David Marco
Hi,
i have spam emails with a Received line like this:
Received: by 9-30-239-23.uocdn.net (Postfix) with ESMTPSA id 693A0C56B with 
(unknown [158.69.130.12]) ; Sun, 20 Nov 2016 21:06:55 -0300
there is no parsing perl code for lines like this in Received.pm module so the 
relay 158.69.130.12 is never checked
is this normal? 
-
Pedro.


Re: Best place to filter spam (x-original-to, no_address_mappings)

2016-11-21 Thread @lbutlr
On Nov 18, 2016, at 10:18 PM, MRob  wrote:
> I am looking at a system where SpamAssassin is called out from the delivery 
> agent. I know there will be a difference here in terms of the envelope 
> information but I'm not familiar enough to know the pitfalls of this versus 
> calling SA from the postfix content_filter.

It’s unclear why you are doing this, but if you want to run SA after delivery 
then the time to do that is in your LDA. *HOW* to do that, depends on your LDA. 
If you are using dovecot, then you can call SA from sieve. If not, you can 
setup procmail as an LDA (or others), and call SA from there.

A quick google on setting up SA with procmail or sieve or maildrop should lead 
to profit.

(I use procmail, but do not recommend it as it has ceased active development. 
Still works fine, but maildrop is probably a better choice).




Re: .info TLD gives 2.1?

2016-11-21 Thread Kevin Golding

On Mon, 21 Nov 2016 19:00:59 -, Alex  wrote:


The part I was unsure of was if those 2.1 points were warranted
because I've only ever seen it in ham. Now I understand that it is.


http://ruleqa.spamassassin.org/ is a very good source for understanding  
how rules get the scores they do.


It can also be a good source for deciding if you need to make local  
adjustments that better suit your mailflow.


Re: .info TLD gives 2.1?

2016-11-21 Thread Alex
Hi,


On Mon, Nov 21, 2016 at 1:07 PM, Bill Cole
 wrote:
> On 21 Nov 2016, at 3:18, Matus UHLAR - fantomas wrote:
>
>> On 20.11.16 19:46, Alex wrote:
>>>
>>> Am I reading this rule wrong, or does the presence of a .info domain
>>> enough to warrant a 2.8 score?
>>>
>>> *  2.1 URI_NO_WWW_INFO_CGI URI: CGI in .info TLD other than third-level
>>> "www"
>>>
>>>
>>> >> EDC1D180ACD125901ADFBE7BB3D38714D4CF371647BF8D90DDD78032>*
>>>
>>> uri URI_NO_WWW_INFO_CGI
>>> /^(?:https?:\/\/)?[^\/]+(?>>
>>> This particular email was scored at 5.30, and wouldn't have hit if it
>>> didn't also hit SORBS, but such a score seemed quite high for just the
>>> presence of a type of TLD.
>>
>>
>> it's not based only on .info tld:
>>
>> 1. TLD .info
>> 2. no 'www'
>> 3. third level domain
>> 4. at least 6 characters 2nd-level domain
>
>
> That's a 7 not a 6 :)
>
> The RE says a bit more, and is maybe clearer using words:
>
> http[s]://..info/<15 or
> more non-whitespace characters including a literal ?>
>
> Note that the trailing '\?' in the RE means a literal '?' indicating that
> the URI has a CGI-style query string. That makes this a very specific URI
> pattern. There's nothing "wrong" with such a URI except for the fact that
> objectively the frequency of that uncommon pattern is much higher in spam
> than non-spam.
>
> I *suspect* that the pattern could be tightened a bit to reduce false
> positives without missing the spam that hits this rule, but I don't have any
> data to support that.

Thank you all for your explanations. I understood that it also
involved a CGI-style query string, but just didn't mention it.

If it would help, I have a handful of other non-spam URIs that hit
this rule, if it would help tighten it up a bit.

The part I was unsure of was if those 2.1 points were warranted
because I've only ever seen it in ham. Now I understand that it is.


Re: Best place to filter spam (x-original-to, no_address_mappings)

2016-11-21 Thread MRob

Can anyone help with this please?

On 2016-11-18 21:18, MRob wrote:

Hello,

I posted this question to the Postfix list and it occurred to me that
the SA community could be just as (or more) informative:

I am looking at a system where SpamAssassin is called out from the
delivery agent. I know there will be a difference here in terms of the
envelope information but I'm not familiar enough to know the pitfalls
of this versus calling SA from the postfix content_filter.

Specifically, I believe it's recommended to call SA in context of
receive_override_options=no_address_mappings but this wouldn't be the
case when we are in the delivery agent I think. What are the effects
of this?

Also, if it's possible to have LMTP send the original envelope sender
(x-original-to?), would that help? And is that possible yet in the
newest version of Postfix?


Re: .info TLD gives 2.1?

2016-11-21 Thread Bill Cole

On 21 Nov 2016, at 3:18, Matus UHLAR - fantomas wrote:


On 20.11.16 19:46, Alex wrote:

Am I reading this rule wrong, or does the presence of a .info domain
enough to warrant a 2.8 score?

*  2.1 URI_NO_WWW_INFO_CGI URI: CGI in .info TLD other than 
third-level  "www"


*

uri URI_NO_WWW_INFO_CGI
/^(?:https?:\/\/)?[^\/]+(?didn't also hit SORBS, but such a score seemed quite high for just 
the

presence of a type of TLD.


it's not based only on .info tld:

1. TLD .info
2. no 'www'
3. third level domain
4. at least 6 characters 2nd-level domain


That's a 7 not a 6 :)

The RE says a bit more, and is maybe clearer using words:

http[s]://.non-dots>.info/<15 or more non-whitespace characters including a literal 
?>


Note that the trailing '\?' in the RE means a literal '?' indicating 
that the URI has a CGI-style query string. That makes this a very 
specific URI pattern. There's nothing "wrong" with such a URI except for 
the fact that objectively the frequency of that uncommon pattern is much 
higher in spam than non-spam.


I *suspect* that the pattern could be tightened a bit to reduce false 
positives without missing the spam that hits this rule, but I don't have 
any data to support that.


Re: Bayes scoring and role accounts

2016-11-21 Thread Karl Denninger

On 11/21/2016 10:12, Karl Denninger wrote:
> I'm using SpamAssassin on a system that uses Postfix for MTA and
> Dovecot for handling final delivery.  Spamassassin is being called via
> Postfix through spamd with:
>
> #
> # Spam Assassin bayesian filter updaters
> #
> sa-spam unix-   n   n   -   -   pipe
> user=spamd:spamd argv=/usr/local/bin/sa-wrapper.pl spam ${sender}
> sa-ham  unix-   n   n   -   -   pipe
> user=spamd:spamd argv=/usr/local/bin/sa-wrapper.pl ham ${sender}
>
> I have a material number of role accounts on the box that are all
> aliased to the various places they need to go.  Most of these do not
> have entries in /etc/passwd, that is, they're not real login accounts.
>
> The issue is that if I am reading the code correctly my particular
> Bayes database (for "karl") is not being consulted, and can't be, for
> anything that comes into a role account since the user side of the
> email address is (obviously) not altered in the message.  As a result
> I have the rulesets, but none of the "training" that individual Bayes
> recognition would provide, nor is there any way for that training to
> take place since none of these accounts are "real".
>
> sa-learn --dump magic -u karl shows the expected (large) number of
> tokens in the database, but the same command targeting any of the role
> account names shows nearly nothing (which isn't surprising since
> they're role accounts and not real user logins.)
>
> How have people dealt with this -- or do they?
>
>
To add to this the way the bayes database gets built (other than via
auto-add) is from anything that a user sticks in the "Junk" folder. 
There is a cron job that runs every hour that runs sa-learn against that
and then moves anything it finds in there to a "Junk-Saved" folder,
expiring anything older than 14 days from that folder (so spam emails
are held for 2 weeks.)  Dovecot is configured to deliver confirmed spam
to the "Junk" folder as well.

Is the best way to handle role accounts to (1) create a "dummy" user
account for them and (2) have the script that runs sa-learn add spam to
not only the target's account but also, if the target is a role account,
to each of the role account's database entries as well?  That's a
somewhat-messy maintenance job if/when role accounts are
added/removed/changed, but it appears to be the only way to accomplish
the goal.

-- 
Karl Denninger
k...@denninger.net 
/The Market Ticker/
/[S/MIME encrypted email preferred]/


smime.p7s
Description: S/MIME Cryptographic Signature


Re: .info TLD gives 2.1?

2016-11-21 Thread John Hardin

On Mon, 21 Nov 2016, Matus UHLAR - fantomas wrote:


On 20.11.16 19:46, Alex wrote:

Am I reading this rule wrong, or does the presence of a .info domain
enough to warrant a 2.8 score?

 *  2.1 URI_NO_WWW_INFO_CGI URI: CGI in .info TLD other than third-level
 "www"

*

uri URI_NO_WWW_INFO_CGI
/^(?:https?:\/\/)?[^\/]+(?

it's not based only on .info tld:

1. TLD .info
2. no 'www'
3. third level domain
4. at least 6 characters 2nd-level domain


5. CGI script parameters.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---


Bayes scoring and role accounts

2016-11-21 Thread Karl Denninger
I'm using SpamAssassin on a system that uses Postfix for MTA and Dovecot
for handling final delivery.  Spamassassin is being called via Postfix
through spamd with:

#
# Spam Assassin bayesian filter updaters
#
sa-spam unix-   n   n   -   -   pipe
user=spamd:spamd argv=/usr/local/bin/sa-wrapper.pl spam ${sender}
sa-ham  unix-   n   n   -   -   pipe
user=spamd:spamd argv=/usr/local/bin/sa-wrapper.pl ham ${sender}

I have a material number of role accounts on the box that are all
aliased to the various places they need to go.  Most of these do not
have entries in /etc/passwd, that is, they're not real login accounts.

The issue is that if I am reading the code correctly my particular Bayes
database (for "karl") is not being consulted, and can't be, for anything
that comes into a role account since the user side of the email address
is (obviously) not altered in the message.  As a result I have the
rulesets, but none of the "training" that individual Bayes recognition
would provide, nor is there any way for that training to take place
since none of these accounts are "real".

sa-learn --dump magic -u karl shows the expected (large) number of
tokens in the database, but the same command targeting any of the role
account names shows nearly nothing (which isn't surprising since they're
role accounts and not real user logins.)

How have people dealt with this -- or do they?

-- 
Karl Denninger
k...@denninger.net 
/The Market Ticker/
/[S/MIME encrypted email preferred]/


smime.p7s
Description: S/MIME Cryptographic Signature


Re: version.h.pl show stopper

2016-11-21 Thread Dan Jacobson
You people were right about
$ mount|grep noexec
And the tests really should check for that. I'll submit a bug.

That apparently was the problem on one of my machines.
But for all the rest, this still fails:

env - HOME=$HOME LOGNAME=$LOGNAME PATH=/usr/bin:/bin USER=$USER sh -uxe 

Re: .info TLD gives 2.1?

2016-11-21 Thread Matus UHLAR - fantomas

On 20.11.16 19:46, Alex wrote:

Am I reading this rule wrong, or does the presence of a .info domain
enough to warrant a 2.8 score?

*  2.1 URI_NO_WWW_INFO_CGI URI: CGI in .info TLD other than third-level  "www"

*

uri URI_NO_WWW_INFO_CGI
/^(?:https?:\/\/)?[^\/]+(?

it's not based only on .info tld:

1. TLD .info
2. no 'www'
3. third level domain
4. at least 6 characters 2nd-level domain

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
42.7 percent of all statistics are made up on the spot.