Re: Differing scores on spamassassin checks

2018-04-17 Thread John Hardin

On Tue, 17 Apr 2018, John Hardin wrote:


On Tue, 17 Apr 2018, Computer Bob wrote:

In this way, any user can move a mail to their .SpamLearn folder and it 
will get learned.


It is a very bad idea to do that without review unless you *strongly* trust 
the judgement and responsibility of your users.


Exception: per-user Bayes. If you're doing that you can let them suffer 
the wages of their own folly without negatively impacting other users.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our government should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
---
 2 days until the 243rd anniversary of The Shot Heard 'Round The World


Re: Differing scores on spamassassin checks

2018-04-17 Thread RW
On Tue, 17 Apr 2018 11:19:57 -0500
Computer Bob wrote:


> The problem I immediately see is that I get one big bayes of everyone 
> and a 'one for all, all for one' bayes config.
> I would like to configure SA to be able to deal with the virtual
> users individually somehow but don't know if it can (and requires
> source analysis).

There are two ways of doing this, one is to setup virtual home
directories by adding the following to the spamd options


-x  -c --virtual-config-dir=

How to set the pattern is described in the spamd documentation.

The other way is to store everything in an SQL database.


Re: Differing scores on spamassassin checks

2018-04-17 Thread John Hardin

On Tue, 17 Apr 2018, Computer Bob wrote:

In this way, any user can move a mail to their .SpamLearn folder and it 
will get learned.


It is a very bad idea to do that without review unless you *strongly* 
trust the judgement and responsibility of your users.


Allowing training without review may be suitable for a small subset of 
trusted users, but in general, users will classify as spam "anything I 
don't want" even if it's something they *did* subscribe to from a vendor 
they *do* have a business relationship with.


The "learn as spam" folder will be treated as an easier alternative to 
hitting the "unsubscribe" link in emails, in part because we've been 
training users to *not* click on unsubscribe links in emails from 
businesses they don't have any legitimate interaction with, and all they 
hear is the "don't click on unsubscribe links" part - the other part 
requires actual *judgement*.


Also: for performance reasons you really should relocate those messages 
once they've been learned, but do keep those messages permanently as your 
Bayes training corpus, so that (1) you can review the users' 
classifications and correct any mistraining, and (2) you can easily 
rebuild Bayes from scratch if it goes off the rails.



--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our government should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
---
 2 days until the 243rd anniversary of The Shot Heard 'Round The World


Re: Differing scores on spamassassin checks

2018-04-17 Thread Computer Bob

I would like to thank everyone for your responses, they have been great.
This maillist has not failed to help me improve things everytime I use it.

So this particular server has virtual domains and virtual users in a 
folder hierarchy there under all owned by 'vmail' user.

I have done the following:
1)  Installed a SiteWideBayesSetup config _without_ the 0777 set which 
seems to work for all virtual users regardless of their virtual domain.
2)  Config'd mail folders to be created in the mail folder hierarchy 
under each user called .SpamLearn with a subfolder of .Learned.
3)  Setup a cron to run periodically under user 'vmail' perusing all 
.SpamLearn folders and running sa-learn using the 'vmail' user on those 
found subsequently moving them to the corresponding .Learned folders.


In this way, any user can move a mail to their .SpamLearn folder and it 
will get learned.

Have I had too many beers ? or not enough ?
The problem I immediately see is that I get one big bayes of everyone 
and a 'one for all, all for one' bayes config.
I would like to configure SA to be able to deal with the virtual users 
individually somehow but don't know if it can (and requires source 
analysis).


In any event, it seems to be working pretty well and most all of the 
spam is apparently getting caught.

And no 'root' involvement...
Thanks to all respondents.


Re: Differing scores on spamassassin checks

2018-04-17 Thread RW
On Tue, 17 Apr 2018 15:44:25 +0200
Matus UHLAR - fantomas wrote:

> >> On 15.04.18 20:04, RW wrote:  
> >> >All setting bayes_path buys you here is the ability to run
> >> >sa-learn and spamassassin as root, something you should *never*
> >> >do anyway.  
> 
> >On Tue, 17 Apr 2018 13:55:13 +0200
> >Matus UHLAR - fantomas wrote:  
> >> it's the only way to use per-user settings and bayes DB on system
> >> with unix users.  
> 
> On 17.04.18 13:43, RW wrote:
> >spamd does, but not sa-learn or spamassassin.  
> 
> sa-learn and spamassassin use current user. this way people can use
> either, but spamd is most effective

I'm not sure what you trying to say here. My original point was that
sa-learn and spamassassin  shouldn't be run as root (to access global
databases owned by the unprivileged user running spamd). 

spamd is different because it never processes the contents of an email
as root.


Re: Differing scores on spamassassin checks

2018-04-17 Thread Matus UHLAR - fantomas

On 15.04.18 20:04, RW wrote:
>All setting bayes_path buys you here is the ability to run sa-learn
>and spamassassin as root, something you should *never* do anyway.



On Tue, 17 Apr 2018 13:55:13 +0200
Matus UHLAR - fantomas wrote:

it's the only way to use per-user settings and bayes DB on system
with unix users.


On 17.04.18 13:43, RW wrote:

spamd does, but not sa-learn or spamassassin.


sa-learn and spamassassin use current user. this way people can use either,
but spamd is most effective
(and great combined wish sa-milter)
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I intend to live forever - so far so good. 


Re: Differing scores on spamassassin checks

2018-04-17 Thread RW
On Tue, 17 Apr 2018 13:55:13 +0200
Matus UHLAR - fantomas wrote:


> On 15.04.18 20:04, RW wrote:

> >All setting bayes_path buys you here is the ability to run sa-learn
> >and spamassassin as root, something you should *never* do anyway.  
> 
> it's the only way to use per-user settings and bayes DB on system
> with unix users.

spamd does, but not sa-learn or spamassassin.


Re: Differing scores on spamassassin checks

2018-04-17 Thread Matus UHLAR - fantomas

On Sun, 15 Apr 2018 13:39:31 -0500
Computer Bob wrote:


Update:
For this location, it is ok to have a central bayes database, so I
turned off AWL, adjusted local.cf to contain:
bayes_path /Central_Path/bayes_db/bayes
bayes_file_mode 0777


On 15.04.18 20:04, RW wrote:

Don't set 0777. If that's still in the wiki someone with access should
remove it.

All setting bayes_path buys you here is the ability to run sa-learn and
spamassassin as root, something you should *never* do anyway.


it's the only way to use per-user settings and bayes DB on system with unix
users.


If you run spamd as the unix user spamd, with "-u spamd", then spamd
look for files in ~spamd which is where it was finding them when you
(correctly) ran spamassassin as spamd.


It's quite possible just now wich is why spamd users' bayes DB gets used.

in such case bayes_path is not needed.

just the spamassassin and sa-learn should be done under spamd user.
--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
I drive way too fast to worry about cholesterol. 


Re: Differing scores on spamassassin checks

2018-04-16 Thread Bill Cole

On 16 Apr 2018, at 19:01 (-0400), John Hardin wrote:


On Mon, 16 Apr 2018, Computer Bob wrote:


Why should sa-learn not be run as root ?


That's a general safe practice. Do as little as root as you possibly 
can. Why risk a root crack from an unknown bug in sa-learn that 
somebody has discovered and figured out how to exploit via email?


Right: don't let malicious strangers talk to root, even via email.

ALSO: sa-learn itself won't stop you from running it as root. Without a 
global bayes_path, it will learn into ~root/.spamassassin/bayes_* files 
which no other user can access and spamd can't even TRY to use because 
it refuses to run as root and drops to 'nobody' if run by root. With a 
global bayes_path, the bayes_* files will become owned by root and 
everything else trying to use them (i.e. everything) will fail.


--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Currently Seeking Steady Work: https://linkedin.com/in/billcole


Re: Differing scores on spamassassin checks

2018-04-16 Thread John Hardin

On Mon, 16 Apr 2018, Computer Bob wrote:


Why should sa-learn not be run as root ?


That's a general safe practice. Do as little as root as you possibly can. 
Why risk a root crack from an unknown bug in sa-learn that somebody has 
discovered and figured out how to exploit via email?


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Ten-millimeter explosive-tip caseless, standard light armor
  piercing rounds. Why?
---
 3 days until the 243rd anniversary of The Shot Heard 'Round The World


Re: Differing scores on spamassassin checks

2018-04-16 Thread Computer Bob

Well, now I am more thoroughly confused than usual. #:)

On 4/15/18 2:04 PM, RW wrote:

On Sun, 15 Apr 2018 13:39:31 -0500
Computer Bob wrote:

Update:
For this location, it is ok to have a central bayes database, so I
turned off AWL, adjusted local.cf to contain:
bayes_path /Central_Path/bayes_db/bayes
bayes_file_mode 0777

Don't set 0777. If that's still in the wiki someone with access should
remove it.

So is the SiteWideBayesSetup ok to run without the 0777 ?

All setting bayes_path buys you here is the ability to run sa-learn and
spamassassin as root, something you should *never* do anyway.
This seems contrary to 
https://wiki.apache.org/spamassassin/SiteWideBayesSetup does it not ?

Why should sa-learn not be run as root ?


If you run spamd as the unix user spamd, with "-u spamd", then spamd
look for files in ~spamd which is where it was finding them when you
(correctly) ran spamassassin as spamd.
The /etc/init.d/spamassassin init script is not starting spamd with -u, 
it is only -D but clearly mail processing in the logs show:
Apr 16 17:31:13 M1-2 spamd[3926]: spamd: connection from localhost 
[127.0.0.1]:49938 to port 783, fd 5
Apr 16 17:31:13 M1-2 spamd[3926]: spamd: setuid to spamd succeeded 
<---changing here***
Apr 16 17:31:13 M1-2 spamd[3926]: spamd: processing message 
 for 
spamd:1001
Apr 16 17:31:13 M1-2 postfix/smtpd[4248]: disconnect from 
mail.microcenter.com[66.194.187.30] ehlo=2 starttls=1 mail=1 rcpt=1 
data=1 quit=1 commands=7
Apr 16 17:31:19 M1-2 spamd[3926]: spamd: clean message (1.7/4.0) for 
spamd:1001 in 6.0 seconds, 30321 bytes.


This setup is running all virtual users and virtual domains via mysql 
and the logs show mail traversing the spamd daemon.
The spamd daemon is running as user spamd and does seem to be using the 
SiteWide files specified.




Re: Differing scores on spamassassin checks

2018-04-16 Thread Amir Caspi
On Apr 16, 2018, at 11:15 AM, RW  wrote:
> 
> You seem to be confusing unix and virtual users.

Sorry, I was confusing "virtual hosting" with "virtual users."  Oops.

Ignore me!

--- Amir



Re: Differing scores on spamassassin checks

2018-04-16 Thread RW
On Mon, 16 Apr 2018 10:34:41 -0600
Amir Caspi wrote:

> > On Apr 15, 2018, at 12:39 PM, Computer Bob 
> > wrote:
> > 
> > I still am a bit puzzled how bayes db gets handled when using
> > virtual users and domains. I see no trace of bayes or .spamassassin
> > files in any of the virtual locations or in the sql databases.  
> 
> If you want Bayes to run per-user with virtual hosts then you need to
> use some sort of glue for each user to invoke spamd as their own
> user.  This is typically done by running spamd as root (without the
> -u flag) and enabling per-user settings (-cH) and then using global
> (or per-user) procmail line to invoke spamc with the -u flag.  But
> that's not the default behavior for SA, unless it was packaged that
> way by your virtual hosting software (e.g., Parallels Pro née Ensim
> did it that way).

You seem to be confusing unix and virtual users.


Re: Differing scores on spamassassin checks

2018-04-16 Thread Amir Caspi
> On Apr 15, 2018, at 12:39 PM, Computer Bob  wrote:
> 
> I still am a bit puzzled how bayes db gets handled when using virtual users 
> and domains. I see no trace of bayes or .spamassassin files in any of the 
> virtual locations or in the sql databases.

If you want Bayes to run per-user with virtual hosts then you need to use some 
sort of glue for each user to invoke spamd as their own user.  This is 
typically done by running spamd as root (without the -u flag) and enabling 
per-user settings (-cH) and then using global (or per-user) procmail line to 
invoke spamc with the -u flag.  But that's not the default behavior for SA, 
unless it was packaged that way by your virtual hosting software (e.g., 
Parallels Pro née Ensim did it that way).

But if you're trying to use Bayes with mySQL or Redis, that can't be done 
per-user AFAIK.

Cheers.

--- Amir



Re: Differing scores on spamassassin checks

2018-04-15 Thread RW
On Sun, 15 Apr 2018 13:39:31 -0500
Computer Bob wrote:

> Update:
> For this location, it is ok to have a central bayes database, so I 
> turned off AWL, adjusted local.cf to contain:
> bayes_path /Central_Path/bayes_db/bayes
> bayes_file_mode 0777

Don't set 0777. If that's still in the wiki someone with access should
remove it.

All setting bayes_path buys you here is the ability to run sa-learn and
spamassassin as root, something you should *never* do anyway. 

If you run spamd as the unix user spamd, with "-u spamd", then spamd
look for files in ~spamd which is where it was finding them when you
(correctly) ran spamassassin as spamd.



On Sun, 15 Apr 2018 13:39:46 -0500
Computer Bob wrote:

> I still am a bit puzzled how bayes db gets handled when using virtual 
> users and domains. I see no trace of bayes or .spamassassin files in
> any of the virtual locations or in the sql databases.

It doesn't do that by default.


Re: Differing scores on spamassassin checks

2018-04-15 Thread RW
On Sun, 15 Apr 2018 11:08:35 -0700 (PDT)
John Hardin wrote:

> On Sun, 15 Apr 2018, Matus UHLAR - fantomas wrote:
> 
> > On 15.04.18 11:55, Computer Bob wrote:  
> >> Here is a root scan:  https://pastebin.com/qdXMRzKb  
> >
> > X-Spam-Status: Yes, score=10.2 required=4.0 tests=HTML_MESSAGE,
> >RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS,
> >URIBL_DBL_SPAM autolearn=no autolearn_force=no version=3.4.1
> >  
> >> Here is the same run under spamd: https://pastebin.com/SvvYptYv  
> >
> > X-Spam-Status: No, score=2.5 required=4.0
> > tests=AWL,BAYES_00,HTML_MESSAGE,
> > RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS
> > autolearn=no autolearn_force=no version=3.4.1
> >
> > the main two differences are AWL and BAYES_00 which means
> >
> > 1. your spamd' bayes database is mistrained
> > 2. you apparently should disable AWL at least until you train bayes
> > properly.  
> 
> Actually, it's using user-specific (vs. global) bayes databases, and 
> apparently only root's database is being trained.

No that's not correct. The version run as spamd is using files under
~spamd and has BAYES_00, the version run as root is using files under
~root and hasn't been trained.



Re: Differing scores on spamassassin checks

2018-04-15 Thread Matus UHLAR - fantomas

On 15.04.18 11:55, Computer Bob wrote:

Here is a root scan:  https://pastebin.com/qdXMRzKb



On Sun, 15 Apr 2018, Matus UHLAR - fantomas wrote:

X-Spam-Status: Yes, score=10.2 required=4.0 tests=HTML_MESSAGE,
  RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS,
  URIBL_DBL_SPAM autolearn=no autolearn_force=no version=3.4.1



Here is the same run under spamd: https://pastebin.com/SvvYptYv



X-Spam-Status: No, score=2.5 required=4.0 tests=AWL,BAYES_00,HTML_MESSAGE,
  RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS
  autolearn=no autolearn_force=no version=3.4.1



the main two differences are AWL and BAYES_00 which means

1. your spamd' bayes database is mistrained
2. you apparently should disable AWL at least until you train bayes
properly.



On Sun, 15 Apr 2018, John Hardin wrote:
Actually, it's using user-specific (vs. global) bayes databases, 
and apparently only root's database is being trained.


Define a shared Bayes database that all users can read and use that.


On 15.04.18 11:13, John Hardin wrote:

...or train as spamd rather than as root...


the root's BAYES DB seems untrained.
the spamd's is trained, but badly (re-training should help there).

the question is:

how is spamassassin used? running spamd? does spamd run with "-u" option?

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Depression is merely anger without enthusiasm. 


Re: Differing scores on spamassassin checks

2018-04-15 Thread John Hardin

On Sun, 15 Apr 2018, John Hardin wrote:


On Sun, 15 Apr 2018, Matus UHLAR - fantomas wrote:


On 15.04.18 11:55, Computer Bob wrote:

Here is a root scan:  https://pastebin.com/qdXMRzKb


X-Spam-Status: Yes, score=10.2 required=4.0 tests=HTML_MESSAGE,
   RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS,
   URIBL_DBL_SPAM autolearn=no autolearn_force=no version=3.4.1


Here is the same run under spamd: https://pastebin.com/SvvYptYv


X-Spam-Status: No, score=2.5 required=4.0 tests=AWL,BAYES_00,HTML_MESSAGE,
   RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS
   autolearn=no autolearn_force=no version=3.4.1

the main two differences are AWL and BAYES_00 which means

1. your spamd' bayes database is mistrained
2. you apparently should disable AWL at least until you train bayes
properly.


Actually, it's using user-specific (vs. global) bayes databases, and 
apparently only root's database is being trained.


Define a shared Bayes database that all users can read and use that.


...or train as spamd rather than as root...

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our government should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
---
 4 days until the 243rd anniversary of The Shot Heard 'Round The World

Re: Differing scores on spamassassin checks

2018-04-15 Thread John Hardin

On Sun, 15 Apr 2018, Matus UHLAR - fantomas wrote:


On 15.04.18 11:55, Computer Bob wrote:

Here is a root scan:  https://pastebin.com/qdXMRzKb


X-Spam-Status: Yes, score=10.2 required=4.0 tests=HTML_MESSAGE,
   RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS,
   URIBL_DBL_SPAM autolearn=no autolearn_force=no version=3.4.1


Here is the same run under spamd: https://pastebin.com/SvvYptYv


X-Spam-Status: No, score=2.5 required=4.0 tests=AWL,BAYES_00,HTML_MESSAGE,
   RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS
   autolearn=no autolearn_force=no version=3.4.1

the main two differences are AWL and BAYES_00 which means

1. your spamd' bayes database is mistrained
2. you apparently should disable AWL at least until you train bayes
properly.


Actually, it's using user-specific (vs. global) bayes databases, and 
apparently only root's database is being trained.


Define a shared Bayes database that all users can read and use that.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Our government should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
---
 4 days until the 243rd anniversary of The Shot Heard 'Round The World

Re: Differing scores on spamassassin checks

2018-04-15 Thread Matus UHLAR - fantomas

On 15.04.18 11:55, Computer Bob wrote:

Here is a root scan:  https://pastebin.com/qdXMRzKb


X-Spam-Status: Yes, score=10.2 required=4.0 tests=HTML_MESSAGE,
RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS,
URIBL_DBL_SPAM autolearn=no autolearn_force=no version=3.4.1


Here is the same run under spamd: https://pastebin.com/SvvYptYv


X-Spam-Status: No, score=2.5 required=4.0 tests=AWL,BAYES_00,HTML_MESSAGE,
RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,RCVD_IN_SBL_CSS,SPF_HELO_PASS
autolearn=no autolearn_force=no version=3.4.1

the main two differences are AWL and BAYES_00 which means

1. your spamd' bayes database is mistrained
2. you apparently should disable AWL at least until you train bayes
properly.



On 4/15/18 11:34 AM, Computer Bob wrote:

Greeting all, *
*I have had some issues with spam getting low scores and in 
troubleshooting I have found that if I run a command line check 
with "spamassassin -D -x  < test" on a mail in question, I get a 
very high score when run under user root. When run under user spamd 
it gets a low passing score. This is on obvious spam mail. Any 
advice on how to determine what is the difference ? *

*




--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Atheism is a non-prophet organization. 


Re: Differing scores on spamassassin checks

2018-04-15 Thread Computer Bob

Here is a root scan:  https://pastebin.com/qdXMRzKb
Here is the same run under spamd: https://pastebin.com/SvvYptYv



On 4/15/18 11:34 AM, Computer Bob wrote:

Greeting all, *
*I have had some issues with spam getting low scores and in 
troubleshooting I have found that if I run a command line check with 
"spamassassin -D -x  < test" on a mail in question, I get a very high 
score when run under user root. When run under user spamd it gets a 
low passing score. This is on obvious spam mail. Any advice on how to 
determine what is the difference ? *
*