spam assassin management or hosting

2014-09-03 Thread Adam Moffett
I've been thinking it could easily be a full time job to read spam, 
write sa rules, test sa rules, etc.


There isn't enough time in my day for that, so I'm pretty much running 
SA un-customized.  I do have bayes, which I do train with my own spam  
ham, but I don't have a good population of users that I would trust to 
determine what is spam.  So I don't think I have a very good 
representative sample.  It works, but it could be better.


Is there a paid service that will take care of this for me, but that 
also directly benefits the Spam Assassin project?  I'm aware of services 
like Barracuda (and many others), but I would prefer to know that our 
payments would help keep the project alive and healthy and up to date.  
I feel that if we're paying somebody, we might as well pay somebody 
who's work benefits the entire spam fighting world.





Re: Spam Assassin - does it work or not? - LONG HEADERS URL

2014-08-07 Thread Adam Moffett

i wouldnt know what to do with it. and have no time or desire to learn
about it.

i think you might have accidentally put your finger on the problem ive
been having with them. i think maybe they seem to expect that most their
customers have dedicated IT departments that can deal with something like
this. maybe i'm just an anomaly or dying breed of small business that
doesn't have the time or resources for dealing with things like this.
although i've certainly spent way too much time talking to them about it!
To be perfectly honest, there's no reason a toy maker (or most anybody 
small business owner) should have to learn how to use spamassassin.  If 
obvious spam is consistently getting negative scores then whoever set up 
spamassassin did something wrong.  If they haven't figured it out in as 
many months as you're saying then you might just want to move your email 
elsewhere.  Or, as you said, maybe they expect you to get in there and 
configure it yourself. Either way, there's no reason to keep it there.


If you're otherwise happy with the hosting, or don't want to deal with 
moving it, then it's ok to just move the email and leave the website 
where it is.




Re: high cpu load

2014-04-22 Thread Adam Moffett


Have you tried compiling the rules with sa-compile.  It speeds up 
everything.

I'm afraid I don't know the answer to what that specific test does.


Hi,

I use SpamAssassin version 3.3.1 running on Perl version 5.10.1, 
amavisd-new-2.8.0-8.el6 as before-queue filter.
Today for unknown reason i noticed high load on my server. Mail flow 
is as usual. About 8k in hour checked by amavisd.


Here is timing from amavis. tests_pri_0 is around 90% all the time :*
*amavis[26002]: (26002-05) TIMING-SA total 759 ms - parse: 1.92 
(0.3%), extract_message_metadata: 31 (4.1%), get_uri_detail_list: 6 
(0.8%), tests_pri_-1000: 17 (2.2%), tests_pri_-950: 0.93 (0.1%), 
tests_pri_-900: 1.00 (0.1%), tests_pri_-400: 0.79 (0.1%), 
*tests_pri_0: 668 (88.0%)*, check_dkim_signature: 9 (1.2%), check_spf: 
364 (47.9%), poll_dns_idle: 338 (44.5%), tests_pri_500: 7 (1.0%), 
get_report: 0.99 (0.1%)


In my another test i see async completed from 0.016 s till 0.174 s.

How can i disable/speed up tests_pri_0 ?
What is inside tests_pri_0?
Thanks.




Re: discard high scoring spam?

2014-01-06 Thread Adam Moffett



SpamAssassin can not drop a mail - it will always produce its output and not
producing it is treated as an error.

If  you  call SpamAssassin directly from within Postfix, you can use a
header check to discard the message on a high score.

Yeah, I'm surprised how many people answered no to this question. See 
link:

http://wiki.apache.org/spamassassin/IntegratePostfixViaSpampd

That's not to say that it's wrong to use an additional milter e.g. 
Amavis.  You will likely get additional capabilities by doing so.




Re: Feedback on blacklist rule I plan to write

2013-10-30 Thread Adam Moffett


I'm reasonably sure that user@ip makes a valid address, but even if it 
is I don't think I've ever observed it anywhere.


I'm certain that double @ format you mention is invalid unless one of 
the @'s is inside of quotation marks or parenthesis.  e.g.: 
Ihave@inMyUsername@somewhere.com or 
MyUsername(Ihave@inAComment)@somewhere.com.  If you're literally seeing 
user@domain@ip then I think you could safely reject that simply for 
having a malformed from: address.


I wouldn't hesitate to reject a message for either reason...but I may 
tend to shoot first and question later.




I have a brief question. I'll provide my setup though isn't applicable.

I'm using SpamAssassin version 3.3.1, on FreeBSD 8.1-RELEASE

I'm using Sendmail for my MTA

I'm using Procmail for my local

My question is: The SMTP protocol allows a return address to be
'u...@ip-address.com' and 'u...@ip-address.com' and some other variations,
I'll assume.

My background knows that nearly _all_ mail when transferred uses
'u...@domain-name.com' or  'u...@domain-name.com'

This AM I was researching an email (spam) that I received and the actual
(hard-core) email-header and noticed they're using something similar to:

user@some-domain@ip-address and it's getting through.

My _real_ question is:

Can't I simply blacklist all/any emails that arrive where they're using
'user@ip-address' - while that's a rhetorical question, (I know I can) but
I'm looking for feedback as to why this would not be a good idea. ANY
respectable/legitimate MTA uses their domain-name as the latter part of the
return address, correct?

Feedback is more what I'm looking for on my question versus an answer to
'can I?' do this.

I will not care if there is that small percentage of MTA's that are/do
legitimately send using the IP address method. (Another discussion,
perhaps?)

If this is logical, how would I enter that in my local.cf ??

' blacklist_from  @[0..255].[0..255] .[0..255] .[0..255]
(With/WithOUT quotes?)

...or is the REALLY a very bad idea?

Thanks so much for your assistance in advance.







Re: Feedback on blacklist rule I plan to write

2013-10-30 Thread Adam Moffett


absolutely right, thanks for jogging my memory.

My reading of RFC5321 seems to indicate thatuser@1.2.3.4  is NOT a
valid address.  It should instead be written asuser@[1.2.3.4]




Re: Feedback on blacklist rule I plan to write

2013-10-30 Thread Adam Moffett


On 10/30/2013 3:07 PM, RW wrote:

On Wed, 30 Oct 2013 20:13:40 +0100
Matus UHLAR - fantomas wrote:


On Wed, 30 Oct 2013 11:44:19 -0400
David F. Skoll wrote:


On Wed, 30 Oct 2013 11:06:35 -0500
Adam Moffett adamli...@plexicomm.net wrote:


I'm reasonably sure that user@ip makes a valid address, but even
if it is I don't think I've ever observed it anywhere.

My reading of RFC5321 seems to indicate that user@1.2.3.4 is NOT
a valid address.  It should instead be written as user@[1.2.3.4]

On 30.10.13 17:53, RW wrote:

If you are referring to 4.1.3. I would say it's defining a routing
mechanism rather than limiting what a valid address is.

4.1.2 defines it as part of mail address:

 address-literal  = [ ( IPv4-address-literal /
  IPv6-address-literal /
  General-address-literal ) ]

 Mailbox= Local-part @ ( Domain / address-literal )

But does anything actually say that 1.2.3.4 can't be treated as a
hostname. Isn't the point of the [] to be a hint to the server that it
can treat the contents as an IP address and deliver to that address. I
don't see anything obviously wrong with something like no-reply@1.2.3.4


Are we splitting hairs? Does it matter either way?  I think it's a safe 
assumption that none of my users are going to expect to receive an email 
to or from an address formatted that way, so scoring it higher would 
hurt no one.  It's also not certain that a spammer is sending it that 
way to begin with, so *not* scoring it high also probably hurts no one.


I do enjoy a good educational argument though.



Re: Dear Dfs (was Re: Question about rule: 2.0 DEAR_SOMETHING BODY: Contains 'Dear (something)')

2012-10-26 Thread Adam Moffett

Here's an argument for *not* making your email address fi...@example.com,
l...@example.com or something like that.

I believe an email to me starting Dear Dfs, has 100% probability of
being spam.  If my email address were da...@roaringpenguin.com
instead, I'd get a lot of FPs on Dear David,

So there you go... use obscure local parts in your email addresses. ;)

I couldn't agree more.  I've always used my name and my wife has always 
used a very long word from a fictional language.  She gets almost no 
spam even without filtering.  I think she's somewhat immune to her 
address being discovered by dictionary attacks.


spam in foreign characters

2012-08-21 Thread Adam Moffett
I have a user who seems to get 4-5 messages per day with Chinese 
characters for the subject and body.  They come from a variety of 
domains and IP's so I guess she somehow got onto a list used to spam 
Chinese speaking people.


If I paste them into Google Translate they seem to be roughly the same 
kind of junk as our English spam: work from home, buy our drugs, 
etc.  The handful that I looked at closely had scores of 2.0-3.0.


Are there existing SpamAssassin rules that work on non english 
characters?  Is there maybe something extra I should enable or install 
that would score these higher?


I'm sorry if it's an ignorant question, but the issue hasn't really come 
up here before.


Thanks.



Re: spam in foreign characters

2012-08-21 Thread Adam Moffett

Awesome, thanks for the tip.

Any guess how this affects messages with mixed character sets?  One of 
our users definitely emails with Chinese vendors.  I'm sure they 
correspond in English, but I'm guessing the Chinese folks might have 
Chinese characters in their signature line or some such.


Thanks.


SpamAssassin has an ok_locales thing that allows you to specify basically
languages you want to accept.  But it has problems:
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=4078

I don't believe anybody has created rules to match these kinds of spams.
A big part of the problem is lacking examples of non-English non-spam
to verify the rules don't hit them.

So, you should probably try using ok_locales, and if it doesn't work,
create your own rules to match these spams, if you can find good common
patterns that don't seem likely to match non-spams (or match all Chinese
email if that's what you want).  And please share what works.

ok_locales is defined in the Mail::SpamAssassin::Conf main page which can
also be found here:
http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html

Hmm, ok_locales may actually work on Chinese, I don't see examples of
problems with that language.

On 08/21, Adam Moffett wrote:

I have a user who seems to get 4-5 messages per day with Chinese
characters for the subject and body.  They come from a variety of
domains and IP's so I guess she somehow got onto a list used to spam
Chinese speaking people.

If I paste them into Google Translate they seem to be roughly the
same kind of junk as our English spam: work from home, buy our
drugs, etc.  The handful that I looked at closely had scores of
2.0-3.0.

Are there existing SpamAssassin rules that work on non english
characters?  Is there maybe something extra I should enable or
install that would score these higher?

I'm sorry if it's an ignorant question, but the issue hasn't really
come up here before.

Thanks.





Re: spam in foreign characters

2012-08-21 Thread Adam Moffett

I think I'd have to read Chinese to tackle that accurately.


So, you should probably try using ok_locales, and if it doesn't work,
create your own rules to match these spams, if you can find good common
patterns that don't seem likely to match non-spams (or match all Chinese
email if that's what you want).  And please share what works.




Re: new paradigm

2011-11-23 Thread Adam Moffett




An interesting idea.  Sort of a challenge and response with the onus 
on the recipient.  But I think this is handled by auto whitelist which 
SpamAssassin was one of the first to implement.


Regards,
KAM


I don't think AWL does with the original poster is describing, but 
implementation would be trivial in the MTA without spamassassin involved 
at all.


If the user expects to receive mail from a limited number of people like 
only their relatives (m...@myhome.com) then this actually might make sense 
for them, but if they expect to receive email from any random person who 
might be a potential customer (sa...@mybusiness.com) then they would 
have a problem with this.


I might try this or something like it for my own use.  I would simply 
tag as [spam] any message whose From:, Reply-To:, or envelope sender 
didn't match my whitelist.  Then I would populate the whitelist with the 
envelope recipient on any message sent by an authenticated user.   You 
could do the whole thing in the Exim config file without invoking 
spamassassin at all.  In fact I don't think it would be hard to keep a 
separate whitelist file for each user.  If I'm going to get a 
confirmation email or some such from some random address then I can look 
in my spam folder.  If I expect to get future emails from the same 
senderI'll just reply to their message.  It doesn't matter if it's a 
DoNotReply@ address because they'd still be added to the whitelist when 
I hit send.  The fact that they blackhole or bounce my reply won't 
affect anything.


I have worked at ISP's for the past 12 years and I 100% whole heartedly 
agree with David Skoll's observations about the general mass of users, 
but I think there are still a subset of people who would benefit from 
doing it this way.




Re: new paradigm

2011-11-23 Thread Adam Moffett
Undoubtedly it is *easier*, just as I can easily eliminate all my spam 
by unplugging the ethernet cable.  Just keep in mind this method would 
only be useful for people who already know who they want to talk to.




The idea is as simple as: past days was easier to blacklist...nowdays
is easier to whitelist !




Re: new paradigm

2011-11-23 Thread Adam Moffett

On 11/23/2011 02:22 PM, Christian Grunfeld wrote:

Undoubtedly it is *easier*, just as I can easily eliminate all my spam by
unplugging the ethernet cable.  Just keep in mind this method would only be
useful for people who already know who they want to talk to.

And that is the big % of what people do or want to do ! most people
wants to comunicate with who they want to talk to !
I think you are defining people in the wrong way. Do not assume by
default that people want spam !



If you described your idea to bunch of average internet users and 
surveyed them about it, you'll find that a big % of them probably think 
they do agree with you.  When you go to implement it for real you'll 
find that the percentage of users who actually stick with the idea in 
the long run will get smaller and smaller.  It's hard for an idea that 
requires users to change their behavior to gain traction and keep it.


Please prove me wrong.  I would be super happy if you're right because 
it would make my job tremendously simpler.


I do like the idea, and I would try it for my own personal email because 
it would be easy to do and there truly aren't many people I want to talk 
to once I get home from the office and I think I'd be done whitelisting 
people by the second day.


Re: pilot error? or idiots at microsoft?

2011-08-10 Thread Adam Moffett
AFAIK, 169.254/16 is the autoconfiguration range for private networks 
that don't have a DHCP server.


That said, I have seen people use it for other internal purposes and it 
isn't usually an issue.


so, what brain decided it would be ok to use 169.* addresses for their 
internal ip's?


was it microsoft? (var says that ms uses these for their internal 
clustering ip's for clustered exchange servers)


so, either ms is really being stupid, or the var has something set up 
wrong.


and.. guess what,  SA doesn't know that 169* addresses are 'internal'

here is a outbound email (note: yes, this is amavisd, so, if you 
reply, trim your cc to the group you subscribe to, thanks).


but our 'outbound' policy maps required a 9+ before its marked spam, 
so, amavisd doesn't know this is outbound email. based on these silly 
169.254.* ip's..


so, anyone ever heard of something so stupid?

x-spam-status:Yes, score=4.603 tag=-999 tag2=4 kill=4 
tests=[APOSTROPHE_FROM=0.545, BAYES_40=-0.001, DCC_REPUT_00_12=-0.4, 
HTML_MESSAGE=0.001, LOCAL_1UB_FORGED=2, RDNS_NONE=0.793, 
SARE_GIF_ATTACH=1.42, SPF_SOFTFAIL=0.665, ST_CREDIT_FOR_TWO=-1.42, 
ST_INLINE_IMAGE=1] autolearn=no


received:from spamtrap2.client.local ([127.0.0.1]) by 
spamtrap2.client.local (spamtrap2.client.local [127.0.0.1]) 
(SpammerTrap(r) SME-500, port 10024) with LMTP id QxTwPcYqMh-9 for 
u...@example.com; Wed, 10 Aug 2011 09:57:53 -0400 (EDT)


received:from MBX2.client.local (unknown [172.20.128.25]) (using TLSv1 
with cipher AES128-SHA (128/128 bits)) (No client certificate 
requested) by spamtrap2.client.local (Postfix) with ESMTPS id 
6773561C0F5 for u...@example.com; Wed, 10 Aug 2011 09:57:53 -0400 (EDT)


received:from MBX1.client.local ([169.254.1.69]) by MBX2.client.local 
([169.254.2.63]) with mapi id 14.01.0289.001; Wed, 10 Aug 2011 
09:57:51 -0400

--
Michael Scheidell, CTO
o: 561-999-5000
d: 561-948-2259
*| *SECNAP Network Security Corporation

* Best Mobile Solutions Product of 2011
* Best Intrusion Prevention Product
* Hot Company Finalist 2011
* Best Email Security Product
* Certified SNORT Integrator




This email has been scanned and certified safe by SpammerTrap®.
For Information please see http://www.secnap.com/products/spammertrap/







Re: Bayes Apache James server

2011-07-29 Thread Adam Moffett

On 07/29/2011 02:13 PM, Kelson Vibber wrote:

  Also, to complete the system, I recall there were some AV-mailets at the age. 
If possible use  them before SA to catch message carrying viruses.

Absolutely - we've got ClamAV running first, before anything touches SA, and 
using some of the SaneSecurity signature sets to catch additional malware.
I've often mused about which should run first, but never did any sort of 
testing.  Is it pretty much the general consensus that it's less 
wasteful for the AV to scan the spam than to have SA scan the malware?





Re: Nearly 200.000 Spams today from coolserver.info and starsweet.info

2011-06-16 Thread Adam Moffett

That's interesting.

I'm pretty sure one of my users was getting those same emails.  One user 
out of several thousand, but she was getting hundreds of messages per day.


They were coming from different IP's, but they were all in the same /23:
Inmotion, Inc. INMOTION-173-245-203-0-23 (NET-173-245-204-0-1) 
173.245.204.0 - 173.245.205.255


I blocked that /23 in our MTA.  I don't know if inmotion inc got it's 
address pool stolen or if their workstations are infected by spambots or 
if they themselves are a spammer.but I also don't really care.  
They're dead to me now.



Hello *,

since some days my servers are hit by  50.000-80.000 Spams  a  day  and  for
some minutes they have spamed today 18 accounts out of 98.000 with MORE then
100.000 spams.

All spams coming from the same network:

  xxx.root.static.coolserver.info
  xxx.root.static.starsweet.info

where xxx change every time and the servers IP too  (they resolv)

In the body of the messages I found those domains:

advocatebuying.info aidpurchase.infoencouragebuying.info
ensurepurchase.info guidebuying.infomotivatebuying.info
providebuying.info  purchaseadvocate.info   purchaseaid.info
purchaseassist.info purchasecoach.info  purchaseguide.info
purchasesimplify.info   purchasesupport.infosimplifybuying.info
supportbuying.info  techsweet.info  topsweet.info
tradesweet.info travelsweet.infovideosweet.info
visionsweet.infovolunteerbuying.infowebsweet.info
yousweet.info

maybe there are some more, but these are those which I was able to grep.
However, I have tried to train Spamassassin but it give only a score  of
2-4.

Does someone know more about this crap?

Thanks, Greetings and nice Day/Evening
 Michelle Konzack





Re: sa-updates

2011-03-10 Thread Adam Moffett


Discussion on the dev list points to a lack of sufficient ham in the 
corpus which is necessary to generate score updates and publish new 
rules.  There was a recent drive for new submitters, but I'm still 
trying to figure out how I can rearrange my configuration in order to 
help.


http://wiki.apache.org/spamassassin/NightlyMassCheck




Interesting.  I'd be happy to contribute, but we bounce or outright 
delete high scoring spam.

After Reading these wiki articles:
http://wiki.apache.org/spamassassin/HandClassifiedCorpora
http://wiki.apache.org/spamassassin/CorpusCleaning
I get the impression that they want a representative sample of your 
spam, and i will skew things in a bad way if I only submit the spam that 
spamassassin already scored low.


What if I submit only ham?


Re: low score for ($1.5Million)

2011-03-04 Thread Adam Moffett



while the OP uses  OP means ?

Original Poster.


Re: Should Emails Have An Expiration Date

2011-03-02 Thread Adam Moffett

I think this entire thread should be expired.

It's already specified in RFC's 1327 and 2076 and no MUA supports it 
because it's a dumb idea.
I can't believe 3 days later it's still being talked about like it's a 
serious thing.


Regarding the side discussion on copyrights:  If anybody went into a 
court room and argued to the judge that they were compelled to delete an 
email by copyright law due to an expires: header they would be laughed 
out of the room.




Re: RFC-Ignorant (was Re: Irony)

2011-02-03 Thread Adam Moffett



That's good.  The only useful list (BogusMX) can be discovered without
querying rfc-ignorant anyway.  Just get the MX records for the sending
domain (which are almost certainly in cache) and make sure they resolve
to real IP addresses.

We reject domains that publish MX records in 127/8 or the RFC 1918
networks.  Out of 3.7 million recent messages, we have rejected just
over 26,000 for this reason.  There may be FPs, but no-one has
complained and anyone who publishes such an MX record IMO deserves
to be banned.

Regards,

David.


That's an interesting point of view.  It was suggested on this list 
fairly recently to publish a fake secondary MX as a way to reduce spam.  
The stated reason being that some spamming software hits the backup MX 
first and if that doesn't work will give up without trying any others.


I realize that can be done without using a 127 or RFC 1918 address, but 
some people are doing it that way.


Out of curiosity, did you start blocking those because you saw that as a 
pattern in spam email or is it more a matter of principle?




Re: List Policy Question: Why no reply-to: header?

2011-01-31 Thread Adam Moffett



no thunderbird need a plugin to do this

However, the original poster (adfam moffett) uses thunderbird 3.1.7 too, so
he can have the list-reply function


Yes I have a reply list button, but this is the only list I'm on where 
I have to use it.  I have gotten into the habit of just hitting 
reply.  So I sometimes accidentally reply to the poster instead of the 
list.


It's annoying to me, but it sounds like it's an issue like abortion 
where everyone's made up their mind already and there isn't much point 
arguing about it.


People can believe I'm dumb for thinking that adding/modifying the 
reply-to: header is a simpler and cleaner solution, and I can believe 
people are dumb for thinking there should be multiple reply buttons in 
MUA's.  I suppose we'll just have to each agree that the other party is 
dumb.




List Policy Question: Why no reply-to: header?

2011-01-28 Thread Adam Moffett

I looked at a few messages and didn't see any reply-to: header.

When I click reply on someone's message here it I am replying to them 
only since they're obviously the sender.


Is there any particular reason there can't be a reply-to: header added 
by the listserv?





Re: Understanding TrustPath

2011-01-11 Thread Adam Moffett

On 01/11/2011 03:24 PM, Jari Fredriksson wrote:

On 11.1.2011 21:24, Mauricio Tavares wrote:

Am I correct? What would stop someone from trying to fake the
originating IP to fit the ones in the above list?

If I am not mistaken, the IP protocol and SMTP. Someone might fake the
address when sending to you MTA, but your MTA's response would go to
wrong address, the fake one. There would be no session and talk between
the hosts to create a SPAM


Right, it's kind of difficult to fake your source IP in a TCP session.  
But if I read the manual correctly the whitelist_from_rcvd that he's 
asking about does lookups on hosts in the Received-from:  headers in 
the message.which would be trivial to fake.




Re: Single dot PTR

2010-12-29 Thread Adam Moffett
The PTR is set by the ISP, not the spammer.  My guess would be that the 
period for a PTR would be a policy of a particular network operator or 
group of operators.  So matching it in spam assassin would be scoring 
messages on the ISP they came from rather than their spaminess.






I'm starting to see a (new to me) pattern of spam, and only spam, with 
PTR records consisting of a single dot, such as:


Received: from ejru38.pindmosel.info (. [184.154.78.38] (may be forged))

It doesn't appear that there is a stock rule yet to identify this 
particular case.  RDNS_NONE matches, but I believe a more specific 
rule may be in order, or maybe even something at the MTA level if this 
pattern proves reliable.  Has anyone else identified this pattern in 
their mail flow?






Re: Does SpamAssassin perform tests/scans on attachments?

2010-07-21 Thread Adam Moffett
I've seen people post in the past that SA will demime text attachments, 
and now someone says it won't.


What's the real story?





It doesn't.  At least, not like what you are thinking.

As you know an encoded attachment is a series of lines like:

XXHUBKJVHLSJFWSJNDL:SANFKJHSBFSLJRWKSBF
DSKJNBFSHNF:LSJFLKSNFLKJSBFLK:SNFLKSNFS
FJSHBFLKSHNFLKNSFL:SF:LSNFLKSNFLK:SNFL:
KFSLKHFDSHNFKDNFLDKNFLKDNFLKJHDBIAVFBUB

SA scans that.  Of course, there is nothing there that matches
anything. 






Re: Does SpamAssassin perform tests/scans on attachments?

2010-07-21 Thread Adam Moffett

On 7/21/2010 12:45 PM, Karsten Bräckelmann wrote:

On Wed, 2010-07-21 at 12:25 -0400, Adam Moffett wrote:
   

I've seen people post in the past that SA will demime text attachments,
and now someone says it won't.
 

Ted was answering a question about binary attachments, not text.

   

What's the real story?
 

It depends. On the rule definition, as explained in the M::SA::Conf
docs.

'body' rules are applied against the textual parts [1], decoded from
Quoted Printable or Base 64 if necessary, rendered and normalized.

'rawbody' is like the above, but without rendering and normalization.
That means, HTML tags and whitespace is preserved as-is.

'full' rules are applied against the raw, pristine, verbatim original
message, as fed to SA.


[1] Textual parts depends on the MIME type, not content.

   

So the answer is all of the above :) Thanks.


Flagged as spam but accepted

2010-07-08 Thread Adam Moffett
2010-07-08 09:05:01 1OWqmi-0005N3-JU /*SA: Action: flagged as Spam but 
accepted: score=4.0 required=4.0 */(scanned in 0/0 secs | Message-Id: 
20100708130436.52c7d1cb1...@mail.microton.com.br). From 
care...@habitat.com (host=NULL [189.26.124.122]) for a...@plexicomm.net


The above is a line from my Exim log file.  Does anyone know under what 
conditions a message can be flagged as spam but accepted?


Thanks,
Adam



Re: Basic Setup Questions

2010-06-28 Thread Adam Moffett



My default config does not appear to be using bayes. How do I enable
it?
 

use_bayes and bayes_auto_learn are on by default.
   


I think using the packages on a Ubuntu system they'll default to off.  
There could be others that do that.



The documentation simply says run sa-learn. Does the creation of
the bayes db files effectively enable bayes?
 

No. You also need to teach enough ham and spam tokens to Bayes. By
default, you should train bayes with at least 200 ham messages and 200 spam
messages. At that point, you should start seeing bayes scoring your
messages.
   
I actually relied exclusively on auto learning for awhile.  Mostly 
because I didn't know how to do the manual training.  Bayes does seem to 
have a positive effect without manual training although I know it's 
recommended to suplement the auto learning with manual training for 
better accuracy.




Re: TMPDIR as a tmpfs

2010-06-22 Thread Adam Moffett
I don't know if it is safe.  I suspect it will function normally, but I 
think you'd be in danger of losing a few messages on an unexpected reboot.


I had a very dramatic performance improvement by switching bayes and awl 
databases to MySQL instead of the default BerkeleyDB.  It costs more 
RAM, CPU, and disk space, but scan times reduced dramatically.  I'm 
certain we were I/O bound before this change because we had plenty of 
RAM and CPU available.




It is safe to use spamassassin tmpdir on a tmpfs mounted system ?

And if its safe it would have a better performance ?

Here where i work we have big problems with the hard drives, because 
we basically are sharing virtual machines disk over nfs. and 
spamassasin is a virtual machine.


Any other tips for better performance ?




[]'sf.rique




Worthwhile to scan outgoing?

2010-06-21 Thread Adam Moffett
My philosophy in the past has always been not to scan outgoing emails 
because my users are not likely to be spamming.


However, a couple of issues recently with spambots and SMTP AUTH with 
weak passwords has me reconsidering that stance.


Is anyone here currently scanning their outgoing mail with SA?  Good 
results?  Bad results?




A few questions

2010-06-10 Thread Adam Moffett
These issues came up when I was trying to address performance problems, 
I hope they aren't major RTFM items.


1) I used sa-compile as suggested by the FAQ and the CPU load dropped 
*dramatically*.  The question is do I have to run that every time I 
sa-update or will it happen automatically?


2) I disabled the auto whitelist module, and got scan times down from 
200+ secs to ~40 secs.  The AWL db file was over 2.5Gig.  The FAQ 
implies that I don't really need AWL, is this the general concensus?  If 
I keep using it, is there an easy automatic way to prune the AWL db for 
old or seldom used entries.


3) I disabled Bayes and now scan times are down to 1 or 2 secs.  That's 
great, but I think bayes really helps so I'd rather keep it.  The 
bayes_toks db is 162MB...that seems like a pretty big db to scan for 
every message.  I know it does auto expire because I have a multitude of 
bayes_toks.expire files ranging from 40-80MB in size.  Can I tune what 
gets expired to reduce the size of the db?  Is there another solution?  
We are definitely I/O bound when bayes is enabled because we have long 
scan times but CPU usage stays in the 8-10% range.


-Adam