Re: Migrating bayes to mysql fails with parsing errors

2011-06-23 Thread Yet Another Ninja

On 2011-06-23 18:40, Dave Wreski wrote:

Hi,


since so many have problems i share my mysql shemas :=)
`token` binary(5) NOT NULL,


Yes, the binary or varbinary is the key to a solution here.
Mucking with utf-8 vs latin-1 is just covering but not solving
the most glaring problem here, namely that a token must not be
associated with any character set, as it does not obey any
such rules, nor should it be treated case-insensitively
(as char is, which is possibly a reason for more than two
record changes as reported by Dave). Will take a closer look...


I changed the Type=MyISAM at the end of each CREATE statement in the
original schema and replaced it with the following from Benny's schema:

ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

It's now working, but is excruciatingly slow. Is this also just covering
the problem, or will this be a usable solution when it finally finishes?


Just being curious: are you using

bayes_store_module  Mail::SpamAssassin::BayesStore::MySQL
or
bayes_store_module   Mail::SpamAssassin::BayesStore::SQL

the latter is VERY slow with MySQL

Axb



Re: High Performance Bayes Database Configuration?

2011-06-21 Thread Yet Another Ninja

On 2011-06-21 16:30, Marc Perkel wrote:



On 6/21/2011 7:23 AM, David F. Skoll wrote:

On Tue, 21 Jun 2011 07:06:11 -0700
Marc Perkelsupp...@junkemailfilter.com wrote:


Trying to get MySQL bays working in a high volume environment.
Dedicated MySQL server with SSD drives. Can someone send me a sample
my.cnf file and make other suggestings to keep it running wihout
database corruption and other MySQL features? Or - should I be
using some other DB?

We've tried various ways of storing Bayes data (we have our own Bayes
implementation, so this discussion may not correspond exactly with the
SA implementation.) After trying Berkeley DB files and PostgreSQL---we
would never use MySQL for any data we care about---we finally settled
on Dan Bernstein's CDB format. It has by far the best performance.
See: http://www.dmo.ca/blog/benchmarking-hash-databases-on-large-data/
Take a look at the Random Reads timings. CDB is 6 times faster than
Berkeley DB!

CDB is read-only, which means when you want to do Bayes training, you
have to rewrite the entire database. This is not an issue for our
system because of how we do Bayes training, but it may be an issue
with the standard sa-learn.

Thanks David but I need real time updating and it's spread across
multiple servers. So need PostgreSQL or MySQL.


I settled with /server SDBM
Under high traffic, MySQL produced too much lag - no matter how fast the 
DB server was.


Re: SA filters lists

2011-06-16 Thread Yet Another Ninja

On 2011-06-16 9:44, Cédric Jeanneret wrote:

Hello,

I just read that SARE shouldn't be used anymore[1] (note maintained
anymore, and many false-positives reported) Is that true?


Yes.. 100% true.

If so,  which list can you suggest? For now, I don't have any problem 
with FPs,

but...


the usable SARE rules were incorporated into SA
the rest was trashed.

lookup SOUGHT rules usage in the SA wiki.





Re: Sought rules

2011-06-11 Thread Yet Another Ninja

On 2011-06-11 3:38, Warren Togami Jr. wrote:

On 6/10/2011 3:34 PM, John Hardin wrote:

On Fri, 10 Jun 2011, Lawrence @ Rogers wrote:


On 10/06/2011 10:24 PM, Warren Togami Jr. wrote:

On 6/10/2011 2:01 PM, Karsten Bräckelmann wrote:
  IFF you use the sought channel with SA 3.3.x, you will need the
reorder
 hack to bend the alphabet.

It is not entirely clear to me, what exactly are you supposed to rename
for the reorder hack? You have to do it every time you sa-update?


Would renaming 20_sought_fraud.cf to 99_sought_fraud.cf, putting
20_sought_fraud.cf (from the yelp.org channel) after 72_active.cf (the
default and assumed older SA rules) solve this problem?


Or symlinks from your local configs directory to the SOUGHT channel
directory files. That would probably be easier to not forget about when
things get fixed.



Is Lawrence's suggestion something we can do upstream to fix this problem?

Alternatively, I think it is a mistake for us to ship SOUGHT rules at
all in the standard sa-update channel. That is, unless we plan on
updating the patterns and scores of SOUGHT on a daily basis. I highly
doubt we will do that.


Agreed, I'm +1 with removing SOUGHT rules from mainstream sa update 
channel. and keep thems separate.





Re: Rule to match X-Spam-Flag

2011-06-09 Thread Yet Another Ninja

On 2011-06-09 11:46, Mark Martinec wrote:

Sandro,


I find a lot of spam that has already passed other spam-filters with
spamassassin better tuned than mine an already have a X-Spam-Flag to YES.

I tried to add a rule to match that case:

   header CUSTOM_X_SPAM_FLAG X-Spam-Flag =~ /\bYES\b/i
   score CUSTOM_X_SPAM_FLAG 5

But spamassassin -t  /tmp/spam does not show any hit ot that rule.
Moreover using flag -D I don't see it being called. I set it in
/etc/spamassassin/local.cf

Is it any possible to match on that rule?


It is an unfortunate consequence of a M::S::PerMsgStatus::check()
removing any 'x-spam-*' header fileds _before_ performing any checks.
It would probably make more sense to do so after checks but before
collecting a report or a rewritten message. I'm just not sure what
other code or rules depend on this, so fixing your case might
break something else (or may not, needs investigating).
You may open a problem report.

As a workaround, you may add some header rewrite rule to your MTA
which could rewrite a X-Spam-Flag to something else, like X-X-Spam-Flag.


or if you want to be rather radical, reject at MTA level with a header 
check.


Motto: Dear Sender: if you pre-tag your mail as spam, keeep it




OT: Haraka - plugin capable SMTP server

2011-06-03 Thread Yet Another Ninja

for those looking for new tools for their arsenal..
take a look at the relatively new Haraka.

running it as a proxy, I'm impressed.

enjoy!

https://github.com/baudehlo/Haraka

Haraka - a Node.js Mail Server

Haraka is a plugin capable SMTP server. It uses a highly scalable event 
model to be able to cope with thousands of concurrent connections. 
Plugins are written in Javascript using Node.js, and as such perform 
extremely quickly.


Haraka can be used either as an inbound SMTP server, and is designed 
with good anti-spam protections in mind (see the plugins directory), or 
it can be used as an outbound mail server (run it on port 587 with an 
auth plugin to authenticate your users). Or of course it can function 
as both.


What Haraka doesn't do is fully replace your mail system (yet). It 
currently has no built-in facilities for mapping email addresses to user 
accounts and delivering them to said accounts. For that we expect you to 
keep something like postfix, exim or any other user-based mail system, 
and have Haraka deliver mail to those systems for that mapping. However 
nothing is stopping someone writing a plugin which replicates that 
facility - it just has yet to be done.


Haraka does have a scalable outbound mail delivery engine in the deliver 
plugin, which should work well for most sites.

Why Use Haraka?

Haraka's primary purpose is to provide you with a much easier to extend 
mail server than most available SMTP servers out there such as Postfix, 
Exim or Microsoft Exchange, yet while still running those systems for 
their excellent ability to deliver mail to users.


The plugin system makes it trivial to code new features. A typical 
example might be to provide qmail-like extended addresses to an Exchange 
system, whereby you could receive mail as user-anywordsh...@example.com, 
and yet still have it correctly routed to u...@domain.com. This is a few 
lines of code in Haraka, or maybe someone has already written this plugin.


Plugins are already provided for running mail through SpamAssassin, 
checking for known bad HELO patterns, checking DNS Blocklists, and 
watching for violators of the SMTP protocol via the early_talker plugin.


Re: No imageinfo.pm score

2011-06-01 Thread Yet Another Ninja

On 2011-06-01 12:42, Barry Kwok wrote:

I just found out that there is no ImageInfo plugin score in one of my
server. Spamassassin debug show:

Jun  1 17:46:50.229 [29338] dbg: config: fixed relative path:
/var/lib/spamassas
sin/3.003001/updates_spamassassin_org/20_imageinfo.cf
Jun  1 17:46:50.229 [29338] dbg: config: using
/var/lib/spamassassin/3.003001/u
pdates_spamassassin_org/20_imageinfo.cf for included file
Jun  1 17:46:50.229 [29338] dbg: config: read file
/var/lib/spamassassin/3.00300
1/updates_spamassassin_org/20_imageinfo.cf
..
...
..
..
Jun  1 17:47:08.197 [29338] dbg: imageinfo: image
ratio=0.000254277605525206, mi
n=0.000 max=0.008
Jun  1 17:47:08.197 [29338] dbg: rules: ran eval rule __DC_IMG_TEXT_RATIO
==

got hit (1)



But there is no DC_IMG_TEXT_RATIO score.

I test the same spam in other server and have DC_IMG_TEXT_RATIO score.
I compare the debug log and can't see the difference.


__DC_IMG_TEXT_RATIO is used by the meta DC_IMAGE_SPAM_TEXT

Per default there is no DC_IMG_TEXT_RATIO rule scoring unless you've set 
a score manually


Pls paste the sample message on pastebin.com and let others compare results


Re: FW: Mit unseren Tabs kannst Du viel mehr im Bett

2011-05-31 Thread Yet Another Ninja

put ow.ly in a URI rule



On 2011-05-31 8:58, Lars Jørgensen wrote:

Hi,

We don't get much spam through the spamassassin filter, but we do get a bit of 
german spam which only seems to trigger RVCD_IN_XBL and thus not get a high 
enough score to be discarded. I have included a sample below (hoping that I 
don't offend anyone and that it'll get through people's spam filters). Has 
anybody seen these and know of a good rule to catch them?


Lars

From: Brenton Hanna [mailto:currycomb...@polysto.com]
Sent: Tuesday, May 31, 2011 7:33 AM
Subject: Mit unseren Tabs kannst Du viel mehr im Bett

Mit unseren Tabs kannst Du viel mehr im Betthttp://ow.ly/4OsQ9

Das geht wirklich jetzt und hier. Nur legale Plilen jetzt und hier bestlelen. 
Sparen ohne Ende.
Jetzt ist die Zeit, um bei Kauf der Plilen richtig zu sparen. Jetzt 
ausprobieren, sehr schnelle Lieferung.



If you would, however, prefer not to receive these mailings in the future, you can 
unsubscribe herehttp://ow.ly/4OsQ9  or update your email preferences.








Re: RCVD_IN_SORBS_DUL on my own emails to self

2011-04-05 Thread Yet Another Ninja

On 2011-04-05 12:08, rstarkov wrote:


Like so many people, I get a dynamic IP from my ISP. Right now, any emails I
send to myself show up as RCVD_IN_SORBS_DUL. Somehow I thought that as
long as my SMTP server isn't blacklisted, something like this wouldn't
happen.

The exact message is: RCVD_IN_SORBS_DUL RBL: SORBS: sent directly from
dynamic IP address
*  [82.6.105.32 listed in dnsbl.sorbs.net]

So what should I do now, just keep resetting my modem until I get an IP that
isn't blacklisted? Doesn't that make this rule completely useless, blocking
email from a lot of legitimate users?


your IP isn't backlisted. It's listed as a DUL, which is correct:


-  32.105.6.82.in-addr.arpa.
type = PTR, class = 1, ttl = 604800, dlen = 50
host = cpc1-cmbg8-0-0-cust287.5-4.cable.virginmedia.co

That IP is also listed in Spamhaus's PBL:

http://www.spamhaus.org/query/bl?ip=82.6.105.32

which is also correct.

You'll have to use a smarthost or get a static/business connection.


Re: mail spam not catched

2011-04-05 Thread Yet Another Ninja

On 2011-04-05 17:44, Salvatore wrote:

For to stop this spam I must modify my spamassassin configuration ?
What steps I can make for to resolve my problem ?
Thanks and sorry for my banal question.
Thanks in advance.


help us help you and post the sample in http://pastebin.com




Re: autolearn=ham was wrong, howto retrain ?

2011-04-04 Thread Yet Another Ninja

On 2011-04-04 9:54, Andreas Schulze wrote:

Hello

Im using spamassassin inside amavisd-new to filter mails.

Today I noticed a mail with these headers:
X-Spam-Flag: NO
X-Spam-Score: -0.007
X-Spam-Level:
X-Spam-Status: No, score=-0.007 tagged_above=-999 required=5
 tests=[HTML_IMAGE_ONLY_32=0.001, HTML_MESSAGE=0.001, MTX_NONE=0.001,
 T_RP_MATCHES_RCVD=-0.01] autolearn=ham
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on andreasschulze.de

How can I tell SA this was spam ?
I would try sa-learn -spammessagefile

But does this let SA really forget the previous state ham ?




http://spamassassin.apache.org/full/3.3.x/doc/sa-learn.txt

-forget  Forget a message


Re: The one year anniversary of the Spamhaus DBL brings a new zone

2011-03-08 Thread Yet Another Ninja

On 2011-03-08 21:24, dar...@chaosreigns.com wrote:

Looks like that would be something like this?

urirhssub   URIBL_DBL_REDIRECTOR   dbl.spamhaus.org.   A   127.0.1.3
bodyURIBL_DBL_REDIRECTOR   eval:check_uridnsbl('URIBL_DBL_SPAM')
describeURIBL_DBL_REDIRECTOR   Contains a URL listed in the DBL as a 
spammed redirector domain
tflags  URIBL_DBL_REDIRECTOR   net domains_only
score   URIBL_DBL_REDIRECTOR   0.1


Anybody know of a domain that hits this?



tried to post a list of the domains but Apache's infra rejected it with.

Delivery to the following recipient failed permanently:

 users@spamassassin.apache.org

Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the 
recipient domain. We recommend contacting the other email provider for 
further information about the cause of this error. The error that the 
other server returned was: 552 552 spam score (13.3) exceeded threshold 
(FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_SURBL_MULTI1,T_SURBL_MULTI2,T_TO_NO_BRKTS_FREEMAIL,T_URIBL_BLACK_OVERLAP,URIBL_AB_SURBL,URIBL_BLACK,URIBL_JP_SURBL,URIBL_PH_SURBL,URIBL_WS_SURBL 
(state 18).



pretty amazing...


Re: The one year anniversary of the Spamhaus DBL brings a new zone

2011-03-08 Thread Yet Another Ninja

On 2011-03-08 20:58, Bill Landry wrote:

FYI: Spamhaus created a new URL shortener/redirector zone in the
DBL. See:

http://www.spamhaus.org/news.lasso?article=667

Will Spamassassin be adding support for this new DBL
shortener/redirector response code?:

127.0.1.3 spammed redirector domain

For details, see:

http://www.spamhaus.org/faq/answers.lasso?section=Spamhaus%20DBL#291

Regards,

Bill


http://pastebin.com/CdDPHnTX




Re: The one year anniversary of the Spamhaus DBL brings a new zone

2011-03-08 Thread Yet Another Ninja

On 2011-03-08 22:12, Warren Togami Jr. wrote:

On 3/8/2011 9:58 AM, Bill Landry wrote:

FYI: Spamhaus created a new URL shortener/redirector zone in the
DBL. See:

http://www.spamhaus.org/news.lasso?article=667

Will Spamassassin be adding support for this new DBL
shortener/redirector response code?:

127.0.1.3 spammed redirector domain

For details, see:

http://www.spamhaus.org/faq/answers.lasso?section=Spamhaus%20DBL#291

Regards,

Bill


OK, so this is meant to be used as a URIBL. I don't see this as anything
special because there is no way to query the pathname portion of a URI
which would allow more fine-grained detection of spammy URI's even on a
non-evil shortening service.


it's nothing more than what it say it is.


Is this new DBL return code meant to be a lower score than ordinary
URIBL's that often choose to list evil shortener domains?


I'd say, depends on your traffic.


My point is  this is no different than an ordinary URIBL listing.

nope.. just a separate return code for a small data subset.




Re: The one year anniversary of the Spamhaus DBL brings a new zone

2011-03-08 Thread Yet Another Ninja

On 2011-03-08 22:28, Joseph Brennan wrote:



http://www.spamhaus.org/faq/answers.lasso?section=Spamhaus%20DBL#291



quote,


One way to address this problem would have been to treat URL shortener
domains the same way as any other spammed domain and include them in our
main DBL zone. But, as mentioned, most of these URL shortener serve a
legitimate purpose and are used in non-spam emailings. Spamhaus has
always worked to avoid the blocklisting of assets that would cause
unjustified false positives.


I'll never grasp why one would use one of those in mail.
I thought there was consense to educate users *not* to visit links they 
don't know and now we hear that something which hides potential danger 
is ok to be used?




In fact Spamhaus *has* been treating them the same as other domains,
and causing false positives on shorteners widely used for legitimate
purposes, such as bit.ly, tinyurl.com, fly2.ws, and is.gd.


ever though of doing uridnsbl_skip_domain for those Open Relay 2.0 you 
consider legitimate?



But they've been doing it on the IP-based SBL, not the domain name
list. I hope the IP blocks for shorteners will stop. It's degraded
the quality of SBL, which in the past was so accurate on spammer-owned
hosts that we were willing to 550 based on a match.



I could see scoring for shorteners. So this is good news.


As soon as this data is widespread, they won't be abused any longer and 
new smaller sites will take over, In fact, they already are.


At the moment its owl.ly and durl.me being massively hammered. As soon 
as they get their homework done, it'll be someone else.







Re: using spamhaus droplist with sa ?

2011-02-17 Thread Yet Another Ninja

On 2011-02-17 15:23, Andreas Schulze wrote:

Hello,

http://www.spamhaus.org/faq/answers.lasso?section=DROP FAQ
mention as very last point to use the Spamhaus Drop list with SA.

is anybody doing this and can explain it in detail ?

Thanks
Andreas



DROP is a tiny subset of the SBL designed for use by firewalls and 
routing equipment.


Using it postqueue is pretty pointless as its basically a safe subset 
of SBL




Re: using spamhaus droplist with sa ?

2011-02-17 Thread Yet Another Ninja

On 2011-02-17 16:40, RW wrote:

On Thu, 17 Feb 2011 15:29:07 +0100
Yet Another Ninjaaxb.li...@gmail.com  wrote:


On 2011-02-17 15:23, Andreas Schulze wrote:

Hello,

http://www.spamhaus.org/faq/answers.lasso?section=DROP FAQ
mention as very last point to use the Spamhaus Drop list with SA.

is anybody doing this and can explain it in detail ?

Thanks
Andreas



DROP is a tiny subset of the SBL designed for use by firewalls and
routing equipment.

Using it postqueue is pretty pointless as its basically a safe
subset of SBL


The suggestion is that it be scored higher for that reason.


if that is what you wish, you can setup a local rbldnsd zone and query that.



Re: Match pseudoheaders only in message body?

2011-02-03 Thread Yet Another Ninja

On 2011-02-03 17:53, Kris Deugau wrote:

I've been adding local rules to catch otherwise legitimate headers from
popular sites in the message body (ie, where they would appear in
postmaster mail that should never ever arrive at an account outside of
that site).

Unfortunately I've had to use mimeheader to trigger a match with some
messages... and mimeheader blindly goes ahead and matches the main
message header as well when there are no MIME parts.

This causes misfires on the *legitimate* mail from these sites.

Is there any better way to make sure that the rule only matches in the
message body other than creating a meta sub to match the main message
header and make the rule a meta like so:

# match all legit Facebook mail
header __FBMAILER X-Facebook =~ /from zuckmail/
# match all postmaster bounces from fake Facebook mail, *and*
# (sometimes) legitimate Facebook mail
mimeheader __T_YOUR_ORDER_VIRUS_P X-Facebook =~ /from zuckmail/
meta T_YOUR_ORDER_VIRUS_P __T_YOUR_ORDER_VIRUS_P  !__FBMAILER



could you pastebin a sample msg?


Re: lots of freemail spam

2011-01-02 Thread Yet Another Ninja

On 2011-01-02 13:59, Warren Togami Jr. wrote:

I've been thinking, perhaps we should consider making a Freemail Realtime
BL that lists not IP addresses, but rather ID's at the Freemail provider.


Search the list archives for emailbl


1) I am assuming that ID's you see in headers of mail from Yahoo is always
from an authenticated user?
2) Traps and user reports can quickly list a new Freemail user ID.
3) Subsequent spam from that user ID is more easily blocked because the RBL
has the ID listed.
4) The RBL feed can be automated to be sent to the provider (like Yahoo) so
they can more quickly enforce locking down compromised accounts or enforce
their ToS.


Search the list archives for emailbl


Re: A new paradigm for DNS based lists

2010-12-29 Thread Yet Another Ninja

On 2010-12-29 20:50, Marc Perkel wrote:



On 12/29/2010 11:10 AM, David F. Skoll wrote:

On Wed, 29 Dec 2010 09:33:25 -0800
Marc Perkelsupp...@junkemailfilter.com wrote:


Yes - there's no point in doing DNS blacklist lookups on yahoo,
hotmail, and gmail as well as thousands of other mixed source
providers.

I disagree. I have a strong feeling that some of those providers
route less-trustworthy mail through certain IP addresses and
more-trustworthy mail through others. For example, some of Yahoo's
servers are listed in our good list while others are listed in our
bad list. The difference in observed behaviour between the two sets of
Yahoo servers is very dramatic.

We don't outright block hosts in the bad list, but we do add points.

Regards,

David.



Hi David,

My idea doesn't preclude you from having a bad yahoo list and adding
points. I'm just saying that when it comes to checking other blacklists
to see if any yahoo server is listed it's a waste of resources. If it's
a yahoo server of any flavore why look it up on the blacklists?


coz we can't be bothered to do otherwise?


Re: My attempt at re-calculating test scores

2010-12-24 Thread Yet Another Ninja

On 2010-12-24 12:37, Warren Togami Jr. wrote:

You have the option of uploading your corpus to the central server to
process every night.  But most people have privacy concerns about that if it
is their own personal ham.  For this reason you have the option of running
the masscheck script yourself every night on your own server and to rsync
upload the logs only to the spamassassin central server.

https://fedorahosted.org/auto-mass-check/
I run this script every night from cron on my corpora.  I wrote this as a
friendlier wrapper script around spamassassin's confusing and difficult to
configure scripts.

♫
And yes, a ham only corpus is extremely useful.  You must confirm that it is
100% human verified.  Start small, make sure the script is working properly,
and sort more ham into that folder.

Warren


FWIW:

http://git.fedoraproject.org/git/?p=auto-mass-check.git;a=summary

git.fedoraproject.org is MIA


Re: Additional sa-update channels

2010-12-16 Thread Yet Another Ninja

On 2010-12-15 19:00, Lawrence @ Rogers wrote:

massive_snip


90_2tld.cf.sare.sa-update.dostech.net


this has been deprecated and replaced with SA's default  20_aux_tlds.cf

See in: 20_aux_tlds.cf

# This file replaces the SARE http://www.rulesemporium.com/rules/90_2tld.cf
# which will be deprecated as from 2010-05-01


Re: Additional sa-update channels

2010-12-16 Thread Yet Another Ninja

On 2010-12-15 21:41, Lawrence @ Rogers wrote:

On 15/12/2010 3:51 PM, Bowie Bailey wrote:

The khop rules are good. I thought the 2tld stuff had been pulled into
SA as 20_aux_tlds.cf?

It has, but the Daryl edited one has some additional stuff (I think)
that isn't in there. There is conditional code that enables certain
rules in the file depending on what version of SA you are running.


I really doubt this being the case as we was pulling the files from 
http://www.rulesemporium.com/ which is history.



90_2tld.cf and 20_aux_tlds.cf were created/updated by the same person.


Re: DNSBL for email addresses?

2010-12-14 Thread Yet Another Ninja

On 2010-12-14 15:28, Marc Perkel wrote:

Are there any DNSBLs out there based on email addresses?


nope


Is there a standard?


nope


Re: Do we need a new SMTP protocol? (OT)

2010-12-01 Thread Yet Another Ninja

On 2010-12-01 17:13, Martin Gregorie wrote:

On Wed, 2010-12-01 at 07:27 -0800, Marc Perkel wrote:

I've been thinking about what it would take to actually eliminate spam
or reduce it to less than 10% of what it is now. One of the problems is
the SMTP protocol itself. And a big problem with that is that mail
servers talk to each other using the same protocol as users use to talk
to servers.


I don't think that would help at all. Bots would just pretend to be mail
servers and use SMTP. Any other form of spam could be circumvented by
setting up spammer-owned MTAs that spammers would use to inject spam.

IMO the best solution would have been a charge per e-mail provided it
was universally enforced. A small charge, e.g. $0.001 to $0.01 per
addressee per message would be almost unnoticable to a normal user or
business while still being enough to discourage volume spammers by
wiping out their profits. Another benefit would be that the bill
received by a bot-infected user would serve as a powerful wake-up call
to get disinfected.


could we move this dead horse out of the house?

the SDLU list may be a better place for this topic


Re: sa-update changelog

2010-11-16 Thread Yet Another Ninja

On 2010-11-16 11:41, Alvaro Marin wrote:

Hi,

Is there anyway to see the changes made by sa-udpate when I execute it?
I want to see which rules and scores are modified since the last update.
Thanks!

Regards,



sa-update -D


Re: Only running network tests when necessary - feature request

2010-10-30 Thread Yet Another Ninja

On 2010-10-30 9:56, RW wrote:

On Sat, 30 Oct 2010 02:23:00 -0400
dar...@chaosreigns.com wrote:



But the total amount of bandwidth and processing time saved on the
internet from not running unnecessary tests on every instance of
spamassassin seems worth doing.


You are also wasting resources by putting the round-trips on the
critical path and tying-up a child process. And if you are
checking a lot of mail you are presumably using rsync. 


rsync? to check mail?


Re: Help! Filter spam with less than symbol in recipient

2010-10-15 Thread Yet Another Ninja

On 2010-10-15 12:58, Niente0 wrote:



Giles Coochey wrote:


Have you tried escaping it with \x3c ?




Thanks for your suggestion, I tried it now but with no success. Here's my
rule:

header  TO1 To:name =~ /\x3c/i
score   TO1 100

I have received other less than spam just now. :-(


pls post a spam sample on pastebin.com and send the link to the list


Re: Help! Filter spam with less than symbol in recipient

2010-10-15 Thread Yet Another Ninja

On 2010-10-15 14:18, Niente0 wrote:


Yet Another Ninja wrote:

On 2010-10-15 12:58, Niente0 wrote:
pls post a spam sample on pastebin.com and send the link to the list



Hi, I tried with 3 different browsers but pastebin.com shows only a blank
page after submitting text. So I posted it here:

http://snipt.org/koRn/


Untested:

# To: i...@aags.com
header TO1  To =~   /^/


Re: Help! Filter spam with less than symbol in recipient

2010-10-15 Thread Yet Another Ninja

On 2010-10-15 14:49, Niente0 wrote:


Yet Another Ninja wrote:

On 2010-10-15 14:18, Niente0 wrote:

Untested:

# To: i...@aags.com
header TO1  To =~   /^/




Thank you!
I tested it but it still doesn't work. :-(

For testing purposes, I created a fake user in my Outlook address book, with
name  and email equal to an alias of my email. I sent him (myself) a test
message and it passed. I examined the header of the incoming message and
there are spamassassin infos, so it passed through SA rules and ignored the
less than filtering rule...


works for me

X-Spam-Report:
*  0.0 HAS_SHORT_URL Message contains one or more shortened URLs
*  1.0 SHORT_URL_404 Message has short URL that returns 404
*  3.0 SHORT_URL_CHAINED Message has shortened URL chained to other
*  shorteners
*  0.0 SHORT_URL_LOOP Message has short URL that loops back to 
itself
*  5.0 SHORT_URL_MAXCHAIN Message has shortened URL that causes 
more than

*  10 redirections
*  1.0 TO1 TO1
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  1.8 MIME_QP_LONG_LINE RAW: Quoted-printable line longer than 
76 chars
*  0.1 RDNS_NONE Delivered to trusted network by a host with no 
rDNS





Bbedit SA syntax highlighting

2010-10-09 Thread Yet Another Ninja
Does anybody have or know of SA syntax (highlighting) definition for 
BBedit (Mac) ?


If yes, would you share?

Thanks


Re: Whitelist questions

2010-10-05 Thread Yet Another Ninja

On 2010-10-05 22:16, John Hardin wrote:

On Tue, 5 Oct 2010, Karsten Br�ckelmann wrote:


If there really is no way to use whitelist_from_rcvd, you of course
always can write custom header rules, matching against the pseudo header
X-Spam-Relays-Internal or friends, carefully constructing the RE to
match a specific Received header by constraining it with the square
brackets surrounding each relay.


Perhaps whitelist_from_rcvd should recoginze IP syntax and ignore the 
rDNS, so this would work:


   whitelist_from_rcvd u...@lanyon.com [209.16.192.170]

not that I'd want to maintain IP-based whitelists...


wasn't there an whitelist_fromip plugin floating around sometime ago?



Re: New plugin: DecodeShortURLs

2010-10-05 Thread Yet Another Ninja

On 2010-10-05 22:35, Brent Gardner wrote:

Steve Freegard wrote:

Hi All,

On 17/09/10 14:11, Steve Freegard wrote:

Hi All,

Recently I've been getting a bit of filter-bleed from a bunch of spams
injected via Hotmail/Yahoo that contain shortened URLs e.g. bit.ly/foo
that upon closer inspection would have been rejected with a high score
if the real URL had been used.

To that end - it annoyed me enough to write a plug-in that decodes the
shortened URL using an HTTP HEAD request to extract the location header
sent by the shortening service and to put this into the list of
extracted URIs for other plug-ins to find (such as URIDNSBL).

On the messages I tested it with - it raised the scores from 5 to 10
based on URIDNSBL hits which is just what I wanted.

Hopefully it will be useful to others; you can grab it from:

http://www.fsl.com/support/DecodeShortURLs.pm
http://www.fsl.com/support/DecodeShortURLs.cf



I've just put up a new version at the above URLs (v0.3) which adds the 
following new features:


- Now follows 'chained' short URLs  (e.g. shortURL - shortURL - real)

When chained URLs are detected the rule 'SHORT_URL_CHAINED' is fired.
If a chained loop is detected the rule 'SHORT_URL_LOOP' is fired.
If more than 10 chained URLs are found 'SHORT_URL_MAXCHAIN' is fired 
and no further redirections are checked.


- If the shortener returns 404 (e.g. not found) for the short URL then 
'SHORT_URL_404' is fired.


- Prevent amavis from die'ing on eval block tests by adding local 
$SIG{'__DIE__'} to each block.


- Added option to allow logging to syslog (mail.info).

Kind regards,
Steve.

I've been testing this plugin, version 0.5.  I'm running SpamAssassin 
v3.2.5 on CentOS v5.5 32-bit, Perl v5.8.8.  I've been testing using a 
test message and changing out the URLs it contains.


Using URLs like these:

http://goo.gl/foo
http://bit.ly/foo
http://2chap.it/foo

I consistently hit on these rules:

HAS_SHORT_URL
SHORT_URL_404
SHORT_URL_CHAINED
SHORT_URL_LOOP
SHORT_URL_MAXCHAIN


I can understand hitting on HAS_SHORT_URL and SHORT_URL_404, but why is 
-every- test hitting SHORT_URL_CHAINED, SHORT_URL_LOOP, SHORT_URL_MAXCHAIN?


I bet *none* of the /foo targets exist.
Could that be confusing the plugin when /foo redirects back to home
Steve?



Re: Free SURBL sources + rbldnsd extensive docs + configuring spamassin with new surbl source?

2010-09-28 Thread Yet Another Ninja

On 2010-09-28 9:28, selven wrote:

Hi, i wanted to set up my own surbl server, unfortunately, not much
information is available around, most of the time am bumping into this
http://www.surbl.org/public-dns.html, but well, getting rsync data feed
access from surbl.org is way too expensive for a bunch of kids at school. Is
there some sort of free list out there that i can rsync from and then if
there's any guide/docs that i could follow to get my spamassassin to query
my local surbl server.


What's wrong with querying the public servers?

SURBL/URIBL  DBL are free if you remain below their heavy traffic usage 
policies. The Invaluement.com lists are not free for public querying but 
an interesting alternative as well.


If, as you say, you only cater for a bunch of kids at school you 
shouldn't be hitting the BL's thresholds.




Re: Blacklist for spam-words

2010-09-16 Thread Yet Another Ninja

On 2010-09-16 12:29, franc wrote:

You may setup a regexp rule in the /etc/local.cf file of your SA
installation


Could you give me an example, or where to find one? In the local.cf i don't
find RegExp-sections.


see http://wiki.apache.org/spamassassin/WritingRules


Re: The most amazing spam ...

2010-09-16 Thread Yet Another Ninja

On 2010-09-16 13:36, Giles Coochey wrote:

On Thu, September 16, 2010 13:28, Martin Gregorie wrote:

On Thu, 2010-09-16 at 07:28 +0200, Per Jessen wrote:

http://public.jessen.ch/files/mazeweb-spam.jpeg



A cynic might wonder whether it also harvests valid e-mail addresses.



Appears to be a perfectly reputable service to me... what makes you think
there is anything untoward?



I'd say: a spam filter service spamming the competition :-)


419er honesty

2010-09-15 Thread Yet Another Ninja

Received: from 41.155.23.91
(SquirrelMail authenticated user spam)
by 71.4.72.28 with HTTP;


sometimes I wonder



--
If you haven't received my email please tell me and I will resend it to 
you again. (Anna Masekela)


Re: scantime=249.2; scantime=175.0; scantime=190.9; scantime=68.9

2010-09-06 Thread Yet Another Ninja

On 2010-09-05 0:00, Chris wrote:

On Sat, 2010-09-04 at 08:42 -0500, Chris wrote:

I'm trying to figure out why I'm having ridiculous scan times such as
the above examples. Lower scan times such as in the 20 second range are
the exception rather than the rule. I'm running bind as a local caching
nameserver and it seems to be working correctly. I've just seen a ham
that has a scantime=172.2. Could there be something else on the system
that is affecting this? 


Any advice as to troubleshooting would be appreciated.



I've started SA now with -D

OPTIONS=-d -D -c -H -m 4  --max-conn-per-child=3 --min-children=1

While looking at my syslog I noticed the following:

Sep  4 16:21:46 localhost spamd[15797]: prefork: periodic ping from
spamd parent
Sep  4 16:21:46 localhost spamd[15800]: prefork: periodic ping from
spamd parent
Sep  4 16:21:46 localhost spamd[15800]: prefork: sysread(9) not ready,
wait max 300 secs
Sep  4 16:21:46 localhost spamd[15797]: prefork: sysread(8) not ready,
wait max 300 secs

I've got the debug output on a ham, just waiting for a spam to come
through then I'll post both to pastebin but the above doesn't look good.
When this is happening my drive light seems to stay on forever and the
system seems close to being unresponsive. Checking cpu usage when this
is happening it stays around 4% for user and 3-4% for system. Link for a
ham - http://pastebin.com/k55D79TL
spam - http://pastebin.com/28qW2nga



Sep  4 16:32:31 localhost spamd[15797]: ClamAV: invoking 
File::Scan::ClamAV, port/socket: /var/lib/clamav/clamd.socket


You're using the SA ClamAV plugin which isn't the most effcient method 
do do AV checks.


There are more efficient methods to interface with Clamd.

You may also want to remove legacy or inneficient rule files.






Re: scantime=249.2; scantime=175.0; scantime=190.9; scantime=68.9

2010-09-06 Thread Yet Another Ninja

On 2010-09-06 12:49, RW wrote:

On Mon, 06 Sep 2010 12:26:08 +0200
Yet Another Ninja sa-l...@alexb.ch wrote:



You're using the SA ClamAV plugin which isn't the most effcient
method do do AV checks.


What's wrong with it?


nothing wrong but my first choice would be to reject infected files at 
MTA level (via milter, proxy, etc) instead of parsing with SA and tag 
it... imo, unnecessary overhead.


Re: Spamassassin not checking user provided RBLs

2010-09-02 Thread Yet Another Ninja

On 2010-09-01 22:47, Chris Datfung wrote:

I'm running spamassassin version 3.3.1-1 from the Debian  package. I added
several RBLs to /etc/mail/spamassassin/init.pre but spamassassin only
queries its built in RBLs and not the ones I added. An example RBL entry to
init.pre is shown below:

header IN_NJABL_ORGrbleval:check_rbl('njabl','dnsbl.njabl.org.')
describe IN_NJABL_ORG  Received via a relay in dnsbl.njabl.org
tflags IN_NJABL_ORGnet
score IN_NJABL_ORG  5

I also find messages that aren't tagged as being in an RBL that are listed
in cbl.abuseat.org and zen.spamhaus.org which should be automatically
checked by spamassassin using the default configuration. As mentioned before
other (built-in) RBL checks work. Any hints as to why my custom RBL checks
added to init.pre (and also tried local.cf) aren't queried after restarting
spamassassin?

Thanks,
 Chris


You don't EVER add rules to a .pre file

only .cf are rules files
use local.cf for custom rules


Re: Expiring Beyes

2010-08-26 Thread Yet Another Ninja

On 2010-08-26 16:11, Grant Peel wrote:

Hi all,

I have serveral hundred domains on a box. Each domain's mail is 
controlled by a specific UNIX user.


Inside every user's directory, they have a user_prefs file.

While I have use_bayes 0 in the main config, some users have opted to 
turn on bayes in thier user_prefs.


This morning I noticed that one particular ~/.spamassassin/bayes* files 
had grown to 1.5 GB.


I have put:

use_bayes 0
bayes_auto_learn0
bayes_auto_expire   1
bayes_expiry_max_db_size 5

in the local.cf file, and restarted spamd.

The database did not appear to trim, so I tried:

sa-learn -u user -D --force-expire

and the database is still 1.5 GB.

I know I am doing something(s) incorrect, but can't figure out what.

How do I properly trim the offending file(s)?

Is there a command to trim all databases (sers) on the box?

Any advice would be appreciated.


I bet the biggest is bayes_seen.
You can safely delete the bayes_seen file (unless you plan to unlearn 
msgs). I will stzart growing again, fast.


the bayes_tokens file is the one which gets trimmed by expiration.
bayes_seen is what I call a parasite :-)

On a busy box, to avoid freezes I'd recommend settin
bayes_auto_expire   0

and do a cron'd force-expire during low traffic hours, eithe daily or 
weekly, depending on the bayes_tokes size.





h2h


Re: query own sbl

2010-08-25 Thread Yet Another Ninja

On 2010-08-25 13:44, Christian Scholz wrote:

 Hello together,

I've set up my own sbl and want spamassassin to check this rbl but it 
doesn't work.

My rule is

IN_SBL_OOS_ORG rbleval:check_rbl('oos', 'sbl.o-o-s.de.')
describe IN_SBL_OOS_ORG Received via a blocked site in sbl.o-o-s.de
tflags IN_SBL_OOS_ORG net
score IN_SBL_OOS_ORG 5.0

is there anything wrong? My Spamassassin Version is spamassassin 
3.2.5-2+lenny2


Chris


first thought...
seems there's something missing in

IN_SBL_OOS_ORG rbleval:check_rbl('oos', 'sbl.o-o-s.de.')

header  IN_SBL_OOS_ORG rbleval:check_rbl('oos', 'sbl.o-o-s.de.')






Re: two SA folders and sa-updates

2010-08-18 Thread Yet Another Ninja

On 2010-08-18 14:05, Matus UHLAR - fantomas wrote:



/etc/mail/spamassassin/sare-sa-update-channels.txt


BE sareful about SARE rules. They are often obsolete, have false positives
and meny of them are already incorporated in stock SA, and some have better
alternatives (uri blacklist vs. hardcoded list of spam domains).


better - *don't even think of using them* - they are not being updated 
and never will.


Anything worthy has already been migrated to SA mainstream and the few 
SARE survivors are also SA commiters so they'll commit to SA instead of 
SARE.


Anybody hammering the rulesemporium with lwp/wget on a regular basis is 
advised to stop unless in need of surprises when the files are zeroed out.


Re: sa-compile has no effect (under Windows.......)

2010-07-30 Thread Yet Another Ninja

On 2010-07-30 21:26, Bowie Bailey wrote:

 On 7/30/2010 3:08 PM, Emin Akbulut wrote:

Simply disable regular ruleset and test again. If it takes 6.93-5.78
seconds or
something similar, you are right.


I'm actually having the same issue on my new home server.  I set up SA
and got it working.  Then I ran sa-compile, enabled the plugin in
v320.pre, and restarted.  The logs show that it is using the compiled
rules, but there is no difference in scan speeds at all.

How would I go about disabling the regular ruleset?



compiled rules only affects body  rawbody rules.
Network tests won't be affected and are probably the reason for the lack 
of a massive difference.


Re: [sa-list] Re: Autoreplies from RT are hitting on ANY_BOUNCE_MESSAGE

2010-06-29 Thread Yet Another Ninja

On 2010-06-29 10:39, Dan Mahoney, System Admin wrote:

On Mon, 28 Jun 2010, Yet Another Ninja wrote:


On 2010-06-28 11:33, Dan Mahoney, System Admin wrote:
 Hey there,
  Perhaps this is by design, but rt replies are, strictly speaking, 
not  bounce messages.

  Message attached, let me know if it looks normal.
  -Dan

from what I see it looks normal if someone really makes an effort to 
tune SA scores.



my 50_scores.cf deault says:

score ANY_BOUNCE_MESSAGE 0.1
score SHORTCIRCUIT 0


Even so, why is it matching, when it's not a bounce.  It's either 
something inaccurate in spamassassin, or something RT is doing that it 
shouldn't be.  It it's the latter, I'll attempt to fix rt.  If the 
former, perhaps SA should.


Either I'm blind or your sample is missing important header information.

suggest you use:

report_safe 0

in local.cf and post a new sample in pastebin




Re: Autoreplies from RT are hitting on ANY_BOUNCE_MESSAGE

2010-06-28 Thread Yet Another Ninja

On 2010-06-28 11:33, Dan Mahoney, System Admin wrote:

Hey there,

Perhaps this is by design, but rt replies are, strictly speaking, not 
bounce messages.


Message attached, let me know if it looks normal.

-Dan



from what I see it looks normal if someone really makes an effort to 
tune SA scores.



my 50_scores.cf deault says:

score ANY_BOUNCE_MESSAGE 0.1
score SHORTCIRCUIT 0






Re: Nonsense spam

2010-06-24 Thread Yet Another Ninja

On 2010-06-24 21:51, Ned Slider wrote:

Michael Scheidell wrote:

On 6/24/10 1:18 PM, Randy Ramsdell wrote:
   Yet spamassassin scores it with a .9. I have been reluctant to 
block and

this is compounded by spamassassin scoring it low as if it weren't as
accurate as you state.

   
again, look at the circumstances.  the SA scoring might be crippled 
due to the issue of a lack of these ip's in spam corpus since most 
people use that as a hard mta rbl.


(chime in, anyone who uses it)




I use PBL to block at the MTA level. I agree the FP rate is near 
non-existent. So long as you're *only* scanning the --lastexternal IP in 
SA then I'd personally score the rule well above the spam threshold level.


Interesting what Michael says about the reason for a low score in SA. I 
don't know enough about the weighting of the scoring system, but it 
sounds like a reasonable argument to me to explain the low scoring. If 
you're not convinced, grep your own inbox for hits against PBL for FPs. 
The danger comes when people use the PBL incorrectly and deep parse all 
headers which *will* lead to copious FPs.


Either way, I'd have no hesitation blocking outright on PBL or scoring 
very highly in SA.


me too


Re: Should Spamhaus default to disabled?

2010-06-12 Thread Yet Another Ninja

On 2010-06-12 15:20, Andy Dills wrote:

300,000 queries per day...per server? per CIDR? What is the delimiter?

Because there is certainly no single IP generating 300,000 queries per 
day.


That is probably your problem... use a central DNS resolver and your 
query count will instantly decrease


I bet you're querying from:

216.127.136.200 dns02.xecu.net
216.127.136.247 mail-out07.xecu.net
216.127.136.242 mail-out02.xecu.net
216.127.136.246 mail-out06.xecu.net
216.127.136.196 mg6.xecu.net
216.127.136.241 mail-out01.xecu.net
216.127.136.245 mail-out05.xecu.net
216.127.136.243 mail-out03.xecu.net
216.127.136.244 mail-out04.xecu.net


Re: Should Spamhaus default to disabled?

2010-06-11 Thread Yet Another Ninja

On 2010-06-11 16:42, Andy Dills wrote:
After recently upgrading to a new mail cluster with SA 3.3.1, we were 
contacted (at every imaginable POC address) with a solicitation to 
purchase access to utilize the Spamhaus blacklists, or they'll stop 
answering our queries.


We felt the amount of money being asked for was unreasonable, as we felt 
we likely wouldn't see an increase in spam if we turned them off.


So, local.cf got:

score URIBL_DBL_SPAM 0
score URIBL_DBL_ERROR 0
score RCVD_IN_ZEN 0

I think those are the only queries that generate lookups against Spamhaus, 
but I'm not positive.


Regardless, we noticed no increase in spam after disabling these tests. 
I imagine there's lots of overlap on the blacklists.


I think the maintainers of SA should strongly consider defaulting Spamhaus 
to off. At the very least, it should be better documented how to entire 
disable Spamhaus queries.


They have the right to charge for their data, but I question whether it's 
appropriate for an open-source project to generate sales leads in this 
manner.


this horse is very dead...  Your traffic generated the sales lead, not SA.






Re: How to remove a domain from a stock or third-party 2tld ruleset?

2010-05-28 Thread Yet Another Ninja

On 2010-05-28 23:57, Kris Deugau wrote:

Karsten Bräckelmann wrote:

On Wed, 2010-05-26 at 11:35 -0400, Kris Deugau wrote:
Is there any way to take a domain listed with util_rb_2tld, and 
un-2tld it (similar to how you can unwhitelist stock whitelist 
entries if they don't work well with your mail)?


IIRC this is not possible. Well, possible, but there's just no code to
handle it. ;)


Didn't think so, but...

I recently came across a free-subsite domain that seems to be part 
of a cluster of **very** similar sites which I've given up listing 
subdomains for locally;  instead I've added the TLDs to a local 
blacklist.


For now I've just added a regular uri rule, but I'm pretty sure that 
won't scale, and it doesn't help with some of the automation I've 
been using to extract URIs not listed on any DNSBL yet from 
missed-spam reports.


uri rules should work. I wouldn't worry about scaling too much, because
the number of util_rb_2tld listings is limited.

Another approach, since I understand you want to query against a local
URI DNSBL, is simply to use wildcard DNS entries. Thus, regardless of a
2tld listing and the resulting DNS lookup, it would return the same
listing for the pure TLD and a second level TLD.


Hmm.  I hadn't thought of this, I'll give it a try and see if something 
chokes.  Thanks!


let me guess... .co.cc ?






Re: Problem matching newline in body

2010-05-21 Thread Yet Another Ninja

On 2010-05-21 15:40, John Horne wrote:

Hello,

Can you tell it's Friday afternoon? What should be a simple problem
always seems to become a nightmare on Friday afternoons! :-)

Using SA 3.3.1 I have the following simple rule:

 body   LOCAL_JH /userid:\s*\n/i

which should look for 'userid:', any number of spaces and then a NL
character (that is, there is nothing following the spaces on the same
line).

If I send a message containing:

some textNL
userid: NL
some more text

it fails. If I insert a NL before 'some more text', then it works.

I tried using '/userid:\s*$/mi', but that too didn't work.


Can someone show me how to match a newline character in the above rule
please?


can you post a spam sample @ pastebin?


Re: Spamassasin as a gateway filter for Exchange

2010-05-19 Thread Yet Another Ninja

On 2010-05-19 23:26, Karsten Bräckelmann wrote:

On Wed, 2010-05-19 at 23:13 +0200, Mikael Syska wrote:

Not to highjack the thread, but there are also other things to consider.

I have no idea how on Postfix, but this could help you too Scott Lavoie.

If there are multiple exchange backends for postfix/spamasassin
gateway ... how could one validate that users exists, given that you
only have a list of valid users for some of the exchange servers and
the mailahead/milterahead/smtp are not an option?


Don't think you're hijacking the thread -- you just stated exactly, what
I mentioned in my previous post.

The only real problem, validating recipients at the front MX, based on
the data in the backend Exchange servers. Everything else is not a
problem, even though managing a Linux server might seem to be one from
the point of view of a Windows admin... ;)


This can be done VERY easily and safely with milter-ahead 
(Sendmail/Postfix).


If you don't need the nifty extras milter-ahead supplies, Postfix has 
built in rcpt adddress verification which imo, works well and is also 
well documented





Re: Spamassasin as a gateway filter for Exchange

2010-05-19 Thread Yet Another Ninja

On 2010-05-19 23:57, Karsten Bräckelmann wrote:

snipped


What I can do, however, is to split up the original question into
manageable chunks, as unrelated as possible. SA and postfix on Debian?


I can highly recommend: FuGlu  http://sourceforge.net/apps/trac/fuglu/



Re: new PDF Launch malware exploit (with sample)

2010-04-28 Thread Yet Another Ninja

On 2010-04-28 20:01, Chip M. wrote:

I haven't seen any since the first blast, so I suspect their
signatures were widely distributed by most anti-virus orgs.

I'm mainly publishing this for all of us who like to have backup
rules, and are willing to be more general than the sometimes too
tightly focused malware sigs.

For example, I've added script.vbs to my instant-death PDF word
scans.


If you still have PDFinfo in your plugin collection:

https://svn.apache.org/repos/asf/spamassassin/trunk/rulesrc/sandbox/axb/20_axb_pdf.cf

should hit on these in case AVs don't






Re: How to configure spamassassin

2010-04-09 Thread Yet Another Ninja

On 2010-04-09 17:31, hateSpam wrote:

Thanks a lot for replies. Do I have to install Amavisd-new and ClamAV to get
spamassassin working? Is there any other way to configure spamassassin with
postfix not installing additional software?


See: http://wiki.apache.org/spamassassin/IntegratedInMta

also:
http://wiki.apache.org/spamassassin/StartUsing

h2



Ned Slider wrote:

Birta Levente wrote:

On 09/04/2010 13:43, hateSpam wrote:

Dear All,
I have Spamassassin on my Centos 5.4. For send and receive email I use
postfix and Dovecot and Sendmail version 8.13.8. Since I have 
You seem a little confused - are you running postfix or sendmail as your 
MTA?



spamassassin I have not configured it. We are getting about 20 spams per
day. I want to configure it and get it working. I did google it there
are
some information but all in different server, some I tried did not work.

I will appreciate if anyone know how to configure it from scratch after
installing it.

Thanks in advance
Hatspam
   

Look at this cool howto:

http://postfixmail.com/blog/index.php/clamav-and-spamassassin-on-centos-5-postfix/ 



Or refer to the CentOS documentation here:

http://wiki.centos.org/HowTos#head-0facb50d5796bee0bd394636c32ffa9a997a6ab5

Specifically:

http://wiki.centos.org/HowTos/postfix
http://wiki.centos.org/HowTos/Amavisd

Hope that helps.







Re: sa-update channels

2010-03-18 Thread Yet Another Ninja

On 2010-03-18 15:02, Jason Bertoch wrote:

On 2010/03/17 6:20 PM, Micah Anderson wrote:


I'm trying to find out what the current state of the art is for plugins
and channel updates.

For channels I've been using:

updates.spamassassin.org
sought.rules.yerp.org
saupdates.openprotect.com

But I wonder if the last two are still relevant, or if there are other
lists to use instead?



My update channels include:

updates.spamassassin.org
sought.rules.yerp.org
90_2tld.cf.sare.sa-update.dostech.net
90_3tld.cf.sare.sa-update.dostech.net

The sought rules are still relevant, though the update server has been 
hit and miss lately.


If I recall correctly, 90_2tld.cf can be used by everyone, while 
90_3tld.cf applies to SA 3.3.x only.  There has been talk on the dev 
list about incorporating both into a future SA version, though.


Pls top using
 90_2tld.cf.sare.sa-update.dostech.net
 90_3tld.cf.sare.sa-update.dostech.net

These files have been merged inot future SA 3.3.1 and are released as 
20_aux_tlds.cf


With a bit of luck there will be a sa-update release for 3.2.x and 3.3.0




Re: bayes, numbers of tokens and performance

2010-03-18 Thread Yet Another Ninja

On 2010-03-18 16:36, tonjg wrote:

update: after doing some reading on google I found init.pre and added:
loadplugin Mail::SpamAssassin::Plugin::Razor2
and
loadplugin Mail::SpamAssassin::Plugin::Pyzor
and restarted spamassassin.



Did you also install the plugins?
These two are not delivered with SA.


Re: Whitelist isn't working

2010-03-16 Thread Yet Another Ninja

QUICK FIX!
borked FH_DATE_PAST_20XX is your problem.

set in local.cf

score FH_DATE_PAST_20XX 0

and then read up about this rule in the list archive



On 2010-03-16 12:26, Phill Edwards wrote:

I'm running Spamassassin 3.2.5. I'm getting masses and masses of false
positives. I trashed my Bayes DB the other day and rebuilt it from
scratch with sa-learn but I'm still getting false positives. One
particularly troublesome one is a Freecycle mailing list that I
subscribe to. I have put this in the config file but it still keeps
getting marked as spam:

def_whitelist_from_rcvd *...@posts.freecycle.org posts.freecycle.org

The message headers of one of these emails that got falsely tagged as
spam look like this:

Return-path: post-1601702-2890...@bounces.freecycle.org
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on ash.edwards.home
X-Spam-Level: RR
X-Spam-Status: Yes, score=6.6 required=5.0 tests=BAYES_00,DATE_IN_FUTURE_06_12,

DKIM_SIGNED,DKIM_VERIFIED,FH_DATE_PAST_20XX,FROM_STARTS_WITH_NUMS,SPF_FAIL,
TVD_RCVD_IP autolearn=no version=3.2.5
X-Spam-Report:
*  1.9 TVD_RCVD_IP TVD_RCVD_IP
*  3.2 FH_DATE_PAST_20XX The date is grossly in the future.
*  1.5 FROM_STARTS_WITH_NUMS From: starts with many numbers
*  1.9 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after Received: date
*  0.7 SPF_FAIL SPF: sender does not match SPF record (fail)
*  [SPF failed: Please see
http://www.openspf.org/Why?s=mfrom;id=post-1601702-2890135%40bounces.freecycle.org;ip=220.233.2.146;r=ash.edwards.home]
* -0.0 DKIM_VERIFIED Domain Keys Identified Mail: signature passes
*  verification
*  0.0 DKIM_SIGNED Domain Keys Identified Mail: message has a signature
* -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
*  [score: 0.]
Envelope-to: myn...@exemail.com.au
Delivery-date: Tue, 16 Mar 2010 17:51:22 +1100
Received: from 146.2.233.220.static.exetel.com.au ([220.233.2.146]
helo=mscip02.mailsentry.net.au)
by chestnut2.exetel.com.au with esmtp (Exim 4.68)
(envelope-from post-1601702-2890...@bounces.freecycle.org)
id 1NrQcc-PC-Us
for myn...@exemail.com.au; Tue, 16 Mar 2010 17:51:22 +1100
Received: from bulkmail2.freecycle.org ([95.172.20.170])
  by mscip02.mailsentry.net.au with ESMTP; 16 Mar 2010 17:51:21 +1100
Received: from localhost ([127.0.0.1] helo=freecycle.org)
by bulkmail2.freecycle.org with esmtp (Exim 4.69)
(envelope-from post-1601702-2890...@bounces.freecycle.org)
id 1NrQcZ-0001Df-Ct
for myn...@exemail.com.au; Tue, 16 Mar 2010 06:51:19 +
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=freecycle.org; h=
content-type:content-transfer-encoding:mime-version:list-id
:list-archive:list-unsubscribe:sender:subject:list-help
:list-post:date:list-owner:list-subscribe:from:to; s=dkim; bh=LS
8YK/tV+qiYlNx3atLWbnpUECc=; b=UQ3qhcXpAOSfz4+PHNWPKGKVNxumuqWq7f
E0ChhlyH0km2Yr6oca4q+jPMXbkVoKKE41IV309Z7nedXeXsUMorRSm5Bz0+PmJt
WI+riErLsOK+/8r5wi5P1ZCjYBrHn4Ozm4NiEkL/OrOVNlnSBMayjgZBbE1nZ6z0
Um2MxdIXU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=freecycle.org; h=content-type
:content-transfer-encoding:mime-version:list-id:list-archive
:list-unsubscribe:sender:subject:list-help:list-post:date
:list-owner:list-subscribe:from:to; q=dns; s=dkim; b=GLdug+LLz4R
ZmFtMl21GJB+VmyTaecD6N63kWNZnTDEvugWXEBNktE8h2Q4x2FidlH2Ioklhckw
xeR2PoqD4knlbQjNjDfVu6th+vA9CgqZ5cKK5VHd3lR/RS0GGQxPa1HuMyKhMXP5
Fd5LZ8mx39XxQq46VovNYomEPQFTHNvo=
Content-Type: text/plain; charset=UTF-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Mailer: My Freecycle (http://my.freecycle.org)
List-ID: WilloughbyFreecycle.groups.freecycle.org
X-TFN-Group: WilloughbyFreecycle
X-TFN-Postid: 2890135
List-Archive: 
http://www.freecycle.orghttp://groups.freecycle.org/WilloughbyFreecycle
List-Unsubscribe: http://my.freecycle.org/home/groups/,
 mailto:willoughbyfreecy...@mods.freecycle.org?subject=please unsubscribe
 me
Sender: My Freecycle rw_boun...@freecycle.org
Subject: {SPAM 06.6} [WilloughbyFreecycle] OFFER: 'Bycol Clear' (Longueville)
List-Help: 
http://www.freecycle.orghttp://groups.freecycle.org/WilloughbyFreecycle,
 mailto:willoughbyfreecy...@mods.freecycle.org?subject=help (Group
 ModTeam)
List-Post: mailto:willoughbyfreecy...@groups.freecycle.org
Date: Tue, 16 Mar 2010 17:51:13 -
List-Owner: mailto:willoughbyfreecy...@mods.freecycle.org (Group ModTeam)
List-Subscribe: http://my.freecycle.org/home/groups/,
mailto:willoughbyfreecy...@mods.freecycle.org
From: frances.dejong 2890...@posts.freecycle.org
To: myname myn...@exemail.com.au
Message-Id: e1nrqcz-0001df...@bulkmail2.freecycle.org
X-Spam-Prev-Subject: [WilloughbyFreecycle] OFFER: 'Bycol Clear' (Longueville)



Can anyone explain why the whitelist entry isn't 

Re: URIBL Notice

2010-03-12 Thread Yet Another Ninja

On 2010-03-12 16:48, Ray Dzek wrote:

I just received the dreaded URIBL You send us to many DNS queries
notice.  This is fine.  We have been growing and I am sure our
queries have gone up.  But when looking at their data feed service
options the first thing I noticed was that there is no fee structure.
I don't know about you, but that is always a red flag in my world.
Before I even get past the first paragraph it already smells like a
shakedown.

But...

My real question is how badly is my SA environment going to be
impacted by turning URIBL off?  What increase in spam should I
expect?


These stats are for small trap box which only accepts mail from bots and 
rejects stuff listed by DNSWL and other public WLs. Since midnight CET-


These are only URI BL tats - so you woun't see other dnsbls like 
Spamcop, etc.


Some zones may sound familiar - others are private.

RANKRULE NAME   COUNT  %OFMAIL %OFSPAM  %OFHAM

   1URI_IN_MSG  4794393.48   95.22   53.92
   2ANY_URIBL_COM   4646088.38   92.270.09
   3URIBL_BLACK 4518685.95   89.740.00
   4ANY_SPAMHAUS4229980.46   84.010.05
   5URIBL_DBL   3955575.24   78.560.00
   6HTML_MESSAGE3940576.83   78.26   44.46
   7URIBL_SBL   3868073.58   76.820.00
   8CM_URI_DNSBL3863673.49   76.730.00
   9AXB_BLACK_NSIP  3843973.12   76.340.00
  10URIBL_SPAMEATINGMONKEY_RED  3825072.76   75.970.00
  11ANY_SURBL   3374264.24   67.011.31
  12URIBL_SC_SWINOG 3326563.28   66.070.00
  13URIBL_IVMURI3298762.75   65.520.00
  14RDNS_NONE   3173362.82   63.02   58.11
  15URIBL_JP_SURBL  3040857.84   60.390.00
  16URIBL_SPAMEATINGMONKEY_BLACK3033857.71   60.250.00
  17URIBL_DRS_BLACK 3027257.58   60.120.00
  18MIME_HTML_ONLY  2948556.13   58.561.08
  19URIBL_WS_SURBL  2895955.14   57.521.31
  20URIBL_AB_SURBL  2720051.74   54.020.00
  21AXB_BLACK_NS2507647.70   49.800.00
  22HK_NAME_DRUGS   1602730.49   31.830.00
  23RDNS_DYNAMIC1333426.09   26.48   17.25
  24URIBL_OB_SURBL  1270124.16   25.230.00
  25FSL_HELO_NON_FQDN_1 1268125.47   25.19   31.89



Re: URIBL Notice

2010-03-12 Thread Yet Another Ninja

On 2010-03-12 20:23, Rob McEwen wrote:

Yet Another Ninja wrote:

These stats are for small trap box which only accepts mail from bots
and rejects stuff listed by DNSWL and other public WLs. Since midnight
CET-
These are only URI BL tats - so you woun't see other dnsbls like
Spamcop, etc.


Alex,

about those stats...

(1) Do those include spams sent to non-existent users (i.e. dictionary
attack spams)?


there are no users - its  trap domains which have never had any real 
users - ever.



(2) Was pre-filtering done, such as collecting stats only on messages
which made it past zen.spamhaus.org (etc.)? Or was there no pre-filtering?


no prefiltering except rejecting potential bounces and stuff leaking 
from whatever may be on DNSWL and a coupleof other WLs.





Re: URIBL Notice

2010-03-12 Thread Yet Another Ninja

On 2010-03-13 0:50, Rob McEwen wrote:

Yet Another Ninja wrote:

there are no users - its  trap domains which have never had any real
users - ever.

no prefiltering except rejecting potential bounces and stuff leaking
from whatever may be on DNSWL and a coupleof other WLs. 


Alex,

Your stats are certainly valuable and illustrative... but not reflective
of the stats one would see in a MOST real world mail streams where:


was not the point, as your real world is yours, and not somebody elses.

I specified what those stats showed.. only bot spam. There is no ham, no 
users, no ESP traffic, no bounces, just trash  /dev/null





Re: [Emerging-Sigs] SIG: SpamAssassin Milter Plugin Remote Arbitrary Command Injection Attempt

2010-03-09 Thread Yet Another Ninja

On 2010-03-09 13:51, Brian wrote:

On Tue, 2010-03-09 at 13:17 +0100, Ralf Hildebrandt wrote:

* Brian brel.astersik100...@copperproductions.co.uk:


In the year 2010 it is not unreasonable to expect the MTA that takes
responsibility for accepting a message to make reasonable checks about
the validity or content of that message. 

Postfix can do this either via the milter interface OR the
smtpd_proxy_filter

It's very easy.


GROAN *** WE KNOW THAT!
Look at the title and read the post Ralf. The point is you need to use a
milter or proxy/policy daemon to do this with Postfix. The point being
'Why does it not natively support this functionality in the year 2010?' 


Answer: Because Weitse (AKA 'God') says so, so you all jump and say 'yes
sir, no sir, three bags full sir'.

So Ralf - author of 'The Postfix Book', can you please now tell me how
to get Postfix to reject mail before it accepts it and gives a 250 -
When Spamassassin tags it as spam? 


It's 2010, spam accounts for 9x% of mail so please share with me how you
can do this with just a minor config change with Postfix. The caveat you
can't use the milter, you can't use 'amavis-crashalot' and a 250 must
not be given if Spamassassin marks it as spam. I can't find it in your
book anywhere old chap..

I'm happy to stay on the Postfix 'merry-go-round' for an answer, or we
can just agree Postfix can't easily do this and move on and stop
flogging this dead horse :-)


good idea -

Here, its totally off topic.

Move it to Postfix lists



Re: 90_sare_freemail.cf.sare.sa-update.dostech.net

2010-03-09 Thread Yet Another Ninja

On 2010-03-09 15:48, Rosenbaum, Larry M. wrote:

From: Yet Another Ninja [mailto:sa-l...@alexb.ch]

On 3/4/2010 7:34 PM, Rosenbaum, Larry M. wrote:

From: Karsten Bräckelmann [mailto:guent...@rudersport.de]

On Thu, 2010-03-04 at 00:12 +0100, Yet Another Ninja wrote:

On 3/3/2010 10:09 PM, Karsten Bräckelmann wrote:

On Wed, 2010-03-03 at 15:38 -0500, Rosenbaum, Larry M. wrote:

Is there still a reason for this update channel?

90_sare_freemail.cf.sare.sa-update.dostech.net

Or is it now built in to SA v3.3.0?

  ^

20_freemail.cf and 20_freemail_domains.cf ?

90_sare_freemail.cf is still supported by for ppl who haven't

upgraded

to SA 3.3.x

Thanks for that addition and confirmation of status. :)

The original question and hence my answer was specifically about

3.3.x,

though, and whether it still is needed from external sources with

that

version.


I'm doing the same additions to 20_freemail_domains.cf

Later this year, 90_sare_freemail.cf, will become unsupported.

Anybody using SA 3.3.x should drop 90_sare_freemail.cf usage.

Thanks, but I'm confused, as there are domains in 90_sare_freemail.cf

that are not currently in 20_freemail_domains.cf.

Hi Larry...

Never got around to do the diff... your msg triggered :-)
Unless I borked it, it should now included the missing from
90_sare_freemail.cf


I still don't see the domains in 20_freemail_domains.cf.


If you've done the diff, pls post the missing domains.



Re: How to find where email server has been blacklisted

2010-03-08 Thread Yet Another Ninja

On 2010-03-08 1:24, Rops wrote:

Hello

I'm trying to figure out why some emails get lost, which most likely is due
to emails killed by ISP spam filter due to high spam score these lost email
have.

How to find out if some mail server is blacklisted and where?
Is there any central database for queries from all different blacklists?
Also IP based search is required and data when and why.


IP based search may be needed, as server under question has it's mailbox
hosted with ISP, but I believe that still the virtual server can be
blacklisted separately based on it's static IP and not the whole ISP mail
server.

Additional side effect is that emails sent inside company get lost more
often - I believe because  they virtual server is blacklisted somewhere and
therefore emails sent always gather higher spam score.
So the question is to find out where it's blacklisted?


Thanks for any help and guidelines how and where to continue!


http://www.robtex.com/


Re: Zen.spamhous.org score for spam assassin...

2010-03-08 Thread Yet Another Ninja

On 2010-03-08 12:29, Dhaval Soni wrote:

Dear All,

I want to use zen.spamhous.org for spam check. So we need to do entry in
spam.lists.conf file. But do we need to mention score for it? If yes, where
to do it?


spam.lists.conf is not part of Spamassassin (sounds like MailScanner)

Pls see:
http://www.spamhaus.org/faq/answers.lasso?section=Spamhaus%20DBL


Re: 90_sare_freemail.cf.sare.sa-update.dostech.net

2010-03-03 Thread Yet Another Ninja

On 3/3/2010 10:09 PM, Karsten Bräckelmann wrote:

On Wed, 2010-03-03 at 15:38 -0500, Rosenbaum, Larry M. wrote:
Is there still a reason for this update channel? 


90_sare_freemail.cf.sare.sa-update.dostech.net

Or is it now built in to SA v3.3.0?


20_freemail.cf and 20_freemail_domains.cf ?



90_sare_freemail.cf is still supported by for ppl who haven't upgraded 
to SA 3.3.x


I'm doing the same additions to 20_freemail_domains.cf

Later this year, 90_sare_freemail.cf, will become unsupported.

Anybody using SA 3.3.x should drop 90_sare_freemail.cf usage.



Re: .pn TLDs not recognized for util_rb_2tld?

2010-02-25 Thread Yet Another Ninja

On 2/25/2010 11:41 PM, Daniel McDonald wrote:

config: SpamAssassin failed to parse line, co.at.pn is not valid for
util_rb_2tld, skipping: util_rb_2tld co.at.pn
config: SpamAssassin failed to parse line, co.uk.pn is not valid for
util_rb_2tld, skipping: util_rb_2tld co.uk.pn
config: SpamAssassin failed to parse line, com.au.pn is not valid for
util_rb_2tld, skipping: util_rb_2tld com.au.pn
channel: lint check of update failed, channel failed


$ dig +short 5.2.3.90_2tld.cf.sare.sa-update.dostech.net txt
201002251100

Shouldn¹t those have util_rb_3tld?



thx for spotting - fixed copy/paste goof

Pls allow some time for updates to be spread


Re: 90_2tld.cf / / 90_3tld.cf

2010-02-02 Thread Yet Another Ninja

On 2/1/2010 10:50 PM, Karsten Bräckelmann wrote:

On Mon, 2010-02-01 at 22:33 +0100, Yet Another Ninja wrote:
- If someone knows how to put these two rule sets in one file and 
activate according to SA version, pls let me know... I'm stumped.


Preprocessing Options [1] in the SA Conf documentation. :)

if (version = 3.003000)
  # util_rb_3tld blob goes here
endif

Hmm, doesn't mention = specifically. Guess it's supported, though,
otherwise you'd need a minor hack like  3.002999.

  guenther


[1] 
http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#preprocessing_options


Thanks for the hint.

I've updated http://www.rulesemporium.com/rules/90_2tld.cf to check for 
version.


Should be visible as soon as SARE mirrors are in sync.


Re: 90_2tld.cf / / 90_3tld.cf

2010-02-02 Thread Yet Another Ninja

On 2/2/2010 1:03 PM, Randal, Phil wrote:

There's an extraneous linebreak or two in there:

#

SA  3.3.0
if (version = 3.003000)


SA  3.3.0 was missing a comment...
fixed
thx


90_2tld.cf / / 90_3tld.cf

2010-02-01 Thread Yet Another Ninja

For those using SA 3.3.x I've split the tld files :

SA  3.3.x  ONLY!
http://www.rulesemporium.com/rules/90_3tld.cf


SA  3.2.4
http://www.rulesemporium.com/rules/90_2tld.cf


SA 3.3.x users will require both files.

- If someone knows how to put these two rule sets in one file and 
activate according to SA version, pls let me know... I'm stumped.


- If someone thinks this should be added to mainstream SA, collect votes 
and submit a bug.


enjoy


Re: [Sare-users] painting everybody in Taiwan with the same brush

2010-01-28 Thread Yet Another Ninja


On 1/28/2010 5:23 PM, Adam Katz wrote:

However, as you noted earlier:

It's all because
http://www.rulesemporium.com/rules/70_sare_header1.cf
header   SARE_RECV_SPAM_DOMN0b Received =~ 
/\bdynamic.hinet\.(?:com|net|org|info)/
describe SARE_RECV_SPAM_DOMN0b Email passed through apparent spammer domain
scoreSARE_RECV_SPAM_DOMN0b 1.666


This rule is poorly written as it does not limit its examination to
the last external relay.  Were SARE accepting revisions (and assuming
I've read the intent right), it should be reworked so as to be defined
as (be wary of mail agent rewrapping):

header SARE_RECV_SPAM_DOMN0b X-Spam-Relays-External =~ /[^\]]+
rdns=[^ ]{0,25}\bdynamic.hinet\.(?:com|net|org|info)(?:\.tw)? /


the rule has been scored 0.0

It can be replaced by a SA rule if desired.


Dear Santa

2009-12-16 Thread Yet Another Ninja

Dear Santa,

 SA users hope Justin Mason has moved into his newly renovated home and 
he find the time  energy to bring the SOUGHT rule magic back to us.

 As this is an Xmas wish, we hope you, Santa Claus, will help him.

Axb
PS: If JM posts a link to his Amazon wishlist, maybe we can all help him 
decorate the new place :-)





Re: emailreg.org - tainted white list

2009-12-16 Thread Yet Another Ninja

On 12/16/2009 3:23 PM, LuKreme wrote:

On 16-Dec-2009, at 07:12, Bowie Bailey wrote:

uses.  The only thing that really matters is how effective they are.  If
a blacklist blocks spammers without blocking too many legitimate mails,
use it.  If a whitelist allows legitimate mail without sending through
too many spams, use it.  Even lists that have a fair number of false
hits are useful in SA -- just with lower scores.



The trouble with this is how often are these rules being re-examined and 
re-evaluated?


blabber... checkout SVN - follow dev list... HABEAS is history...





Re: emailreg.org - tainted white list

2009-12-16 Thread Yet Another Ninja

On 12/16/2009 6:16 PM, Charles Gregory wrote:

On Wed, 16 Dec 2009, Yet Another Ninja wrote:

blabber... checkout SVN - follow dev list... HABEAS is history...


I believe the *point* here is that HABEAS is NOT 'history' for ordinary 
systems running ordinary sa-update on 3.2.5.


they can adjust scores if they don't approve of what has been delivered, 
right? If they don't it means they're ok, don't care or can't be 
bothered, pick what fits.


My rules (in /var/lib/spamassassin) still include the strong negative 
scores for HABEAS, as discussed here.


funny.. my rules show a 0 score for HABEAS stuff, same with all the 
other certification services  oh wait!! I adjusted the scores myself 
coz I didn't want them in my way.


So cool that I can do stuff like that without depending and/or waiting 
for a minor fix via Windows Update.


BIG thanks to Daniel Quinlan, Justin and all the others who came up with 
such a nifty system.

Also thanks to McAfee for your dev support.

I respect the freedom and privileges of developers who are not being 
paid for all their hard work, but I would appreciate it if statements 
like the one above could be more accurately phrased, to at least say 
HABEAS will be history after {date}, at which time sa-update channels 
will be updated


when SA 3.3.0 is released... when? when its finished, as you have 
already read in the dev list.


Sarcasm?
Yes...

moving on





Re: Spam from compromised web mails

2009-12-15 Thread Yet Another Ninja

On 12/15/2009 4:07 PM, Rajkumar S wrote:

On Tue, Dec 15, 2009 at 8:29 PM, Matt Garretson
ma...@assembly.state.ny.us wrote:

Do you use Bayes?  Bogofilter (another bayesian filter) catches
those here.  The one you posted scored 0.94 here and would have
been dropped.


I am not using bayes as of now, SA is site wide and so proper training
is a problem.


even using site wide, autolearning will help your detection a LOT.
Don't underestimate it...



Re: Site-wide Bayes

2009-12-15 Thread Yet Another Ninja

On 12/15/2009 5:49 PM, Charles Gregory wrote:

On Tue, 15 Dec 2009, Matt Garretson wrote:
Heartily agreed. Site-wide bayes here (single database for 2000+ 
users) catches 40% of the spam here.


But what is the FP rate? Is it safe for an ISP with a widely varied user 
base to use site-wide Bayes?


from my experience, yes.

the auto-fodder is just as diverse making Bayes very rugged and 
effective. You just need a good amount of ham traffic...




Re: Spam from compromised web mails

2009-12-15 Thread Yet Another Ninja

On 12/16/2009 8:24 AM, Rajkumar S wrote:

On Tue, Dec 15, 2009 at 9:07 PM, Yet Another Ninja sa-l...@alexb.ch wrote:

even using site wide, autolearning will help your detection a LOT.
Don't underestimate it...


When running site wide, how do you get ham to train bayes? I can
manage spam by spam reporting and such, but getting ham without
breaching the privacy of our users is my problem.

raj


I don't do any manual training, ever. SA's butler, autolearn, does 
it for me.


bayes_auto_learn  1

h2h

Axb


Re: [sa] RE: emailreg.org - tainted white list

2009-12-14 Thread Yet Another Ninja

On 12/14/2009 10:23 PM, Martin Gregorie wrote:

May I suggest that handling whitelist or blacklist rules and any
associated plugins by packaging them as separately installable modules
may be of benefit to SA maintainers. The idea is to reduce the SA dev
workload by handing off responsibility for maintaining and bugfixing
such modules to external developers. These may, as at present, be the
person who independently develops the module or the people who are
responsible for the resources it queries. Here's a little more detail:

- exclude the modules from the default SA configuration and from SA
  updates.
- create a library of downloadable modules, one for each external
  resource. Each module consists of:

  - a .cf file and a .pm file, if required, that should be installed by
putting both in /etc/mail/spamassassin
  - version info
  - installation and configuration instructions
  - attributions: author, the author's affiliations, etc
  - a disclaimer saying that SA distributes the module as is and without
liability or responsibility for its correctness

- anybody, including whitelist owners, can supply a module and will be
  solely responsible for maintaining it.
- modules MUST be accompanied by regression test data in the form of
  messages that demonstrate hits, misses and corner tests.
- SA devs should review the documentation and verify module operation
  using the supplied test data to show that the module does what it says
  on the tin and doesn't crash SA or interfere with other rules/plugins
  before accepting a module for publication. 
- the modules should be included in regression tests for new SA

  versions. If a module fails a regression test it is excluded from the
  library and its author notified. This way unmaintained modules will
  eventually disappear with minimal work from SA devs apart from
  removing the model from the distribution library and adding it to a
  list of no longer supported modules. 

  
There may be problems with this approach that I'm not aware of, but I'm

floating it because AFAIK nobody else has suggested it and it may defang
some of the discussions around whitelists, etc. by making the use of
such rules and modules independent of the SA project.


your modules are all there already and much of it is already managed as 
you suggest: they're called rules..  you can even switch them on or off, 
or add your own modules /plugins/modules.


SA provides an Open Source FRAMEWORK which caters to many millions of 
systems - if it doesn't fit your needs, use as you wish and/or fork out.

Many do that with the ruleset - many don't

SA devs are volunteers. What's stopping you from actively contributing 
to the development?


Get familiar with the Wiki, checkout SVN, look at the masscheck code, 
bath in the Wiki.


Following a comprehensive set of standards, anybody can contribute 
patches/fixes/etc.


h2h

Axb


Re: [sa] RE: emailreg.org - tainted white list

2009-12-14 Thread Yet Another Ninja

On 12/14/2009 10:55 PM, Daniel J McDonald wrote:

I'd love to have the clamav unofficial signature families scored.  I
have a fine guess as to how relevant they are, but it is just that - a
guess.  


someone, somewhere is alreay converting ClamV signatures to HUGE (slow) 
rule files, forgot where I saw them. Google around...








Re: Interesting low scoring phish

2009-12-07 Thread Yet Another Ninja

On 12/7/2009 3:42 PM, rich...@buzzhost.co.uk wrote:

http://pastebin.com/m7c1c17d

Interesting insofar as it appears to be whitelisted??? Is this some kind
of well known US email or hosting service?

Sane missed it, the dnsbl's have missed it and the content filtering has
missed it. So it's a tasty morsel of spam :-)


By clicking on the link you agree to download...

where' the URL?
html part missing in sample?


rule_du_jour: AXB_CID_YARIGHT

2009-12-06 Thread Yet Another Ninja

this rule won't work for long :-)

ifplugin Mail::SpamAssassin::Plugin::MIMEHeader
mimeheader AXB_CID_YARIGHT  Content-ID =~ /^\00\{DIGIT2\}/
score  AXB_CID_YARIGHT  3.0
endif


score higher if you wish...

have a {ENJOY_VAR} Sunday!


Re: HABEAS_ACCREDITED WHY BY DEFAULT?

2009-12-04 Thread Yet Another Ninja

On 12/4/2009 10:57 AM, rich...@buzzhost.co.uk wrote:
  FINAL

This is not a social club, it's a question and issues list for
Spamassassin. My question and issue is why, by default, does
Spamassassin use the HABEAS white list, and why is it out of the box set
with a score to favour delivery of their junk? It's a fair question. The
answer 'just change the score' is not the correct answer. 


the answer is totally correct. SA is a framework, which luckily allows 
YOU do whatever you want with it, so please do, whatever YOU want (that 
does not include beating a dead horse on the list) and move on.



The correct answer will be precisely why this state of affairs exists.


- because developers think/have thought its a good idea.

- because nobody other than you makes such a noise about it. And YOU who 
are so against, have you submitted a bug to have whatever reconsidered.


EOT







seek-phrases-in-log pattern length

2009-11-26 Thread Yet Another Ninja
Is there a way to limit the pattern size in rules created by 
seek-phrases-in-log ?


I'd like to avoid creating rules using patterns with +200 characters.

hints very appreciated.

Axb


Re: masscheck Dumptext.pm line 26.

2009-11-25 Thread Yet Another Ninja

On 11/25/2009 3:56 AM, John Hardin wrote:

On Tue, 24 Nov 2009, Justin Mason wrote:


that's normal.  can be ignored

On Tue, Nov 24, 2009 at 21:04, Yet Another Ninja sa-l...@alexb.ch 
wrote:



When running masscheck calling:

/home/mc/masscheck/spamassassin/trunk/masses  nice ./mass-check \
 --cf='loadplugin Dumptext plugins/Dumptext.pm' \
 --cf='loadplugin Mail::SpamAssassin::Plugin::Check' \
 -j=2 -n -o --rules='^(?!JM_SOUGHT)(?!T_JM_SOUGHT)' \
 spam:dir:/home/mc/Maildir/.SPAM/cur \
 /home/mc/masscheck/seekrules/w.s )

I get this output and am at totally stumped:

Wide character in print at
/home/mc/masscheck/spamassassin/trunk/masses/plugins/Dumptext.pm line 
26.


anybody any ideas?


I did open a bug about masscheck stalling on multibyte characters when 
run with multiple threads, and offered a patch:


   https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6226

Does that fix it for you?


patched - ran routine..

no change -

Wide character in print at 
/home/mc/masscheck/spamassassin/trunk/masses/plugins/Dumptext.pm line 26.


seems here to stay :-)

thx

:-(


Re: well, isnt that special...

2009-11-25 Thread Yet Another Ninja

On 11/25/2009 11:29 PM, Alex wrote:

iptables -A FIREWALL -s 127.0.0.0/8 -j DROP


Very good. That was nearly funny :-) Why don't you add:
iptables -A FIREWALL -s 0.0.0.0/0 -j DROP and enjoy the silence :-)


Trouble is that you have to be the one that drives to the colo to
eventually undo the rules :-)

Speaking of fw rules, has anyone considered something to automate the
SANS top 10?

http://isc.sans.org/top10.html

Would that be effective?


not relevant to Spamassassin, is it?

if you have to go way off topic at pleas be considerat and add an OT: 
tag to the subject..   /dev/null


or try:  http://spam-l.com/mailman/listinfo


Re: emailBL devel ?

2009-11-24 Thread Yet Another Ninja

On 11/24/2009 6:22 PM, R-Elists wrote:

didnt anyone think that the emailBL project was good enough in adding an
extra factor of protection to continue development?

 - rh


Freemail.pm plugin does it pretty well without the overhead and cron'd 
replication lag...


Re: emailBL devel ?

2009-11-24 Thread Yet Another Ninja

On 11/24/2009 6:34 PM, Benny Pedersen wrote:

On tir 24 nov 2009 18:30:15 CET, Yet Another Ninja wrote
Freemail.pm plugin does it pretty well without the overhead and cron'd 
replication lag...


just one problem with freemail it should list all domain as freemail as 
default, unless there is a clear sign of payment to get it


otherway around is to easy for spammers



you mean a rhsdnsbl?
iirc, there's a bunch of them around.

seems simpler than adding 1 domains to freemail's config .-)


Re: emailBL devel ?

2009-11-24 Thread Yet Another Ninja

On 11/24/2009 7:10 PM, Benny Pedersen wrote:

On tir 24 nov 2009 19:02:29 CET, Yet Another Ninja wrote

seems simpler than adding 1 domains to freemail's config .-)


that why i like to change it to be paidmail.pm with lists of paid domains

got it now ? :)

spammers can get any free domain and it can continue as a freemail, but 
when some have to paid, oh well


intresting part is that i see most spam comes from freemail domains, and 
when i see spam from paid domains its content in body that is spam, not 
the sender domain, so far i see a pattern there




sounds like a great idea.. let us know when you're ready so we can start 
testing...


masscheck Dumptext.pm line 26.

2009-11-24 Thread Yet Another Ninja

When running masscheck calling:

/home/mc/masscheck/spamassassin/trunk/masses  nice ./mass-check \
  --cf='loadplugin Dumptext plugins/Dumptext.pm' \
  --cf='loadplugin Mail::SpamAssassin::Plugin::Check' \
  -j=2 -n -o --rules='^(?!JM_SOUGHT)(?!T_JM_SOUGHT)' \
  spam:dir:/home/mc/Maildir/.SPAM/cur \
   /home/mc/masscheck/seekrules/w.s )

I get this output and am at totally stumped:

Wide character in print at 
/home/mc/masscheck/spamassassin/trunk/masses/plugins/Dumptext.pm line 26.


anybody any ideas?

thx
Axb


Re: masscheck Dumptext.pm line 26.

2009-11-24 Thread Yet Another Ninja

On 11/25/2009 3:56 AM, John Hardin wrote:

On Tue, 24 Nov 2009, Justin Mason wrote:


that's normal.  can be ignored

On Tue, Nov 24, 2009 at 21:04, Yet Another Ninja sa-l...@alexb.ch 
wrote:



When running masscheck calling:

/home/mc/masscheck/spamassassin/trunk/masses  nice ./mass-check \
 --cf='loadplugin Dumptext plugins/Dumptext.pm' \
 --cf='loadplugin Mail::SpamAssassin::Plugin::Check' \
 -j=2 -n -o --rules='^(?!JM_SOUGHT)(?!T_JM_SOUGHT)' \
 spam:dir:/home/mc/Maildir/.SPAM/cur \
 /home/mc/masscheck/seekrules/w.s )

I get this output and am at totally stumped:

Wide character in print at
/home/mc/masscheck/spamassassin/trunk/masses/plugins/Dumptext.pm line 
26.


anybody any ideas?


I did open a bug about masscheck stalling on multibyte characters when 
run with multiple threads, and offered a patch:


   https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6226

Does that fix it for you?


Thanks John
Will test...


possible Kerio msg-id bork

2009-11-11 Thread Yet Another Ninja
Anybody here using some flavour of Kerio Mail Server... pls get back to 
me, offlist!


thanks

AXB


Re: sought rules

2009-11-04 Thread Yet Another Ninja

On 11/4/2009 5:22 PM, Charles Gregory wrote:

On Wed, 4 Nov 2009, Bowie Bailey wrote:

The SA core rules are not updated very often.  For the most part, they
just work.  If you are not already doing so, you may want to consider
Justin's Sought ruleset.  It is dynamically generated and updated every
4 hours or so.

http://wiki.apache.org/spamassassin/SoughtRules


Is there a way to examine the sought rules *before* installing them into
my spamassassin? Or at least a 'readme' so that if I download them via 
sa-update I can know which files will be created and how to remove them.
I have a number of custom rules and want to vet the auto-generated rules 
for overlap


- Charles


sa-update --help will show you how



Re: [SPAM:15.8]

2009-11-03 Thread Yet Another Ninja

On 11/3/2009 12:40 PM, rich...@buzzhost.co.uk wrote:

On Tue, 2009-11-03 at 10:55 +, Ned Slider wrote:

rich...@buzzhost.co.uk wrote:

RUSSIAN_LINKS BODY: link to .ru

Appears to miss the example:
http://pastebin.com/m7ae0f8ec

Unless I'm missing something ?


Well, lets see your RUSSIAN_LINKS rule as it hits fine on my narod and 
ru tld rules.


uriLOCAL_URI_RUm{https?://.{1,40}\.ru\b}
describeLOCAL_URI_RUcontains link to Russian domain

uriLOCAL_URI_NAROD_RUm{https?://.{1,40}\.narod\.ru\b}
describeLOCAL_URI_NAROD_RUcontains link to http://foo.narod.ru




Now here is the funny thing, mine look *exactly like that*.
{now :-P}


bet its formating is borked...
try using a rawbody rule...


Re: bringing clamav into the loop?

2009-10-31 Thread Yet Another Ninja

On 10/31/2009 2:33 PM, Gene Heskett wrote:

On Saturday 31 October 2009, Yet Another Ninja wrote:

On 10/31/2009 2:16 PM, Gene Heskett wrote:

Greetings;

Does anyone have a procmail recipe that incorporates clamav into the
checks, and one that handles the clamav output to /dev/null the viri etc?

At least I assume clamav doesn't auto-delete, I've not yet studied all
the docs, but do have freshclam running apparently ok.

this works for me:
:0cW
:
|clamdscan --no-summary --stdout -

CLAMAV_CODE=$?

:0

* CLAMAV_CODE ?? 1
/dev/null

This looks like what I had in mind.  But since I don't have that part checked 
out yet, would it then delete the mail because clamdscan had an error?  I'll 
enable the second after the first is working. :)


it will only delete the msg if clamdscan returns code 1
if it errors out, it won't return code 1

running only the first part will only show it did something if you 
enable procmail logging


Re: bringing clamav into the loop?

2009-10-31 Thread Yet Another Ninja

On 10/31/2009 2:33 PM, Gene Heskett wrote:

On Saturday 31 October 2009, Yet Another Ninja wrote:

On 10/31/2009 2:16 PM, Gene Heskett wrote:

Greetings;

Does anyone have a procmail recipe that incorporates clamav into the
checks, and one that handles the clamav output to /dev/null the viri etc?

At least I assume clamav doesn't auto-delete, I've not yet studied all
the docs, but do have freshclam running apparently ok.

this works for me:
:0cW
:
|clamdscan --no-summary --stdout -

CLAMAV_CODE=$?

:0

* CLAMAV_CODE ?? 1
/dev/null

This looks like what I had in mind.  But since I don't have that part checked 
out yet, would it then delete the mail because clamdscan had an error?  I'll 
enable the second after the first is working. :)


my recipe was stolen from this

see
http://wiki.clamav.net/bin/view/Main/ClamAndProcmail


Re: Constant Contact

2009-10-16 Thread Yet Another Ninja

On 10/16/2009 10:25 PM, Adam Katz wrote:
  I suppose it's possible that your customer base is large enough that

there aren't any repeat offenders and that each case is unique ...
digging through my archives, I don't see more than 2x of any message
from a CC customer.


look at this way, some snowshoe IP, CC snowshoes customers





Re: Spam filtering on outgoing email

2009-10-10 Thread Yet Another Ninja

On 10/10/2009 10:32 PM, Warren Togami wrote:

On 10/10/2009 11:27 AM, Marc Perkel wrote:

I'm thinking about starting a service to filter spam on outgoing email.
I was wondering if anyone has any experience doing this and has some
advice on how to do it. These customers will be businesses, not freemail
customers, and one of the only real threats is if someone gets hacked or
has some kind of web form that gets abused.

The advantages for customers would be that many of them have dynamic IPs
or static IP names that look dynamic and are worried about being
blacklisted. And I'm hoping that by tracking who they send email to that
I can match up replies and white list them.

Does this sound like a good idea? I'd like to hear from someone who is
doing this or people who have ideas about it.

Thanks in advance for your advice.



Some customers might find other aspects useful as well:

* If suspicious looking mail is outgoing, then it is highly likely an 
infected host on their network.  Part of a useful service would like an 
intrusion detection system telling the customer the addresses of their 
problem hosts.

* If outgoing mail is below a certain score you could auto-sign with DKIM.


Guys

this is totally offtopic and unrelated to SA.

If you insist, pls use an OFF-TOPIC tag or move to some other list.
seems like something for spam-l.

Warren for general bainstorming sessions you might want to subscribe to 
spam-l.com's list.


y'all have a good on-topic weekend

Axb



Re: Harvested Fresh .cn URIBL

2009-10-07 Thread Yet Another Ninja

On 10/7/2009 5:00 PM, Warren Togami wrote:
  It seems then the only way to feed a URIBL fresh .cn domains would be a
spam trap.  This proposed URIBL would be extremely easy to build on the 
infrastructure of existing trap-based DNSBL's like PSBL, HOSTKARMA or 
SEM.  My own volume of spam is too small to do this.


you haven't really looked into Spamhaus data  SA rules, have you?
have you looked into SURBL/URIBL's data  datafeeds?

what's the deal here? do you not represent RedHat?
have no acccess to RH spam data?

just can't imagine RH can't provide itself or the community with plenty 
of spam data.



Opinions of this proposal?


sorry - imo, reinventing the wheeel some time too late




Re: Harvested Fresh .cn URIBL

2009-10-07 Thread Yet Another Ninja

On 10/7/2009 8:01 PM, Rob McEwen wrote:

Blaine Fleming wrote:

I know my users never see .cn domains in their inbox
and if I didn't run a blacklist I wouldn't either.


Which brings up an interesting idea. I wonder how many legit non-spam
..cn domains exist? Surely it is a fraction of a percent of the # of .cn
domains used for spam purposes, correct?


nope.. there are zillions and growing.
same thought could apply to .com or .org or .net for chinese mom  pop 
users.


and its still pretty off-topic in the SA users list and better placed in 
spam-l.com


Re: Uppercase E-mail in Latin America

2009-10-06 Thread Yet Another Ninja

On 10/6/2009 2:33 AM, Warren Togami wrote:

Please excuse me, I used faulty logic.

I wasn't asking you anything further.  I meant I asked this friend for 
more details and it seems to be non-technical users is the most likely 
type of people to type legitimate mail in all caps.


Warren


so what score is being added to this uppercase stuff?

score UPPERCASE_50_75 0.001 0.490 0.001 0.001
score UPPERCASE_75_100 2.402 1.930 1.127 1.528

reminder: SA scores and one rule, per default won't tag something as 
spam.



where's the problem? what's the worry?


  1   2   3   >