Re: problems with TVD_SPACE_RATIO

2009-05-27 Thread mouss
Karsten Bräckelmann a écrit :
 On Tue, 2009-05-26 at 22:12 +0200, mouss wrote:
 Karsten Bräckelmann a écrit :
 
 Bug 6119 has been opened already. Please attach additional samples
 there, rather than opening a new bug for every sample.  Thanks!

   https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6119
 I've attached a few. If more is needed, just ask...
 
 No more auto-generated content, please. :)  Real, human written samples
 on the other hand...
 

4454 and 4455 are human written. (if people who spend a lot of time in
front of a terminal are still considered human ;)

and 4454 is a one line message, but the signature causes the hit.


 See comment 7.
 
   guenther
 



Re: Plugin for URL shorteners / redirects

2009-05-27 Thread Henrik K
On Tue, May 26, 2009 at 06:10:34PM -0700, John Hardin wrote:
 On Wed, 27 May 2009, Jason Haar wrote:

 Why can't SURBL be expanded to support full URLs instead of just the  
 hostname? That way you could blacklist a.bad.domain as well as  
 xttx://tinyurl . com/redirect-to-bad-domain? Some form of BASE64  
 encoding would be needed of course, but why not?

 I'd suggest hex- or base64-encoding the MD5 hash of the URI, as is being  
 done in Email BL.

You won't see paths on URIBL or SURBL, as they even haven't adopted
wildcards (try googling some discussions years ago). Better leave the path
cases to publicly rsyncable stuff like Sanesecurity.



Re: problems with TVD_SPACE_RATIO

2009-05-27 Thread Michael Monnerie
On Mittwoch 27 Mai 2009 mouss wrote:
 and 4454 is a one line message, but the signature causes the hit.

And my messages are just one-liners without .sig that should never hit 
this rule at all.

I don't have other examples in original format, but just a few days ago 
got a FP report where this rule hit a normal, german, human-typed mail.
I'll restore the original score now to see if I get more reports.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc.
--
Sorcerers have their magic wands:
  powerful, potentially dangerous tools with a life of their own.
Witches have their familiars:
  creatures disguised as household beasts that could,
  if they choose, wreak the witches' havoc.
Mystics have their golems:
  beings built of wood and tin brought to life to do their
  masters' bidding.
I have Linux.
--



Re: Plugin for URL shorteners / redirects

2009-05-27 Thread Justin Mason
Yes.  it immediately exposes a backchannel from the spam to the spammer,
thereby enabling a number of interesting security holes.

--j.

On Wed, May 27, 2009 at 05:25, Rob McEwen r...@invaluement.com wrote:
 Jason Haar wrote:
 Why can't SURBL be expanded to support
 full URLs instead of just the hostname? That way you could blacklist
 a.bad.domain as well as xttx://tinyurl . com/redirect-to-bad-domain?
 Some form of BASE64 encoding would be needed of course, but why not?

 Because spammers could easily generate a unique URL for each individual
 spam. They could then map this back to listings in URI blacklists and
 use that as a very cheap and effective way to listwash. And they only
 need to add a single astricked hostname in their DNS server to
 accomplish this. As a result of this and similar tactics, URI lists
 would bloat exponentially and this would slow down the propagation of
 the data to rsync users and to DNS mirrors, as well as bringing the
 backend processing to its knees. Finally, there is some amount of
 reputation and registration (even if hidden) associated with a domain
 due to the fact that a domain *requires* ownership. URLs and subdomains
 are more ambiguous, which then also makes removal requests extremely
 subjective and murky process.

 --
 Rob McEwen
 http://dnsbl.invaluement.com/
 r...@invaluement.com
 +1 (478) 475-9032





Re: Plugin for URL shorteners / redirects

2009-05-27 Thread Jeff Chan
On Wednesday, May 27, 2009, 1:39:11 AM, Justin Mason wrote:
 Yes.  it immediately exposes a backchannel from the spam to the spammer,
 thereby enabling a number of interesting security holes.

 --j.

Yes, it's impractical for some of the reasons Rob mentions, and
it would also allow any of the following:

1.  Listwashing
2.  Mapping out of spam traps
3.  Poisoning of spam traps
4.  Confirming delivery of spams and email addresses
etc.

Jeff C.

 On Wed, May 27, 2009 at 05:25, Rob McEwen r...@invaluement.com wrote:
 Jason Haar wrote:
 Why can't SURBL be expanded to support
 full URLs instead of just the hostname? That way you could blacklist
 a.bad.domain as well as xttx://tinyurl . com/redirect-to-bad-domain?
 Some form of BASE64 encoding would be needed of course, but why not?

 Because spammers could easily generate a unique URL for each individual
 spam. They could then map this back to listings in URI blacklists and
 use that as a very cheap and effective way to listwash. And they only
 need to add a single astricked hostname in their DNS server to
 accomplish this. As a result of this and similar tactics, URI lists
 would bloat exponentially and this would slow down the propagation of
 the data to rsync users and to DNS mirrors, as well as bringing the
 backend processing to its knees. Finally, there is some amount of
 reputation and registration (even if hidden) associated with a domain
 due to the fact that a domain *requires* ownership. URLs and subdomains
 are more ambiguous, which then also makes removal requests extremely
 subjective and murky process.




Re: Plugin for URL shorteners / redirects

2009-05-27 Thread Jeff Chan
On Tuesday, May 26, 2009, 6:20:13 PM, Jason Haar wrote:
 John Hardin wrote:

 Better still, the tinyurl-esque services should vet the URLs people
 submit against SURBL...

 They actually do. When I was trying to test Jonas URLredirect plugin, it
 was actually hard to get tinyurl.com to generate a link for some known
 spam URLs. I suspect they are indeed doing SURBL lookups. Hope I didn't
 end up blacklisting myself :-}

Yes, tinyurl and several other URL shortening services use SURBL
data to fight abuse of their services:

  http://www.surbl.org/redirect.html

Jeff C.
-- 
Jeff Chan
mailto:je...@surbl.org
http://www.surbl.org/



Re: problems with TVD_SPACE_RATIO

2009-05-27 Thread Karsten Bräckelmann
On Wed, 2009-05-27 at 09:21 +0200, Michael Monnerie wrote:
 On Mittwoch 27 Mai 2009 mouss wrote:
  and 4454 is a one line message, but the signature causes the hit.

The fact that mailing-list footer is forced onto the message with no
newline causes it. And the second hardly counts as human generated. ;)

 And my messages are just one-liners without .sig that should never hit 
 this rule at all.

Checked those samples from both of you. Lots more analysis of this eval
function added to the bug report.

See comment 12. Smells kinda fishy to me, and probably broke at some
point since its original introduction. :/


 I don't have other examples in original format, but just a few days ago 
 got a FP report where this rule hit a normal, german, human-typed mail.
 I'll restore the original score now to see if I get more reports.

Hmm, I'd love to see that one. Any *human-typed* mail featuring a real
sentence should not trigger this. Unless it's followed directly by a
huge machine-generated paste or something, without an empty line...


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Plugin for URL shorteners / redirects

2009-05-27 Thread Yet Another Ninja

On 5/27/2009 11:00 AM, Jeff Chan wrote:

On Tuesday, May 26, 2009, 6:20:13 PM, Jason Haar wrote:

John Hardin wrote:



Better still, the tinyurl-esque services should vet the URLs people
submit against SURBL...


They actually do. When I was trying to test Jonas URLredirect plugin, it
was actually hard to get tinyurl.com to generate a link for some known
spam URLs. I suspect they are indeed doing SURBL lookups. Hope I didn't
end up blacklisting myself :-}


Yes, tinyurl and several other URL shortening services use SURBL
data to fight abuse of their services:

  http://www.surbl.org/redirect.html



http://bit.ly/hnds8

This resource has permanently moved to a 
href='http://alvinabeate.narod.ru/pages.html'http://alvinabeate.narod.ru/pages.html/a.


even if SURBL saw that URL, as it doesn't list 2ltlds, 
alvinabeate.narod.ru will not be listed.


other URI blacklists won't ever see alvinabeate.narod.ru either as its 
not in the mail flow.






upgrad spamassassin

2009-05-27 Thread hateSpam

Dear All,
I have spamassassin 3.1.9 I want to upgraded it with new version 3.2.5
I will appreciate if any one can tell me how can I upgrade it? Where should
i put new version files and which command should I run?

I looked at upgrade documentation but I didn't understand what should id do.

Thanks in advance

HateSpam
-- 
View this message in context: 
http://www.nabble.com/upgrad-spamassassin-tp23743952p23743952.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: upgrad spamassassin

2009-05-27 Thread McDonald, Dan
On Wed, 2009-05-27 at 07:44 -0700, hateSpam wrote:
 Dear All,
 I have spamassassin 3.1.9 

Running on... 
 [] Redhat linux version 6.0
 [] Minix
 [] OpenVMS
 [] Sun/OS 2.0
 [] Timex Sinclair ZX81
 [] Windows NT 3.02B
 [] Something else?
Installed using...
 [] tarball install
 [] CPAN
 [] RPM
 [] deb
 [] emerge
 [] part of cpanel
 [] something else?

 I want to upgraded it with new version 3.2.5

-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com


signature.asc
Description: This is a digitally signed message part


Re: upgrad spamassassin

2009-05-27 Thread Bowie Bailey

hateSpam wrote:

Dear All,
I have spamassassin 3.1.9 I want to upgraded it with new version 3.2.5
I will appreciate if any one can tell me how can I upgrade it? Where should
i put new version files and which command should I run?

I looked at upgrade documentation but I didn't understand what should id do.

Thanks in advance

HateSpam
  


How did you install 3.1.9?  You need to install the new version the same 
way to avoid problems.


--
Bowie


Re: upgrad spamassassin

2009-05-27 Thread hateSpam

Thanks for reply.
Sorry, I should say that before. I am using CentOS Linux 5. For mailing
delivery we are using Postfix version 2.3.3 and Procmail. 




McDonald, Dan wrote:
 
 On Wed, 2009-05-27 at 07:44 -0700, hateSpam wrote:
 Dear All,
 I have spamassassin 3.1.9 
 
 Running on... 
  [] Redhat linux version 6.0
  [] Minix
  [] OpenVMS
  [] Sun/OS 2.0
  [] Timex Sinclair ZX81
  [] Windows NT 3.02B
  [] Something else?
 Installed using...
  [] tarball install
  [] CPAN
  [] RPM
  [] deb
  [] emerge
  [] part of cpanel
  [] something else?
 
 I want to upgraded it with new version 3.2.5
 
 -- 
 Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
 www.austinenergy.com
 
  
 

-- 
View this message in context: 
http://www.nabble.com/upgrad-spamassassin-tp23743952p23744661.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: upgrad spamassassin

2009-05-27 Thread Rick Macdougall

McDonald, Dan wrote:
On Wed, 2009-05-27 at 07:44 -0700, hateSpam wrote:

Running on...
 [ ] Redhat linux version 6.0
 [ ] Minix
 [ ] OpenVMS
 [ ] Sun/OS 2.0
 [X] Timex Sinclair ZX81
 [ ] Windows NT 3.02B
 [ ] Something else?

Installed using...
 [ ] tarball install
 [ ] CPAN
 [ ] RPM
 [ ] deb
 [ ] emerge
 [ ] part of cpanel
 [X] Sub-Space channel to V'ger
 [ ] something else?

Sorry, Couldn't resist.

Regards,

Rick


Re: upgrad spamassassin

2009-05-27 Thread hateSpam

I haven't installed it. it was already installed. Do you mean I should delete
the current spamassassin and reinstall new version?

Bowie Bailey wrote:
 
 hateSpam wrote:
 Dear All,
 I have spamassassin 3.1.9 I want to upgraded it with new version 3.2.5
 I will appreciate if any one can tell me how can I upgrade it? Where
 should
 i put new version files and which command should I run?

 I looked at upgrade documentation but I didn't understand what should id
 do.

 Thanks in advance

 HateSpam
   
 
 How did you install 3.1.9?  You need to install the new version the same 
 way to avoid problems.
 
 -- 
 Bowie
 
 

-- 
View this message in context: 
http://www.nabble.com/upgrad-spamassassin-tp23743952p23744755.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: upgrad spamassassin

2009-05-27 Thread Rick Macdougall

hateSpam wrote:

Thanks for reply.
Sorry, I should say that before. I am using CentOS Linux 5. For mailing
delivery we are using Postfix version 2.3.3 and Procmail. 





Yes, but the most important question still remains.

How did you originally install SpamAssassin ?

Regards,

Rick




Re: upgrad spamassassin

2009-05-27 Thread Bowie Bailey

hateSpam wrote:

I haven't installed it. it was already installed. Do you mean I should delete
the current spamassassin and reinstall new version?
  


To upgrade SpamAssassin, you normally just install the new version on 
top of the old one.  But if you install with different settings or a 
different install method than the old one, you can end up with two 
versions on the system or pieces of the two in different places.  This 
can cause some really annoying problems.


You can try to figure out where the old one came from.

If it was installed via CPAN, this command will show the install dates:
$ perldoc -t perllocal | grep SpamAssassin

If it was installed via rpm, you can look for it this way:
$ yum list installed 'spamassassin*'

If neither of those two commands give any output, then it was probably 
installed from source.  You can poke around the system to see if the 
source for the old version is still around anywhere.  If so, you can 
look in the config.log file to see what the configure command looked 
like and build the new one with the same command.


If in doubt, the safest thing to do is to remove the old one and then 
install the new one from scratch via whichever method you prefer.  If 
the old one was an rpm, you can simply use the 'yum remove' command to 
get rid of it.  Otherwise, you'll need to dig into Perl's module 
directories and the system binary directories (/usr/bin, /usr/local/bin, 
etc) and remove it yourself.


--
Bowie


Re: upgrad spamassassin

2009-05-27 Thread hateSpam

I don't know because somebody else has already done it. he is not here and
unfortunately no documentation how he installed it.

Regards
hateSpame

 

Rick Macdougall-2 wrote:
 
 hateSpam wrote:
 Thanks for reply.
 Sorry, I should say that before. I am using  CentOS Linux 5. For mailing
 delivery we are using Postfix version 2.3.3 and Procmail. 
 
 
 
 Yes, but the most important question still remains.
 
 How did you originally install SpamAssassin ?
 
 Regards,
 
 Rick
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/upgrad-spamassassin-tp23743952p23745079.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: upgrad spamassassin

2009-05-27 Thread Ned Slider

hateSpam wrote:

Thanks for reply.
Sorry, I should say that before. I am using CentOS Linux 5. For mailing
delivery we are using Postfix version 2.3.3 and Procmail. 





In that case you need to update your CentOS system as CentOS is using 
the latest SpamAssassin.


'yum update' should do the trick.





McDonald, Dan wrote:

On Wed, 2009-05-27 at 07:44 -0700, hateSpam wrote:

Dear All,
I have spamassassin 3.1.9 
Running on... 
 [] Redhat linux version 6.0

 [] Minix
 [] OpenVMS
 [] Sun/OS 2.0
 [] Timex Sinclair ZX81
 [] Windows NT 3.02B
 [] Something else?
Installed using...
 [] tarball install
 [] CPAN
 [] RPM
 [] deb
 [] emerge
 [] part of cpanel
 [] something else?


I want to upgraded it with new version 3.2.5

--
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
www.austinenergy.com

 







Re: upgrad spamassassin

2009-05-27 Thread Bowie Bailey

Ned Slider wrote:

hateSpam wrote:

Thanks for reply.
Sorry, I should say that before. I am using CentOS Linux 5. For 
mailing

delivery we are using Postfix version 2.3.3 and Procmail.


In that case you need to update your CentOS system as CentOS is using 
the latest SpamAssassin.


'yum update' should do the trick.


Assuming that he is using the SpamAssassin rpm.  My CentOS box has SA 
installed from CPAN.


This command will confirm if the package is installed via yum/rpm:
$ yum list installed 'spamassassin*'

and if so, 'yum update' will update it along with everything else on the 
system.


--
Bowie


Re: upgrad spamassassin

2009-05-27 Thread hateSpam

Thanks for reply i used (yum list installed 'spamassassin*') command and I
got 

Loading installonlyn plugin
Installed Packages
spamassassin.x86_64  3.1.9-1.el5installed 


Does it mean it has installed via rpm?

Thanks
hateSpam


Bowie Bailey wrote:
 
 hateSpam wrote:
 I haven't installed it. it was already installed. Do you mean I should
 delete
 the current spamassassin and reinstall new version?
   
 
 To upgrade SpamAssassin, you normally just install the new version on 
 top of the old one.  But if you install with different settings or a 
 different install method than the old one, you can end up with two 
 versions on the system or pieces of the two in different places.  This 
 can cause some really annoying problems.
 
 You can try to figure out where the old one came from.
 
 If it was installed via CPAN, this command will show the install dates:
 $ perldoc -t perllocal | grep SpamAssassin
 
 If it was installed via rpm, you can look for it this way:
 $ yum list installed 'spamassassin*'
 
 If neither of those two commands give any output, then it was probably 
 installed from source.  You can poke around the system to see if the 
 source for the old version is still around anywhere.  If so, you can 
 look in the config.log file to see what the configure command looked 
 like and build the new one with the same command.
 
 If in doubt, the safest thing to do is to remove the old one and then 
 install the new one from scratch via whichever method you prefer.  If 
 the old one was an rpm, you can simply use the 'yum remove' command to 
 get rid of it.  Otherwise, you'll need to dig into Perl's module 
 directories and the system binary directories (/usr/bin, /usr/local/bin, 
 etc) and remove it yourself.
 
 -- 
 Bowie
 
 

-- 
View this message in context: 
http://www.nabble.com/upgrad-spamassassin-tp23743952p23745271.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: upgrad spamassassin

2009-05-27 Thread Bowie Bailey

hateSpam wrote:

Thanks for reply i used (yum list installed 'spamassassin*') command and I
got 


Loading installonlyn plugin
Installed Packages
spamassassin.x86_64  3.1.9-1.el5installed 



Does it mean it has installed via rpm?

  


Exactly.

It should update along with everything else when you run 'yum update'.  
(You should be doing this on a regular basis anyway to get OS security 
updates.)


--
Bowie


Re: my AWL messed up?

2009-05-27 Thread Bowie Bailey

Linda Walsh wrote:


I got a really poorly scored piece of spam -- one thing that stood out
as weird was report claimed the sender was in my AWL.


Any sender who has sent mail to you previously will be in your AWL.  
This is probably the most misunderstood component of SA.  Read the wiki.


http://wiki.apache.org/spamassassin/AutoWhitelist

--
Bowie


AWL functionality messed up?

2009-05-27 Thread Linda Walsh

Bowie Bailey wrote:

Linda Walsh wrote:


I got a really poorly scored piece of spam -- one thing that stood out
as weird was report claimed the sender was in my AWL.


Any sender who has sent mail to you previously will be in your AWL.  
This is probably the most misunderstood component of SA.  Read the wiki.


http://wiki.apache.org/spamassassin/AutoWhitelist




At face value, this seems very counter productive.

If I get spam from 1000 senders, they all end up in my
AWL???

WTF?

AWL should only be added to by emails judged to be 'ham' via
the feed back mechanisms --, spammers shouldn't get bonuses for
being repeat senders...

How do I delete spammer addresses from my 'auto-white-list'?

(That's just insane..whitelisting spammers?!?!)




83MB auto-whitelist?

2009-05-27 Thread Henry Kwan
Just noticed that my AWL is up to 83MB.  Not sure if it should be that large so
I ran check_whitelist and it removed the single entries but did not compact the
file.  I then checked the SA site and it said to use sa-awlUtil but I can't find
this utility on my system.  Was it included in the standard 3.25 tarball?

Thanks.




Re: my AWL messed up?

2009-05-27 Thread Linda Walsh


Bowie Bailey wrote:

Linda Walsh wrote:


I got a really poorly scored piece of spam -- one thing that stood out
as weird was report claimed the sender was in my AWL.


Any sender who has sent mail to you previously will be in your AWL.  
This is probably the most misunderstood component of SA.  Read the wiki.


http://wiki.apache.org/spamassassin/AutoWhitelist


---
To be clear about what is being white listed, would it
hurt if the 'brief report for the AWL', instead of :
-1.3 AWLAWL: From: address is in the auto white-list

it had
-1.3 AWLAWL: 'From: 518501.com' addr is in auto white-list

So I can see what domain it is flagging with a 'white' value?

I don't know of any emails from '518501.com' that wouldn't have
been classified spam, so none should have a 'negative value'.



Re: AWL functionality messed up?

2009-05-27 Thread Bowie Bailey

Linda Walsh wrote:

Bowie Bailey wrote:

Linda Walsh wrote:


I got a really poorly scored piece of spam -- one thing that stood out
as weird was report claimed the sender was in my AWL.


Any sender who has sent mail to you previously will be in your AWL.  
This is probably the most misunderstood component of SA.  Read the wiki.


http://wiki.apache.org/spamassassin/AutoWhitelist




At face value, this seems very counter productive.

If I get spam from 1000 senders, they all end up in my
AWL???

WTF?

AWL should only be added to by emails judged to be 'ham' via
the feed back mechanisms --, spammers shouldn't get bonuses for
being repeat senders...

How do I delete spammer addresses from my 'auto-white-list'?

(That's just insane..whitelisting spammers?!?!)


Did you read the wiki link that I gave you???

Despite it's name, this is NOT a simple whitelist.  It is a score 
averaging system.  It will attempt to adjust a sender's score based on 
their past history.  So when a friend who normally sends low-scoring 
emails forwards you something that matches a bunch of spam rules, this 
will push the score back down towards his previous average.  Similarly, 
when a spammer sends something that doesn't match many rules, the score 
gets pushed back up towards his previous average.


Spammers don't get bonuses for being repeat senders, they get 
penalized.  Take another look:


http://wiki.apache.org/spamassassin/AutoWhitelist

and also:

http://wiki.apache.org/spamassassin/AwlWrongWay

--
Bowie


Re: my AWL messed up?

2009-05-27 Thread Bowie Bailey

Linda Walsh wrote:


Bowie Bailey wrote:

Linda Walsh wrote:


I got a really poorly scored piece of spam -- one thing that stood out
as weird was report claimed the sender was in my AWL.


Any sender who has sent mail to you previously will be in your AWL.  
This is probably the most misunderstood component of SA.  Read the wiki.


http://wiki.apache.org/spamassassin/AutoWhitelist


---
To be clear about what is being white listed, would it
hurt if the 'brief report for the AWL', instead of :
-1.3 AWLAWL: From: address is in the auto white-list

it had
-1.3 AWLAWL: 'From: 518501.com' addr is in auto white-list

So I can see what domain it is flagging with a 'white' value?

I don't know of any emails from '518501.com' that wouldn't have
been classified spam, so none should have a 'negative value'.


If the AWL is assigning a -1.3 score, that means that the previous 
message from this sender averaged 2.6 points lower than this email.  
Exactly how that works out depends on what other rules hit on this 
message.  Was the -1.3 score enough to prevent this message from being 
marked as spam?  It is normal for the AWL scores to be either positive 
or negative for any mail (ham or spam), but they should not be high 
enough to change the spam determination of the message unless it is 
significantly different from the sender's past messages.


--
Bowie


Re: AWL functionality messed up?

2009-05-27 Thread Jeff Mincy
   From: Linda Walsh sa-u...@tlinx.org
   Date: Wed, 27 May 2009 12:48:43 -0700
   
   Bowie Bailey wrote:
Linda Walsh wrote:
   
I got a really poorly scored piece of spam -- one thing that stood out
as weird was report claimed the sender was in my AWL.

Any sender who has sent mail to you previously will be in your AWL.  
This is probably the most misunderstood component of SA.  Read the wiki.

http://wiki.apache.org/spamassassin/AutoWhitelist
   
   
   At face value, this seems very counter productive.
   
You still aren't understanding the wiki or the AWL scoring or what AWL
is trying to do.

   If I get spam from 1000 senders, they all end up in my
   AWL???
   
yes.   every email+ip address pair that sends you email winds up in
your AWL with an average score for that pair.  This is ok.

   WTF?
   
   AWL should only be added to by emails judged to be 'ham' via
   the feed back mechanisms --, spammers shouldn't get bonuses for
   being repeat senders...
   
You are getting too attached to the 'whitelist' part of the name.
Pretend AWL stands for average weighting list.

   How do I delete spammer addresses from my 'auto-white-list'?
   
   (That's just insane..whitelisting spammers?!?!)

AWL isn't whitelisting spammers.   It is pushing the score to the
average for that sender.   The sender can have a high average or a low
average.   

If the previous email from a particular sender was FP or FN then AWL
will have an incorrect average and will wind up doing or trying to do
the wrong thing with subsequent email for that sender.

You can remove addresses using spamassassin --remove-from-whitelist

-jeff


Re: 83MB auto-whitelist?

2009-05-27 Thread LuKreme

On 27-May-2009, at 13:49, Henry Kwan wrote:
Just noticed that my AWL is up to 83MB.  Not sure if it should be  
that large so
I ran check_whitelist and it removed the single entries but did not  
compact the
file.  I then checked the SA site and it said to use sa-awlUtil but  
I can't find
this utility on my system.  Was it included in the standard 3.25  
tarball?


Doesn't appear to be.  It's not part of my 3.2.5 install (in fact,  
doesn't appear anywehre on my server). It also does not appear in the  
ports tree as a separate program.  Where do you get it?  Got me, I  
searched several ways on google and only found references to other  
people asking where it was.


the page in question does list, under ToDo, upload sw-alwUtil


--
...I started playing Myst at 4:30 in the afternoon and looked up
suddenly and realized it was February.



Re: AWL functionality messed up?

2009-05-27 Thread LuKreme

On 27-May-2009, at 13:48, Linda Walsh wrote:

Bowie Bailey wrote:

Linda Walsh wrote:


I got a really poorly scored piece of spam -- one thing that stood  
out

as weird was report claimed the sender was in my AWL.
Any sender who has sent mail to you previously will be in your  
AWL.  This is probably the most misunderstood component of SA.   
Read the wiki.

http://wiki.apache.org/spamassassin/AutoWhitelist



At face value, this seems very counter productive.


At face value, you still haven't read the docs, have you?


If I get spam from 1000 senders, they all end up in my
AWL???


Yep.


WTF?


Read the docs.


AWL should only be added to by emails judged to be 'ham' via


No, you are confused. This is common, lots of people are confused  
about this. This is why many people think the name needs to be changed  
to Averaged Weight List or something similar.



the feed back mechanisms --, spammers shouldn't get bonuses for
being repeat senders...


that's not how the AWL works.  In fact, spammers get MORE points for  
being repeat senders.



How do I delete spammer addresses from my 'auto-white-list'?


That's a very bad idead.

--
++?++ Out of Cheese Error. Redo From Start.



Re: 83MB auto-whitelist?

2009-05-27 Thread Karsten Bräckelmann
On Wed, 2009-05-27 at 19:49 +, Henry Kwan wrote:
 Just noticed that my AWL is up to 83MB.  Not sure if it should be that large 
 so

Well, it keeps track of all sender addresses it sees, including the
forged ones which usually hit your site once only...

 I ran check_whitelist and it removed the single entries but did not compact 
 the
 file.  I then checked the SA site and it said to use sa-awlUtil but I can't 
 find
 this utility on my system.  Was it included in the standard 3.25 tarball?

I use Kris Deugau's trim_whitelist -- though very infrequently, just
when I spot the DB gets out of hand. That hack seems to works perfectly
for me.

For more info and the script's link see
  http://markmail.org/message/qqsm35q5bqpbb3in

Also see the first link to a list archive in that file.


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: New spamassassin OCR plugin

2009-05-27 Thread decoder

alex k wrote:


If only FuzzyOCR's developer would read that ;)
Unfortunately he doesn't seem to be interested in his project anymore.
Maybe you could take care of this orphaned code.

  


Dear Alex,


I am reading exactly everything you write ;)


The code is not orphaned, but also not being extended at the moment. The 
SVN version runs stable in all SA 3.2.x releases. I answer to tickets 
and questions via email.



I am planning a new release, but my time schedule is though.


Best regards,


Chris


smime.p7s
Description: S/MIME Cryptographic Signature


Re: New spamassassin OCR plugin

2009-05-27 Thread decoder

LuKreme wrote:

On 24-May-2009, at 18:40, Henrik K wrote:
I don't know why users are so afraid of words like SVN. You have to 
look at the project, not version numbers.



I don't have FuzzyOCR installed, and it's not because of the SVN. 
First, I don't think my server can take the processing hit and second 
it requires so much to be installed that I'm SURE my server can't take 
the hit.




May I ask how many mails you process per day? Please note that

a) FuzzyOcr runs last if properly installed
b) it doesn't do anything if the score exceeds a configurable threshold
c) it supports hashes and other things that make processing faster


Cheers,


Chris


smime.p7s
Description: S/MIME Cryptographic Signature


RBL triggered?

2009-05-27 Thread Charles Gregory

Hello!

Quick question: Does Spamassassin's RCVD tests also check headers
labelled X-Originating-IP?

In particular, I received the below message from hotmail with hits 
on RCVD_IN_BL_SPAMCOP_NET and RCVD_IN_SORBS_WEB. Neither of the

hotmail IP's is found in *any* RBL listed at mailabuse.org's multi-check.
The X-originating-IP shows up in the sorbs RBL but not the spamcop one.
Is this a case where hotmail got a FP corrected in 12 hours? Or is there 
something else going on to trigger these tests?


Return-Path: __...@sympatico.ca
Received: by barton.hwcn.org (Postfix, from userid 110)
id A4B4EF3EF8; Tue, 26 May 2009 17:04:28 -0400 (EDT)
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on barton.hwcn.org
X-Spam-Level: *
X-Spam-Status: No, hits=5.6 required=10.0 autolearn=disabled
tests=HTML_MESSAGE=0.001,RCVD_IN_BL_SPAMCOP_NET=4.5,RCVD_IN_SORBS_WEB=1.117
Received: from col0-omc2-s17.col0.hotmail.com (col0-omc2-s17.col0.hotmail.com
[65.55.34.91])
by barton.hwcn.org with SMTP id 2nv8k5uzsjhw4rtthhp9guzsha;
for off...@hwcn.org;
Tue, 26 May 2009 17:04:21 -0400 (EDT)
(envelope-from culs...@sympatico.ca)
Received-SPF: Pass; receiver=barton.hwcn.org; client-ip=65.55.34.91;
envelope-from=culs...@sympatico.ca; helo=col0-omc2-s17.col0.hotmail.com;
mechanism=include:hotmail.com (include:spf-a.hotmail.com (ip4:65.52.0.0/14
- pass) - pass)
X-Avenger: version=0.7.9; receiver=barton.hwcn.org; client-ip=65.55.34.91;
client-port=25067; syn-fingerprint=65535:112:1:48:M1460,N,N,S Windows 2000
SP4, XP SP1; data-bytes=0; network-path=208.65.246.17 208.72.120.5
74.205.221.2 38.104.159.125 38.20.41.73 154.54.28.33 154.54.27.165
154.54.7.30 207.46.33.29 154.54.27.206 207.46.33.29 207.46.43.153
207.46.43.153 10.22.12.134 10.22.12.134 207.46.41.209;
network-path-time=1243371861
Received: from COL104-W8 ([65.55.34.72]) by col0-omc2-s17.col0.hotmail.com with
Microsoft SMTPSVC(6.0.3790.3959);
 Tue, 26 May 2009 14:04:38 -0700
Message-ID: col104-w8d40e0023b93e83b4ffcbc6...@phx.gbl
Content-Type: multipart/alternative;
boundary=_be3ff754-56a4-49ca-a500-6d9290a4f246_
X-Originating-IP: [66.110.6.119]
From: ___...@sympatico.ca
To: off...@hwcn.org
Date: Tue, 26 May 2009 21:04:38 +
Importance: Normal
MIME-Version: 1.0
X-OriginalArrivalTime: 26 May 2009 21:04:39.0020 (UTC)
FILETIME=[94F89AC0:01C9DE45]
Subject: DSL rates

(body snipped)

--


Re: generate message with a specific score

2009-05-27 Thread Rudy Gevaert
Hi Matus,

On Mon, May 25, 2009 at 10:48:25PM +0200, Matus UHLAR - fantomas wrote:
 On 25.05.09 17:12, Rudy Gevaert wrote:
  Is it possible to generate a rule that when it applies gives the message 
  that specific score? If so, how do I do it?
 
 every rule gives a specific score when it applies... What do you mean?
 do you need the whole message to have specific score? If so, why?

To test the different levels in Amavis.  Amavis uses the score provided
by SA, so I need to set the score to a specific number to test it...

-- 
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Rudy Gevaert  rudy.geva...@ugent.be  tel:+32 9 264 4734
Directie ICT, afd. Infrastructuur  Direction ICT, Infrastructure dept.
Groep Systemen Systems group
Universiteit Gent  Ghent University
Krijgslaan 281, gebouw S9, 9000 Gent, Belgie   www.UGent.be
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 


Re: problems with TVD_SPACE_RATIO

2009-05-27 Thread mouss
Karsten Bräckelmann a écrit :
 On Wed, 2009-05-27 at 09:21 +0200, Michael Monnerie wrote:
 On Mittwoch 27 Mai 2009 mouss wrote:
 and 4454 is a one line message, but the signature causes the hit.
 
 The fact that mailing-list footer is forced onto the message with no
 newline causes it. And the second hardly counts as human generated. ;)
 

true, the guy replied without adding any personal comments. but these
are things that happen:

- I have a problem with foobar
- what does /sbin/joe show
- $joe_output

now, the question is: what should really be caught?

If I can suggest anything, I would like to propose the following:
when a rule is designed:
- document what it should catch
- give examples of things it catches and things it shouldn't catch (so
that if someone modifies the rule, he has some hints on what he can do)


of course, if there's work to do, count me in (well, subject to my
availability...).

 And my messages are just one-liners without .sig that should never hit 
 this rule at all.
 
 Checked those samples from both of you. Lots more analysis of this eval
 function added to the bug report.
 
 See comment 12. Smells kinda fishy to me, and probably broke at some
 point since its original introduction. :/
 
 
 I don't have other examples in original format, but just a few days ago 
 got a FP report where this rule hit a normal, german, human-typed mail.
 I'll restore the original score now to see if I get more reports.
 
 Hmm, I'd love to see that one. Any *human-typed* mail featuring a real
 sentence should not trigger this. Unless it's followed directly by a
 huge machine-generated paste or something, without an empty line...
 


I'm not sure. but I'll have to dig in my mail before I can see anything
real.



FuzzyOcr 3.6.0 released

2009-05-27 Thread decoder

Hello all,


after quite some time, I've decided to release another version of 
FuzzyOcr. This version is only a tag from SVN revision 135 (+ a patch 
provided recently which fixes something in one of the sql utilities) 
that has been used quite some time with SA 3.2.x and is included in some 
major distributions already. If you are using FuzzyOcr from SVN (rev 
135), then there is no need for you to upgrade.


Since image spam seems on the rise again, lots of people have contacted 
me in the last 2 months, and I have been asked many times to release 
another tarball... So I hope someone will find it useful. No new 
features are added in this release, as I decided to first tag the 
version that is working without known problems for those that seemed to 
have a problem with checking out the version from SVN. The major version 
number increase is due to the fact that it breaks compatibility with SA 
3.1.x and now requires SA 3.2.x.


See http://fuzzyocr.own-hero.net/wiki/Downloads for more details.

Although I still can't invest that much time into the project at this 
point, there are some features I'd like to add though in the near 
future, such as regex support. I also considered rewriting the scoring 
engine because some people share the opinion that it is too sensitive 
(as opposed to others who consider it to be good).




Best regards,



Chris


smime.p7s
Description: S/MIME Cryptographic Signature


Re: RBL triggered?

2009-05-27 Thread mouss
Charles Gregory a écrit :
 Hello!
 
 Quick question: Does Spamassassin's RCVD tests also check headers
 labelled X-Originating-IP?

yes.

 
 In particular, I received the below message from hotmail with hits on
 RCVD_IN_BL_SPAMCOP_NET and RCVD_IN_SORBS_WEB. Neither of the
 hotmail IP's is found in *any* RBL listed at mailabuse.org's multi-check.
 The X-originating-IP shows up in the sorbs RBL but not the spamcop one.
 Is this a case where hotmail got a FP corrected in 12 hours? Or is there
 something else going on to trigger these tests?
 

66.110.6.119 is listed in CBL, SORBS, BRBL (Barracuda), ...
so this IP is owned or whatever, and in any case, it sends spam. thus
any mail that was sent from or via this IP is suspicious and deserves
some points.


 Return-Path: __...@sympatico.ca
 Received: by barton.hwcn.org (Postfix, from userid 110)
 id A4B4EF3EF8; Tue, 26 May 2009 17:04:28 -0400 (EDT)
 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on barton.hwcn.org
 X-Spam-Level: *
 X-Spam-Status: No, hits=5.6 required=10.0 autolearn=disabled

 tests=HTML_MESSAGE=0.001,RCVD_IN_BL_SPAMCOP_NET=4.5,RCVD_IN_SORBS_WEB=1.117
 Received: from col0-omc2-s17.col0.hotmail.com
 (col0-omc2-s17.col0.hotmail.com
 [65.55.34.91])
 by barton.hwcn.org with SMTP id 2nv8k5uzsjhw4rtthhp9guzsha;
 for off...@hwcn.org;
 Tue, 26 May 2009 17:04:21 -0400 (EDT)
 (envelope-from culs...@sympatico.ca)
 Received-SPF: Pass; receiver=barton.hwcn.org; client-ip=65.55.34.91;
 envelope-from=culs...@sympatico.ca;
 helo=col0-omc2-s17.col0.hotmail.com;
 mechanism=include:hotmail.com (include:spf-a.hotmail.com
 (ip4:65.52.0.0/14
 - pass) - pass)
 X-Avenger: version=0.7.9; receiver=barton.hwcn.org; client-ip=65.55.34.91;
 client-port=25067; syn-fingerprint=65535:112:1:48:M1460,N,N,S
 Windows 2000
 SP4, XP SP1; data-bytes=0; network-path=208.65.246.17 208.72.120.5
 74.205.221.2 38.104.159.125 38.20.41.73 154.54.28.33 154.54.27.165
 154.54.7.30 207.46.33.29 154.54.27.206 207.46.33.29 207.46.43.153
 207.46.43.153 10.22.12.134 10.22.12.134 207.46.41.209;
 network-path-time=1243371861
 Received: from COL104-W8 ([65.55.34.72]) by
 col0-omc2-s17.col0.hotmail.com with
 Microsoft SMTPSVC(6.0.3790.3959);
  Tue, 26 May 2009 14:04:38 -0700
 Message-ID: col104-w8d40e0023b93e83b4ffcbc6...@phx.gbl
 Content-Type: multipart/alternative;
 boundary=_be3ff754-56a4-49ca-a500-6d9290a4f246_
 X-Originating-IP: [66.110.6.119]
 From: ___...@sympatico.ca
 To: off...@hwcn.org
 Date: Tue, 26 May 2009 21:04:38 +
 Importance: Normal
 MIME-Version: 1.0
 X-OriginalArrivalTime: 26 May 2009 21:04:39.0020 (UTC)
 FILETIME=[94F89AC0:01C9DE45]
 Subject: DSL rates
 
 (body snipped)
 
 -- 



Re: FuzzyOcr 3.6.0 released

2009-05-27 Thread René Berber
decoder wrote:

[snip]
 See http://fuzzyocr.own-hero.net/wiki/Downloads for more details.
 
 Although I still can't invest that much time into the project at this
 point, there are some features I'd like to add though in the near
 future, such as regex support. I also considered rewriting the scoring
 engine because some people share the opinion that it is too sensitive
 (as opposed to others who consider it to be good).

You should mention that scores and sensitivity are configurable (even
individually for each word / term; I use low scores for short words, for
instance).  And all is documented on the fuzzy perl library used.

Thanks for a great SA plugin.
-- 
René Berber



Re: AWL functionality messed up?

2009-05-27 Thread Linda Walsh

Jeff Mincy wrote:

   From: Linda Walsh sa-u...@tlinx.org
   Date: Wed, 27 May 2009 12:48:43 -0700
   
   Bowie Bailey wrote:  

   At face value, this seems very counter productive.
   
You still aren't understanding the wiki or the AWL scoring or what AWL

is trying to do.


Ah, but it only seems I'm daft, today...:-)


   If I get spam from 1000 senders, they all end up in my
   AWL???
   
yes.   every email+ip address pair that sends you email winds up in

your AWL with an average score for that pair.  This is ok.


GRRRnot so ok in my mindset, but ... and ... errr..
well that only makes it more confusing, in a way...since I was
only 99% certain that I'd never gotten any HAM from hostname
'518501.com' (thinking for a short period that AWL might be classify
things by hosts as reliable or not, instead of, or in addition to
by email-addr), but I'm 99.97% certain I've never gotten any HAM
from user 'paypal.notify' (at) hostname '5185



   AWL should only be added to by emails judged to be 'ham' via
   the feed back mechanisms --, spammers shouldn't get bonuses for
   being repeat senders...
   
You are getting too attached to the 'whitelist' part of the name.

Pretend AWL stands for average weighting list.

=
Aw...come on.  Isn't the world difficult enough without
changing white to black or white to weighing?  I mean, we humans
have enough trouble agreeing on what our symbols, words mean in
relation to concepts and all without ya goin' and redefining perfectly
good acceptable symbols to mean something else completely and still
claim it to be some semblance of English.   No wonder most of the
non-techno-literate humans on this world regard us techies with
a hint of suspicion regarding the difficulty of problems.  We go around
redefining words to suit reality and catch the heat when the rest of
the world doesn't understand our meaning:

Pointy-Haired Boss: Well, how long did you say it would take?

Geek: Well, I said it was 3-4 weeks worth of work.

PHB: Then why has it been 6 weeks with no product? I told you
  anything over 4 weeks was unacceptable!

G: 6 weeks, but...to get under 4 weeks, I assumed you were talking
168-hour pure-programming time weeks -- not CALENDAR weeks!



AWL isn't whitelisting spammers.   It is pushing the score to the
average for that sender.   The sender can have a high average or a low
average.   

---
	An average?  So it keeps the scores of all the past emails of every email we 
ever got sent?  Must just store a weighted average -- otherwise

the space (hmm...someone said something about 80MB+ auto-whitelist DB
files?)

Why not call it the Historically Based Score Normalizer or
HBSN module?  Db file could be historical-norms or something.



If the previous email from a particular sender was FP or FN then AWL
will have an incorrect average and will wind up doing or trying to do
the wrong thing with subsequent email for that sender.


Maybe it shouldn't add in the 'average' unless it exceeds
the 'auto-learning threshold'??  I.e. something like the
'bayes_auto_learn_threshold_nonspam' for HAM and the
'bayes_auto_learn_threshold_spam' for SPAM.  Assuming it doesn't
already do such a thing, it would make a little sense...so as
not to train it on 'bad data'...

When I run sa-learn --spam email over a message, can I
assume (or is it the case) that telling SA, a message was 'spam'
would assign a sufficiently large value to the 'HBSN' value for that
sender to reduce any effect of having falsely (if it is likely to happen)
incorrect value?

Or might I at least assume that each sa-learn over a message
will modify it's AWL score appropriately?



You can remove addresses using spamassassin --remove-from-whitelist


Yes...saw that after visiting the wiki.  Is there a
--show-whitelist-with-current-scores-and-their-weight switch as well
(as opposed to one that only showed the addr's in the white list, or only
showed the non-weighted scores)?


Thanks...and um...
How difficult would it be to have the name of the module reflect
what it's actually doing?  maybe roll out a name change with the next
.dot release of SA?  (3.3? 3.4?)  Might alleviate some amount of
confusion(?)...

Does the AWL also keep track of when it last saw an 'email' addr
so it can 'expire' the oldest entries so the db doesn't grow to eventually
consume all forms of matter and energy in the universe?  :-)

Thanks for the clarification and info!!

-linda


Re: FuzzyOcr 3.6.0 released

2009-05-27 Thread RW
On Wed, 27 May 2009 19:10:57 -0500
René Berber r.ber...@computer.org wrote:

 You should mention that scores and sensitivity are configurable (even
 individually for each word / term; I use low scores for short words,
 for instance).  And all is documented on the fuzzy perl library used.

AFAIK though it isn't possible to place a cap on the  FuzzyOCR score. I
don't want to, but I detune it purely to reduce the likelyhood of
something hitting my discard threshold by OCR alone.



Re: FuzzyOcr 3.6.0 released

2009-05-27 Thread René Berber
RW wrote:

 AFAIK though it isn't possible to place a cap on the  FuzzyOCR score. I
 don't want to, but I detune it purely to reduce the likelyhood of
 something hitting my discard threshold by OCR alone.

Isn't that done by setting focr_add_score to 0.0?  The total score in
this case should be always focr_base_score, unless other of the rules
gets a hit (wrong content-type wrong file extension, etc.)

I haven't seen the problem of hitting discard threshold by OCR alone,
FPs are simple to isolate by using the image size thresholds or other
means (usually they are specific cases that can be white listed by
recipient or sender).
-- 
René Berber



Re: AWL functionality messed up?

2009-05-27 Thread Spiro Harvey
Linda Walsh sa-u...@tlinx.org wrote:
 We go
 around redefining words to suit reality and catch the heat when the
 rest of the world doesn't understand our meaning:

Please repeat after me:

AWL is not an auto whitelist
AWL is not an auto whitelist
AWL is not an auto whitelist

It's one of those funny jokes, like GNU. Feel free to click your heels
together while repeating this affirmation, just whatever you do, DON'T
say it in front of a mirror. Seriously, there's a crater somewhere in
Mexico where a data warehouse used to sit the last time someone tried
that.

   An average?  So it keeps the scores of all the past emails of
 every email we ever got sent?  Must just store a weighted average --
 otherwise the space (hmm...someone said something about 80MB+
 auto-whitelist DB files?)

Time to upgrade those 80MB drives, huh?

   How difficult would it be to have the name of the module
 reflect what it's actually doing?  maybe roll out a name change with
 the next .dot release of SA?  (3.3? 3.4?)  Might alleviate some
 amount of confusion(?)...

Why? It's not broken. Just pretend it stands for Averaged Weight List,
and then you'll be able to sleep at night.

Oh, and there's no need to reply to all. You're on a mailing list, so
anybody who sent you a message from it is already on the list,
and will get your replies.

-- 
Top-posting is the computer equivalent of mailing a letter glued
to the *outside* of an envelope, with a stamp attached via paper clip.
-- Xcott Craver


signature.asc
Description: PGP signature


Re: FuzzyOcr 3.6.0 released

2009-05-27 Thread RW
On Wed, 27 May 2009 21:19:58 -0500
René Berber r.ber...@computer.org wrote:

 RW wrote:
 
  AFAIK though it isn't possible to place a cap on the  FuzzyOCR
  score. I don't want to, but I detune it purely to reduce the
  likelyhood of something hitting my discard threshold by OCR alone.
 
 Isn't that done by setting focr_add_score to 0.0?  The total score in
 this case should be always focr_base_score, unless other of the rules
 gets a hit (wrong content-type wrong file extension, etc.)


No, I want the score to increase for each extra word, I just don't want
it to rise to a huge score like 25 where it might go over a second
threshold for outright discarding. 

e.g. for a threshold of 5 and a discard level of 20 you might have 

focr_base_score 4.5
focr_add_score  0.5
focr_max_score  14
focr_autodisable_score 20 





Re: AWL functionality messed up?

2009-05-27 Thread Benny Pedersen

On Wed, May 27, 2009 21:48, Linda Walsh wrote:
 http://wiki.apache.org/spamassassin/AutoWhitelist
 At face value, this seems very counter productive.

read the docs one more time

 If I get spam from 1000 senders, they all end up in my
 AWL???

yes

 WTF?

not here please

 AWL should only be added to by emails judged to be 'ham' via
 the feed back mechanisms --, spammers shouldn't get bonuses for
 being repeat senders...

thay dont either, AWL tracks the sender ip also

but i agrea its silly doing it with a fuss of /16

 How do I delete spammer addresses from my 'auto-white-list'?

perldoc Mail::SpamAssassin::Conf

 (That's just insane..whitelisting spammers?!?!)

its NOT a whitelist

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: New spamassassin OCR plugin

2009-05-27 Thread Benny Pedersen

On Wed, May 27, 2009 23:43, decoder wrote:
 I am planning a new release, but my time schedule is though.

super, i posted a new thread with subject FuzzyOcr wordlist

new words to be added for latest spams

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: AWL functionality messed up?

2009-05-27 Thread Matt Kettler
Linda Walsh wrote:
 Bowie Bailey wrote:
 Linda Walsh wrote:

 I got a really poorly scored piece of spam -- one thing that stood out
 as weird was report claimed the sender was in my AWL.

 Any sender who has sent mail to you previously will be in your AWL. 
 This is probably the most misunderstood component of SA.  Read the wiki.

 http://wiki.apache.org/spamassassin/AutoWhitelist


 
 At face value, this seems very counter productive.
It's obvious you're taking it at face value and you've not read the
URL above.

You're seeing whitelist in the name, and beliving it. Sorry the name
is misleading, but the AWL is not a whitelist.

 If I get spam from 1000 senders, they all end up in my
 AWL???

 WTF?
You're leaping to wildly incorrect conclusions, mostly because you're
assuming the AWL is a whitelist. It's not.

*READ* the URL above. No, really READ IT. You don't understand the AWL yet.

 AWL should only be added to by emails judged to be 'ham' via
 the feed back mechanisms --, spammers shouldn't get bonuses for
 being repeat senders...
Who says they get bonuses just for being a repeat sender?? They get
bonuses or penalties, all depending.

The AWL isn't a whitelist Linda. It's an averager. It can whitelist or
blacklist messages. If they send a message that scores less than their
previous average, they get a positive AWL score (blacklisting). If they
send one that's higher they get a negative score (whitelisting).

HOWEVER, in the AWL, a simple look at the positive or negative sign on
the score doesn't really tell you much.

Take this example: Pre-AWL score +12, AWL -2, Final score +10, . What
did the AWL think of this sender based on history? +6, spammer.

If the same sender instead sent: Pre-AWL score +4 the AWL would hit at
+1.0 resulting in Final score +5.0.

End result: same sender, different messages, different signs on the AWL,
but both are still tagged as spam. And in one example, a false negative
was avoided based on their history.


 How do I delete spammer addresses from my 'auto-white-list'? \
spamassassin --remove-addr-from-whitelist=...@example.com


 (That's just insane..whitelisting spammers?!?!)
No, it's insane to have the AWL named AWL, because it's not a white list.

It's really A history-based score averaging system with automatic
whitelisting and blacklisting effects. However, AHBSASWAWB is an awfuly
long name.

I *REALLY* suggest you read up on how the AWL works, for real, before
jumping to conclusions about what it is, and what it does. It really
doesn't work the way you think.








Re: my AWL messed up?

2009-05-27 Thread Matt Kettler
Linda Walsh wrote:
 To be clear about what is being white listed, would it
 hurt if the 'brief report for the AWL', instead of :
 -1.3 AWLAWL: From: address is in the auto white-list

 it had
 -1.3 AWLAWL: 'From: 518501.com' addr is in auto white-list

 So I can see what domain it is flagging with a 'white' value?

 I don't know of any emails from '518501.com' that wouldn't have
 been classified spam, so none should have a 'negative value'.


What was the final message score in this example? Looking at the AWL
score alone is meaningless, and doesn't show what the AWL thinks the
historical average is.

If the final score was over 6.3, the AWL still thought the sender was
a spammer. It's just splitting the averages.