Re: SA-learn (spamassassin)

2009-08-03 Thread monolit

Good morning. The output of sa-learn --dump magic after bayes learning is +1
nspam/nham. I tried the command several. times. I tried write the mail with
Subject: viagra; body: viagra and sent it from my first account to the my
second account(score 0,4). Then I used sa-learn -spam for this mail. I wrote
the same mail and sent it from account one to the second. The mail gain
higher score 2.4. I took this mail and used sa-learn -spam. I wrote the same
mail and repeat  the sending(From 1. account to the second). The score was
again higher 3.4. I tried it still several times but the score didnt grow...
Thats was my small experiment with scoring by bayes.

My spamd process run under root. I started  sa-learn under root. BUT the
database is in /root directory and the same database is in /home/spamfilter
directory. Spamfilter is user which is  state in master.cf. In spamassassin
(local.cf) I have record for the bayes database and the path is
/home/spamfilter... When I started sa-learn under root so I check time of
updating database. The database under user spamfilter is correctly
updated(under root isnt updated).

I know it is strange and confusing ...use two user for this. I wish all
function and so on ran under one user, but I dont know how start up spamd
under spamfilter. I am not sure if is it the right... maybe spamd should
running under root.
Here is my modification from master.cf(postfix). This modification is
recommended by spamassassin www pages.

smtp  inet  n   -   n   -   -   smtpd
 -o content_filter=spamfilter:dummy


# Interfaces to non-Postfix software. Be sure to examine the manual
# pages of the non-Postfix software to find out what options it wants.
# 
spamfilter unix -   n   n   -   -   pipe
 flags=Rq user=spamfilter argv=/usr/local/bin/spamfilter -f ${sender} --
${recipient}

Thank you for explanation how bayes works and for time which you devoted to
me.
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24786173.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Parallelizing Spam Assassin

2009-08-03 Thread Dan Schaefer
This whole time I thought the subject line was Paralyzing Spam 
Assassin and the original poster was having trouble with SA locking up. 
Oops. ;-)


--
Dan Schaefer
Web Developer/Systems Analyst
Performance Administration Corp.



Re: SA-learn (spamassassin)

2009-08-03 Thread Karsten Bräckelmann
On Sun, 2009-08-02 at 23:50 -0700, monolit wrote:
 Good morning. The output of sa-learn --dump magic after bayes learning is +1
 nspam/nham. I tried the command several. times. I tried write the mail with

I did not ask for the difference. I asked for the output of the command.

 Subject: viagra; body: viagra and sent it from my first account to the my
 second account(score 0,4). Then I used sa-learn -spam for this mail. I wrote

As I told you before, there are *lots* of other tokens. Which differ
greatly between your self-written messages and spam. Measuring Bayes by
observing a single token is broken.

 the same mail and sent it from account one to the second. The mail gain
 higher score 2.4. I took this mail and used sa-learn -spam. I wrote the same
 mail and repeat  the sending(From 1. account to the second). The score was
 again higher 3.4. I tried it still several times but the score didnt grow...
 Thats was my small experiment with scoring by bayes.

Without looking at the headers and the SA rules hit, there's no evidence
Bayes did anything at all. As described, this easily could be AWL, too.

Oh my, this horse is dead anyway.


 Thank you for explanation how bayes works and for time which you devoted to
 me.

-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Parallelizing Spam Assassin

2009-08-03 Thread jp
I would run a tcpdump on the ethernet interface while doing this, just 
in case there are network tests happening that you are not aware of.

On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote:
 
 Hi
 
 I was measuring how quickly could SA [spam assassin] process spams when
 several SA processes are run in parallel over separate mbox files. I used a
 8 core machine. Below are the numbers when I forked different number of
 processes.
 
 Fork = 8;
 Rate = 57 msgs/sec
 
 Fork = 4;
 Rate = 44 msgs/sec
 
 Fork = 1;
 Rate = 22 msgs/sec
 
 
 I ran freshly build SA with Bayes and DNSBL turned off. Why am I not seeing
 a linear increase in the throughput? Is a file locking creating the
 bottleneck? If yes, which particular file is being locked? If no, what could
 be the reason for this?
 
 thnx
 -- 
 View this message in context: 
 http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24751958.html
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.

-- 
/*
Jason Philbrook   |   Midcoast Internet Solutions - Wireless and DSL
KB1IOJ|   Broadband Internet Access, Dialup, and Hosting 
 http://f64.nu/   |   for Midcoast Mainehttp://www.midcoast.com/
*/


Re: SA-learn (spamassassin)

2009-08-03 Thread monolit

I got you output of the command sa-learn --dump magic. About your end of your
report...it could not be AWL because I have AWL disabled. I had lucky today
... my chief was busy. I will present my solutions tomorrow:)
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24793498.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: blacklisting a forger; summary; /* end

2009-08-03 Thread Dennis G German
Summary:

 

Problem:

Observing scatter from many different sites coming to vari...@mydomain.com
. 

 

These are NDRs (Non delivery Responses) to messages sent from

the forger or infected system :

59.184.51.13 aka triband-mum-59.184.51.13.mtnl.net.in

Is already blacklisted on many Realtime Black Lists as seen via

 http://www.mxtoolbox.com/blacklists.aspx

 

The various sites that are sending NDRs should be checking one of 

The RBLs and ignoring the initial email.

 

My email is configured to accept all vari...@mydomain.com so it 

does not contribute to network traffic by sending NDRs.

 

First forwarder: relay1.sea.eschelon.com (66.213.193.108)  shold

 

Thank to all for comments and suggestions

 



Re: Timed Out

2009-08-03 Thread Sasa

..sorry but pheraps this isn't a problem about SA ?
Thanks.

--

  Salvatore.


- Original Message - 
From: Sasa s...@shoponweb.it

To: users@spamassassin.apache.org
Sent: Thursday, July 30, 2009 4:13 PM
Subject: Timed Out


Hi, in log file I have this error with SA-3.2.5 and MySQL-5.0.77 (with 
amavisd-new, postfix, maia):


Jul 23 11:03:35 mail amavis[6329]: (06329-02-2) SA TIMED OUT, backtrace: 
at /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/BayesStore/MySQL.pm 
line 492\n\teval {...} called at 
/usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/BayesStore/MySQL.pm 
line 
492\n\tMail::SpamAssassin::BayesStore::MySQL::tok_touch_all('Mail::SpamAssassin::BayesStore::MySQL=HASH(0xb8bfcec)', 
'ARRAY(0xb19ce24)', 1248337394) called at 
/usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/Bayes.pm line 
1284\n\tMail::SpamAssassin::Bayes::scan('Mail::SpamAssassin::Bayes=HASH(0xb6a92bc)', 
'Mail::SpamAssassin::PerMsgStatus=HASH(0xcfda70c)', 
'Mail::SpamAssassin::Message=HASH(0xd1bb9dc)') called at 
/usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/Plugin/Bayes.pm line 
50\n\tMail::SpamAssassin::Plugin::Bayes::check_bayes('Mail::SpamAssassin::Plugin::Bayes=HASH(0xb7647cc)', 
'Mail::SpamAssassin::PerMsgStatus=HASH(0xcfda70c)', 'ARRAY(0xba621dc)', 
0.99, 1.00) called at (ev...


..this error is present  occasionally but several times in a day.
This error is caused by SA or MySQL ?
Thanks.

--

  Salvatore.








Re: SA-learn (spamassassin)

2009-08-03 Thread Matus UHLAR - fantomas
On 03.08.09 09:20, monolit wrote:
 I got you output of the command sa-learn --dump magic. About your end of
 your report...it could not be AWL because I have AWL disabled. I had lucky
 today ... my chief was busy. I will present my solutions tomorrow:)

Better don't present it until you finally understand how it works...

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Two words: Windows survives. - Craig Mundie, Microsoft senior strategist
So does syphillis. Good thing we have penicillin. - Matthew Alton


Re: SA-learn (spamassassin)

2009-08-03 Thread Karsten Bräckelmann
On Mon, 2009-08-03 at 09:20 -0700, an anonymous Nabble user wrote:
 I got you output of the command sa-learn --dump magic.

No, you did NOT provide the output.  But hey, there's no point in
arguing over this or further following up with this thread anyway.


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: SA-learn (spamassassin)

2009-08-03 Thread monolit

If you are so clever (because I am bad english speaker) you can explain me
this problematics in my mail(po slovensky). Its problem for you? I didnt
enough good materials about this theme in czech language.
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24794082.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SA-learn (spamassassin)

2009-08-03 Thread monolit

Lieber Karl. I dont know of what command output you need? You said sa-learn
--dump magic. What can I think about your requirement? I am total confusing
from you ...  
-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24794182.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SA-learn (spamassassin)

2009-08-03 Thread Benny Pedersen
On Mon, 3 Aug 2009 09:43:49 -0700 (PDT), monolit xmull...@gmail.com
wrote:
 If you are so clever (because I am bad english speaker) you can explain
me
 this problematics in my mail(po slovensky). Its problem for you? I didnt
 enough good materials about this theme in czech language.

its hard to help if not both understand what to do, its not your bad
danish that are the problem either

-- 
Benny Pedersen


Re: SA-learn (spamassassin)

2009-08-03 Thread Karsten Bräckelmann
On Mon, 2009-08-03 at 09:49 -0700, an annoying Nabble user wrote:
 Lieber Karl. I dont know of what command output you need? You said sa-learn
 --dump magic. What can I think about your requirement? I am total confusing
 from you ...  

Karl?


Dear anonymous Nabble user,

what I requested from you is the output of that command. The actual
output you get when running the command. Not a statement by you, that
you have run the command -- but the output, copied and pasted. You do
know how to copy-n-paste, don't you?

Oh, yeah, you do. We established that before. After all, this entire
thread started as a copy-n-paste from an Ubuntu forum.


Anyway, end-of-thread for me.  Don't bother sending the output now.


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: SA-learn (spamassassin)

2009-08-03 Thread monolit

I am really sorry I am tired from work. I made mistake with your name. This
task is serious please dont joke (Oh, yeah, you do. We established that
before. After all, this entire thread started as a copy-n-paste from an
Ubuntu forum. ) I established the thread hier and then copy it to the Ubuntu
forum. How I told I need necessary help. ...and I am not anonymous I have
nick...
The command didnt run!

 [r...@localhost 3.002005]# sa-learn --dump magic  //start
Unrecognized escape \g passed through in regex; marked by -- HERE in 
m/(?i)\g -- HERE irls\b/ at 
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Conf/Parser.pm line 
1173.
0.000  0  3  0  non-token data: bayes db version
0.000  0 67  0  non-token data: nspam
0.000  0 29  0  non-token data: nham
0.000  0   1588  0  non-token data: ntokens
0.000  0 1247338497  0  non-token data: oldest atime
0.000  0 1249317365  0  non-token data: newest atime
0.000  0 1249317143  0  non-token data: last journal 
sync atime
0.000  0  0  0  non-token data: last expiry 
atime
0.000  0  0  0  non-token data: last expire 
atime delta
0.000  0  0  0  non-token data: last expire 
reduction count
[r...@localhost 3.002005]#  //stop

This is output...the command arent running. But the almost same output I
given to the forum. The single difference was that my first post had not
prompt.

-- 
View this message in context: 
http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24795226.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: privacy policy updates?

2009-08-03 Thread J.D. Falk

LuKreme wrote:


I haven't gone to any of the sites, and it could all be coincidence, but
it seemed a little suspicious to me.

Over-reaction?


I'd be suspicious, too, but there are regulations (in some jurisdictions, 
for some industries) stating that companies have to alert you when their 
privacy policy changes.  These still hold true after the company gets bought 
out or changes names, and may even apply to info which was harvested or 
purchased from shady list brokers.


What you're describing sounds like they may have even outsourced the 
notification process to some other company, and this 3rd party doesn't know 
how to make their mail look less phishy.


(This isn't to say that the mail isn't spam, of course.)

--
J.D. Falk
Return Path Inc
http://www.returnpath.net/


Re: Parallelizing Spam Assassin

2009-08-03 Thread poifgh

I did that - with DNSBL off there are no port 53 communications from SA

--


Jason Philbrook wrote:
 
 I would run a tcpdump on the ethernet interface while doing this, just 
 in case there are network tests happening that you are not aware of.
 
 On Thu, Jul 30, 2009 at 11:55:21PM -0700, poifgh wrote:
 
 Hi
 
 I was measuring how quickly could SA [spam assassin] process spams when
 several SA processes are run in parallel over separate mbox files. I used
 a
 8 core machine. Below are the numbers when I forked different number of
 processes.
 
 Fork = 8;
 Rate = 57 msgs/sec
 
 Fork = 4;
 Rate = 44 msgs/sec
 
 Fork = 1;
 Rate = 22 msgs/sec
 
 
 I ran freshly build SA with Bayes and DNSBL turned off. Why am I not
 seeing
 a linear increase in the throughput? Is a file locking creating the
 bottleneck? If yes, which particular file is being locked? If no, what
 could
 be the reason for this?
 
 thnx
 -- 
 View this message in context:
 http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24751958.html
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
 
 -- 
 /*
 Jason Philbrook   |   Midcoast Internet Solutions - Wireless and DSL
 KB1IOJ|   Broadband Internet Access, Dialup, and Hosting 
  http://f64.nu/   |   for Midcoast Mainehttp://www.midcoast.com/
 */
 
 

-- 
View this message in context: 
http://www.nabble.com/Parallelizing-Spam-Assassin-tp24751958p24796555.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Timed Out

2009-08-03 Thread Mark Martinec
Sasa,

 Hi, in log file I have this error with SA-3.2.5 and MySQL-5.0.77 (with
 amavisd-new, postfix, maia):

 Jul 23 11:03:35 mail amavis[6329]: (06329-02-2) SA TIMED OUT, backtrace: at
 /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/BayesStore/MySQL.pm
 line 492\n\teval {...} called at
 /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/BayesStore/MySQL.pm
 line
 492\n\tMail::SpamAssassin::BayesStore::MySQL::tok_touch_all('Mail::SpamAssa
ssin::BayesStore::MySQL=HASH(0xb8bfcec)', 'ARRAY(0xb19ce24)', 1248337394)
 called at
 /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/Bayes.pm line
 1284\n\tMail::SpamAssassin::Bayes::scan('Mail::SpamAssassin::Bayes=HASH(0xb
6a92bc)', 'Mail::SpamAssassin::PerMsgStatus=HASH(0xcfda70c)',
 'Mail::SpamAssassin::Message=HASH(0xd1bb9dc)') called at
 /usr/lib/perl5/vendor_perl/5.10.0/Mail/SpamAssassin/Plugin/Bayes.pm line
 50\n\tMail::SpamAssassin::Plugin::Bayes::check_bayes('Mail::SpamAssassin::P
lugin::Bayes=HASH(0xb7647cc)',
 'Mail::SpamAssassin::PerMsgStatus=HASH(0xcfda70c)', 'ARRAY(0xba621dc)',
 0.99, 1.00) called at (ev...

 ..this error is present  occasionally but several times in a day.
 This error is caused by SA or MySQL ?

Amavisd sets a timer before calling SpamAssassin, typically to 30 seconds.
If the timer expires, it logs a backtrace, although execution is typically not
interrupted. The backtrace shows where the program was at the time
of the interrupt - which may or may not be the actual trouble spot guilty
of consuming excessive time. If this happens repeatedly and the backtrace
points to a similar place in the code, it is likely this actually _is_ the 
trouble spot. If it hapens very rarely, you can ignore it, it can be treated
as as warning only.

  Mark


Re: blacklisting a forger; summary; /* end

2009-08-03 Thread LuKreme

On 3-Aug-2009, at 10:21, Dennis G German wrote:

Content-Type: text/html;
charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

html xmlns:o=3Durn:schemas-microsoft-com:office:office =
xmlns:w=3Durn:schemas-microsoft-com:office:word =
xmlns=3Dhttp://www.w3.org/TR/REC-html40;

head
meta http-equiv=3DContent-Type content=3Dtext/html; =
charset=3Dus-ascii
meta name=3DGenerator content=3DMicrosoft Word 11 (filtered  
medium)

style
!--
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:Times New Roman;}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:Arial;
color:windowtext;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
 /* List Definitions */
 @list l0
{mso-list-id:789787239;
mso-list-type:hybrid;
mso-list-template-ids:577797780 67698705 67698713 67698715 67698703 =
67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-text:%1\);
mso-level-tab-stop:.5in;
mso-level-number-position:left;
text-indent:-.25in;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--
/style

/head

body lang=3DEN-US link=3Dblue vlink=3Dpurple

div class=3DSection1

p class=3DMsoNormalfont size=3D2 face=3DArialspan =
style=3D'font-size:10.0pt;
font-family:Arial'Summary:o:p/o:p/span/font/p

p class=3DMsoNormalfont size=3D2 face=3DArialspan =
style=3D'font-size:10.0pt;
font-family:Arial'o:pnbsp;/o:p/span/font/p

p class=3DMsoNormalfont size=3D2 face=3DArialspan =
style=3D'font-size:10.0pt;
font-family:Arial'Problem:o:p/o:p/span/font/p




Yes, there IS a problem.

What the hell?

--
Behind every great man there's a woman with a vibrator
-- Hawkeye Pierce



Backscatter.org used as RBL??

2009-08-03 Thread Dennis G German
Is Backscatter.org http://www.backscatterer.org/index.php  used by any
rules?

 

I looked but did not find any.

Dennis G German



Re: blacklisting a forger; summary; /* end

2009-08-03 Thread Matt Kettler
LuKreme wrote:
 On 3-Aug-2009, at 10:21, Dennis G German wrote:
 Content-Type: text/html;
 charset=US-ASCII
 Content-Transfer-Encoding: quoted-printable


 Yes, there IS a problem.

 What the hell?

The message was multipart/alternative. You are more than capable of
reading the text/plain part.

html-only messages are strongly discouraged on the list, but so is
complaining about multipart/alternative.







large unicode email nails CPU

2009-08-03 Thread Jason Haar

Hi there

We're got a few people subscribed to Serbian mailing-lists, and one in 
particular is having difficulty getting email to us - spamc/spamd times 
out and is never able to process the message. While it is running, spamd 
takes 100% of the CPU for 1.5+minutes.


Here's an example: http://pastebin.com/m75f39d72

strace shows spamd running around looking for unicore/lib/gc_sc files - 
which is related to unicode stuff. I don't know if that's the problem 
- but that's all I could find. spamassassin -D  doesn't show anything 
strange other than massively long times to process DNSBLs. They are not 
believable: the slow DNSBLs change from invocation to invocation (of 
the same message), and dig shows no such issues - the DNSBLs SA says 
are taking 99sec to complete return instantly via dig (and yes, local 
caching DNS). Also, it is specifically a problem with these emails - in 
general we are not seeing any problems with any other email.


We've also got SA in several countries, all on CentOS5 servers 
(perl-5.8.8,spamassassin-3.2.5-1) and they all show the same symptoms - 
so I don't think it's network related but rather CPU: basically these 
emails nail SA and it's slow to finish for them?


Any ideas? Thanks!

--
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1



Re: large unicode email nails CPU

2009-08-03 Thread Michael Scheidell



Jason Haar wrote:

Hi there

We're got a few people subscribed to Serbian mailing-lists, and one in 
particular is having difficulty getting email to us - spamc/spamd 
times out and is never able to process the message. While it is 
running, spamd takes 100% of the CPU for 1.5+minutes.


Here's an example: http://pastebin.com/m75f39d72
pretty cool.  does a similar thing here on a 64bit amd core running 
freebsd., perl 5.8.9,

(but it only took 52 seconds)
but thats about 45 seconds more then it should. we are averaging 3 to 7 
seconds per email to parse them. you running compiled rules?


--
Michael Scheidell, CTO
Phone: 561-999-5000, x 1259
 *| *SECNAP Network Security Corporation

   * Certified SNORT Integrator
   * 2008-9 Hot Company Award Winner, World Executive Alliance
   * Five-Star Partner Program 2009, VARBusiness
   * Best Anti-Spam Product 2008, Network Products Guide
   * King of Spam Filters, SC Magazine 2008


_
This email has been scanned and certified safe by SpammerTrap(r). 
For Information please see http://www.secnap.com/products/spammertrap/

_
  

Re: Backscatter.org used as RBL??

2009-08-03 Thread LuKreme

On 3-Aug-2009, at 18:36, Dennis G German wrote:
Is Backscatter.org http://www.backscatterer.org/index.php  used by  
any

rules?


Pretty sure not. The way to use that RBL is as an RBL. Don't accept  
the backscatter in the first place.



--
I got a question. If you guys know so much about women, how come
you're here at like the Gas 'n' Sip on a Saturday night
completely alone drinking beers with no women anywhere?



Re: large unicode email nails CPU

2009-08-03 Thread Jason Haar

On 08/04/2009 02:03 PM, Michael Scheidell wrote:

Here's an example: http://pastebin.com/m75f39d72
pretty cool.  does a similar thing here on a 64bit amd core running 
freebsd., perl 5.8.9,


(but it only took 52 seconds)
but thats about 45 seconds more then it should. we are averaging 3 to 
7 seconds per email to parse them. you running compiled rules?


Nope.

--
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1



Bayes training

2009-08-03 Thread MySQL Student
Hi,

We have accumulated quite a large list of whitelisted users, primarily
because they were previously tagged incorrectly. I've extracted a copy
of all whitelisted mail into a separate mbox.

Certainly there is some spam in there as well, but assuming I only
learn the ham, would it make sense to train bayes using the emails
from this folder? It's all business-related, but I'm concerned that it
may have things in the email that caused it to be tagged in the first
place, like excessive HTML, sent from a host with no reverse DNS, etc.
-- all the reasons for it being whitelisted in the first place.

Looking at the logs before the addresses were added to the whitelist,
I see quite a few that were BAYES_99, probably because they resemble
mailing lists, such as those from networkworld, for example. IOW, I
wouldn't want to whitelist an email from networkworld.com, but one of
the company's partners could send the company an email that had many
of those characteristics.

Someone may also send them a one-line email with a small GIF as an
attachment, such as their corporate logo in their signature. This
would be a valid email, but also very much resembles the
characteristics of a typical spam.

This is all being done to hopefully train bayes to better recognize
corporate email, and hopefully cut down on the number of whitelisted
senders that must be added in the future (or, corporate email that
gets tagged then must be whitelisted).

Ideas greatly appreciated.
Thanks,
Alex