Re: Uppercase E-mail in Latin America

2009-10-06 Thread Yet Another Ninja

On 10/6/2009 2:33 AM, Warren Togami wrote:

Please excuse me, I used faulty logic.

I wasn't asking you anything further.  I meant I asked this friend for 
more details and it seems to be non-technical users is the most likely 
type of people to type legitimate mail in all caps.


Warren


so what score is being added to this uppercase stuff?

score UPPERCASE_50_75 0.001 0.490 0.001 0.001
score UPPERCASE_75_100 2.402 1.930 1.127 1.528

reminder: SA scores and one rule, per default won't tag something as 
spam.



where's the problem? what's the worry?


Re: OT bad news

2009-10-06 Thread ram

On Mon, 2009-10-05 at 15:05 -0700, Quanah Gibson-Mount wrote:
 --On Monday, October 05, 2009 11:50 PM +0200 mouss mo...@ml.netoyen.net 
 wrote:
 
  Thomas Mullins a écrit :
  We have been running Spamassassin for maybe eight years now.  But, my
  coworkers do not like OpenSource.  So they have finally complained
  enough that my boss is going to replace our reliable
  FreeBSD/Spamassassin boxes.  They are planning on purchasing something
  that runs ON Exchange.  What a bummer.
 
 
 
  and the problem is?
 
  if they want exchange, give them exchange. don't fight (directly), watch
  instead. take pleasure of the situation, get fun as you can. I
  personally took fun all day long in windows-only (and believe it or not,
  in linux-only) environments.
 
 
  that said, you can still try to explain that exchange should not be
  exposed to the internet. you still need a relay (such as freebsd/postfix).
 
 
 And once exchange falls over, show them Zimbra. ;)  Which uses 
 postfix/SA/amavis, etc, and looks a lot like exchange... only better. ;)
 

Isnt zimbra dead as yet ? Yahoo deliberately messed it I believe , and
now look to dump it 

Anyway I think people run away from open source because it is
unsupported. Management doesnt want to have any indispensable IT
team , so that they can always recruit some cheap M$$ trained guy from
the market to do a dirty job. 

There is also security in question. If something goes wrong with your
linux/BSD box *you* will be blamed. If something goes wrong with m$ box
(as usual) they would claim that that is how it is supposed to work :-).
After all it is from the leading software makers. 

Never mind that the management also get sponsored International holidays
for putting their entire budget in worthless stuff. 




 --Quanah
 
 --
 
 Quanah Gibson-Mount
 Principal Software Engineer
 Zimbra, Inc
 
 Zimbra ::  the leader in open source messaging and collaboration



Re: Problems with whitelist_from_rcvd

2009-10-06 Thread Igor Bogomazov
 Ignore the text immediately after the from, in this case 
 SUB.MYDOMAIN.MAIL. That is _not_ rDNS data, that is whatever the
 client sent in its SMTP HELO, and can be _anything_. If you see the
 correct hostname there it just means that computer is sending its
 correct hostname when it says HELO.
 
 To illustrate, I pulled this out of your message to the list, it is
 not edited in any way:
 
 Received: from localhost (unknown [213.108.33.133])
  by highlink.ru (Postfix) with ESMTP id 37F236A818D
  for users@spamassassin.apache.org; Mon,  5 Oct 2009 10:28:48
 +0400 (MSD)
 
 I'm pretty sure 213.108.33.133's rDNS does not say localhost.
 
 The (unknown [12.12.12.12]) is the DNS data about the client as
 your MTA sees it, and the fact that it says unknown means that for
 some reason it cannot perform rDNS on that IP address, or perhaps its
 rDNS is explicitly set to unknown. If rDNS was working you'd see
 something like:
 
 Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
  by ga.impsec.org (8.13.7/8.13.7) with SMTP id n956Tp8L020518
  for jhar...@impsec.org; Sun, 4 Oct 2009 23:29:55 -0700
 
 Exactly how are you checking the rDNS of that IP address? Can you 
 demonstrate? For example, here are rDNS lookups on the two IP
 addresses from my examples above:
 
 jhar...@dendarii ~ $ host 213.108.33.133
 133.33.108.213.in-addr.arpa domain name pointer 133.33.108.213.hl.ru.
 jhar...@dendarii ~ $ host 140.211.11.3
 3.11.211.140.in-addr.arpa domain name pointer hermes.apache.org.
 
 I note that the first does have an rDNS, even though the Received:
 header from the MTA in the example above says unknown.
 
 Are you performing your rDNS tests on the MTA computer? It looks to
 me like the DNS setup on it is misconfigured somehow and it can't
 perform rDNS queries successfully.
 

What I do (all commands on the mail-server, where SA is installed):

# host SUB.MYDOMAIN.MAIL
SUB.MYDOMAIN.MAIL has address 12.12.12.12

# host 12.1204.68.58
12.12.12.12.in-addr.arpa domain name pointer SUB.MYDOMAIN.MAIL.

host does not produce anything else but a single row

-- 
С уважением,

Igor Bogomazov
Игорь Богомазов
Главный технический специалист
HighLink Ltd. St-Petersburg, Russia
8(812)334-12-12 [доб. 220]
8(963)344-44-38 (Билайн)
http://www.hl.ru



signature.asc
Description: PGP signature


Re: Hostkarma White list Updated and Improved

2009-10-06 Thread Marc Perkel



Jon Trulson wrote:

On Mon, 5 Oct 2009, Marc Perkel wrote:




John Hardin wrote:

On Mon, 5 Oct 2009, Marc Perkel wrote:

Our white list is supposed to be a source of pure good email. So if 
spam comes for any of the white listed IPs then it's an error.


Whose? Yours or theirs?

Meaning: is a single spam reason for an IP to be dropped from the 
hostkarma whitelist?


It depends on what kind of spam it is. If it is a virus generated 
spam - then yes. If it's a spam determined by message content - no.




  Sorry if I missed this in the thread, but how do you determine
  whether a spam originates from a bot-net vs. a 'lone wolf'?


A combination of several factors including hitting my tarbaby server AND 
not using QUIT to close the connection AND some HELO sins. I'm catching 
near 100% of botnet spam.






Re: OT bad news

2009-10-06 Thread Dan Schaefer

Ted Mittelstaedt wrote:

Gary Smith wrote:

 Let them have as much Windows stuff as they want.  Just plead the 
case to supplement. 


I'll have to repeat, for the original poster this isn't a technology
vs technology argument.  If it was, his coworkers would be listing
specific things Exchange does that FreeBSD/SA does not do.

This is a political battle.  He is essentially in the position of
a mechanic that someone brings their car to for repair, then sits
there telling the mechanic what tools he should be using to repair
their car.  If the car gets repaired the owner claims that they
knew how to repair the car better than the mechanic and the
mechanic was an idiot.  If the car repair fails the owner claims
the mechanic is incompetent and an idiot.  Either way, once your
boss starts micromanaging, your going to be screwed whether you
do a good job or not.

He's tried rescuing the situation for 8 years, now your giving
advice to help him rescue the situation more.  If he helps them
by keeping the BSD server in reserve, and they fall flat on their
face and he rescues them, then it just is teaching them what to
fix on their Exchange setup.  They will try it again - perhaps
falling flat again - and this will continue over and over with
them putting more powerful hardware and more expensive add-on software
on their exchange box until eventually they will figure it out, make
him get rid of the BSD box - then they won't fall flat anymore.

Then they will claim how much better Exchange works, completely
ignoring the fact that he helped them troubleshoot their exchange
setup.

There is absolutely no fix for these types other than to let them
fail and not help them back up - just let them be fired for
incompetence.  Trust me - even if that happened to these coworkers
they will just go to the next employer that's a Windows only shop
and will never once believe that the Windows solution is worse.

It's just like the people who believe in Apple.  They will go spend
$1K on an iMac and accessories and get -exactly- the same thing that
I can build with FreeBSD and a whitebox clone for a quarter of the
cost - but will never believe that they overpaid for what they have.


Ted

(Standing ovation on both emails)

--
Dan Schaefer
Web Developer/Systems Analyst
Performance Administration Corp.



Re: OT bad news

2009-10-06 Thread LuKreme

On 5-Oct-2009, at 14:49, Thomas Mullins wrote:

I will pull out our BSD box, and I will let them connect the  
Exchange box straight to the Net.


It's a shame that, living in Denver, I will be *just* out of range of  
hearing the screams as the mailspools fill with viruses, malware, and  
massive payloads of Spanish Prinsoner spams.


Really, it should be fun.

Personally, I would NOT keep the old setup standing by unless  
specifically told to. I would rebuild it, slowly. Take a few days,  
maybe a week, to get it all back online. After all, once the panic  
starts, its worth it to teach them a lesson. Also, there's going to be  
some advantage to building a nice new install with updated everything,  
right?


It's not like you'd be being VINDICTIVE, just cautious, right?

--
Battlemage? That's not a profession. It barely qualifies as a
hobby. 'Battlemage' is about impressive a title as 'Lord of the
Dance'. PAUSE I'm adding Lord of the Dance to my titles.



Re: OT bad news

2009-10-06 Thread LuKreme

On 5-Oct-2009, at 16:58, Ted Mittelstaedt wrote:


It's just like the people who believe in Apple.  They will go spend
$1K on an iMac and accessories and get -exactly- the same thing that
I can build with FreeBSD and a whitebox clone for a quarter of the
cost - but will never believe that they overpaid for what they have.


Now if that were true then a lot of Unix admins would not have Macs  
for their personal machines. If you're buying a machine to be a  
mailserver then buying an iMac is silly. If you're buying a machine to  
use and to administer unix server, then a Mac is a fine choice  
(Probably not an iMac, a MacBookPro).


The question is are there things you want your computer to do outside  
of the command line?



--
Mac OS X, because making Unix user-friendly was
easier than fixing Windows



Re: Problems with whitelist_from_rcvd

2009-10-06 Thread John Hardin

On Tue, 6 Oct 2009, Igor Bogomazov wrote:


Exactly how are you checking the rDNS of that IP address? Can you
demonstrate?

Are you performing your rDNS tests on the MTA computer? It looks to
me like the DNS setup on it is misconfigured somehow and it can't
perform rDNS queries successfully.


What I do (all commands on the mail-server, where SA is installed):

# host SUB.MYDOMAIN.MAIL
SUB.MYDOMAIN.MAIL has address 12.12.12.12

# host 12.1204.68.58
12.12.12.12.in-addr.arpa domain name pointer SUB.MYDOMAIN.MAIL.

host does not produce anything else but a single row


Okay, good. That proves that host's rDNS is properly set up.

Can you run that command on the same computer that your _MTA_ is running 
on? The MTA is what is doing the rDNS lookups for the Received: header.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  If healthcare is a Right means that the government is obligated
  to provide the people with hospitals, physicians, treatments and
  medications at low or no cost, then the right to free speech means
  the government is obligated to provide the people with printing
  presses and public address systems, the right to freedom of
  religion means the government is obligated to build churches for the
  people, and the right to keep and bear arms means the government is
  obligated to provide the people with guns, all at low or no cost.
---
 5 days since a sunspot last seen - EPA blames CO2 emissions


Re: OT bad news

2009-10-06 Thread --[ UxBoD ]--
- Quanah Gibson-Mount qua...@zimbra.com wrote:

| --On Monday, October 05, 2009 11:50 PM +0200 mouss
| mo...@ml.netoyen.net 
| wrote:
| 
|  Thomas Mullins a écrit :
|  We have been running Spamassassin for maybe eight years now.  But,
| my
|  coworkers do not like OpenSource.  So they have finally complained
|  enough that my boss is going to replace our reliable
|  FreeBSD/Spamassassin boxes.  They are planning on purchasing
| something
|  that runs ON Exchange.  What a bummer.
| 
| 
| 
|  and the problem is?
| 
|  if they want exchange, give them exchange. don't fight (directly),
| watch
|  instead. take pleasure of the situation, get fun as you can. I
|  personally took fun all day long in windows-only (and believe it or
| not,
|  in linux-only) environments.
| 
| 
|  that said, you can still try to explain that exchange should not be
|  exposed to the internet. you still need a relay (such as
| freebsd/postfix).
| 
| 
| And once exchange falls over, show them Zimbra. ;)  Which uses 
| postfix/SA/amavis, etc, and looks a lot like exchange... only better.
| ;)
| 
| --Quanah
|
Seconded :)


Best Regards,

-- 
This message has been scanned for viruses and
dangerous content and is believed to be clean.

SplatNIX IT Services :: Innovation through collaboration



Re: Uppercase E-mail in Latin America

2009-10-06 Thread LuKreme

On 5-Oct-2009, at 12:53, René Berber wrote:

Warren Togami wrote:

On 10/05/2009 02:30 PM, René Berber wrote:

Warren Togami wrote:

I heard an interesting story from a friend who was working in  
Mexico for

the past few months.  Apparently in some Latin American countries,
uppercase legitimate person-to-person e-mail is common because it  
is

seen as a sign of respect.  This apparently is due to historical
telegraph messages being in uppercase.


Not true.


Could you provide some context?  Where are you from?  What kind of
industry or people are you exposed to?


I am Mexican, living in México City.


I grew up in Guadalajara and still have friends there, and in 'el De  
Effe' as well as scattered around a few other places in Mexico and I  
can confirm this is simply not true. No one uses all caps as a sign of  
respect.


I can't speak to other Latin American countries. Perhaps this is true  
in Guatemala, or Nicaragua? I doubt it though.


--
Everybody hates a tourist, especially one who thinks it's all such
laugh. Yeah, and the chip stains and grease will come out in the
bath. You will never understand how it feels to live your life
with no meaning or control, and with nowhere left to go. You
are amazed that the exist, and they burn so bright whilst you
can only wonder why.



Re: SIGCHLD query

2009-10-06 Thread Per Jessen
Martin Gregorie wrote:

 What causes a spamd 3.2.5 child process to be terminated by receiving
 a SIGCHLD signal?
 

A parent process receives a SIGCHLD when a child process terminates. 

 My last month's logs show 7 of them and I can't work out what caused
 them to be sent. However, Jose Luis Marin Perez' system is seeing a
 lot of them - on the order of 10% of messages scanned are getting hit
 by them, though his seem to be connected with very long running scans.

A timeout in the child perhaps?


/Per Jessen, Zürich



RE: OT bad news

2009-10-06 Thread R-Elists
 

 I have no explanation,
  
 Their supposed complaint is, they don't know *nix.  But my 
 coworker and I manage those boxes, so even if one of us left, 
 there would be at least one person to run those boxes.
  
 SA/ClamAV has been working great.  Our BSD box sits in front 
 of the Exchange, hands off clean mail, what more could you 
 ask for.  We have two boxes, in case we need to take one down 
 for an upgrade. 
  
 I will pull out our BSD box, and I will let them connect the 
 Exchange box straight to the Net.  
  
 Shane

Shane,

you have probably already thought of and done this yet just in case...

document the entire history of these boxes and save the configs of course...

plus compile as much the functional statistics as you can over the life
(logs) of those servers re: how much total email and how much malware and
ham and spam and rejected and delivered email qty etc etc...

that way, when the doodie hits the fan and end users are screaming over the
huge increase in spam, you have hard stats that tell the real story and
write the one page paper about it...

whether now, or later, possibly consider distributing it to people that
seriously need to know.

 - rh



RE: Uppercase E-mail in Latin America

2009-10-06 Thread R-Elists
 

 
 I grew up in Guadalajara and still have friends there, and in 
 'el De Effe' as well as scattered around a few other places 
 in Mexico and I can confirm this is simply not true. No one 
 uses all caps as a sign of respect.
 
 I can't speak to other Latin American countries. Perhaps this 
 is true in Guatemala, or Nicaragua? I doubt it though.
 

hm

doesnt it appear to everyone else that this has the (slim to none) makings
of a new urban legend?

i mean, if all caps was a sign of respect on that continent, then wouldnt
all of the advertising be in all caps out of respect

a few days ago when this was posted it was almost believable, for like 3
seconds of pondering.

 - rh



Re: SIGCHLD query

2009-10-06 Thread Martin Gregorie
On Tue, 2009-10-06 at 16:46 +0200, Per Jessen wrote:
 Martin Gregorie wrote:
 
  What causes a spamd 3.2.5 child process to be terminated by receiving
  a SIGCHLD signal?
  
 
 A timeout in the child perhaps?
 
That thought that may be the reason. It certainly seems to apply when a
child runs longer than the time set by --timeout-child  but there are a
few cases where a SIGCHLD is sent when the child has only run for a
second or two. Its a pity the log message doesn't include the reason why
the SIGCHLD was sent.


Martin




RE: OT bad news

2009-10-06 Thread Gary Smith
 (Standing ovation on both emails)
 
 --
 Dan Schaefer
 Web Developer/Systems Analyst
 Performance Administration Corp.

I feel beat down now :( 

j/k




Re: OT bad news

2009-10-06 Thread MySQL Student
Hi,

 It's a shame that, living in Denver, I will be *just* out of range of
 hearing the screams as the mailspools fill with viruses, malware, and
 massive payloads of Spanish Prinsoner spams.

Awe, c'mon now. Yes, I agree SA is a better solution, but Microsoft
didn't get to be a multi-billion-dollar company solely because of its
marketing. Certainly a competent admin following some SANS guides can
secure an Exchange box to sufficiently avoid it getting hacked, and a
properly-installed version of Symantec will keep most spam away.

It /is/ possible, I suppose :-)

I'd bet that if he kept the FreeBSD box in place and just told his
boss he upgraded to Exchange, they'd never even know :-)

Regards,
Alex


Re: Uppercase E-mail in Latin America

2009-10-06 Thread MySQL Student
Hi,

 doesnt it appear to everyone else that this has the (slim to none) makings
 of a new urban legend?

I have to admit that when Warren posted this, I went to snopes to
check, and there was nothing there :-)

Regards,
Alex


SpamAssassin Ruleset Generation

2009-10-06 Thread poifgh

I have a question about - understanding how are rulesets generated for
spamassassin.

For example - consider the rule in 20_drugs.cf : 
header SUBJECT_DRUG_GAP_C   Subject =~
/\bc.{0,2}i.{0,2}a.{0,2}l.{0,2}i.{0,2}s\b/i
describe SUBJECT_DRUG_GAP_C Subject contains a gappy version of 'cialis'

Who generated the regular expression
/\bc.{0,2}i.{0,2}a.{0,2}l.{0,2}i.{0,2}s\b/i

a. Is it done manually with people writing regex to see how efficiently they
capture spams?
b. Is there an algorithm that identifies large corpus of spam and the comes
up with these regex'es on its own?
c. Is it a combination of (a), (b)?

I know scores for rules are generated using a neural network trained with
error back propagation
http://wiki.apache.org/spamassassin/HowScoresAreAssigned

But how are the rules generated themselves? 

Thnx
-- 
View this message in context: 
http://www.nabble.com/SpamAssassin-Ruleset-Generation-tp25773508p25773508.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SpamAssassin Ruleset Generation

2009-10-06 Thread RW
On Tue, 6 Oct 2009 11:08:28 -0700 (PDT)
poifgh abhinav.pat...@gmail.com wrote:

 
 I have a question about - understanding how are rulesets generated for
 ...
 a. Is it done manually with people writing regex to see how
 efficiently they capture spams?
 b. Is there an algorithm that identifies large corpus of spam and the
 comes up with these regex'es on its own?
 c. Is it a combination of (a), (b)?

The optional sought rules are autogenerated, the rest are manual.


Re: SpamAssassin Ruleset Generation

2009-10-06 Thread poifgh



RW-15 wrote:
 
 On Tue, 6 Oct 2009 11:08:28 -0700 (PDT)
 poifgh abhinav.pat...@gmail.com wrote:
 
 
 I have a question about - understanding how are rulesets generated for
 ...
 a. Is it done manually with people writing regex to see how
 efficiently they capture spams?
 b. Is there an algorithm that identifies large corpus of spam and the
 comes up with these regex'es on its own?
 c. Is it a combination of (a), (b)?
 
 The optional sought rules are autogenerated, the rest are manual.
 
 

Thnx - What are optional sought rules?

-- 
View this message in context: 
http://www.nabble.com/SpamAssassin-Ruleset-Generation-tp25773508p25776105.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SpamAssassin Ruleset Generation

2009-10-06 Thread Bowie Bailey
poifgh wrote:

 RW-15 wrote:
   
 On Tue, 6 Oct 2009 11:08:28 -0700 (PDT)
 poifgh abhinav.pat...@gmail.com wrote:

 
 I have a question about - understanding how are rulesets generated for
 ...
 a. Is it done manually with people writing regex to see how
 efficiently they capture spams?
 b. Is there an algorithm that identifies large corpus of spam and the
 comes up with these regex'es on its own?
 c. Is it a combination of (a), (b)?
   
 The optional sought rules are autogenerated, the rest are manual.
 

 Thnx - What are optional sought rules?
   

http://www.google.com/search?q=spamassassin+sought

-- 
Bowie


Re: SpamAssassin Ruleset Generation

2009-10-06 Thread poifgh



Bowie Bailey wrote:
 
 
 
 http://www.google.com/search?q=spamassassin+sought
 
:-D - Thnx

-- 
View this message in context: 
http://www.nabble.com/SpamAssassin-Ruleset-Generation-tp25773508p25776303.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SpamAssassin Ruleset Generation

2009-10-06 Thread poifgh



poifgh wrote:
 
 
 
 Bowie Bailey wrote:
 
 
 
 http://www.google.com/search?q=spamassassin+sought
 
 :-D - Thnx
 
 

Other than the sought rules, all the rules are manually generated? Is there
any statistics on how frequently are new rules/regex adopted by
spamassasssin? Who are the people who write them? Any details related to it?

thnx
-- 
View this message in context: 
http://www.nabble.com/SpamAssassin-Ruleset-Generation-tp25773508p25776307.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: SIGCHLD query

2009-10-06 Thread Per Jessen
Martin Gregorie wrote:

 On Tue, 2009-10-06 at 16:46 +0200, Per Jessen wrote:
 Martin Gregorie wrote:
 
  What causes a spamd 3.2.5 child process to be terminated by
  receiving a SIGCHLD signal?
  
 
 A timeout in the child perhaps?
 
 That thought that may be the reason. It certainly seems to apply when
 a
 child runs longer than the time set by --timeout-child  but there are
 a few cases where a SIGCHLD is sent when the child has only run for a
 second or two. Its a pity the log message doesn't include the reason
 why the SIGCHLD was sent.

Martin, generally speaking, the parent can only report the signal and
that the child has gone away.  The child would have to report on why. 


/Per Jessen, Zürich



Re: SpamAssassin Ruleset Generation

2009-10-06 Thread MySQL Student
Hi,

 Other than the sought rules, all the rules are manually generated? Is there
 any statistics on how frequently are new rules/regex adopted by
 spamassasssin? Who are the people who write them? Any details related to

Information on Justin Mason's SOUGHT rules is here:

http://taint.org/2007/08/15/004348a.html

Use sa-update to update your SA rules once or twice per day with the
new stuff. His ongoing development work is here:

http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/sandbox/jm/?sortby=date

HTH,
Alex


Re: SpamAssassin Ruleset Generation

2009-10-06 Thread John Hardin

On Tue, 6 Oct 2009, poifgh wrote:

Other than the sought rules, all the rules are manually generated? Is 
there any statistics on how frequently are new rules/regex adopted by 
spamassasssin? Who are the people who write them? Any details related to 
it?


Most of the rules are manually written by contributors such as myself. 
Some meta rules are generated by various means from existing rules - for 
example, the ADVANCE_FEE rules are generated using genetic algorithms to 
find effective combinations of simpler subrules that were manually 
generated.


New rules are added whenever a contributor works on them, and this is 
generally based on when they have time to do so, when they have new ideas, 
and when new forms of spam appear. Indirect contributors will post rules 
to the users list and a contributor may add them to the rules sandbox for 
testing and eventual inclusion in the base ruleset.


The CREDITS file in the sources should list all of the contributors. Some 
contributors may not have added their names to that file, though.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
 5 days since a sunspot last seen - EPA blames CO2 emissions


Re: OT bad news

2009-10-06 Thread mouss
Quanah Gibson-Mount a écrit :
 --On Monday, October 05, 2009 11:50 PM +0200 mouss
 mo...@ml.netoyen.net wrote:
 
 Thomas Mullins a écrit :
 We have been running Spamassassin for maybe eight years now.  But, my
 coworkers do not like OpenSource.  So they have finally complained
 enough that my boss is going to replace our reliable
 FreeBSD/Spamassassin boxes.  They are planning on purchasing something
 that runs ON Exchange.  What a bummer.



 and the problem is?

 if they want exchange, give them exchange. don't fight (directly), watch
 instead. take pleasure of the situation, get fun as you can. I
 personally took fun all day long in windows-only (and believe it or not,
 in linux-only) environments.


 that said, you can still try to explain that exchange should not be
 exposed to the internet. you still need a relay (such as
 freebsd/postfix).

 
 And once exchange falls over, show them Zimbra. ;)  Which uses
 postfix/SA/amavis, etc, and looks a lot like exchange... only better. ;)
 

I have to chose between zimbra and exchange, I'll go for exchange. but I
don't need to chose between the two. I want different components for
different tasks. and for many things, I go for web oriented solutions.


Re: Spam Eating Monkey?

2009-10-06 Thread Warren Togami

On 10/04/2009 09:32 PM, Blaine Fleming wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Warren Togami wrote:

http://spameatingmonkey.com

Anyone have any experience using these DNSBL and URIBL's?

Is anyone from this site on this list?

I wonder if we should add these rules to the sandbox for masschecks as
well.


Since someone is bound to ask I figure I'll state right now that I have
no objections to the SEM lists being included in the masschecks.  In
fact, I'm quite curious.

I would also recommend adding AnonWhois.org to the list.



I'll add your existing rules to the Sandbox for testing.

But have you considered putting all the DNSBL's and URIBL's into 
aggregated zones so you can cut down on redundant queries?


http://wiki.junkemailfilter.com/index.php/Spam_DNS_Lists
For example, one DNSBL lookup here can respond with 127.0.0.[1-5] 
depending on which list it is.


Warren Togami
wtog...@redhat.com


Re: SpamAssassin Ruleset Generation

2009-10-06 Thread Matt Kettler
poifgh wrote:
 I have a question about - understanding how are rulesets generated for
 spamassassin.

 For example - consider the rule in 20_drugs.cf : 
 header SUBJECT_DRUG_GAP_C   Subject =~
 /\bc.{0,2}i.{0,2}a.{0,2}l.{0,2}i.{0,2}s\b/i
 describe SUBJECT_DRUG_GAP_C Subject contains a gappy version of 'cialis'

 Who generated the regular expression
 /\bc.{0,2}i.{0,2}a.{0,2}l.{0,2}i.{0,2}s\b/i
   
Man, that's a good question. I wrote a large chunk of the rules in
20_drugs.cf, but not that one. ( I wrote the stuff near the bottom that
uses meta rules. ie:  __DRUGS_ERECTILE1 through DRUGS_MANYKINDS,
originally distributed as a separate set called antidrug.cf). As I
recall, there were 2 other people making drug rules, but it's been a
LONG time, and I forget who did it. Those rules were written in the
2004-2006 time frame when pharmacy spams were just hammering the heck
outa everyone.

 a. Is it done manually with people writing regex to see how efficiently they
 capture spams?
   
Yes. Many hours of reading spams, studying them, testing various regex
tweaks, checking for false positives, etc, etc.

mass-check is your friend for this kind of stuff.

One post from when I was developing this as a stand-alone set:

http://mail-archives.apache.org/mod_mbox/spamassassin-users/200404.mbox/%3c6.0.0.22.0.20040428132346.029d9...@opal.evi-inc.com%3e

Note: the comcast link mentioned in that message should be considered
DEAD. The antidrug set is no longer maintained separately from the
mailline ruleset, and hasn't been for years.


If you want to break the rules down a bit, here's some tips:

The rules are in general designed to detect common methods to obscure
text by inserting spaces, punctuation, etc between letters, and possibly
substituting some of the letters for other similar looking characters.
(W4R3Z style, etc)

The simple format would be to think of it in groupings. You end up using
a repeating pattern of (some representation of a character)(some kind of
gap sequence)(character)(gap)...etc.

.{0,2} is a gap sequence, although not one I prefer. I prefer
[_\W]{0,3} in most cases because it's a bit less FP-prone, but risks
missing things using small lower-case letters to gap.

You also get replacements for characters in some of those, like [A4]
instead of just A. Or, more elaborately..  [a4\xe0-\...@]

So this mess:

body __DRUGS_ERECTILE1  
/(?:\b|\s)[_\W]{0,3}(?:\\\/|V)[_\W]{0,3}[ij1!|l\xEC\xED\xEE\xEF][_\W]{0,3}[a40\xe0-\...@][_\w]{0,3}[xyz]?[gj][_\W]{0,3}r[_\W]{0,3}[a40\xe0-\...@][_\w]{0,3}x?[_\W]{0,3}(?:\b|\s)/i


Could be broken down:

(?:\b|\s)   - preamble, detecting space or word boundary.
[_\W]{0,3}   - gap
(?:\\\/|V)   - V
[_\W]{0,3}   - gap
[ij1!|l\xEC\xED\xEE\xEF] - I
[_\W]{0,3}   - gap
[a40\xe0-\...@]   - A
[_\W]{0,3}   - gap
[xyz]?[gj]   - G (with optional extra garbage before it)
[_\W]{0,3}   - gap
r- just R :-)
[_\W]{0,3}   - gap
[a40\xe0-\...@] -A
[_\W]{0,3}   - gap
x?   - optional garbage
[_\W]{0,3}   - gap
(?:\b|\s)- suffix, detecting space or word boundary.

Which detects weird spacings and substitutions in the word Viagra.


 But how are the rules generated themselves? 
   
Mostly meatware, except the sought rules others have mentioned.
 Thnx
   



Re: OT bad news

2009-10-06 Thread Royce Williams
On Tue, Oct 6, 2009 at 4:12 AM, Dan Schaefer d...@performanceadmin.com wrote:

 I'll have to repeat, for the original poster this isn't a technology
 vs technology argument.  If it was, his coworkers would be listing
 specific things Exchange does that FreeBSD/SA does not do.

 (Standing ovation on both emails)

Uncloaking to vigorously second.  Ted is so painfully right on that I
wish that it were otherwise (out of sympathy for the OP).

Royce


Re: Spam Eating Monkey?

2009-10-06 Thread Blaine Fleming
Warren Togami wrote:
 I'll add your existing rules to the Sandbox for testing.

Thank you!

 But have you considered putting all the DNSBL's and URIBL's into
 aggregated zones so you can cut down on redundant queries?

Actually, the uri red list is an aggregate zone of my uri black, red and
yellow lists.  The main reason I haven't merged the black list with any
of the other IP zones is because I haven't had enough user response on
the other lists yet.

Basically, the relevant zones are the SEM-URIRED and SEM-BLACK and each
of them needs to be it's own query because of the two completely
different datasets.

--Blaine



Re: Spam Eating Monkey?

2009-10-06 Thread Warren Togami

On 10/06/2009 11:15 PM, Blaine Fleming wrote:

Warren Togami wrote:

I'll add your existing rules to the Sandbox for testing.


Thank you!


But have you considered putting all the DNSBL's and URIBL's into
aggregated zones so you can cut down on redundant queries?


Actually, the uri red list is an aggregate zone of my uri black, red and
yellow lists.  The main reason I haven't merged the black list with any
of the other IP zones is because I haven't had enough user response on
the other lists yet.


You are misunderstanding the question.  A single DNS query could respond 
different numbers meaning they are hits on different lists.  Your lists 
that are subsets or supersets of other lists can easily use this.  The 
querying software need only to know what each result means.


Warren


consolidating DNSBLs into a single query (was Spam Eating Monkey?)

2009-10-06 Thread Rob McEwen
Warren Togami wrote:
 You are misunderstanding the question.  A single DNS query could
 respond different numbers meaning they are hits on different lists. 
 Your lists that are subsets or supersets of other lists can easily use
 this.  The querying software need only to know what each result means.

Not saying that this is a bad idea, but it does have its limitations.
For example, some lists are into the hundreds of megabytes large, and
getting the whole file rsncned and updated can take more than several
minutes. Often, such lists update only once or twice per hour, if even
that often.

In contrast, some lists are smaller and faster reacting and update every
few minutes.

Trying to merge all such lists into a single lists every several minutes
is no trivial task in terms of having enough CPU cycles and RAM to get
that done correctly and within a reasonably short time.

Likewise, doing the merge hourly loses the benefit of some of the
smaller-footprint faster-reacting lists which can react to emerging spam
threats faster.

Not saying such a consolidation can't be done... and maybe a few
tradeoffs here are worthwhile? But if these issues are not dealt with
smartly and competently, then one could easily find themselves with that
all-in-one comprehensive DNSBL has not being as effective as querying
them separately.

Also, this loses the ability to *score* on multiple lists... unless you
use a bitmasked scoring system whereby one list gets assigned .2,
another .4, another .8, on to .128. But that leaves a maximum of
only 7 lists. Sure, you can add more than 7 by employing other octets in
the answer IP, but that only severely complicates matters.

And as it stands, you'd also have the complexity of getting the spam
filter to parse, understand, and react properly to those bitmasks.

-- 
Rob McEwen
http://dnsbl.invaluement.com/
r...@invaluement.com
+1 (478) 475-9032




Re: consolidating DNSBLs into a single query (was Spam Eating Monkey?)

2009-10-06 Thread Royce Williams
On Tue, Oct 6, 2009 at 8:19 PM, Rob McEwen r...@invaluement.com wrote:
 Warren Togami wrote:
 You are misunderstanding the question.  A single DNS query could
 respond different numbers meaning they are hits on different lists.
 Your lists that are subsets or supersets of other lists can easily use
 this.  The querying software need only to know what each result means.

 Not saying that this is a bad idea, but it does have its limitations.
 For example, some lists are into the hundreds of megabytes large, and
 getting the whole file rsncned and updated can take more than several
 minutes. Often, such lists update only once or twice per hour, if even
 that often.

Hmm ... interesting.  If implemented via rbldnsd, each list could be
maintained in a separate file, and since rbldnsd can be configured to
build a single zone using multiple files on the back end, different
lists could be refreshed at different rates.

Your comments about tradeoffs and bitmasking still stand, of course.

Royce