Re: Non English Spam

2006-10-21 Thread Warren Block

On Fri, 20 Oct 2006, Erik Norgaard wrote:


You can't check the white list before using RBL in Sendmail?


Yes, you can, with entries in access.db marked with OK.

-Warren Block * Rapid City, South Dakota USA

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-20 Thread Ted Mittelstaedt

- Original Message - 
From: Erik Norgaard [EMAIL PROTECTED]
To: Ted Mittelstaedt [EMAIL PROTECTED]
Cc: Beech Rintoul [EMAIL PROTECTED];
freebsd-questions@freebsd.org
Sent: Tuesday, October 17, 2006 2:22 AM
Subject: Re: Non English Spam



 Also this means that later filtering on the first Received field is
 double work: You already accepted the mail based on that information.

 In short: Writing header filtering rules for the Received field is
 simply waste of time and proof of inefficiency.


I agree with this but unfortunately the real world often screws this up.

For example, SpamCop is one of the most effective blacklists on the
Internet because of it's high user participation.  Unfortunately, it
repeatedly blocks yahoomail, craigslist, and ebay because spammers
hate it and try to stuff it up so as to get people to stop using it.

As a result, you cannot use Spamcop to reject at the HELO.  Instead
you have to post-filter the mail and do your spamcop lookups, so
you can exempt domains like ebay that are legitimate.

 Just as Sendmail, Postfix is not designed for spam filtering. Postfix
 provides simple filtering mechanisms, keeping it simple postfix provides
 an effective and reliable MTA that doesn't suffer the track record of
 security bugs Sendmail does.

 When the native filters does not suffice you can combine with any number
 of policy services: External filtering mechanisms such as postgrey,
 spam assassin etc. This design is clean, reliable and easy to manage.


Same for Sendmail, you can use milters to add all this stuff in.  Or you
can do it in the local delivery agent.


 OP requested a way to filter away the spam in foreign character sets
 because for some reason these were not caught by Spam Assassin or
 procmail. I gave a solution that solves that problem, and I mentioned
 the problem of false negatives for this list.

 Rather than get pissed, do try to offer an alternative solution to a
 real problem.


There really is no solution.  Fundamentally, well written spam is
not distinguishable from non-spam by a computer.  What has saved our asses
so
far is that there's not a spammer alive who has been able to resist the
temptation
to use bold, colors, blinking test, hot phrases, and other attention-getting
devices in their spams.  Since you can program a computer to look for the
attention getting stuff, what has happened is a little social engineering.

Most people today have abandonded use of attention-getting devices
in their e-mails because when they use HTMLized text and such,
their mails tend to get blocked as spam by everyone and their dog.
So, the spam content filters can still distinguish spam from non-spam
by looking for these differences.

But it is only a matter of time before the spammers all wake up and
smell the coffee, and start using standard ASCII pure text for their
spams, and then all these charset filters your loving will go gurgling
down the drain.

Granted that might make their spams less effective so they might
get less respondents.

Frankly, I think there is no technical solution, I think there are only
political solutions.  We've already made spam illegal in the US, and
the CAN-SPAM act defines the advertised party in the spams
also as a spammer, in addition to the actual spammer sending the
stuff.

It would be childs play for the FBI to work with the major ISP's
to create thousands of dummy e-mail addresses and use these
to capture spam runs.  Then they just go arrest the people in the
company that is being advertised and hang a few of them high.
There's no need to even go after the actual spammers themselves.

When this happens enough times, the supply of companies that
are willing to pay spammers to send spam will dry up, and the
spammers will go find some other criminal activity to engage in.

But, the FBI isn't doing this because many of the companies that
are hiring spammers have lots of money, and that gives them lots of
political power.  So, the will to curb spam just isn' t there even though
money earned by spammers is undoubtedly going into organized
crime, feeding terror cells, and other more nasty stuff.


 I asked politely if there were any consensus or best practices etc. on
 this issue. You have the regular mail on how to get the best results
 there are recommendations on how to use this list, they are not enforced
 but only serve as guidelines.

 I don't try to force people to use particular character sets, I merely
 ask whether such recommendation exist for the best results when using
 the list, in which case filtering on charsets may be the least
 imperfect solution (until you share your perfect filter, that is).


Your continuing to try to muddy the issue by inferring that personal
filters are the same as requirements to post.

You snipped all my explanation of what the differences are and responded
with a snotty request for a perfect filter, when I never said I ever had
one.

As I already stated, what people do on their own mailserver

Re: Non English Spam

2006-10-20 Thread Erik Norgaard

Ted Mittelstaedt wrote:


Also this means that later filtering on the first Received field is
double work: You already accepted the mail based on that information.

In short: Writing header filtering rules for the Received field is
simply waste of time and proof of inefficiency.


I agree with this but unfortunately the real world often screws this up.

For example, SpamCop is one of the most effective blacklists on the
Internet because of it's high user participation.  Unfortunately, it
repeatedly blocks yahoomail, craigslist, and ebay because spammers
hate it and try to stuff it up so as to get people to stop using it.


You can't check the white list before using RBL in Sendmail? Well, you 
can with postfix, you can even control if checks should be done when the 
entire envelope is received or when the connection is established. Maybe 
postfix isn't that crappy after all :)


Of course, maintaining white lists is only practically possible for a 
limited number of hosts.



OP requested a way to filter away the spam in foreign character sets
because for some reason these were not caught by Spam Assassin or
procmail. I gave a solution that solves that problem, and I mentioned
the problem of false negatives for this list.

Rather than get pissed, do try to offer an alternative solution to a
real problem.


There really is no solution.  Fundamentally, well written spam is
not distinguishable from non-spam by a computer.  What has saved our asses
so
far is that there's not a spammer alive who has been able to resist the
temptation
to use bold, colors, blinking test, hot phrases, and other attention-getting
devices in their spams.  Since you can program a computer to look for the
attention getting stuff, what has happened is a little social engineering.


True - or the reverse, that novice users will send their birthday
invitation with flags and colors etc so you can't naively reject html mail.


Frankly, I think there is no technical solution, I think there are only
political solutions.  We've already made spam illegal in the US, and
the CAN-SPAM act defines the advertised party in the spams
also as a spammer, in addition to the actual spammer sending the
stuff.


Actually, I do think there is a technical solution, but the problem is
that the cost of implementation is at the senders end, and the cost of
spam is at recipients end.

The political action needed is to move the cost onto the senders end - 
I'm not talking about adding a cost for sending individual mails but 
moving liability: You are responsible for what you send.


Basically, it's like for cars: You have an insurance for your car, even 
if a thief steals it your insurance covers accidents that the car may be 
involved in.


Once liability moves to the source, anyone upstream in the the mail 
delivery will make sure that they can pass on liability to someone 
further up, and if they can't, they will implement the controls to limit 
illicit mailing to reduce the risk.



I asked politely if there were any consensus or best practices etc. on
this issue. You have the regular mail on how to get the best results
there are recommendations on how to use this list, they are not enforced
but only serve as guidelines.

I don't try to force people to use particular character sets, I merely
ask whether such recommendation exist for the best results when using
the list, in which case filtering on charsets may be the least
imperfect solution (until you share your perfect filter, that is).


Your continuing to try to muddy the issue by inferring that personal
filters are the same as requirements to post.


No, my idea is that if there is consensus that subscribers should post 
in say ASCII for the best results, then one could more reasonably filter 
other character sets because these are unlikely to occur. And, since 
foreign character sets are associated with language, other subscribers 
sharing language could take care of that off list - just as if someone 
writes in a foreign language.



You snipped all my explanation of what the differences are and responded
with a snotty request for a perfect filter, when I never said I ever had
one.


I snipped, not to be rude, but because I felt you were getting emotional.


As I already stated, what people do on their own mailserver is their
business.  If they want to filter Asian charsets, then fine.  Go ahead.
But, telling people they can't use them when posting to the list is
crossing the line.

Certainly a best results when using the list document is a good thing.
But, that is a recommendation, not a requirement.  The response that
got me pissed was speculating that the list server should filter on Asian
charsets,
and we should order, not recommend, to
people that they don't use Asian charsets.  I'm glad to see your
backwatering from that.


I never intended to imply that the FreeBSD list server should filter
messages more than is done now. If you would go back to my first post I ask:

What is the recommended 

Re: Non English Spam

2006-10-17 Thread Ted Mittelstaedt

- Original Message - 
From: Erik Norgaard [EMAIL PROTECTED]
To: Ted Mittelstaedt [EMAIL PROTECTED]
Cc: Beech Rintoul [EMAIL PROTECTED];
freebsd-questions@freebsd.org
Sent: Sunday, October 15, 2006 3:47 AM
Subject: Re: Non English Spam


 Ted Mittelstaedt wrote:

  I have noted however, that some subscribers to this list write english
  encoded in one of the above character sets, I don't know enough about
  the character set definition, but it seems that English characters are
a
  subset of any character set?
 
  What is the recommended policy here? Should subscribers be advised to
  change character set when posting to the list?
 
  No.  It's the responsibility of the person doing the filtering - in this
  case you -
  to exempt any known good e-mail sender from your filters.

  You know damn well that legitimate mailing list mail comes from
 
  mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])
 
  it's right in the headers of the messages on the list.

 First: You know all too well that filtering based on Received header
 fields is not reliable - any decent spammer know how to forge that.

Spammers cannot forge the Received header that your own mailserver
puts into the received message.  The first Received line of the message
is always legitimate.  You can also turn on the Sendmail flag to put in
the envelope address if you have multiple aliases to a mailbox that
you want to see.

 Accepting mail from a particular host should be done even before the
 mail delivery starts.


Don't know what your talking about here.

 Second: If you know postfix, you also know that header filtering is
 independent of other checks, even the result of filtering on individual
 header lines are independent.

 So the ideal you mention is not an option until a complete public list
 of authorized mail servers is available and all mail relayed through
 these requires authentication.


I don't know Postfix.  So what your saying is Postfix is so defective
that you can't use it for filtering?  No wonder I never bothered to
deal with it.

And, this isn't true anyway.  You can easily tell with a little sleuthing
what all of the mail emitters are for the FreeBSD mailing lists.  Many
mailing list managers, in fact, go to the trouble of posting publically
what their mailservers are.  And if the transmitting domain really
has their shit together, they will have published SPF records in their
DNS that will tell you what the authorized mailservers for that domain
are.   Sendmail has an SPF milter and I believe Spamassassin can also
use these for weighting.  (I'm too lazy to check for sure right now)

 Or do you have the solution that does not imply accepting any of a
 myriad of character sets?

 I'd be happy to implement that, but I don't want to open my mail server
 to receive mail I have no means of reading and understanding just
 because it is RFC compliant.


You open your mailserver to known, whitelisted, legitimate sending
servers, and let everyone else deal with the charset filtering.  You know
your going to accept mail from freebsd.org (or you tell your users to
tell you if they are) and you exempt these servers from filtering.

  You have no right to
  force other people to conform to what you feel is acceptable formatting
  of their message as long as they meet the SMTP rfc standards.  That's
  why we have RFC's.

 You you know perfectly well that content filtering is not based on the
 RFC's on SMTP but rather on the Internet Message Format and various
 RFC's on MIME - but I assume that you meant to refer to these.


content filtering and message charsets aren't the same thing.  A content
filter checks for Make Money FAST and other obvious spam content.
You don't like Viagra?  That's a content filter that takes care of that.

However, marking a non-spam message as spam soley because it's written in
another language
that you don't read - that's not a content filter.  There's nothing in the
content
of that message that is spam.  Thus you have no moral right to force mailing
list users to conform to a specific language.  Certainly, you can say I
don't know
Spanish so I will just setup a filter to delete anything I get written in
Spanish
but your crossing the line when you start telling people they can't post
Spanish
to a mailing list.

 Basically what you say here is that spammers have every right to flood
 mail servers as long as they do so compliant with the RFC's?


I'm saying that you don't have the right to force other people to modify
their content on messages that AREN'T spam just because your spam
filters are too piss-poor to differentiate between an Asian charset message
that is spam, and an Asian charset message that is a legitimate message.

 I don't force anyone to conform to any arbitrary standards that I decide
 upon, but I have every legitimate right to reject anything that doesn't
 conform to my arbitrary standards.


No argument there - but your crossing the line (or the other poster is
crossing the line

Re: Non English Spam

2006-10-17 Thread Erik Norgaard

Ted Mittelstaedt wrote:


Spammers cannot forge the Received header that your own mailserver
puts into the received message.  The first Received line of the message
is always legitimate.


Please read my reply to Ian, who commented exactly the same. The 
Recieved headers are useless for filtering.



Accepting mail from a particular host should be done even before the
mail delivery starts.


Don't know what your talking about here.


The first Received header line, which as you correctly mention is (the 
only) reliable, is inserted by your own server based on the info from 
the establishing connection and HELO command.


In this case you can decide to accept or reject the mail before 
accepting the DATA. This is more efficient as you don't waste bandwidth 
receiving data you will later reject.


Also this means that later filtering on the first Received field is 
double work: You already accepted the mail based on that information.


In short: Writing header filtering rules for the Received field is 
simply waste of time and proof of inefficiency.



Second: If you know postfix, you also know that header filtering is
independent of other checks, even the result of filtering on individual
header lines are independent.


I don't know Postfix.  So what your saying is Postfix is so defective
that you can't use it for filtering?  No wonder I never bothered to
deal with it.


Just as Sendmail, Postfix is not designed for spam filtering. Postfix 
provides simple filtering mechanisms, keeping it simple postfix provides 
an effective and reliable MTA that doesn't suffer the track record of 
security bugs Sendmail does.


When the native filters does not suffice you can combine with any number 
of policy services: External filtering mechanisms such as postgrey, 
spam assassin etc. This design is clean, reliable and easy to manage.


I mentioned a solution using the mechanisms supported natively by 
postfix. OP had problems that spam assassin and procmail did not catch 
these mails.



Basically what you say here is that spammers have every right to flood
mail servers as long as they do so compliant with the RFC's?


I'm saying that you don't have the right to force other people to modify
their content on messages that AREN'T spam just because your spam
filters are too piss-poor to differentiate between an Asian charset message
that is spam, and an Asian charset message that is a legitimate message.


Call it piss-poor, but it is very effective, and simple to implement. If 
you have an effective alternative please do share.


OP requested a way to filter away the spam in foreign character sets 
because for some reason these were not caught by Spam Assassin or 
procmail. I gave a solution that solves that problem, and I mentioned 
the problem of false negatives for this list.


Rather than get pissed, do try to offer an alternative solution to a 
real problem.



I don't force anyone to conform to any arbitrary standards that I decide
upon, but I have every legitimate right to reject anything that doesn't
conform to my arbitrary standards.


No argument there - but your crossing the line (or the other poster is
crossing the line) when your talking about telling list subscribers to
change charsets when they post.


I think you misread my original post. I brought up the issue exactly 
because filtering on charsets causes false positives whichever way you 
do it.


I don't have a particular desire to throw away legitimate mail, in fact 
I'd like to solve that problem (and I think OP want that too), but so 
far you have not contributed with a working alternative.


I asked politely if there were any consensus or best practices etc. on 
this issue. You have the regular mail on how to get the best results 
there are recommendations on how to use this list, they are not enforced 
but only serve as guidelines.


I don't try to force people to use particular character sets, I merely 
ask whether such recommendation exist for the best results when using 
the list, in which case filtering on charsets may be the least 
imperfect solution (until you share your perfect filter, that is).


Cheers, Erik
--
Ph: +34.666334818  web: http://www.locolomo.org
X.509 Certificate: http://www.locolomo.org/crt/8D03551FFCE04F0C.crt
Key ID: 69:79:B8:2C:E3:8F:E7:BE:5D:C3:C3:B1:74:62:B8:3F:9F:1F:69:B9
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-15 Thread Beech Rintoul
On Saturday 14 October 2006 16:58, Ted Mittelstaedt wrote:
 - Original Message -
 From: Erik Norgaard [EMAIL PROTECTED]
 To: Beech Rintoul [EMAIL PROTECTED]
 Cc: freebsd-questions@freebsd.org
 Sent: Saturday, October 14, 2006 5:38 AM
 Subject: Re: Non English Spam

  I have noted however, that some subscribers to this list write english
  encoded in one of the above character sets, I don't know enough about
  the character set definition, but it seems that English characters are a
  subset of any character set?
 
  What is the recommended policy here? Should subscribers be advised to
  change character set when posting to the list?

 No.  It's the responsibility of the person doing the filtering - in this
 case you -
 to exempt any known good e-mail sender from your filters.

 You know damn well that legitimate mailing list mail comes from

 mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])

 it's right in the headers of the messages on the list.  You have no right
 to force other people to conform to what you feel is acceptable formatting
 of their message as long as they meet the SMTP rfc standards.  That's why
 we have RFC's.

 If everyone did what your proposing then senders would have hundreds
 of different rules they would have to follow, over and above the normal
 RFCs.

Ted, thank you for the bit of sanity. 

As for me, Dr. Seaman's suggestions (earlier in this thread) have  brought 
things back to tolerable levels. I have had many responses and most probably 
work. But, I need a solution I can install on client machines (and my own) 
that doesn't require exotic scripts.

I thoroughly parsed my maillog and so far nothing important has landed
in /dev/null. 

Once again, thanks to everyone who responded.

Beech


-- 

---
Beech Rintoul - Sys. Administrator - [EMAIL PROTECTED]
/\   ASCII Ribbon Campaign  | Alaska Paradise
\ / - NO HTML/RTF in e-mail  | 201 East 9Th Avenue Ste.310
 X  - NO Word docs in e-mail | Anchorage, AK 99501
/ \  - Please visit Alaska Paradise - http://www.alaskaparadise.com
---













pgpXrZtej6VdH.pgp
Description: PGP signature


Re: Non English Spam

2006-10-15 Thread Erik Norgaard

Ted Mittelstaedt wrote:


I have noted however, that some subscribers to this list write english
encoded in one of the above character sets, I don't know enough about
the character set definition, but it seems that English characters are a
subset of any character set?

What is the recommended policy here? Should subscribers be advised to
change character set when posting to the list?


No.  It's the responsibility of the person doing the filtering - in this
case you -
to exempt any known good e-mail sender from your filters.



You know damn well that legitimate mailing list mail comes from

mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])

it's right in the headers of the messages on the list.


First: You know all too well that filtering based on Received header 
fields is not reliable - any decent spammer know how to forge that. 
Accepting mail from a particular host should be done even before the 
mail delivery starts.


Second: If you know postfix, you also know that header filtering is 
independent of other checks, even the result of filtering on individual 
header lines are independent.


So the ideal you mention is not an option until a complete public list 
of authorized mail servers is available and all mail relayed through 
these requires authentication.


Or do you have the solution that does not imply accepting any of a 
myriad of character sets?


I'd be happy to implement that, but I don't want to open my mail server 
to receive mail I have no means of reading and understanding just 
because it is RFC compliant.



You have no right to
force other people to conform to what you feel is acceptable formatting
of their message as long as they meet the SMTP rfc standards.  That's
why we have RFC's.


You you know perfectly well that content filtering is not based on the 
RFC's on SMTP but rather on the Internet Message Format and various 
RFC's on MIME - but I assume that you meant to refer to these.


Basically what you say here is that spammers have every right to flood 
mail servers as long as they do so compliant with the RFC's?


I don't force anyone to conform to any arbitrary standards that I decide 
upon, but I have every legitimate right to reject anything that doesn't 
conform to my arbitrary standards.


Yet, it is somewhat implicit that this is an English language list, any 
one writing in a different language may be lucky to find someone who can 
respond in their language, but are just as often referred to one of the 
language specific lists - if their message is not simply ignored.


So we do actually impose some arbitrary rule on subscribers, namely to 
write in English. Given that we find it reasonable to impose such a 
rule, then why is it unreasonable to impose that they should abstain 
from obscure non-English character sets?


I was hoping to find a way that we can all get along, I find it kind of 
useless to waste my resources on mail written in languages that I have 
no means of interpreting.



If everyone did what your proposing then senders would have hundreds
of different rules they would have to follow, over and above the normal
RFCs.


Well, in real life as well as on-line we have thousands of rules and 
customs, implicit or written, on communication and gestures.


There are best practices on how to communicate in e-mail and on mailling 
lists, usage of smileys and other types of mood-expression, and 
proclaimed best practices on how to quote.


You regularly see people complaining about top posting. Then, line 
wrapping, or people who don't delete the trailing message part that they 
don't reply to etc.


I don't see a recommendation on character sets as much different.

Cheers, Erik
--
Ph: +34.666334818  web: http://www.locolomo.org
X.509 Certificate: http://www.locolomo.org/crt/8D03551FFCE04F0C.crt
Key ID: 69:79:B8:2C:E3:8F:E7:BE:5D:C3:C3:B1:74:62:B8:3F:9F:1F:69:B9
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-15 Thread Ian Smith
On Sun, 15 Oct 2006 [EMAIL PROTECTED] wrote:
  Message: 2
  Date: Sun, 15 Oct 2006 12:47:37 +0200
  From: Erik Norgaard [EMAIL PROTECTED]

  Ted Mittelstaedt wrote:
  
   I have noted however, that some subscribers to this list write english
   encoded in one of the above character sets, I don't know enough about
   the character set definition, but it seems that English characters are a
   subset of any character set?
  
   What is the recommended policy here? Should subscribers be advised to
   change character set when posting to the list?
   
   No.  It's the responsibility of the person doing the filtering - in this
   case you -
   to exempt any known good e-mail sender from your filters.
  
   You know damn well that legitimate mailing list mail comes from
   
   mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])
   
   it's right in the headers of the messages on the list.
  
  First: You know all too well that filtering based on Received header 
  fields is not reliable - any decent spammer know how to forge that. 
  Accepting mail from a particular host should be done even before the 
  mail delivery starts.

Ted's talking about the _first_ Received header, see mine below.  It's
the only one you _can_ rely on, assuming your mailserver isn't lying to
you.  Subsequent headers, sure, all can be faked, trust noone .. :)

  Received: from mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])
   by gaia.nimnet.asn.au (8.8.8/8.8.8R1.4) with ESMTP id WAA18000
   for [EMAIL PROTECTED]; Sun, 15 Oct 2006 22:02:19 +1000 (EST)
   (envelope-from [EMAIL PROTECTED])

There's the verified IP address of the connecting peer mailserver, that
IP's reverse resolution from DNS, and the HELO presented.  Any and all
of which can be analysed, looked up in maps, blacklisted, whitelisted,
or filtered any way you want, no? 

  Second: If you know postfix, you also know that header filtering is 
  independent of other checks, even the result of filtering on individual 
  header lines are independent.

Does that mean you can't black/grey/whitelist by connecting mailserver?

  So the ideal you mention is not an option until a complete public list 
  of authorized mail servers is available and all mail relayed through 
  these requires authentication.

That's the 'solution' the mega players appear to be proposing.  And who
then authorises whom to run mailservers?  What about, er, us?  Shudder. 

  Or do you have the solution that does not imply accepting any of a 
  myriad of character sets?
  
  I'd be happy to implement that, but I don't want to open my mail server 
  to receive mail I have no means of reading and understanding just 
  because it is RFC compliant.

Like any one, you can reject any mail you don't fancy, for whatever
reason you don't want it.  That doesn't require proposing that others
should do likewise, as in wanting to specify 'standards' for lists.

As Ted pointed out, various people often post perfectly intelligible
messages in English in the various FreeBSD lists, reporting non-Roman
charsets.  I could mention one regular poster (and committer) whose
messages provide no charset information at all :)

   You have no right to
   force other people to conform to what you feel is acceptable formatting
   of their message as long as they meet the SMTP rfc standards.  That's
   why we have RFC's.
  
  You you know perfectly well that content filtering is not based on the 
  RFC's on SMTP but rather on the Internet Message Format and various 
  RFC's on MIME - but I assume that you meant to refer to these.
  
  Basically what you say here is that spammers have every right to flood 
  mail servers as long as they do so compliant with the RFC's?

Have you noticed a lot of non-Roman charset spam on the FreeBSD lists?

  I don't force anyone to conform to any arbitrary standards that I decide 
  upon, but I have every legitimate right to reject anything that doesn't 
  conform to my arbitrary standards.

Of course.

  Yet, it is somewhat implicit that this is an English language list, any 
  one writing in a different language may be lucky to find someone who can 
  respond in their language, but are just as often referred to one of the 
  language specific lists - if their message is not simply ignored.

We're not - with respect to suggesting 'rules' for these lists - talking
about non English language messages.  As you say, they get dealt with,
often offlist, by someone helpful who knows that language.  So this is
about whether to 'enforce' particular charsets for messages in English.

  So we do actually impose some arbitrary rule on subscribers, namely to 
  write in English. Given that we find it reasonable to impose such a 
  rule, then why is it unreasonable to impose that they should abstain 
  from obscure non-English character sets?

Because it's unnecessary, as well as arbitary, to filter list messages
by charset alone as an unassociated variable.  Sure, it might be a hint
in the mix to give 

Re: Non English Spam

2006-10-15 Thread Erik Norgaard

Ian Smith wrote:


Ted's talking about the _first_ Received header, see mine below.  It's
the only one you _can_ rely on, assuming your mailserver isn't lying to
you.  Subsequent headers, sure, all can be faked, trust noone .. :)


Filtering on the Received header entries is waste of time: Only the 
first line is reliable, inserted by your own mail server, but in that 
case you can filter on the connect or HELO, which is much better because 
you don't waste bandwidth receiving the entire mail.


I actually had spammers DDOS my connection because I didn't reject the 
large bulk part early enough. I temporarily had to block any connection 
from China and Korea.



  Received: from mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])
by gaia.nimnet.asn.au (8.8.8/8.8.8R1.4) with ESMTP id WAA18000
for [EMAIL PROTECTED]; Sun, 15 Oct 2006 22:02:19 +1000 (EST)
(envelope-from [EMAIL PROTECTED])

There's the verified IP address of the connecting peer mailserver, that
IP's reverse resolution from DNS, and the HELO presented.  Any and all
of which can be analysed, looked up in maps, blacklisted, whitelisted,
or filtered any way you want, no? 


Maybe I didn't make clear how the filtering in Postfix works? Each 
header line is unwrapped and then filtered independent of the others. 
There is no info as to if that is the first or last Received line.


I can make a rule to reject the mail. And I can make a rule that accept 
a given header line, but the remaining header will still be filtered and 
possibly rejected.


I can't make a header check for Received cause checks for content-type 
to be skipped.


Nor can I make incoming mail from white listed servers skip the header 
checks. The two things are independent: The first applies when 
establishing the connection: HELO, MAIL FROM, RCPT TO etc. The header 
checks are invoked if the initial delivery request was accepted.


Yes, that sucks, but that's how Postfix works.

  Second: If you know postfix, you also know that header filtering is 
  independent of other checks, even the result of filtering on individual 
  header lines are independent.


Does that mean you can't black/grey/whitelist by connecting mailserver?


No, I'm only referring to the built in header filtering capabilities.

I have postgray too, and I do have freebsd white listed. Postgrey uses 
the MAIL FROM and RCPT TO, so it takes effect even before the DATA command.


  So the ideal you mention is not an option until a complete public list 
  of authorized mail servers is available and all mail relayed through 
  these requires authentication.


That's the 'solution' the mega players appear to be proposing.  And who
then authorises whom to run mailservers?  What about, er, us?  Shudder. 


Anarchy is great, but it assumes that everyone are good. Evidently 
this is not the case - unfortunately.


I'm one of 'us' and honestly, I don't see why it should be OK to set up 
a mail server without any possibility of identifying the owner or 
responsible, nor do I see this as a big problem:


You either relay mail through your provider's mail server (which 
requires you to authenticate) or register your mail server with the 
provider. The provider can then add your info to the whois database and 
open your connection out.


This should be trivial to implement, but currently there is no legal 
requirement or economic benefit for those capable to take action. For 
the latter, the problem is that implementing such controls only benefits 
everyone else.



As Ted pointed out, various people often post perfectly intelligible
messages in English in the various FreeBSD lists, reporting non-Roman
charsets. 


Which was exactly the problem I mentioned to OP - I mean not that 
intelligible messages are posted :), but they are encoded in different 
character sets.



I could mention one regular poster (and committer) whose
messages provide no charset information at all :)


Well, his messages would be accepted since there is no character set to 
reject :)


I absolutely would prefer not to reject any mail on the FreeBSD list, 
but the effect would be to accept non-FreeBSD mail that is obviously spam.


If you have a solution at hand that would not open the gates to spam, 
please do share.



Have you noticed a lot of non-Roman charset spam on the FreeBSD lists?


No, but as mentioned before: Distinguishing non-Roman charset FreeBSD 
mail from non-Roman non-FreeBSD spam is the problem.



Because it's unnecessary, as well as arbitary, to filter list messages
by charset alone as an unassociated variable.  Sure, it might be a hint
in the mix to give some points.  The FreeBSD lists are mostly incredibly
spam free, but I doubt that much of that filtering is based on charsets.


As mentioned in my original post, the previous and above: The problem is 
that filtering mail by charset while in many cases will reject what can 
positively be identified as spam, in certain cases also rejects 
legitimate mail sent to this 

Re: Non English Spam

2006-10-15 Thread Gerard Seibert
On Sunday October 15, 2006 at 03:21:37 (PM) Erik Norgaard wrote:


 Ian Smith wrote:

[...]

 Maybe I didn't make clear how the filtering in Postfix works? Each 
 header line is unwrapped and then filtered independent of the others. 
 There is no info as to if that is the first or last Received line.
 
 I can make a rule to reject the mail. And I can make a rule that accept 
 a given header line, but the remaining header will still be filtered and 
 possibly rejected.
 
 I can't make a header check for Received cause checks for content-type 
 to be skipped.
 
 Nor can I make incoming mail from white listed servers skip the header 
 checks. The two things are independent: The first applies when 
 establishing the connection: HELO, MAIL FROM, RCPT TO etc. The header 
 checks are invoked if the initial delivery request was accepted.
 
 Yes, that sucks, but that's how Postfix works.

Are you sure about that? I use Postfix myself and that does not appear
to be correct, although it might be. Have you ever posted this question
on the postfix forum? [EMAIL PROTECTED] There are some pretty
sharp individuals there who might be able to give you some advice.

[...]

-- 
Gerard

An optimist thinks that this is the best possible world. A pessimist
fears that this is true.

 Anonymous
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-15 Thread Erik Norgaard

Gerard Seibert wrote:

On Sunday October 15, 2006 at 03:21:37 (PM) Erik Norgaard wrote:



Ian Smith wrote:


[...]

Maybe I didn't make clear how the filtering in Postfix works? Each 
header line is unwrapped and then filtered independent of the others. 
There is no info as to if that is the first or last Received line.


I can make a rule to reject the mail. And I can make a rule that accept 
a given header line, but the remaining header will still be filtered and 
possibly rejected.


I can't make a header check for Received cause checks for content-type 
to be skipped.


Nor can I make incoming mail from white listed servers skip the header 
checks. The two things are independent: The first applies when 
establishing the connection: HELO, MAIL FROM, RCPT TO etc. The header 
checks are invoked if the initial delivery request was accepted.


Yes, that sucks, but that's how Postfix works.


Are you sure about that? I use Postfix myself and that does not appear
to be correct, although it might be. Have you ever posted this question
on the postfix forum? [EMAIL PROTECTED] There are some pretty
sharp individuals there who might be able to give you some advice.


I am certain that:

1) header/body checks are independent of the smtpd_restrictions - I can 
send a mail that is rejected even though I have authenticated and permit 
authenticated connections.


2) OK when a header line is matched does not affect the parsing of other 
header lines, and if you think about it you wouldn't want that: Then  it 
would be possible to include a secret keyword or forged header line in 
the top of the header to get by the other rules.


Basically, the only line that you can trust is the first Received which 
our server inserted - which as mentioned is waste to check. So, no 
header check in itself should allow an entire mail.


There is a FILTER keyword which you can use to tag a mail for further 
content filtering. That action is taken after all the header checks have 
been done.


Cheers, Erik

--
Ph: +34.666334818  web: http://www.locolomo.org
X.509 Certificate: http://www.locolomo.org/crt/8D03551FFCE04F0C.crt
Key ID: 69:79:B8:2C:E3:8F:E7:BE:5D:C3:C3:B1:74:62:B8:3F:9F:1F:69:B9
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-15 Thread Erik Norgaard

Erik Norgaard wrote:

Ian Smith wrote:
  So the ideal you mention is not an option until a complete public list 
  of authorized mail servers is available and all mail relayed through 
  these requires authentication.


That's the 'solution' the mega players appear to be proposing.  And who
then authorises whom to run mailservers?  What about, er, us?  Shudder. 


I'm one of 'us' and honestly, I don't see why it should be OK to set up 
a mail server without any possibility of identifying the owner or 
responsible, nor do I see this as a big problem:


Ironically, as if to stress the point, my reply to you got rejected 
(well you can find it in the archives), because my server is not on your 
(arbitrary) white list and the mail was not relayed through an 
authorized relay (mx2.freebsd.org).


And I even pay extra to have a static ip, that resolves to a PTR 
containing the word static according to the IETF draft. And I actually 
accept connections from any server that plays by the RFC (the SMTP - 
strict) because I don't want to reject the large group of people who 
want to set up their on server...


- so who is 'us'?

Well, anyway, this only serves to enlighten another problem: That even 
if you find the solution to rejecting non-Roman non-FreeBSD mail while 
accepting everything from the list, people replying in those character 
sets will see their mail rejected because their mail doesn't go through 
the FreeBSD server.


To avoid the above, we should recommend subscribers to the list to 
change their reply to when writing to the list, or configure their 
subscription such that mx2 will send mail regardless of the recipient 
being in the To/Cc header, or recommending users only to include the 
list as recipient... but we were against imposing rules - right?


Wouldn't it be nice if there was a reliable way to determine legitimate 
sources...?


Cheers, Erik
--
Ph: +34.666334818  web: http://www.locolomo.org
X.509 Certificate: http://www.locolomo.org/crt/8D03551FFCE04F0C.crt
Key ID: 69:79:B8:2C:E3:8F:E7:BE:5D:C3:C3:B1:74:62:B8:3F:9F:1F:69:B9
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-15 Thread Beech Rintoul
On Sunday 15 October 2006 13:08, Erik Norgaard wrote:

(SNIP)
 Well, anyway, this only serves to enlighten another problem: That even
 if you find the solution to rejecting non-Roman non-FreeBSD mail while
 accepting everything from the list, people replying in those character
 sets will see their mail rejected because their mail doesn't go through
 the FreeBSD server.

 To avoid the above, we should recommend subscribers to the list to
 change their reply to when writing to the list, or configure their
 subscription such that mx2 will send mail regardless of the recipient
 being in the To/Cc header, or recommending users only to include the
 list as recipient... but we were against imposing rules - right?

 Wouldn't it be nice if there was a reliable way to determine legitimate
 sources...?

The freebsd-current@ list is doing that after a fashion. If I forget to change 
my mail identity to [EMAIL PROTECTED] com, I get sent to the moderator.

The freebsd lists are almost spam free, and I would love to see exactly how 
they are doing it. Do any of you know if it's documented anywhere? 

Beech
-- 

---
Beech Rintoul - Sys. Administrator - [EMAIL PROTECTED]
/\   ASCII Ribbon Campaign  | Alaska Paradise
\ / - NO HTML/RTF in e-mail  | 201 East 9Th Avenue Ste.310
 X  - NO Word docs in e-mail | Anchorage, AK 99501
/ \  - Please visit Alaska Paradise - http://www.alaskaparadise.com
---













pgpntrcrxhI1L.pgp
Description: PGP signature


Re: Non English Spam

2006-10-15 Thread Olivier Nicole
 I'm getting a ton of spam every day  that comes from China, Japan and Korea=
 =2E=20
 Spam Assassin completely ignores it because it has all non-english characte=
 rs=20
 and slows kmail to a crawl loading. Is there a way to filter on non-english=

in /usr/local/etc/mail/spamassassin/v310.pre I enabled the language
guesser plugin:

# TextCat - language guesser
#
loadplugin Mail::SpamAssassin::Plugin::TextCat

It puts a little bit of stress on SpamAssassin, but Ithink it works
pretty good.

Bests,

Olivier
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-14 Thread Anders Gulden Olstad
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Beech Rintoul wrote:
 I'm getting a ton of spam every day  that comes from China, Japan and Korea. 
 Spam Assassin completely ignores it because it has all non-english characters 
 and slows kmail to a crawl loading. Is there a way to filter on non-english 
 either using Spam Assassin or procmail? 
 
 Suggestions would be appreciated.
 
 Beech

This procmail rule catches all of my non-english spam

# Trap misc charset mail in header and body
:0HB
* charset=.*BIG5.*|\
  charset=.*GB2312.*|\
  charset=.*DEFAULT_CHARSET.*|\
  charset=.*ks_c_5601-1987.*|\
  charset=.*euc-kr.*|\
  charset=.*ISO-2022-KR.*|\
  ^Subject:.*BIG5.*|\
  ^Subject:.*GB2312.*
/dev/null




-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)
Comment: Grunbacher Altweizen Dunkel
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFMIQTMVyOPWVstbURAsaGAKDXkCWAJ2xonZdWlNhKT61rpuhgzgCgsez2
CwIgRdaN4Q6/RkqfcRjkOB4=
=9ltQ
-END PGP SIGNATURE-

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-14 Thread Erik Norgaard

Beech Rintoul wrote:
I'm getting a ton of spam every day  that comes from China, Japan and Korea. 
Spam Assassin completely ignores it because it has all non-english characters 
and slows kmail to a crawl loading. Is there a way to filter on non-english 
either using Spam Assassin or procmail? 


I get none after adding simple filter rules for postfix:

# Accepted mime headers: (ASCII, UTF-8 and ISO-8859-X)
/^Content-Type:.*?charset\s*=\s*?(us-ascii|iso-8859-\d+|utf-8)?/
OK HDR2000 Accepted charset: $1

Strictly you can reject every other characterset, but I chose to make it 
explicit:


# Reject specific character sets
# Chinese, Japanese and Korean
/^Content-Type:.*?charset\s*=\s*?(Big5|gb2312|euc-cn)?/
REJECT HDR2100: Unaccepted character set: $1
/^Content-Type:.*?charset\s*=\s*?(euc-kr|iso-2022-kr)?/
REJECT HDR2110: Unaccepted character set: $1
/^Content-Type:.*?charset\s*=\s*?(iso-2022-\w+|euc-jp|shift_jis)?/
REJECT HDR2120: Unaccepted character set: $1
# Cyrrilic character sets: Russian/Ukrainian
/^Content-Type:.*?charset\s*=\s*?(koi8-(?:r|u))?/
REJECT HDR2200: Unaccepted character set: $1
/^Content-Type:.*?charset\s*=\s*?(windows-(?:1250|1251))?/
REJECT HDR2210: Unaccepted character set: $1

And then you may want a catchup rule to catch unknown character sets.

/^Content-Type:.*?charset\s*=\s*?(\w?)?/
WARN   HDR2299: Unknown character set: $1

you may change WARN to REJECT.

I have noted however, that some subscribers to this list write english 
encoded in one of the above character sets, I don't know enough about 
the character set definition, but it seems that English characters are a 
subset of any character set?


What is the recommended policy here? Should subscribers be advised to 
change character set when posting to the list?


Cheers, Erik
--
Ph: +34.666334818  web: http://www.locolomo.org
X.509 Certificate: http://www.locolomo.org/crt/8D03551FFCE04F0C.crt
Key ID: 69:79:B8:2C:E3:8F:E7:BE:5D:C3:C3:B1:74:62:B8:3F:9F:1F:69:B9
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-14 Thread Beech Rintoul
On Friday 13 October 2006 19:09, Matthew Seaman wrote:
 Beech Rintoul wrote:
  I'm getting a ton of spam every day  that comes from China, Japan and
  Korea. Spam Assassin completely ignores it because it has all non-english
  characters and slows kmail to a crawl loading. Is there a way to filter
  on non-english either using Spam Assassin or procmail?
 
  Suggestions would be appreciated.

 Install the IP::Country perl modules (port: net/p5-IP-Country) and
 uncomment the lines in /usr/local/etc/mail/spamassassin/init.pre to
 enable Mail::SpamAssassin::Plugin::RelayCountry plugin, which causes
 the Bayesian filters to learn which countries relay most spam to you.

 Look for the discussion on 'ok_locales' in the Mail::SpamAssassin::Conf
 perldoc.  Set that to 'en' and messages in character sets other than
 anything based on the Latin (and possibly Greek) alphabet will get a
 higher spam score.  You can put that into
 /usr/local/etc/mail/spamassassin/local.cf for a site-wide effect or into
 per-user ~/.spamassassin/user_prefs config files.

Thank you. Your suggestion appears to be working. I was getting 75 or more of 
non-english spam daily and It was becoming a real pain in the backside to 
deal with. Now spamassassin is tagging those with a higher score and procmail 
is sending them to /dev/null. Looking at the log, all my normal mail (like 
this list) are getting through. Hopefully spam will now be down to a 
tolerable level.

Beech 
-- 

---
Beech Rintoul - Sys. Administrator - [EMAIL PROTECTED]
/\   ASCII Ribbon Campaign  | Alaska Paradise
\ / - NO HTML/RTF in e-mail  | 201 East 9Th Avenue Ste.310
 X  - NO Word docs in e-mail | Anchorage, AK 99501
/ \  - Please visit Alaska Paradise - http://www.alaskaparadise.com
---













pgpc0562Ea36P.pgp
Description: PGP signature


Re: Non English Spam

2006-10-14 Thread Beech Rintoul
On Saturday 14 October 2006 05:12, Gerard Seibert wrote:
 On Saturday 14 October 2006 09:04, Beech Rintoul wrote:
  Thank you. Your suggestion appears to be working. I was getting 75 or
  more of non-english spam daily and It was becoming a real pain in the
  backside to deal with. Now spamassassin is tagging those with a higher
  score and procmail is sending them to /dev/null. Looking at the log, all
  my normal mail (like this list) are getting through. Hopefully spam
  will now be down to a tolerable level.

 It seems to me that those restrictions might be a tad too tight, but that
 is just my opinion.

They don't seem to be. I'm going  to watch the log for a couple of days, but 
so far everything legitimate is getting through.

Beech

-- 

---
Beech Rintoul - Sys. Administrator - [EMAIL PROTECTED]
/\   ASCII Ribbon Campaign  | Alaska Paradise
\ / - NO HTML/RTF in e-mail  | 201 East 9Th Avenue Ste.310
 X  - NO Word docs in e-mail | Anchorage, AK 99501
/ \  - Please visit Alaska Paradise - http://www.alaskaparadise.com
---













pgpscUrFwRhgh.pgp
Description: PGP signature


Re: Non English Spam

2006-10-14 Thread Robert Huff

In checking this out, I came across this in man spamassassin:

 ok_locales xx [ yy zz ... ](default: all)
   This option is used to specify which locales are considered OK for
   incoming mail.  Mail using the character sets that are allowed by
   this option will not be marked as possibly being spam in a foreign
   language.

   If you receive lots of spam in foreign languages, and never get any
   non-spam in these languages, this may help.  Note that all
   ISO-8859-* character sets, and Windows code page character sets,
   are always permitted by default.

   Set this to all to allow all character sets.  This is the
   default.

   The rules CHARSET_FARAWAY, CHARSET_FARAWAY_BODY, and
   CHARSET_FARAWAY_HEADERS are triggered based on how this is set.

   Examples:

 ok_locales all (allow all locales)
 ok_locales en  (only allow English)
 ok_locales en ja zh(allow English, Japanese, and Chinese)

   Note: if there are multiple ok_locales lines, only the last one is
   used.

   Select the locales to allow from the list below:

   en   - Western character sets in general
   ja   - Japanese character sets
   ko   - Korean character sets
   ru   - Cyrillic character sets
   th   - Thai character sets
   zh   - Chinese (both simplified and traditional) character sets



Robert Huff
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-14 Thread Ted Mittelstaedt

- Original Message - 
From: Erik Norgaard [EMAIL PROTECTED]
To: Beech Rintoul [EMAIL PROTECTED]
Cc: freebsd-questions@freebsd.org
Sent: Saturday, October 14, 2006 5:38 AM
Subject: Re: Non English Spam


 I have noted however, that some subscribers to this list write english
 encoded in one of the above character sets, I don't know enough about
 the character set definition, but it seems that English characters are a
 subset of any character set?

 What is the recommended policy here? Should subscribers be advised to
 change character set when posting to the list?


No.  It's the responsibility of the person doing the filtering - in this
case you -
to exempt any known good e-mail sender from your filters.

You know damn well that legitimate mailing list mail comes from

mx2.freebsd.org (mx2.freebsd.org [216.136.204.119])

it's right in the headers of the messages on the list.  You have no right to
force other people to conform to what you feel is acceptable formatting
of their message as long as they meet the SMTP rfc standards.  That's
why we have RFC's.

If everyone did what your proposing then senders would have hundreds
of different rules they would have to follow, over and above the normal
RFCs.

Ted

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Non English Spam

2006-10-13 Thread Beech Rintoul
I'm getting a ton of spam every day  that comes from China, Japan and Korea. 
Spam Assassin completely ignores it because it has all non-english characters 
and slows kmail to a crawl loading. Is there a way to filter on non-english 
either using Spam Assassin or procmail? 

Suggestions would be appreciated.

Beech
-- 

---
Beech Rintoul - Sys. Administrator - [EMAIL PROTECTED]
/\   ASCII Ribbon Campaign  | Alaska Paradise
\ / - NO HTML/RTF in e-mail  | 201 East 9Th Avenue Ste.310
 X  - NO Word docs in e-mail | Anchorage, AK 99501
/ \  - Please visit Alaska Paradise - http://www.alaskaparadise.com
---













pgpeUgJThtkj1.pgp
Description: PGP signature


Re: Non English Spam

2006-10-13 Thread [EMAIL PROTECTED]

Beech Rintoul wrote:
I'm getting a ton of spam every day  that comes from China, Japan and Korea. 
Spam Assassin completely ignores it because it has all non-english characters 
and slows kmail to a crawl loading. Is there a way to filter on non-english 
either using Spam Assassin or procmail? 


Suggestions would be appreciated.

Beech


May be it is not exactly an answer to your question, but we started to 
use real time black lists with postfix and it works pretty well (though 
some spam comes through). There have been no complains of false positive 
so far (almost 6 months).


I noticed how well it works when we switched it off today for 2 hours 
because of an error and the anti-virus programs started to jump.


And we get very seldom non-English spam.

I do not know if procmail has something similar.

Iv
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-13 Thread Paul Schmehl
--On October 13, 2006 5:12:27 PM -0800 Beech Rintoul 
[EMAIL PROTECTED] wrote:



I'm getting a ton of spam every day  that comes from China, Japan and
Korea.  Spam Assassin completely ignores it because it has all
non-english characters  and slows kmail to a crawl loading. Is there a
way to filter on non-english  either using Spam Assassin or procmail?

Suggestions would be appreciated.


/usr/ports/mail/postfix-policyd-weight/

Your troubles will be over.

Paul Schmehl ([EMAIL PROTECTED])
Adjunct Information Security Officer
The University of Texas at Dallas
http://www.utdallas.edu/ir/security/


Re: Non English Spam

2006-10-13 Thread Eric

[EMAIL PROTECTED] wrote:

Beech Rintoul wrote:
I'm getting a ton of spam every day  that comes from China, Japan and 
Korea. Spam Assassin completely ignores it because it has all 
non-english characters and slows kmail to a crawl loading. Is there a 
way to filter on non-english either using Spam Assassin or procmail?

Suggestions would be appreciated.

Beech


May be it is not exactly an answer to your question, but we started to 
use real time black lists with postfix and it works pretty well (though 
some spam comes through). There have been no complains of false positive 
so far (almost 6 months).


I noticed how well it works when we switched it off today for 2 hours 
because of an error and the anti-virus programs started to jump.


And we get very seldom non-English spam.

I do not know if procmail has something similar.



i found postgrey to be a fantastic addition to my antispam arsenal. 
check it out. it really works well


Eric
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-13 Thread Brian

Paul Schmehl wrote:
--On October 13, 2006 5:12:27 PM -0800 Beech Rintoul 
[EMAIL PROTECTED] wrote:



I'm getting a ton of spam every day  that comes from China, Japan and
Korea.  Spam Assassin completely ignores it because it has all
non-english characters  and slows kmail to a crawl loading. Is there a
way to filter on non-english  either using Spam Assassin or procmail?

Suggestions would be appreciated.


/usr/ports/mail/postfix-policyd-weight/

Your troubles will be over.

Paul Schmehl ([EMAIL PROTECTED])
Adjunct Information Security Officer
The University of Texas at Dallas
http://www.utdallas.edu/ir/security/
I didn't catch if this was sendmail or not, but spamasassin kept updated 
and a lower could be spam score gets me very little spam, the stock 
stuff is about all that occasionally gets through for me.


Brian

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Non English Spam

2006-10-13 Thread Chad Leigh -- Shire.Net LLC


On Oct 13, 2006, at 7:12 PM, Beech Rintoul wrote:

I'm getting a ton of spam every day  that comes from China, Japan  
and Korea.
Spam Assassin completely ignores it because it has all non-english  
characters


I don't know what settings affect this but SpamAssassin actually  
catches most of the Japanese and Chinese language spam we get (have  
not seen Korean).  (I have whitelisted a couple of Japan email  
addresses that send us legit email in Japanese but others that are  
not spam do not get flagged that often as spam -- don't ask me how it  
works).


Chad

and slows kmail to a crawl loading. Is there a way to filter on non- 
english

either using Spam Assassin or procmail?

Suggestions would be appreciated.

Beech
--


---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net





Re: Non English Spam

2006-10-13 Thread Matthew Seaman
Beech Rintoul wrote:
 I'm getting a ton of spam every day  that comes from China, Japan and Korea. 
 Spam Assassin completely ignores it because it has all non-english characters 
 and slows kmail to a crawl loading. Is there a way to filter on non-english 
 either using Spam Assassin or procmail? 
 
 Suggestions would be appreciated.

Install the IP::Country perl modules (port: net/p5-IP-Country) and
uncomment the lines in /usr/local/etc/mail/spamassassin/init.pre to
enable Mail::SpamAssassin::Plugin::RelayCountry plugin, which causes
the Bayesian filters to learn which countries relay most spam to you.

Look for the discussion on 'ok_locales' in the Mail::SpamAssassin::Conf
perldoc.  Set that to 'en' and messages in character sets other than
anything based on the Latin (and possibly Greek) alphabet will get a
higher spam score.  You can put that into
/usr/local/etc/mail/spamassassin/local.cf for a site-wide effect or into
per-user ~/.spamassassin/user_prefs config files.

Cheers,

Matthew

-- 
Dr Matthew J Seaman MA, D.Phil.   7 Priory Courtyard
  Flat 3
PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate
  Kent, CT11 9PW



signature.asc
Description: OpenPGP digital signature