RE: Re[2]: [sniffer] Charset

2004-08-19 Thread Michiel Prins
Pete, even your message had a chaset header:

Content-Type: text/plain; charset=us-ascii

I think you'll generate more FP's if you do something like that than FN's
you might have now. Aren't there spamassassin config files that detect this
spam?


Met vriendelijke groet,

ing. Michiel Prins
SOS Small Office Solutions / REJECT
Wannepad 27
1066 HW Amsterdam
tel. 020-4082627
fax. 020-4082628
[EMAIL PROTECTED]


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Pete McNeil
Sent: vrijdag 20 augustus 2004 4:58
To: Jorge Asch
Subject: Re[2]: [sniffer] Charset

On Thursday, August 19, 2004, 10:45:37 PM, Jorge wrote:

JA> Could a filter be created that will tag as spam any messages that 
JA> contaning NON-ascii characters? I mean allow only CHRS 1 through 255.

JA> I believe this fill filter out all these foreign character sets, and 
JA> let through regular old and plain messages through...

JA> Of course such a rule will only apply for most of us on the western 
JA> hemisphere...

In theory this could be done, but it would be a tricky gadget - probably
best done as something programatic... There are a lot of opportunities for
false positives.

I will think about this...

Then again - why not simply block on anything that says charset= ? If it's
plain old ascii, then there's no need for charset. (Lots of FPs with this,
but then I would never use a filter like that... It might be very close to
what you are looking for.

The other way to do it would be to build patterns that match all of the
known character sets -- or at least the majority. That would be a chunk of
work but doable - especially with a few well placed wildcards and a good
comprehensive list.

_M



This E-Mail came from the Message Sniffer mailing list. For information and
(un)subscription instructions go to
http://www.sortmonster.com/MessageSniffer/Help/Help.html



This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re[2]: [sniffer] Charset

2004-08-19 Thread Pete McNeil
On Thursday, August 19, 2004, 10:45:37 PM, Jorge wrote:

JA> Could a filter be created that will tag as spam any messages that
JA> contaning NON-ascii characters? I mean allow only CHRS 1 through 255.

JA> I believe this fill filter out all these foreign character sets, and let
JA> through regular old and plain messages through...

JA> Of course such a rule will only apply for most of us on the western
JA> hemisphere...

In theory this could be done, but it would be a tricky gadget -
probably best done as something programatic... There are a lot of
opportunities for false positives.

I will think about this...

Then again - why not simply block on anything that says charset= ? If
it's plain old ascii, then there's no need for charset. (Lots of FPs
with this, but then I would never use a filter like that... It might
be very close to what you are looking for.

The other way to do it would be to build patterns that match all of
the known character sets -- or at least the majority. That would be a
chunk of work but doable - especially with a few well placed
wildcards and a good comprehensive list.

_M



This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re: [sniffer] Charset

2004-08-19 Thread Jorge Asch

Well,... If you really wanted to do it then it could be done.
Create a set of rules that look for any of the most common spanish
words - especially any that use high-bit characters. With enough of
these it should be broad enough to catch most... The trick is to
include words that are also not common in normal conversation on the
local system.
 

Could a filter be created that will tag as spam any messages that 
contaning NON-ascii characters? I mean allow only CHRS 1 through 255.

I believe this fill filter out all these foreign character sets, and let 
through regular old and plain messages through...

Of course such a rule will only apply for most of us on the western 
hemisphere...

--
Jorge Asch Revilla
CONEXION DCR
www.conexion.co.cr
800-CONEXION 


This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re[2]: [sniffer] Charset

2004-08-19 Thread Pete McNeil
On Thursday, August 19, 2004, 3:54:20 PM, Jorge wrote:


>>We could then turn on or off the languages we didn't want.
>>>From my foray with dealing with Chinese, it certainly much
>>>easier said than done. Chinese was doable, I've had no luck
>>>stopping my Spanish spam.
>>Then again, you might be better at it than I.
>>
JA> Problem with spanish, is that we use the same western character set as
JA> you do... so it makes it harder to detect...

Well,... If you really wanted to do it then it could be done.

Create a set of rules that look for any of the most common spanish
words - especially any that use high-bit characters. With enough of
these it should be broad enough to catch most... The trick is to
include words that are also not common in normal conversation on the
local system.

That would be an awfully aggressive filter though - and a bunch of
work. Of course we can contract to code any ruleset that's possible. I
suspect there aren't many systems out there that can afford to be so
aggressive - but that's just my guess.

_M




This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re: [sniffer] Charset

2004-08-19 Thread Jorge Asch

We could then turn on or off the languages we didn't want.
From my foray with dealing with Chinese, it certainly much easier said than done. Chinese was doable, I've had no luck stopping my Spanish spam.
Then again, you might be better at it than I.
Problem with spanish, is that we use the same western character set as 
you do... so it makes it harder to detect...

--
Jorge Asch Revilla
CONEXION DCR
www.conexion.co.cr
800-CONEXION

This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re: Re[2]: [sniffer] Charset

2004-08-19 Thread Scott Fisher
I'll chime in on the subject too.
 I've finally managed to get the spam in Chinese under control on my system, but for a 
while I really wished Message Sniffer has language based filters.
I.e. Result 40 Chinese 
Result 41 Cyrillic
Result 42 Spanish
Result 43 Germain

We could then turn on or off the languages we didn't want.
>From my foray with dealing with Chinese, it certainly much easier said than done. 
>Chinese was doable, I've had no luck stopping my Spanish spam.
Then again, you might be better at it than I.

<<< [EMAIL PROTECTED]  8/19  9:52a >>>
On Thursday, August 19, 2004, 10:11:45 AM, Jorge wrote:

JA> Michiel Prins wrote:

>>Can't you use the content filter of your mail server to detect if the
>>charset is used? 
>>
JA> I've tried, but it's not 100% effective

I recall the earlier conversations about this. We have not had a lot
of call for generally blocking foreign character sets so that project
has not received much attention.

Another issue with this is that many of our customers are not in the
US and so defining "foreign" is often problematic.

We can more easily establish local black rules for you.

When you have an example of a character set you would like to block,
please send us a note to support@ with your license ID in the subject
line and the words "Local black rule please"

Explain in your note that you want us to block the character set(s) in
the message.

Attach the message to your note.

We will verify your license ID and then create local black rules for
the character sets we find in the message.

Over a short time this should have the effect you are looking for.

Hope this helps,
_M

PS: We do filter "foreign" spam that is submitted to us at spam@ using
the same rules that we follow for other messages. That is, we don't
treat them as "foreign" - only as spam in general. Russian spam in
particular has rapidly become heavily obfuscated - though there are
usually patterns that can be found to block the messages.



This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html



This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re[2]: [sniffer] Charset

2004-08-19 Thread Pete McNeil
On Thursday, August 19, 2004, 10:11:45 AM, Jorge wrote:

JA> Michiel Prins wrote:

>>Can't you use the content filter of your mail server to detect if the
>>charset is used? 
>>
JA> I've tried, but it's not 100% effective

I recall the earlier conversations about this. We have not had a lot
of call for generally blocking foreign character sets so that project
has not received much attention.

Another issue with this is that many of our customers are not in the
US and so defining "foreign" is often problematic.

We can more easily establish local black rules for you.

When you have an example of a character set you would like to block,
please send us a note to support@ with your license ID in the subject
line and the words "Local black rule please"

Explain in your note that you want us to block the character set(s) in
the message.

Attach the message to your note.

We will verify your license ID and then create local black rules for
the character sets we find in the message.

Over a short time this should have the effect you are looking for.

Hope this helps,
_M

PS: We do filter "foreign" spam that is submitted to us at spam@ using
the same rules that we follow for other messages. That is, we don't
treat them as "foreign" - only as spam in general. Russian spam in
particular has rapidly become heavily obfuscated - though there are
usually patterns that can be found to block the messages.



This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html


Re: [sniffer] Charset

2004-08-19 Thread Jorge Asch
Michiel Prins wrote:
Can't you use the content filter of your mail server to detect if the
charset is used? 

I've tried, but it's not 100% effective
--
Jorge Asch Revilla
CONEXION DCR
www.conexion.co.cr
800-CONEXION

This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html


RE: [sniffer] Charset

2004-08-19 Thread Michiel Prins
Can't you use the content filter of your mail server to detect if the
charset is used? 


Met vriendelijke groet,

ing. Michiel Prins
SOS Small Office Solutions / REJECT
Wannepad 27
1066 HW Amsterdam
tel. 020-4082627
fax. 020-4082628
[EMAIL PROTECTED]


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Jorge Asch
Sent: donderdag 19 augustus 2004 15:16
To: [EMAIL PROTECTED]
Subject: [sniffer] Charset

I asked about this about ayear ago, with no luck... Is there anyw ay Message
Sniffer, could be used to block certaing message, depending on their
Charset-Type (in content-type).

For example, I would like to block all Windows-1251 (Cyrillic) messages from
my server. I know SpamAssasing has such a feature, but I would rather do it
with Message Sniffer.

Is such a thing possible now? How about in the future? I am getting
bombarded with messages in foreign languages, and Message Sniffer does
*not* detect them (and it seems forwarding them to [EMAIL PROTECTED] is
pointless, since they still coming in... seems that theres no easy way to
create a rulebase for them)

-- 
Jorge Asch Revilla
CONEXION DCR
www.conexion.co.cr
800-CONEXION 



This E-Mail came from the Message Sniffer mailing list. For information and
(un)subscription instructions go to
http://www.sortmonster.com/MessageSniffer/Help/Help.html




This E-Mail came from the Message Sniffer mailing list. For information and 
(un)subscription instructions go to 
http://www.sortmonster.com/MessageSniffer/Help/Help.html


[sniffer] Charset

2004-08-19 Thread Jorge Asch
I asked about this about ayear ago, with no luck... Is there anyw ay 
Message Sniffer, could be used to block certaing message, depending on 
their Charset-Type (in content-type).

For example, I would like to block all Windows-1251 (Cyrillic) messages 
from my server. I know SpamAssasing has such a feature, but I would 
rather do it with Message Sniffer.

Is such a thing possible now? How about in the future? I am getting 
bombarded with messages in foreign languages, and Message Sniffer does 
*not* detect them (and it seems forwarding them to [EMAIL PROTECTED] 
is pointless, since they still coming in... seems that theres no easy 
way to create a rulebase for them)

--
Jorge Asch Revilla
CONEXION DCR
www.conexion.co.cr
800-CONEXION 


This E-Mail came from the Message Sniffer mailing list. For information and (un)subscription instructions go to http://www.sortmonster.com/MessageSniffer/Help/Help.html