[\x{400}-\x{52f}] is awesome! It catches approx. 92% of the test
messages I gave it. Now to see if I can actually get it working
properly with SA...
Thanks!
I'll ping back with the rule I come up with.
-CMP
On 1/28/07, Daniel Sterling <[EMAIL PROTECTED]> wrote:
Cristóbal Palmer wrote:
> We're already using content checks... and other techniques.
Excellent! I hate to be repetitive, but please keep using statistical
analysis! I run spamassassin with the bayes *off*. Spam that
spamassassin misses is filtered by Thunderbird's built in statistical
analysis. I have a silly setup like this mostly because it works and I
am too lazy to change it.
Anyway, my Thunderbird's filters are catching the Cyrillic spam. I
noticed that the following fun keyword is in mine:
charset="windows-1251"
windows-1251 is the Cyrillic encoding. You can definitely trash messages with
that string.
Also, you may or may not have good luck with the following bit of regex:
[\x{400}-\x{52f}] -- let me know! (I suppose it mostly depends on whether or
not the string to be matched against is using byte or character semantics.)
-- Dan
--
TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
--
Cristóbal M. Palmer
UNC-CH SILS Student -- ils.unc.edu/~cmpalmer
TriLUG Vice Chair
"There are many roads to enlightenment, and thus many roads back to
the One True Debian" --crimsun
--
TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/