On Wed, 25 Feb 2004, Ilan Aisic wrote:

>  Hi Everyone,
>
>    In /etc/mail/spamassassin/local.df I have the following line:
>
>       ok_languages en he
>
>   This supposedly allows English and Hebrew.
>   However, I get lots of false positive on Hebrew letters because it wrongly
> identifies Hebrew letters as "raw illegal characters" and gives them high
> scores.
>   You can see 7 points are contributed to the scroe just from these 2 lines
> copied from the reports:
>
>        4.3 FROM_ILLEGAL_CHARS     From contains too many raw illegal
> characters
>        2.7 SUBJ_ILLEGAL_CHARS     Subject contains too many raw illegal
> characters
>
>  Anyone can advise on this?

No that isn't a false positive, that's an appropriate hit against a
bad e-mail client that is violating internet standards.

Internet standard RFC-2822 (section 2.2) unequivocally states that you
MUST use only 7-bit characters in e-mail HEADERS. If you want to represent
non-7-bit characters in a header (such as 'From' or 'Subject') you must
use some kind of encoding (such as 'QP' or Base64), not the "raw" data.

Good e-mail client programs follow RFC standards and would not generate
such messages. Spammers usually are not concerned with following
standards and are more likely to generate such garbage.

The "ok_languages" option sets the types of languages that you will
accept, after they're decoded. (IE what kinds of character sets
can be encoded).

Thus if somebody were to send you a message with an encoded Korean
subject your SA would score against that.


-- 
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to