[EMAIL PROTECTED] (Justin Mason) writes:

> - language detection is called here (if "ok_languages" != "all") and the
>   language token is added as a metadatum called "X-Language".  (TODO:
>   this should be conditional, because language rec is a slow process,

Language detection will be faster once/if we add the XS implementation.

>   but is ("ok_languages" != "all") the right way to enable it?)

Yes, but someone might want to start setting "ok_languages all", but
still send the detected languages to Bayes.

I'd suggest we add a new option "use_language_detection" to
enable/disable the test for 3.0.

> - In addition, the MsgMetadata class holds some parsing/rendering code; it
>   calls the HTML renderer and holds the HTML features hash, and also now
>   holds the functions that make the "decoded"/"rendered" text arrays.
>   
>   Note that there's an open question as to whether rendered data, and
>   features discovered during that rendering, are really "metadata".  These
>   may be more appropriate to put in another class, either in the root
>   MsgContainer or another class off that.

I don't really care, but the data is just used for eval tests.

There may be some utility to attaching some metadata on a per-HTML-part
basis, but the code isn't really consistent on that point (per-HTML-part
or per-message) because we used to render everything as one big blob if
we detected HTML somewhere in the message and now we render HTML parts
precisely.
 
A reasonable guess is that some HTML metadata will want to remain
per-HTML-part and some will be per-message, but I'm not really sure at
this point.

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting

Reply via email to