[EMAIL PROTECTED] (Justin Mason) writes:
> - language detection is called here (if "ok_languages" != "all") and the
> language token is added as a metadatum called "X-Language". (TODO:
> this should be conditional, because language rec is a slow process,
Language detection will be faster once/if we add the XS implementation.
> but is ("ok_languages" != "all") the right way to enable it?)
Yes, but someone might want to start setting "ok_languages all", but
still send the detected languages to Bayes.
I'd suggest we add a new option "use_language_detection" to
enable/disable the test for 3.0.
> - In addition, the MsgMetadata class holds some parsing/rendering code; it
> calls the HTML renderer and holds the HTML features hash, and also now
> holds the functions that make the "decoded"/"rendered" text arrays.
>
> Note that there's an open question as to whether rendered data, and
> features discovered during that rendering, are really "metadata". These
> may be more appropriate to put in another class, either in the root
> MsgContainer or another class off that.
I don't really care, but the data is just used for eval tests.
There may be some utility to attaching some metadata on a per-HTML-part
basis, but the code isn't really consistent on that point (per-HTML-part
or per-message) because we used to render everything as one big blob if
we detected HTML somewhere in the message and now we render HTML parts
precisely.
A reasonable guess is that some HTML metadata will want to remain
per-HTML-part and some will be per-message, but I'm not really sure at
this point.
--
Daniel Quinlan anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/ and open source consulting