I'd prefer a "bayes_hash_tokens" option (default = turned on) over this
that affected all storage methods:

 * on: hash tokens
 * off: no hashing or truncation of tokens

Simple and easy to understand.  Probably would require some hooks in
some storage methods to change the format of the tokens, maybe a per-DB
format flag.  Code would not need to (and probably should not) support
either/both formats.

-- 
Daniel Quinlan                     anti-spam (SpamAssassin), Linux,
http://www.pathname.com/~quinlan/    and open source consulting

Reply via email to