Giampaolo Tomassoni wrote: > Dears, > > actually, I see the Bayes database in SA can be either per-user or > system-wide. > > I would like to have a way to put bayes tokens on a per-user basis, > and fetch them on a more system-wide (or pheraps domain-wide) way. >
Without going much further, you can fake the domain level bayes if you want. You just have to use bayes_override_username in domain based SQL user_prefs. > My intention is to have each user's bayes to contribute to scoring > every other user's incoming mail, while still let each user's db be > prominent in scoring mails delivered to the user's mailbox. > > To accomplish this, I would probably need to write my own version of > the BayesStore::(My|Pg)SQL, but I'm facing a problem: how can I get > the message's destinators from a subclass of BayesStore::SQL? I see a > $self::_userid defined, but it seems to be meant to store the > username used to access the db (and it is a scalar, not an array or > something like that which may be needed if the message is targeted to > multiple destinators). Its not the username to connect to the database, its the username that will be used for all lookups in the database. Its private and calculated based on the username variable in the main SpamAssassin object. What you're asking for is data that is not user specific. You would have to obtain your data elsewhere. SpamAssassin has no knowledge who the recipients of a message are. Best you might could do is parse the message itself looking at To/CC but we all know won't really work. The method that you want to gather data is really a departure from how things are done now, you would most likely have to throw out just about everything and replace it with a modified version. > > Also, I have a question which is loosely related to this. Why tokens > get hashed before storing to/retrieving by the db? Wasn't it better > to have them in clear, just to allow, in example, an easy > identification of the "really spam words" which could be used to > build rules further penalizing spam messages? Sidney touched briefly on this. I'll add a little more. Indeed its all about speed improvements and performance. Lots of discussion happened around this when I made the change. I tried to keep the option of allowing clear text as well but, and my memory might be failing me here, it was about a 12% drop in performance allowing that option. The compromise was several plugin hooks that let you build a separate database of the clear text tokens. I believe I posted a proof of concept plugin at the time to show that it would work. Also, in PostgreSQL, the column is BYTEA because it is binary data and otherwise you get token corruption. Hope that helps. Michael > > Thanks, > > ----------------------------------- Giampaolo Tomassoni - IT > Consultant Piazza VIII Aprile 1948, 4 I-53044 Chiusi (SI) - Italy Ph: > +39-0578-21100 > > MAI inviare una e-mail a: NEVER send an e-mail to: > [EMAIL PROTECTED] >
