Re: BayesStore::SQL question

Sidney Markowitz Wed, 13 Dec 2006 04:53:23 -0800

Giampaolo Tomassoni wrote, On 14/12/06 12:35 AM:
> Also, I have a question which is loosely related to this. Why tokens
> get hashed before storing to/retrieving by the db?


I'm not qualified to answer your first questions, but I can deal with
this one.

When tokens were stored as plain text and we made the decision to change
it, the average size of a token was 12 bytes. We now use a 40 bit hash
of the token stored as a CHAR(5) field in the database which takes up
much less space than a variable length char with average length 12
bytes. The database is much smaller, access is faster, but the tradeoff
is that we can no longer dump the Bayes token database in plain text.

 -- sidney

Re: BayesStore::SQL question

Reply via email to