> -----Original Message----- > From: Federico Giannici [mailto:[EMAIL PROTECTED] > Sent: woensdag 15 november 2006 10:31 > To: users@spamassassin.apache.org > Subject: Bayes column 'token' > > > Last week we migrated our bayes DB from DBM to MySQL. > Now we have upgraded our MySQL server from version 4.0 to 4.1. > > Today I found a couple of duplicate index values in the > "token" column of "bayes_token" table. > > This field is defined as char(5) with default collation > (that is "latin1_swedish_ci"). Is it the correct one?
Well, bayes_mysql.sql does not specify collation; so, like you said, the collation will be your MySQL server-set default. And searches in MySQL are case-insensitive by default. Might indeed perhaps be a good idea to convert to "latin1_bin" or some such. There is, btw, now that I look at it, a small bug in: CREATE TABLE bayes_token ( id int(11) NOT NULL default '0', token char(5) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', atime int(11) NOT NULL default '0', PRIMARY KEY (id, token), INDEX bayes_token_idx1 (token), INDEX bayes_token_idx2 (id, atime) ) TYPE=MyISAM; PRIMARY for `id` and `token` should not have INDEX for `id` and `token` added, too. - Mark