> -----Original Message----- > From: Mark [mailto:[EMAIL PROTECTED] > Sent: woensdag 15 november 2006 18:15 > To: 'users@spamassassin.apache.org' > Subject: RE: Bayes column 'token' > > > > Well, bayes_mysql.sql does not specify collation; so, like > > > you said, the collation will be your MySQL server-set default. And > > > searches in MySQL are case-insensitive by default. Might indeed > > > perhaps be a good idea to convert to "latin1_bin" or some such. > > > > There will be any problem if I convert the current data to the new > > collation? > > I see no indication (or reason) in the code that tokens are > to be handled in an case-insensitive manner. The opposite, ere. > So, I'm inclined to say that "latin1_bin" collation is better. > I don't wanna be responsible for messing up your database, though. :) > So I will test this a bit on my Vmware box.
Did the testing; and it works very smooth with latin1_bin. > > > PRIMARY for `id` and `token` should not have INDEX for `id` > > > and `token` added, too. > > > > I don't understand what you mean. > > The couple (id, token) is PRIMARY, not INDEX... > > Where exactly is the problem? > > PRIMARY, like UNIQUE, always implies INDEX, too. So, adding > an extra INDEX for `id` and `token` basically gives you a double > INDEX for them. There's a double INDEX for `atime` too. So, I'd say, in bayes_mysql.sql, replace this: CREATE TABLE bayes_token ( id int(11) NOT NULL default '0', token char(5) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', atime int(11) NOT NULL default '0', PRIMARY KEY (id, token), INDEX bayes_token_idx1 (token), INDEX bayes_token_idx2 (id, atime) ) TYPE=MyISAM; With: CREATE TABLE bayes_token ( id int(11) NOT NULL default '0', token char(5) COLLATE latin1_bin NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', atime int(11) NOT NULL default '0', PRIMARY KEY (id, token), INDEX bayes_token_idx1 (atime) ) TYPE=MyISAM; - Mark