Larry M. Rosenbaum wrote on Mon, 06 Oct 2008 15:42:53 -0400: > So I copied > the database to a non-production MySQL server and tried to convert > it there. It has taken 4 days to convert! I'm thinking something > must be wrong.
Yes, converting a database with a 100 million records will take that long or longer. > So the config file says 500 thousand tokens, but the database has > 105 million entries. Have I misunderstood something, or is expiry > not working correctly? Maybe. Check the bayes_vars table for the token count and then check how many tokens the database actually contains. The expiry code just takes the token count from bayes_vars and doesn't check for the real record count of bayes_token. So, if there's a mismatch things like this can happen. For me it happened the other way around. After converting to SQL I removed all entries older than a year and then ran expiry without changing the token count value in bayes_vars. As it was thinking I still had several million tokens it slashed almost the complete database and I had to import all the stuff again. BTW: I'm not seeing output like this when I do an expire: token frequency: 1-occurrence tokens: 0.13% token frequency: less than 8 occurrences: 0.06% Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com