Hi list I have 5 tables:
words (word_id int auto_increment, word varbinary(40)) (has ~3.5M rows) with the keys: PRIMARY KEY (`word_id`),UNIQUE KEY `word_ind` (`word`) phrases (phrase_id int auto_increment, phrase varbinary(100)) (has ~11M rows) with the keys: PRIMARY KEY (`phrase_id`),UNIQUE KEY `phrase_ind` (`phrase`) phrase_words (phrase_id, word_id) (has ~31M rows) with: UNIQUE KEY `phrase_ind` (`phrase_id`,`word_id`) KEY `word` (`word_id`), KEY `phrase` (`phrase_id`) word_searches (word_id, search_date date, search hour char(2), amount smallint, type char(8), location char(2)) with: KEY `word_search` (`word_id`), KEY `id_search` (`search_date`), KEY `word_date` (`word_id`,`search_date`) (and a similar for phrase_searches, these two tables are merge tables with one table for each month, each table having 15-30M rows) phrases are built of "words" identified by phrase_words (these are not human language words and phrases but rather random bytes where some are human readable). Now, I'm trying to find out "how many times has word 1..n been searched for and how many times has phrases containing 1..n been searched for?" These queries take a really long time to execute, first I select for the words: explain sELECT w.word as word, w.word_id as word_id, sum(ws.amount) as amount FROM words w, word_searches ws WHERE ws.word_id=w.word_id AND w.word IN ("ppppp", "xyz", "zzz", "abc") AND ws.search_date >= '2006-07-17' AND ws.search_date <= '2006-08-16' group by ws.word_id; +----+-------------+-------+-------+---------------------------------+-------------+---------+----------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------------------------+-------------+---------+----------------------+------+----------------------------------------------+ | 1 | SIMPLE | w | range | PRIMARY,word_ind | word_ind | 42 | NULL | 4 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | ws | ref | word_search,id_search,word_date | word_search | 4 | statistics.w.word_id | 15 | Using where | +----+-------------+-------+-------+---------------------------------+-------------+---------+----------------------+------+----------------------------------------------+ and then for phrases: explain SELECT pw.word_id as word_id, sum(ps.amount) as amount FROM phrase_words pw, phrase_searches ps WHERE pw.word_id IN (966,1,1250,1741) AND pw.phrase_id = ps.phrase_id AND ps.search_date >= '2006-07-17' AND ps.search_date <= '2006-08-16' GROUP by pw.word_id; +----+-------------+-------+-------+-------------------------------------+---------------+---------+-------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+-------------------------------------+---------------+---------+-------------------------+--------+-------------+ | 1 | SIMPLE | pw | range | phrase_ind,word,phrase | word | 4 | NULL | 226847 | Using where | | 1 | SIMPLE | ps | ref | phrase_search,id_search,phrase_date | phrase_search | 4 | statistics.pw.phrase_id | 15 | Using where | +----+-------------+-------+-------+-------------------------------------+---------------+---------+-------------------------+--------+-------------+ The queries takes 40s-several minutes on a dual xeon 3GHz with 4GB ram only running Msql. Can someone see something I've done wrong? I have the same data in flat files with one word and phrase on each row and one file for each day and doing grep/sort/uniq -c in all thoose files is quicker on a slower server with a lot of other procesess and with the files nfs mounted. mysqladmin status doesn't show any slow queries: Uptime: 1215323 Threads: 2 Questions: 2191970 Slow queries: 0 Opens: 0 Flush tables: 1 Open tables: 64 Queries per second avg: 1.804 Thanks in advance /Jon -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]