Hum,
Well, I'm back with another one... When adding a
join to the previous query, it sloows down once again
even though it retrieves less datat. Here's the info :
mysql> explain SELECT ti.posi, ti.docid, d.filename,
ti.id, c.name
FROM corpus_documents cd, corpus c, documents d,
tokens_ins ti, tokens t, tokens t2, tokens_ins ti2
WHERE c.corpusid=4
AND cd.corpusid=c.corpusid
AND cd.docid=d.docid
AND t.docid=d.docid
AND ti.id=t.id
AND ti2.id=t2.id
AND t.docid=ti.docid
AND t2.docid=ti2.docid
AND t.docid=t2.docid
AND t.word='faire'
AND t2.word='pr�sence'
AND ti2.posi>=ti.posi-5
AND ti2.posi<=ti.posi+5;
+-------+------+--------------------------------------------------------------+-------------------+---------+---------------+------+--------------------------+
| table | type | possible_keys
| key | key_len |
ref | rows | Extra |
+-------+------+--------------------------------------------------------------+-------------------+---------+---------------+------+--------------------------+
| c | ref | PRIMARY,corpus_corpusid
| PRIMARY | 3 |
const | 1 | Using where; Using index |
| cd | ref | PRIMARY
| PRIMARY | 3 |
const | 2 | Using where; Using index |
| d | ref | PRIMARY,documents_docid
| PRIMARY | 3 |
cd.docid | 3 | Using index |
| t | ref |
PRIMARY,tokens_id,tokens_docid,tokens_word,tokens_word_docid
| tokens_word_docid | 33 | const,d.docid | 1 |
Using where |
| t2 | ref |
PRIMARY,tokens_id,tokens_docid,tokens_word,tokens_word_docid
| tokens_word_docid | 33 | const,t.docid | 1 |
Using where |
| ti | ref |
PRIMARY,tokens_ins_id,tokens_ins_docid
| tokens_ins_id | 4 | t.id |
96 | Using where |
| ti2 | ref |
PRIMARY,tokens_ins_id,tokens_ins_posi,tokens_ins_docid
| tokens_ins_id | 4 | t2.id |
96 | Using where |
+-------+------+--------------------------------------------------------------+-------------------+---------+---------------+------+--------------------------+
7 rows in set (0.00 sec)
Seems OK to me. But here's the query once run:
mysql> SELECT ti.posi, ti.docid, d.filename, ti.id,
c.name FROM corpus_documents cd, corpus c, documents
d, tokens_ins ti, tokens t, tokens t2, tokens_ins ti2
WHERE c.corpusid=4 AND cd.corpusid=c.corpusid AND
cd.docid=d.docid AND t.docid=d.docid AND ti.id=t.id
AND ti2.id=t2.id AND t.docid=ti.docid AND
t2.docid=ti2.docid AND t.docid=t2.docid AND
t.word='faire' AND t2.word='pr�sence' AND
ti2.posi>=ti.posi-5 AND ti2.posi<=ti.posi+5;
+----------+-------+-----------------+------+---------------+
| posi | docid | filename | id | name
|
+----------+-------+-----------------+------+---------------+
| 2798734 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 4036372 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 4324477 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 5167534 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 5535744 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 6796883 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 7484157 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 9410455 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 10251037 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 11391576 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 11656957 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 12069808 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 15313652 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 15663061 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 18024245 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 18341956 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 20943027 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
| 21282138 | 366 | LeMonde2002.tag | 1110 | Le Monde
2002 |
+----------+-------+-----------------+------+---------------+
18 rows in set (2 min 53.60 sec)
Victor, you were asking about cardinality, here's the
output of the indexes. I'm not sure I understand what
cardinality is about and how it would interfere:
mysql> show index from tokens;
+--------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name |
Seq_in_index | Column_name | Collation | Cardinality |
Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tokens | 0 | PRIMARY |
1 | word | A | 249541 | NULL |
NULL | | BTREE | |
| tokens | 0 | PRIMARY |
2 | pos | A | 374312 | NULL |
NULL | | BTREE | |
| tokens | 0 | PRIMARY |
3 | docid | A | 748624 | NULL |
NULL | | BTREE | |
| tokens | 1 | tokens_id |
1 | id | A | 249541 | NULL |
NULL | | BTREE | |
| tokens | 1 | tokens_docid |
1 | docid | A | 246 | NULL |
NULL | | BTREE | |
| tokens | 1 | tokens_word |
1 | word | A | 249541 | NULL |
NULL | | BTREE | |
| tokens | 1 | tokens_asbfreq |
1 | absfreq | A | 2478 | NULL |
NULL | | BTREE | |
| tokens | 1 | tokens_word_docid |
1 | word | A | 249541 | NULL |
NULL | | BTREE | |
| tokens | 1 | tokens_word_docid |
2 | docid | A | 748624 | NULL |
NULL | | BTREE | |
+--------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
9 rows in set (0.00 sec)
mysql> show index from tokens_ins;
+------------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name |
Seq_in_index | Column_name | Collation | Cardinality |
Sub_part | Packed | Null | Index_type | Comment |
+------------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tokens_ins | 0 | PRIMARY |
1 | docid | A | 246 |
NULL | NULL | | BTREE | |
| tokens_ins | 0 | PRIMARY |
2 | posi | A | 24935821 |
NULL | NULL | | BTREE | |
| tokens_ins | 1 | tokens_ins_id |
1 | id | A | 259748 |
NULL | NULL | | BTREE | |
| tokens_ins | 1 | tokens_ins_posi |
1 | posi | A | 24935821 |
NULL | NULL | | BTREE | |
| tokens_ins | 1 | tokens_ins_docid |
1 | docid | A | 246 |
NULL | NULL | | BTREE | |
+------------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
5 rows in set (0.00 sec)
Any help would be more than appreciated.
__________________________________
Do you Yahoo!?
Read only the mail you want - Yahoo! Mail SpamGuard.
http://promotions.yahoo.com/new_mail
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]