Hi, I am working with 5.0.3 snapshot on x86_64. I am now experiencing with features of MySQL, so please ignore that in my example I use self join . I know that this particular example could be done much easier but eventually I am going to join multiple tables with similar structure (one preprocessing table and one final table). My query is:
mysql> insert into log select t1.uniprot_primary_ac from uniprot_attributes as t1 left join uniprot_attributes as t2 on (t1.attribute_name=t2.attribute_name and t1.attribute_value=t2.attribute_value) where t2.uniprot_primary_ac="P15017"; Query OK, 748 rows affected (16.31 sec) Records: 748 Duplicates: 0 Warnings: 0 mysql> explain select t1.uniprot_primary_ac from uniprot_attributes as t1 left join uniprot_attributes as t2 on (t1.attribute_name=t2.attribute_name and t1.attribute_value=t2.attribute_value) where t2.uniprot_primary_ac="P15017"; -> ? +----+-------------+-------+------ +----------------------------------------------+----------+--------- +------------------------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------ +----------------------------------------------+----------+--------- +------------------------+-------+-------------+ | 1 | SIMPLE | t2 | ref | up_att_1,up_att_3,up_att_4,up_att_5,up_att_2 | up_att_3 | 23 | const | 9 | Using where | | 1 | SIMPLE | t1 | ref | up_att_4,up_att_5,up_att_2 | up_att_4 | 8 | test.t2.attribute_name | 35655 | Using where | +----+-------------+-------+------ +----------------------------------------------+----------+--------- +------------------------+-------+-------------+ Why does it take 16 seconds to join 9 rows to 35655 rows on a x86_64 machine with 2GB RAM nearly all processor capacity available? Why is it using where, when there are indexes for all possible combinations? I indexed the table almost all possible ways to find a way to optimize the query but no help: mysql> show index from uniprot_attributes; -> ? +--------------------+------------+----------+-------------- +--------------------+-----------+-------------+----------+-------- +------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +--------------------+------------+----------+-------------- +--------------------+-----------+-------------+----------+-------- +------+------------+---------+ | uniprot_attributes | 1 | up_att_1 | 1 | uniprot_primary_ac | A | 170522 | NULL | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_1 | 2 | attribute_name | A | 1961014 | 4 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_3 | 1 | uniprot_primary_ac | A | 170522 | NULL | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_3 | 2 | attribute_name | A | 1961014 | 5 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_3 | 3 | attribute_value | A | 3922028 | 7 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_4 | 1 | attribute_name | A | 109 | 5 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_4 | 2 | attribute_value | A | 980507 | 7 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_4 | 3 | uniprot_primary_ac | A | 3922028 | NULL | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_5 | 1 | attribute_value | A | 980507 | 7 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_5 | 2 | attribute_name | A | 980507 | 5 | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_5 | 3 | uniprot_primary_ac | A | 3922028 | NULL | NULL | YES | BTREE | | | uniprot_attributes | 1 | up_att_2 | 1 | attribute_value | NULL | 1 | NULL | NULL | YES | FULLTEXT | | +--------------------+------------+----------+-------------- +--------------------+-----------+-------------+----------+-------- +------+------------+---------+ 12 rows in set (0.01 sec) Moreover I created a stored procedure that does exactly the same as the select statement above: mysql> create procedure test_performance(IN uniprot_id VARCHAR(20) ) -> begin -> declare v_attribute_name VARCHAR(20); -> declare v_attribute_value TEXT; -> declare done int default 0; -> declare cur1 cursor for select attribute_name, attribute_value from uniprot_attributes where uniprot_primary_ac=uniprot_id; -> declare continue handler for sqlstate '02000' set done=1; -> open cur1; -> mainloop: repeat -> fetch cur1 into v_attribute_name, v_attribute_value; -> if done=0 then -> insert into log select distinct uniprot_primary_ac from uniprot_attributes where attribute_name=v_attribute_name and attribute_value=v_attribute_value; -> end if; -> until done end repeat mainloop; -> close cur1; -> end -> ? Query OK, 0 rows affected (0.00 sec) mysql> call test_performance("P15017");? Query OK, 1 row affected (0.13 sec) The stored procedure does exactly the same as the query but 120x faster!!!! How is it possible? Is there a workaround to optimize the query? Please dont come with "use not null columns, drop redundant indexes" because it doesnt address the core of the problem. Why is not the select statement as efficient as the stored procedure? It is important because I dont want to rely on stored procedures too much because I found there is a memory bottleneck problem with SPs ( see "Are stored procedures merely code generators?" posting), if there are too many action statements, the stored procedures rapidly uses up all memory without doing anything. Thanks, -- Bereczki Gabor <[EMAIL PROTECTED]> -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]