Hi,
I'm running MySQL 5.0.51a-24+lenny3-log on a Debian Lenny box with
2Gb of RAM and a pair of dual core 2.6Ghz CPUs. I'm using a pair of
40Gb disks mirrored using MD (Linux software RAID) for both the OS and
the storage of the mysql DBs.
My problem child is doing this simple query:
mysql> select distinct Call_No from Newspaper_Issues
mysql> WHERE BIB_ID = 464;
+----------+
| Call_No |
+----------+
| News |
| NewsD CT |
+----------+
2 rows in set (2.98 sec)
The Newspaper_Issues table has 1.3 million rows, and has a structure
like this:
mysql> describe Newspaper_Issues;
+----------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+----------------+
| Record_No | int(11) | NO | PRI | NULL | auto_increment |
| BIB_ID | varchar(38) | NO | MUL | NULL | |
| Issue_Date | date | NO | MUL | NULL | |
| Type_Code | char(1) | NO | | r | |
| Condition_Code | char(1) | NO | | o | |
| Format_Code | char(1) | NO | | p | |
| Edition_Code | char(1) | NO | | n | |
| Date_Type_Code | char(1) | NO | | n | |
| Ed_Type | tinyint(1) | NO | | 1 | |
| RS_Code | char(1) | NO | | c | |
| Call_No | varchar(36) | YES | MUL | NULL | |
| Printed_Date | varchar(10) | YES | | NULL | |
| Update_Date | date | NO | | NULL | |
+----------------+-------------+------+-----+---------+----------------+
13 rows in set (0.00 sec)
I've tried adding various indexes, and reading up on howto optimize
DISTINCT or GROUP BY queries, but I'm hitting a wall here. My current indexes
are:
mysql> show index from Newspaper_Issues;
+------------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index |
Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type |
Comment |
+------------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Newspaper_Issues | 0 | PRIMARY | 1 |
Record_No | A | 1333298 | NULL | NULL | | BTREE |
|
| Newspaper_Issues | 1 | BIB_ID | 1 | BIB_ID
| A | 14980 | NULL | NULL | | BTREE |
|
| Newspaper_Issues | 1 | Call_No | 1 | Call_No
| A | 927 | NULL | NULL | YES | BTREE |
|
| Newspaper_Issues | 1 | Issue_Date | 1 |
Issue_Date | A | 49381 | NULL | NULL | | BTREE |
|
| Newspaper_Issues | 1 | BIB_ID_Issue_Date | 1 | BIB_ID
| A | 14980 | NULL | NULL | | BTREE |
|
| Newspaper_Issues | 1 | BIB_ID_Issue_Date | 2 |
Issue_Date | A | 1333298 | NULL | NULL | | BTREE |
|
| Newspaper_Issues | 1 | call_no_short | 1 | Call_No
| A | 30 | 6 | NULL | YES | BTREE |
|
| Newspaper_Issues | 1 | BIB_ID_call_no_short | 1 | BIB_ID
| A | 14980 | NULL | NULL | | BTREE |
|
| Newspaper_Issues | 1 | BIB_ID_call_no_short | 2 | Call_No
| A | 15503 | NULL | NULL | YES | BTREE |
|
| Newspaper_Issues | 1 | call_no_bib_id | 1 | Call_No
| A | 927 | NULL | NULL | YES | BTREE |
|
| Newspaper_Issues | 1 | call_no_bib_id | 2 | BIB_ID
| A | 15503 | NULL | NULL | | BTREE |
|
+------------------+------------+----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
11 rows in set (0.00 sec)
So now when I do an explain on my query I get:
mysql> explain select distinct(Call_No) from Newspaper_Issues WHERE BIB_ID
= 464;
+----+-------------+------------------+-------+-----------------------------------------------+----------------+---------+------+---------+--------------------------+
| id | select_type | table | type | possible_keys
| key | key_len | ref | rows | Extra
|
+----+-------------+------------------+-------+-----------------------------------------------+----------------+---------+------+---------+--------------------------+
| 1 | SIMPLE | Newspaper_Issues | index |
BIB_ID,BIB_ID_Issue_Date,BIB_ID_call_no_short | call_no_bib_id | 227 | NULL
| 1333298 | Using where; Using index |
+----+-------------+------------------+-------+-----------------------------------------------+----------------+---------+------+---------+--------------------------+
1 row in set (0.00 sec)
Interestingly enough, I get much better performance if I just drop the WHERE
clause, but that doesn't help me get what I want though. *grin*
mysql> explain select distinct(Call_No) from Newspaper_Issues;
+----+-------------+------------------+-------+---------------+---------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key |
key_len | ref | rows | Extra |
+----+-------------+------------------+-------+---------------+---------+---------+------+------+--------------------------+
| 1 | SIMPLE | Newspaper_Issues | range | NULL | Call_No |
111 | NULL | 928 | Using index for group-by |
+----+-------------+------------------+-------+---------------+---------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
Would it make sense to split the Call_No data off into it's own table, and put
in a proper numeric ID, instead of the current VARCHAR(36) it uses? So in
Newspaper_Issues I'd just have a Call_No_ID and a new Call_No table with:
Call_No_ID INT
Call_No Char(36)
which would hopefully index better? I only have 928 distinct Call_No strings,
so I'm not afraid of wasting space or anything.
Thanks,
John
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/[email protected]