My name is Konrad Lorincz, a graduate student in computer science at Harvard
University. I am doing a class project in which I am comparing the
performance of queries between MySQL, grep, and FlatSQL (an SQL like
language
that we implemented using awk and perl, that performs queries on flat
files). I am writing because I discovered a strange behavior in MySQL when
performing joins involving three columns over non-indexed data.
When the number of tuples is lager than a certain size, it take a very long
time (at first I thought it entered an infinite loop) for operation to
complete. Our implementation FlatSQL on the other time completes in a
reasonable time.
I wonder if MySQL does something funny like try to build an index over the
non-indexed data that it's causing it to be so slow. Please let me know if
you have any ideas of what's causing this behavior and if there is a way
around it. Below, are some additional details
Here is a sample of the date. It's a collection of NFS traces from the file
system. The second column (time) is indexed in MySQL. Each line is of the
form:
Op | Time | XID | ClientID | FH | Offset | Size
write 1003809600.416227 29caeba9 8cf7216b.03ce 1 229376 8192
write 1003809600.417535 2bcaeba9 8cf7216b.03ce 1 237568 1142
read 1003809601.072451 22897eaa 8cf7216c.03e8 2 0 8192
and the query is
SELECT a.XID, a.ClientID, b.Offset, c.Size
FROM 2_10__3_4 AS a, 2_10__3_6 AS b, 2_10__3_7 AS c
WHERE a.XID=b.XID AND a.XID=c.XID;
where 2_10__3_4 is a table with 2^10=1024 tuples (lines) of data having
columns 3 and 4
*** MySQL timing ***
NbrLinesInTable | Time to complete Joins (sec)
2_06 lines 0:00.08 sec
2_07 lines 0:00.12 sec
2_08 lines 0:00.36 sec
2_09 lines 2:26.41 sec
2_10 lines 20:16.59 sec
*** FlatSQL ***
2_06 lines 0:01.18 sec
2_07 lines 0:02.20 sec
2_08 lines 0:04.28 sec
2_09 lines 0:09.15 sec
2_10 lines 0:21.94 sec
As we can see, something strange happens when the table is larger than 2^8
lines (tuples) the time to complete jumped from 0.36sec to 2min 26 sec and
then to 20min 16sec. For FlatSQL, we have a steady increase.
I created the tables in MySQL using the following
my $TABLE_TO_DESC="CREATE TABLE ${TABLE_TO} ( \
Op VARCHAR(10) NOT NULL default '', \
Time DECIMAL(15, 5), \
XID VARCHAR(10) default NULL, \
ClientID VARCHAR(15) default NULL, \
FH INT UNSIGNED, \
Offset INT UNSIGNED, \
Size INT UNSIGNED, \
KEY Time (Time) \
) TYPE=MyISAM";
MySQL version
mysql Ver 11.18 Distrib 3.23.52, for pc-linux-gnu (i686)
I appreciate your help,
-Konrad
************************************************************
** Konrad Lorincz
** graduate student at
** Harvard University
**
** Tel: (617) 493-4956
** E-Mail: [EMAIL PROTECTED]
** URL: http://www.eecs.harvard.edu/~konrad
**
**
** "For here we are not afraid to follow truth
** wherever it may lead, nor to tolerate any error
** so long as reason is left free to combat it."
** - Thomas Jefferson
**
** ... to infinity and beyond ...
************************************************************
---------------------------------------------------------------------
Before posting, please check:
http://www.mysql.com/manual.php (the manual)
http://lists.mysql.com/ (the list archive)
To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php