Re: Slow multi-column index

DL Neil Wed, 24 Oct 2001 05:07:13 -0700

Hi Phil,
I suspect you have already answered your own question!
Warning: high ignorance factor - am currently researching indexes for myself - if I 
can't figure it out for myself, you
can help me with that later...


When the SELECT can, it will answer a query from a scan-of/data-retrieval-from the 
index file(s) alone. Under such
conditions it will respond v.quickly - this applies to two of your examples:

   SELECT count(*) FROM MAIN WHERE...
   SELECT fileDate...

the latter being fulfilled from your

   INDEX (fileDate),

- which incidentally looks superfluous because wouldn't any query on the fileDate 
index be equally well covered under
the previously defined

   INDEX (fileDate, filePath, mediaType),

You point to the above being fine and dandy, but performance problems resulting 
whenever the actual data table is
required/accessed to provide the data. In this there is only evidence from one 
data-selecting query (ie a SELECT that
cannot be entirely satisfied by data held in the indexes). Do you experience 
equivalently long response times from other
such queries?

Another part that you may have self-answered was to do with variable length 
fields/rows and rate of inserts. Is it worth
trying a reorg (ie to produce a 'state' of zero inserts) and then comparing 'before' 
and 'after' response times? [really
high ig-factor here! such an action is usually designed to improve the b-trees in the 
indexes, and I'm not at all sure
of its effect on the layout of a data table with heavy use of variable length fields]

Because of exactly this 'fear' (if not 'issue') a device I have used in the past (but 
not in MySQL) was to move the
variable length stuff to a separate table, and keep the rest/not too much variable-len 
in the 'main table'. Ok the
absolute one-to-one relationship means that this involves de-normalising the database, 
but such would be 'legitimised'
if it helps cope with a performance bind (don't tell the RDB-police though!)

Make any sense?
=dn


> > >>So, is it *really* the variable length row lookup that's taking all the
> time
> > here ?<<
> >
> > I don't see how. 18000 record is nothing and varchar searches are pretty
> fast.
> > On top of that, you first search is on the date which would be extremelly
> > fast and narrow the next searches to many less records.
> >
> > Are you sure it is the database that is slow to retrieve your result or
> could
> > it be with other code you have around the database but within you time
> > markers?
>
> Hi - yes, it really is the SELECT statement that's taking the time. The
> timings I gave are directly from the mysql command line, ie. reported by
> mysql.exe itself, not my code. By the way, only around 60 rows are returned
> by the query.
>
> I'm sure you're right, that VARCHAR searches are fast, since the 2nd query
> (SELECT fileDate...) is near instantaneous and is searching on the same
> VARCHAR columns as the slow one.
> The difference between the slow & fast queries seems to be the fact that the
> slow one is having to retrieve the row data from the table rather than the
> index itself.
>
> When an index 'returns' a query match, does it also contain some sort of
> lookup into the table to allow the row itself to be retrieved ? Or, if rows
> are variable length, does MySQL have to do a linear scan through the
> database to find the row ? That's what I'm wondering because of the
> reference manual warning about variable length rows.
>
> Thanks
>
> >
> >
> > On Wednesday 24 October 2001 06:52 am, Philip White wrote:
> > > Hello all
> > >
> > > Grateful for any help on this -
> > >
> > > I have a table (definition for MAIN below) with a number of indexes
> > > corresponding to various frequently used queries.
> > > A SELECT statement using one of the multi-column indexes is running very
> > > slow if it retrieves data that is not in in the index itself. For
> example,
> > > a "SELECT count(*) FROM MAIN WHERE..." takes 0.04 seconds, but "SELECT
> > > fileName from MAIN WHERE..." takes 20 seconds. The table currently has
> > > around 18000 rows, and the DB is around 400MB.
> > >
> > > I'm wondering whether this has something to do with the following words
> of
> > > wisdom in the reference manual:
> > > "For tables that changes a lot you should try to avoid all VARCHAR or
> BLOB
> > > columns. You will get dynamic row length as soon as you are using a
> single
> > > VARCHAR or BLOB columns."
> > >
> > > but I don't see why this should matter. Why does it matter that row
> lengths
> > > are dynamic ? Surely the index doesn't just specify a row "index" that
> is
> > > multiplied by the row length ? Surely it contains some sort of row data
> > > offset to allow for variable row lengths.
> > >
> > > If any use, here's the table definition:
> > >
> > > CREATE TABLE MAIN
> > > (
> > >  id    INT(4) AUTO_INCREMENT PRIMARY KEY,
> > >  addedDateTime DATETIME NOT NULL,
> > >  filePath  VARCHAR(100) NOT NULL,
> > >  fileDateTime DATETIME NOT NULL,
> > >  fileDate  DATE NOT NULL,
> > >  fileName  VARCHAR(255) NOT NULL,
> > >  mediaType  VARCHAR(100) NOT NULL,
> > >  key1   CHAR(32),
> > >  key2   CHAR(32),
> > >  language  CHAR(4),
> > >  headline  BLOB,
> > >  enStory   TEXT,
> > >  searchSlug  TEXT,
> > >  searchCodes  TEXT,
> > >  fileData  LONGBLOB,
> > >
> > >  INDEX (filePath, mediaType),
> > >  INDEX (addedDateTime, fileDateTime, fileDate, mediaType, filePath),
> > >  INDEX (fileDate, filePath, mediaType),
> > >  INDEX (fileDate),
> > >  INDEX (fileName),
> > >  INDEX (mediaType),
> > >  INDEX (key1),
> > >  INDEX (key2),
> > >  FULLTEXT (enStory),
> > >  FULLTEXT (searchSlug),
> > >  FULLTEXT (searchCodes)
> > > ) TYPE=MYISAM;
> > >
> > > and the slow query is something like:
> > >
> > > SELECT fileName FROM MAIN WHERE (fileDate='20011021') AND
> > > (filePath='OLWWPICGLSP) AND (mediaType='TEXT');
> > > (~20 seconds)
> > >
> > > EXPLAIN says this will use the 3rd index (fileDate, filePath, mediaType)
> > > which looks fine. If the query is changed to:
> > >
> > > SELECT fileDate FROM MAIN WHERE (fileDate='20011021') AND
> > > (filePath='OLWWPICGLSP) AND (mediaType='TEXT');
> > > (0.02 seconds)
> > >
> > > This is fine because it can get 'fileDate' from the index itself.
> > >
> > > So, is it *really* the variable length row lookup that's taking all the
> > > time here ?
> > >
> > > Thanks for any help on this,
> > > Phil White
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > Before posting, please check:
> > >    http://www.mysql.com/manual.php   (the manual)
> > >    http://lists.mysql.com/           (the list archive)
> > >
> > > To request this thread, e-mail <[EMAIL PROTECTED]>
> > > To unsubscribe, e-mail
> <[EMAIL PROTECTED]>
> > > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
> >
> > ---------------------------------------------------------------------
> > Before posting, please check:
> >    http://www.mysql.com/manual.php   (the manual)
> >    http://lists.mysql.com/           (the list archive)
> >
> > To request this thread, e-mail <[EMAIL PROTECTED]>
> > To unsubscribe, e-mail
> <[EMAIL PROTECTED]>
> > Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
> >
>
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail 
><[EMAIL PROTECTED]>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
>


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Re: Slow multi-column index

Reply via email to