On Sun, 2005-08-21 at 11:36 -0400, Jason Tackaberry wrote: 
> Assuming indexed queries, it's the number of rows returned that matters,
> not the number of selects.  A select which uses an indexed column which
> returns 0 rows is very, very fast.  It's practically a no-op.

Attached is some (very ugly) test code which supports this statement.
Here's the program output:
        
        *** Creating single table ...
        Created table with 300000 rows.
        Query for known dir_id returned 150000 rows, took 1.6794
        seconds.
        Query for unknown dir_id returned 0 rows, took 0.0718 seconds.
        
        *** Creating multi table ...
        Created table with 30000 rows.
        Query for known dir_id returned 15000 rows, took 0.3160 seconds.
        Query for unknown dir_id returned 0 rows, took 0.0007 seconds.
        

There are 30000 files total, with 2 directories, 15k files in each
directory. Each file has 10 attributes, hence the single table has 10
times more rows than the multi table.  

Actually I'm a bit surprised that a query for unknown dir_id took as
long as it did in the single table test.  It's certainly much more than
10 times slower than the multi table test, which suggests there might be
some extra I/O involved.  I'm not sure what's happening there.

But at any rate, I think this convincingly demonstrates the
table-per-file-type, column-per-attribute approach is going to yield
appreciably better performance.

Cheers,
Jason.

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to