T K wrote:
I have a table with a blob and 13 rows in it; in the following, please
notice the execution times:
1) Selecting the primary key:
SELECT primkey from sometable
Query 1 of 1 elapsed time (seconds) - Total: 0.032, SQL query: 0.016,
Building output: 0.016
2) Select the primary key with WHERE clause:
SELECT primkey from sometable WHERE someString = 'someValue'
Query 1 of 1 elapsed time (seconds) - Total: 0.843, SQL query: 0.828,
Building output: 0.015
3) Here's how many rows I have:
select count(*) from sometable
13
Does the sheer fact that the table has a blob, albeit not selected,
affect query execution THAT much? Any suggestions to improve it?
Thanks
Derby stores blobs in the same space as the other data, and without
indexes the only way that derby can find the "next" row is by reading
through all the pages after the current row. It stores indexes in
separate files where the structure of the index lets it do direct
lookups in the base table. The Blobs are stored
as linked lists of pages. So performance is dependent on how the rows
get laid out in the table. The worst case would be 1 row on page one
followed by 2 gig of blob data, and then followed by another row.
When an index is used Derby can jump directly to the page containing
the beginning of the row. And when processing a single row it need not
read the blob if it is not part of the query.
In the case of query 1 it probably never even looked at the base table,
just used the index which given the number of rows is probably a one
page cached index.
For query 2 it would be interesting to know if it is picking to use the
index to do probes or not.
printing the query plan for both would tell exactly what it is doing.