Dean Knut,

Thanks for answer,

The accessed data has higher locality in this query than in the previous
query, and therefore the page cache is used more efficiently.

The previous query used the indexes on big.name and bigref.bigid, which
are ordered differently. So even if the index on big.name was scanned
sequentially, the index on bigref.bigid was accessed almost completely
randomly, and a large number of pages all over the table had to be
visited.

In the latest query, the indexes on big.id and bigref.bigid are
used. Those indexes are ordered the same way, and all the interesting
rows are located consecutively on a relatively small number of pages. So
even if the number of page accesses is higher, the number of different
pages accessed is probably much lower. And since the rows in both
indexes are accessed in the same order that they are stored, the
requested page is found in the page cache most of the time.

Another difference is that the previous query had to access the base
table (big) to get the id column, whereas the latest query finds all the
columns it needs in the indexes. You may want to try to add an extra
index to speed up the previous query:

  CREATE UNIQUE INDEX BIGNAMEID ON BIG(NAME, ID)

If I understand well, the creation of the new index (BIGNAMEID) will solve the second problem you mentioned, the id column of table big will be taken from the index BIGNAMEID and not from the base table, but the first issue

"the index on bigref.bigid was accessed almost completely
randomly, and a large number of pages all over the table had to be
visited."


will remain? Is than correct ?

For the record, the creation of the index you proposed speeds up the query in a spectacular way !!

Is it correct to assume that the main source of the problem was the access of the base table (big) to get the id
column and not the almost random access of the index on bigref.bigid ?

Anyway, I would not like to become (more :-)) annoying, thanks a lot for your answers.

Best regards,
George

Reply via email to