I can provide some info coming from our experience with SQLite 3.7.14:

Since most SQl processing is IO bound, the "estimated cost" should be the 
number of disk IO operations required to retrieve the rows. The later addition 
of "estimated rows" reflects to the fact that some virtual table 
implementations might use non-disk storage (e.g. process or shared memory), 
where the number of IO operations is determined by the resident set and the 
cost of paging/swapping.

Lets say you have 10000 records of 200 bytes with 50 bytes of key overhead 
stored in some kind of ISAM file, and a page size of 4k.

Performing a full table scan will take an estimated 10000 * 200 / 4096 ~= 489 
disk accesses, whereas looking up a single record will take about 3 (50 bytes 
per key in a 4096 byte page gives an estimated fan out of over 100, resulting 
in 2 pages to read from the index and 1 for the record itself). Performing a 
partial index scan that returns 100 records will take 2 acesses to locate the 
first record, 1 more if a second index page is required and anywhere between 5 
(if the records are contiguous) and 100 (if each is from a separate page) 
accesses to retrieve the records themselves.

Regarding the UNIQUE flag, this is quite different from the number of estimated 
rows, which may be 0 or 1 due to rounding errors on a non-unique index (e.g. 
the initials of a set of 100 people has a cardinality of 26*26=676, giving an 
average number of 0,1479 records per index entry, but there may still be 
duplicates).

-----Ursprüngliche Nachricht-----
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von dave
Gesendet: Donnerstag, 19. Oktober 2017 18:43
An: 'SQLite mailing list' <sqlite-users@mailinglists.sqlite.org>
Betreff: [EXTERNAL] [sqlite] virtual tables, and theTrue meaning of 
pIdxInfo->estimatedCost, pIdxInfo->estimatedRows, and pIdxInfo->idxFlags...

Hi folks,

I am trying to fully understand the impact and correct use of a few subtle 
features related to virtual tables' the xBestIndex mechanism, and their correct 
use.  Here are my current beliefs:

*  pIdxInfo->estimatedCost
  obviously the cost of the proposed plan; a metric of the 'viscosity' of the 
table when traversing through xNext relative to other tables and especially to 
filesystem access
*  pIdxInfo->estimatedRows
  obviously the approximate number of rows that a proposed plan will return.
But less obvious to me is how this materially affects the query plan, 
especially relative to pIdxInfo->estimatedCost
  and a little bit with respect to:
* pIdxInfo->idxFlags
  when the SQLITE_INDEX_SCAN_UNIQUE is set.  Isn't setting
pIdxInfo->estimatedRows to 0 or 1 enough to communicate this same
information?

Anyway, I am dutifully setting both estimatedRows and idxFlags in cases where I 
have a 0-or-1-result table (I have several of these), and I am also 
estimatedRows to LLONG_MAX along with estimatedCost to DBL_MAX in cases where a 
plan can never be executed (btw I would respectfully suggest perhaps using a 
bit in idxFlags to communicate 'never use this plan, it will never work').

I haven't had any ill effects doing the above, but wonder if that is 'correct'. 
 Also, it would be interesting just to know what the material effect of 
estimatedRows and idxFlags is, so that I can maybe use them more effectively.

Any thoughts or corrections to my thinking?  Thanks in advance; cheers!
-dave
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick | Software Engineer | Scientific Games International GmbH | 
Klitschgasse 2-4, A-1130 Vienna | FN 157284 a, HG Wien, DVR: 0430013 | (O) +43 
1 80100 - 0

May be privileged. May be confidential. Please delete if not the addressee.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to