Re: [sqlite] difference between 'ID IS NULL' and 'ID = NULL'

R Smith Mon, 08 Jan 2018 04:25:19 -0800

On 2018/01/08 12:39 PM, x wrote:

Thanks Cezary and Scott. I’m now a bit clearer as to what’s happening. I 
imagined the RowID as being a separate index which is the root of my confusion.

To elaborate a little - We often get people here asking "But why ittable-scans in stead of using my nice Index?".

This stems from an often-held misconception that Indexes are God-sentmagic to improve everything. The fact is that Indexes are costlymechanisms which allows fast lookup which, only AFTER a certain criticalsize and for specific circumstances, become more efficient than a scan.The Query Planner has to do a lot of work to figure out what those"critical size and specific circumstances" is for any specific query,and it does get real fuzzy.

I think I've heard Richard or Dan explain it as follows (if memoryserves, someone please point out if I'm mistaken):

You can think of an SQLite table as essentially a btree covering Indexby itself with the Key being the Row_ID (or more recently, the PK forWITHOUT ROWID tables). This is why the rowid (or any column serving asan alias to it, or the PK for WITHOUT ROWID tables) cannot have NULLvalues, but any other primary key could (in SQLite).

Being an Index by itself means that a Table-Scan is perhaps not asinefficient as one might think and indeed using any other index means around-trip reading and hitting values in THAT index, then returning andlooking up the hit result in the rowid table index, and then reading thepages(s) from it and extracting the data - where during a table scan,all this round tripping is skipped.

So unless any prospective candidate Index for any query offers a trulymagnificent cost advantage, a table scan would probably be moreefficient, and so be chosen. This is why running ANALYZE on largetables is needed, because it allows the QP to better deduce whether aprospective Index might in fact offer such a magnificent cost reductionor not. Another way is hinting at the QP (Search "likelihood" in the docs).

This is why a non-rowid-alias Primary Key on a rowid table is also lessefficient to scan than the table itself (often very non-intuitive) - or- why a covering index sometimes gets avoided in a JOIN when it seems tocontain all needed data to fulfill the join obligation.

Also, often a great index is not used simply because the query plannerdoes not know enough about it and its prospective cost to obtain a goodestimate of its utility, and sometimes what feels intuitively to us as agreat Index just isn't really. The QP is not infallible, but it is quitesmart.



Cheers,
Ryan



_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] difference between 'ID IS NULL' and 'ID = NULL'

Reply via email to