On 01/03/2013 8:08 PM, Stephen Chrzanowski wrote:
On Fri, Mar 1, 2013 at 11:48 AM, Ryan Johnson
<ryan.john...@cs.utoronto.ca>wrote:
On 01/03/2013 11:10 AM, Stephen Chrzanowski wrote:
***I'm waiting for the repair man to show up to fix my waterheater...
so...
I'm bored. This is going to be to the point at the beginning, but get
wordy
and technical near the end. ;) Super over kill..... ahem****
Nice explanation... just a couple of nitpicks:
Thanks gents. :]
You will ALWAYS incur slower speeds when using foreign keys in either a
join
or "ON [DELETE/UPDATE]". Additional look ups have to happen, which means
more
time spent, which typically is the millisecond range per lookup.
I can't think of any reason a foreign key constraint would impact the cost
of joins in any query. The cost is entirely at update time (when you have
to enforce the constraint).
Wouldn't you have to do a look up to procure the list of keys to see what
is actually part of the result set, or would that be decided within the
WHERE logic/filtering?
WHERE does that. Let's say you have table P and F, and the foreign key
is P.PK=F.FK. With no index at all, it will fetch all rows from P,
throwing away those that don't match their filters. Then it will fetch
all rows from F, also throwing away mismatches. Then it will do an
(expensive) join comparing all rows in P with all rows in F, returning
only the ones having P.PK = F.FK.
However, if (most likely) you've told sqlite3 that P.PK is a primary
key, the plan changes: Fetch and filter rows from F side, and for each
match, then use F.FK to "probe" the index sqlite3 automatically on P.PK.
That will give only rows for which F.FK=P.PK, and can be vastly faster.
If you create an additional index whose field(s) make it easier to apply
the WHERE filter on F, sqlite3 can use that instead of a scan. Think if
you had an index on (last-name,first-name). A query that filters for a
certain last name can use that index, but a query that filters by first
name can't (try looking up your first name in a phone book if you don't
believe me!).
If you created an index on F.FK (which doesn't happen automatically),
and created other indexes on fields involved in the filters for P, then
the engine *might* look at its stats and decide it's actually faster to
use the index to filter P, then probe F using that index. That would
happen if it thinks the filters on P allow fewer rows past than the
filters on F do. Choosing which index(es) to create is mostly black
magic: which ones are actually useful depends strongly on which query
and data you have, and how smart the optimizer is, and how up-to-date
your stats are (from ANALYZE).
Meanwhile, a foreign key constraint does the following:
- Whenever you insert or update F, it will blow up if there is not a
matching P.PK in the database.
- When you delete from F, it will find the matching P.PK (which must be
there, see above) and either delete them or give an error (depending on
what you asked for).
- Whenever you read from F, it does absolutely nothing at all, because
the constraint has already been enforced.
In other words, enforcing a FK constraint requires running queries very
similar to the one you're trying to run... on every single update. P.PK
also has the constraint that it must be unique, again requiring queries
on every insert/update. All those queries can use the PK index just like
yours do. That's one of the reasons sqlite3 creates that PK index
automatically, the performance hit of enforcing PK and FK constraints
would be ridiculous otherwise.
HTH,
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users