Suppose I have two related tables: CREATE TABLE events ( id INTEGER PRIMARY KEY NOT NULL, place_id INTEGER );
CREATE TABLE places ( id INTEGER PRIMARY KEY NOT NULL, name TEXT ); INSERT INTO places (id, name) VALUES (1, 'Place A'); INSERT INTO places (id, name) VALUES (2, 'Place B'); INSERT INTO places (id, name) VALUES (3, 'Place C'); INSERT INTO events (id, place_id) VALUES (1, 1); INSERT INTO events (id, place_id) VALUES (2, 2); INSERT INTO events (id, place_id) VALUES (3, 1); Now, I want to count the number of 'events' at every 'place'. I started with a simple join and a group by clause to perform the count: SELECT name, count (*) AS count FROM events LEFT JOIN places ON places.id = events.place_id GROUP BY events.place_id; name|count Place A|2 Place B|1 It executes very fast (because the join can take advantage of the primary key) but does not produce the desired result. As you can see, not all places are included in the result: So I changed swapped the tables in the join: SELECT name, count (events.place_id) AS count FROM places LEFT JOIN events ON events.place_id = places.id GROUP BY places.id; name|count Place A|2 Place B|1 Place C|1 With this query, the correct result is returned, but the query runs much slower. Probably because the join cannot use the primary key (according to 'explain query plan', it is used for the sorting of the group by clause now). Adding an index improves the speed CREATE INDEX events_idx_place_id ON events (place_id); Is there a difference between using a primary key or another index for a join? And what is the advantage (if there is one) over not using a join at all: SELECT name, (SELECT count (*) FROM events WHERE events.place_id = places.id) AS count FROM places; ----------------------------------------------------------------------------- To unsubscribe, send email to [EMAIL PROTECTED] -----------------------------------------------------------------------------