This query will use the index you proposed. SELECT * FROM nodes NATURAL JOIN (SELECT parent AS id FROM edges WHERE parent NOT IN (SELECT child FROM edges));
Peter On Sun, Dec 31, 2017 at 6:14 PM, Shane Dev <devshan...@gmail.com> wrote: > Hello, > > I have a directed acyclic graph defined as follows - > > sqlite> .sch > CREATE TABLE nodes(id integer primary key, description text); > CREATE TABLE edges(parent not null references nodes, child not null > references nodes, primary key(parent, child)); > > Now I want to find the "roots" of the graph - i.e nodes which are not > children of other nodes - > > select * from nodes where not exists (select * from edges where child= > nodes.id); > > This works but is very slow when there are a million nodes and edges. > > Looking at the query plan - > > sqlite> explain query plan select * from nodes where not exists (select * > from edges where child=nodes.id); > selectid order from detail > 0 0 0 SCAN TABLE nodes > 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 > 1 0 0 SCAN TABLE edges > > I thought an index on edges(child) might help > > sqlite> CREATE INDEX iedges on edges(child); > > but it didn't - > > sqlite> explain query plan select * from nodes where not exists (select * > from edges where child=nodes.id); > selectid order from detail > 0 0 0 SCAN TABLE nodes > 0 0 0 EXECUTE CORRELATED SCALAR SUBQUERY 1 > 1 0 0 SCAN TABLE edges > > Is there any way to speed it up? > _______________________________________________ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users > _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users