On Nov 8, 2008, at 7:34 AM, BareFeet wrote: >>> I'd like to fetch all the rows in "code" that don't exist in >>> "companies": >>> >>> SELECT code.id FROM code,companies WHERE code.id IS NOT IN (SELECT >>> code FROM companies); >> >> select code.id from code >> where code.id not in (select code from companies); > > A technically better (eg faster) solution is to use "exists" (which > only looks for the first instance) rather than "in" (which scans the > whole table unnecessarily): > > select Code.ID from Code > where not exists (select 1 from Companies where Companies.Code = > Code.ID) > ;
That is a subjective statement. It could be true if there are a lot more rows in table 'Companies' than there are in 'Code' and there is no index on Companies.Code. In general I think the NOT IN(...) formulation will be faster though. If there is an index on Companies.Code, then the two methods are more or less equivalent. Otherwise, the NOT IN(...) method builds a transient index so that testing "X NOT IN (...)" is an O(log(N)) operation. The NOT EXISTS test will be O(N). If there are a lot of rows in table 'Code' the test will need to be run a lot of times so the expense of building the transient index is probably worth it. If there are few rows in 'Code' then the test will only have to run a few times. In this case the cost of building the index might outweigh the benefits of it. As always, if performance really matters, test both methods. Dan. _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users