Re: [sqlite] Deleting records from a large table
Thanks all for your input, it has really helped. In my real world application tmp_keep is a temporary table populated by examinining a number of other tables etc., and I suddenly realsied that it could even contain duplicate ids. Sloppy thinking on my part. I get the best results by creating another table: CREATE TEMPORARY TABLE tmp_keep_unique (id1 integer primary key); INSERT INTO tmp_keep_unique SELECT DISTINCT id1 from tmp_keep; It takes far longer to create an index on tmp_keep, than it save times on the above query with one. Then *with a primary key on both table1 and tmp_keep_unique*DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep_unique); is acceptably efficient. On SQLite NOT EXISTS is quicker than NOT IN, but I also need to use same SQL on a MySQL implementation, and it behaves the oppoiste. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Dave Blake wrote: > Yes the id1 are integer primary keys. In both tables? If yes, then there is not much you could do, and the problem probably are all the modifications done to the actual table and its indexes. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
On 3 Mar 2017, at 10:53am, Dave Blakewrote: > DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); > > But this becomes inefficient when tmp_keep is large. Do you have an appropriate index on tmp_keep ? CREATE INDEX tk_id1 ON tmp_keep (id1) then try it again. Simon. ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
DELETE FROM table1 WHERE NOT EXISTS (SELECT 1 from tmp_keep WHERE id1 = table1.id1); Does it in a single pass by doing a correlated subquery on each row in table1 to see if the id is in tmp_keep. > -Original Message- > From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] > On Behalf Of Dave Blake > Sent: Friday, 3 March, 2017 03:53 > To: SQLite mailing list > Subject: [sqlite] Deleting records from a large table > > Say table1 has more then 50 records, and there is a second table > tmp_keep with the ids of the records in table1 to be kept, the rest need > to > be deleted. The number of records in tmp_keep can vary from 0 to all the > records in table1, with any values in between. > > What is the best strategy for doing the deletion? > > For deleting a large number of records (tmp_keep is small), this works > fine: > DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); > > But this becomes inefficient when tmp_keep is large. > > Any suggestions? > ___ > sqlite-users mailing list > sqlite-users@mailinglists.sqlite.org > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Could be keep almost all the records so ~50, but it varies greatly so sometimes will be just keep 10. I can adjust approach depending on size if necessary. Yes the id1 are integer primary keys. Table1 has a number of indexes and views, so the create new table approach is less attractive ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
Dave Blake wrote: > For deleting a large number of records (tmp_keep is small), this works > fine: > DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); > > But this becomes inefficient when tmp_keep is large. SQLite usually creates a temporary index for the values in the IN clause. How large is "large"? Are the id1 values integers? Then you can make tmp_keep.id1 the INTEGER PRIMARY KEY. Regards, Clemens ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Re: [sqlite] Deleting records from a large table
On 2017/03/03 12:53 PM, Dave Blake wrote: Say table1 has more then 50 records, and there is a second table tmp_keep with the ids of the records in table1 to be kept, the rest need to be deleted. The number of records in tmp_keep can vary from 0 to all the records in table1, with any values in between. What is the best strategy for doing the deletion? For deleting a large number of records (tmp_keep is small), this works fine: DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); But this becomes inefficient when tmp_keep is large. BEGIN TRANSACTION; ALTER TABLE table1 RENAME TO tmp1; CREATE TABLE table1 ( -- Your standard Table creation code here for table1... ); INSERT INTO table1 SELECT tmp1.* FROM tmp1 JOIN tmp_keep ON tmp_keep.id1 = tmp1.id1; DROP TABLE tmp1; COMMIT; ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
[sqlite] Deleting records from a large table
Say table1 has more then 50 records, and there is a second table tmp_keep with the ids of the records in table1 to be kept, the rest need to be deleted. The number of records in tmp_keep can vary from 0 to all the records in table1, with any values in between. What is the best strategy for doing the deletion? For deleting a large number of records (tmp_keep is small), this works fine: DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep); But this becomes inefficient when tmp_keep is large. Any suggestions? ___ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users