Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Dave Blake
Thanks all for your input, it has really helped.

In my real world application tmp_keep is a temporary table populated by
examinining a number of other tables etc., and I suddenly realsied that it
could even contain duplicate ids.  Sloppy thinking on my part.

I get the best results by creating another table:
CREATE TEMPORARY TABLE tmp_keep_unique (id1 integer primary key);
INSERT INTO tmp_keep_unique SELECT DISTINCT id1 from tmp_keep;

It takes far longer to create an index on tmp_keep, than it save times on
the above query with one.

Then
*with a primary key on both table1 and tmp_keep_unique*DELETE FROM table1
WHERE id1 NOT IN (SELECT id1 FROM tmp_keep_unique);

is acceptably efficient.

On SQLite NOT EXISTS is quicker than NOT IN, but I also need to use same
SQL on a MySQL implementation, and it behaves the oppoiste.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Clemens Ladisch
Dave Blake wrote:
> Yes the id1 are integer primary keys.

In both tables?  If yes, then there is not much you could do, and the
problem probably are all the modifications done to the actual table
and its indexes.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Simon Slavin

On 3 Mar 2017, at 10:53am, Dave Blake  wrote:

> DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep);
> 
> But this becomes inefficient when tmp_keep is large.

Do you have an appropriate index on tmp_keep ?

CREATE INDEX tk_id1 ON tmp_keep (id1)

then try it again.

Simon.
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Keith Medcalf

DELETE FROM table1 WHERE NOT EXISTS (SELECT 1 from tmp_keep WHERE id1 = 
table1.id1);

Does it in a single pass by doing a correlated subquery on each row in table1 
to see if the id is in tmp_keep.

> -Original Message-
> From: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org]
> On Behalf Of Dave Blake
> Sent: Friday, 3 March, 2017 03:53
> To: SQLite mailing list
> Subject: [sqlite] Deleting records from a large table
> 
> Say table1 has more then 50 records, and there is a second table
> tmp_keep with the ids of the records in table1 to be kept, the rest need
> to
> be deleted. The number of records in tmp_keep can vary from 0 to all the
> records in table1, with any values in between.
> 
> What is the best strategy for doing the deletion?
> 
> For deleting a large number of records (tmp_keep is small), this works
> fine:
> DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep);
> 
> But this becomes inefficient when tmp_keep is large.
> 
> Any suggestions?
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Dave Blake
Could be keep almost all the records so ~50, but it varies greatly so
sometimes will be just keep 10. I can adjust approach depending on size if
necessary.

Yes the id1 are integer primary keys.

Table1 has a number of indexes and views, so the create new table approach
is less attractive
​
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread Clemens Ladisch
Dave Blake wrote:
> For deleting a large number of records (tmp_keep is small), this works
> fine:
> DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep);
>
> But this becomes inefficient when tmp_keep is large.

SQLite usually creates a temporary index for the values in the IN clause.
How large is "large"?

Are the id1 values integers?  Then you can make tmp_keep.id1 the INTEGER
PRIMARY KEY.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Deleting records from a large table

2017-03-03 Thread R Smith


On 2017/03/03 12:53 PM, Dave Blake wrote:

Say table1 has more then 50 records, and there is a second table
tmp_keep with the ids of the records in table1 to be kept, the rest need to
be deleted. The number of records in tmp_keep can vary from 0 to all the
records in table1, with any values in between.

What is the best strategy for doing the deletion?

For deleting a large number of records (tmp_keep is small), this works
fine:
DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep);

But this becomes inefficient when tmp_keep is large.


BEGIN TRANSACTION;
ALTER TABLE table1 RENAME TO tmp1;
CREATE TABLE table1 (
   -- Your standard Table creation code here for table1...
);
INSERT INTO table1 SELECT tmp1.* FROM tmp1 JOIN tmp_keep ON tmp_keep.id1 
= tmp1.id1;

DROP TABLE tmp1;
COMMIT;

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Deleting records from a large table

2017-03-03 Thread Dave Blake
Say table1 has more then 50 records, and there is a second table
tmp_keep with the ids of the records in table1 to be kept, the rest need to
be deleted. The number of records in tmp_keep can vary from 0 to all the
records in table1, with any values in between.

What is the best strategy for doing the deletion?

For deleting a large number of records (tmp_keep is small), this works
fine:
DELETE FROM table1 WHERE id1 NOT IN (SELECT id1 FROM tmp_keep);

But this becomes inefficient when tmp_keep is large.

Any suggestions?
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users