Re: [sqlite] Performance problem with DELETE FROM/correlated subqueries

2020-02-07 Thread Jürgen Baier

Hi,

On 07.02.20 09:25, Clemens Ladisch wrote:

Jürgen Baier wrote:

   CREATE TABLE main ( ATT1 INT, ATT2 INT, PRIMARY KEY (ATT1,ATT2) );
   CREATE TABLE staging ( ATT1 INT, ATT2 INT );

Then I execute

   DELETE FROM main WHERE EXISTS (SELECT 1 FROM staging WHERE main.att1 = 
staging.att1 AND main.att2 = staging.att2)

which takes a very long time.


DELETE FROM main WHERE (att1, att2) IN (SELECT att1, att2 FROM staging);


Thank you very much.

I can confirm that this solves my problem and indeed scans the staging 
table and looks up the main table:


sqlite> EXPLAIN QUERY PLAN DELETE FROM main WHERE (att1, att2) IN 
(SELECT att1, att2 FROM staging);

QUERY PLAN
|--SEARCH TABLE main USING INDEX sqlite_autoindex_main_1 (ATT1=? AND ATT2=?)
`--LIST SUBQUERY
   `--SCAN TABLE staging

For reference: This syntax is not supported by Microsoft SQL Server 
(2017). But Microsoft SQL Server is relatively fast when using the 
original DELETE FROM query.


Thanks,

Jürgen




Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance problem with DELETE FROM/correlated subqueries

2020-02-07 Thread Rowan Worth
On Fri, 7 Feb 2020 at 16:25, Clemens Ladisch  wrote:

> Jürgen Baier wrote:
> >   CREATE TABLE main ( ATT1 INT, ATT2 INT, PRIMARY KEY (ATT1,ATT2) );
> >   CREATE TABLE staging ( ATT1 INT, ATT2 INT );
> >
> > Then I execute
> >
> >   DELETE FROM main WHERE EXISTS (SELECT 1 FROM staging WHERE main.att1 =
> staging.att1 AND main.att2 = staging.att2)
> >
> > which takes a very long time.
>
> DELETE FROM main WHERE (att1, att2) IN (SELECT att1, att2 FROM staging);
>

Note using row-values requires sqlite 3.15.0 or later -- which is three
years old at this point, but every version I have on hand still says
'Error: near ",": syntax error' so I thought I'd track down the details :)
-Rowan
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Performance problem with DELETE FROM/correlated subqueries

2020-02-07 Thread Clemens Ladisch
Jürgen Baier wrote:
>   CREATE TABLE main ( ATT1 INT, ATT2 INT, PRIMARY KEY (ATT1,ATT2) );
>   CREATE TABLE staging ( ATT1 INT, ATT2 INT );
>
> Then I execute
>
>   DELETE FROM main WHERE EXISTS (SELECT 1 FROM staging WHERE main.att1 = 
> staging.att1 AND main.att2 = staging.att2)
>
> which takes a very long time.

DELETE FROM main WHERE (att1, att2) IN (SELECT att1, att2 FROM staging);


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Performance problem with DELETE FROM/correlated subqueries

2020-02-07 Thread Jürgen Baier

Hi,

I have a question regarding the performance of DELETE FROM (or maybe 
better: correlated subqueries).


I have a table "main" and a table "staging". In "staging" I have a 
subset of "main". I want to delete all rows from "main" which are also 
in "staging".


  CREATE TABLE main ( ATT1 INT, ATT2 INT, PRIMARY KEY (ATT1,ATT2) );
  CREATE TABLE staging ( ATT1 INT, ATT2 INT );

Then I execute

  DELETE FROM main WHERE EXISTS (SELECT 1 FROM staging WHERE main.att1 
= staging.att1 AND main.att2 = staging.att2)


which takes a very long time. As far as I understand the query plan 
SQLite scans the full staging table for each row in "main":


  sqlite> EXPLAIN QUERY PLAN DELETE FROM main WHERE EXISTS (SELECT 1 
FROM staging WHERE main.att1 = staging.att1 AND main.att2 = staging.att2)

     ...> ;
  QUERY PLAN
  |--SCAN TABLE main
  `--CORRELATED SCALAR SUBQUERY
     `--SCAN TABLE staging

How do I speed this up? The idea is that the database should scan 
"staging" and lookup "main" (because an appropriate primary index exists).


But I'm open to any alternative approach. I just have the situation that 
I have a very large "main" table and a "staging" table which contains 
also a large number of tuples which should be deleted from "main".


Any ideas?

Thanks,

Jürgen

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users