>I don?t think I can always run an analyze on the TEMP table for each
>query. May ruin performance worse than trying my luck with the temp
>table.

>I think this boils down why a JOIN with a 500 row table (one column) is
>so much (several hundred times) slower than using an IN clause with 500
>values.

>I have not changed the code or query for a long time, so my assumption is
>that a change in one of the recent SQLite updates caused the changed
>behavior of the optimizer/query engine. I recall a similar problem maybe
>a year ago. I think only one of the SQLite developers may shed some light
>on this.

>For now I have increased the threshold for IN clauses (instead of TEMP
>tables) and use WHERE IN (SELECT ? from TEMP) instead of a JOIN.

Of course, when you use IN (...) the optimizer KNOWS that the RHS is unique -- 
whether you are using IN (...list of values...) or IN (SELECT x FROM y).

However, if you are using a temp table and using a JOIN, the optimizer only 
knows that the RHS join predicate is unique if you tell it that is the case.  
So for example, if you use:

create table temp_table (x integer);
insert into temp_table values (1),(2),(3) ...

then this query:

select ... from b join temp_table on b.col = temp_table.x;

is significantly different from:

select ... from b where col in (select x from temp_table);

but the following are equivalent to the later, and to using the list of values 
directly in the IN clause:

create table temp_table (x integer not null primary key);
insert into temp_table values (1),(2),(3) ...

select ... from b join temp_table where b.col = temp_table.x;

which is equivalent to:

select ... from b where exists (select 1 from temp_table where x = b.col);

and

select ... from b where col in (select x from temp_table);

and

select ... from b where col in temp_table;

I don't recall ever seeing your definition of your temp_table ...

---
Theory is when you know everything but nothing works.  Practice is when 
everything works but no one knows why.  Sometimes theory and practice are 
combined:  nothing works and no one knows why.





Reply via email to