Re: [sqlite] INTERSECT optimization, is it possible?

2010-01-10 Thread Max Vlasov
On Mon, Jan 11, 2010 at 12:56 AM, D. Richard Hipp  wrote:

>
> On Jan 10, 2010, at 4:50 AM, Max Vlasov wrote:
>
> > Documentation says that INTERSECT implemented with temporary tables
> > either
> > in memory or on disk. Is it always the case?
>
> No.
>
> If there is an ORDER BY clause, SQLite may run each subquery as a
> separate co-routine and merge the results.


Thanks a lot,
At the first place I tried to append ORDER BY in every query that led to
"ORDER BY clause should come after intersect not before" error, but I did
not read this message properly just to place the ORDER BY once at the end of
the query. The time improved dramatically from 50 seconds to 6 and it was an
extreme case so the real life queries dropped to below a second.
Needless to say, but I'm very impressed about how sqlite takes almost every
aspect into account
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INTERSECT optimization, is it possible?

2010-01-10 Thread D. Richard Hipp

On Jan 10, 2010, at 4:50 AM, Max Vlasov wrote:

> Documentation says that INTERSECT implemented with temporary tables  
> either
> in memory or on disk. Is it always the case?

No.

If there is an ORDER BY clause, SQLite may run each subquery as a  
separate co-routine and merge the results.  If the ORDER BY on both  
subqueries can be computed using indices, then the INTERSECT will run  
in either linear or logarithmic time (depending on what indices are  
available) and in constant space.  This is also true of UNION and  
EXCEPT.

If there is an ORDER BY clause but there does not exist indices needed  
to implement the ORDER BY for one or both subqueries, then the  
subquery might get evaluated into a temporary table, and sorted there,  
prior to the merge step.

I *think* it will always be the case that if an INTERSECT query has an  
ORDER BY clause and if sqlite3_stmt_status(db, SQLITE_STMTSTATUS_SORT,  
1) returns 0, then the query does not use temporary tables and runs in  
constant space.  But I might be wrong.  And in any event, that rule is  
subject to change in a future release if we decide we can get better  
performance by doing things differently.

D. Richard Hipp
d...@hwaci.com



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] INTERSECT optimization, is it possible?

2010-01-10 Thread darren
Considering that INTERSECT is logically nothing but a special case of
relational join (the exact opposite of cartesian product), where all
columns are involved in the join condition, you should just be able to
reuse any optimizations that exist for join, including primary/unique
keys/etc. -- Darren Duncan

> Documentation says that INTERSECT implemented with temporary tables either
> in memory or on disk. Is it always the case? The problem is that if I have
> several selects (six for example) when each produces thousands of results
> and the intersection is only hundreds the query takes about minute to
> execute. But all these selects are properly ordered to do intersection
> without extra files usage. For example I know that I can make it in my
> code
> by splitting these selects into separated prepared statements and making
> synchronous stepping outputting only if all steps have equal value(s). If
> there's no such optimization, is it possible to implement it in the sqlite
> engine? I know that having very complex query with many selects it would
> be
> a hard task to recognize such a specific case but maybe it is easier than
> it
> seems.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] INTERSECT optimization, is it possible?

2010-01-10 Thread Max Vlasov
Documentation says that INTERSECT implemented with temporary tables either
in memory or on disk. Is it always the case? The problem is that if I have
several selects (six for example) when each produces thousands of results
and the intersection is only hundreds the query takes about minute to
execute. But all these selects are properly ordered to do intersection
without extra files usage. For example I know that I can make it in my code
by splitting these selects into separated prepared statements and making
synchronous stepping outputting only if all steps have equal value(s). If
there's no such optimization, is it possible to implement it in the sqlite
engine? I know that having very complex query with many selects it would be
a hard task to recognize such a specific case but maybe it is easier than it
seems.

Thanks,

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users