[h2] Re: Performances of nested queries

Vitali Sun, 03 May 2015 13:07:31 -0700

I would add also my case here.

Seems a query like


SELECT * FROM SOMETABLE WHERE SOMETABLE_ID IN (SELECT   ID FROM 
SOMEFUNCTION( .. static input parameters.. ))

also is not scalable very well.  SOMETABLE_ID  is a primary key with an 
index. SOMEFUNCTION is a deterministic  function returning ResultSet, 
parameters are static
Let's say main table contains 80000 records,  SOMEFUNCTION  returns 3000 
IDs.

The query above works 4 seconds.
If I rewrite it as


SELECT * FROM SOMETABLE INNER JOIN (SELECT   ID FROM SOMEFUNCTION( .. 
static input parameters.. ))  ids ON ids.ID = SOMETABLE_ID

then it works 10ms.

More items are  in set  - performance drops exponentially.

I think this case is much more widespread than the case that Quentine 
reported :)

A variant of the case is a prepared statement query like:
SELECT * FROM SOMETABLE WHERE SOMETABLE_ID IN (SELECT   ID FROM TABLE(ID 
INT=?))
when   Set<Integer> , for example,  is passed to a prepared statement and  
time complexity should be O(n) iterating  over IDs (that even passed a Java 
collection) and primary key index is asked to retrieve a record.

Inner join also helps:
SELECT * FROM SOMETABLE INNER JOIN (SELECT   ID FROM TABLE(ID INT=?)) ids 
ON ids.ID = SOMETABLE_ID

There is no problem to use inner joins, but for applications generating SQL 
dynamically an approach with   SOMETABLE_ID IN (...) is preferable. Imagine 
UPDATE, DELETE queries where  joins are not possible directly e.g.

Vitali



-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

[h2] Re: Performances of nested queries

Reply via email to