Well even with the best optimizer in the world this is likely to result in 
a complete table scan of table1, with each record then being checked 
against the results generated by the inner NOT IN select. 

When I do a simple test of having 2 tables in a clean database with just ID 
fields populated 1-1M and 1-100 this query takes 2.3s to run if I just 
request a count(*) back rather than the 999900 records that the original 
query will return. For me that 2.3s is basically disk I/O time, if you have 
records with more information a lot more information will be requested from 
the disk and so this will take longer. 

Try running the following at a h2 console after having restarted H2 to 
clear the cache and you will see the retrival time - if your cache is large 
enough the second time you run it should result in a near zero time taken.

select count(*) from 
(
   select  * from table1 
   where 
   ID 
   NOT IN 
   (
      select ID from table2 
   )
) 

The rest of the time will be the time spent dealing with the result set - 
can you move addtional processing to the database so ending up with a 
smaller result set?

Roger

On Tuesday, April 15, 2014 3:39:34 PM UTC+1, Devesh Mishra wrote:
>
> Hi,
>
> I am having performance problems while executing a query which has NOT IN 
> clause. Following is the query
>
> select  * from table1
> where 
> ID 
> NOT IN 
> (
> select ID from table2 
> ) 
>
> Table1 contains 1 million records
> table2 contains 100 records.
>
> expecting 1 million - 100 records as output. Both the columns are indexed.
>
> Please help, Thanks a lot in advance
>

-- 
You received this message because you are subscribed to the Google Groups "H2 
Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/d/optout.

Reply via email to