Hello,
On May 27, 2008, at 9:07 PM, Stephen Oberholtzer wrote:
> Well, the first thing you should bring away from this experience is
> that the
> number of VM instructions isn't really an indicator of how efficient
> the
> query is :)
Good point :)
> Now, I'm not sure exactly why one is faster than the other,
> especially since
> you didn't post your exact schema and indices,
The DDL is rather straightforward:
create table if not exists token
(
id integer primary key not null,
name text not null
)
create unique index if not exists token_name on token( name )
http://dev.alt.textdrive.com/browser/HTTP/Finder.ddl#L60
> and I have no idea how many rows there are in either table.
The incoming data set size varies from 108 to 3345 rows, with a
average size of around 930 rows. The target table size is about 75,148
rows. Over time, most of the incoming rows will already exist in the
target table.
> But if I had to guess, it's because of the ORDER BY clause. In
> general, an
> ORDER BY means that SQLite needs to generate a temporary table with
> all the
> rows to be selected/inserted,
> then sort that temporary table. The INSERT OR IGNORE version has to
> unconditionally sort the entire 'stage' table; your second query
> only has to
> sort those rows in 'stage' that don't already exist in 'table'. If
> each
> table fits comfortably in your computer's disk cache, the extra pass
> won't
> matter so much.
Ah... yes... the order by clause... good point... indeed removing the
'order by' from the 'insert or ignore' statement brings down its
execution time a whisker away from the 'self join' version :)
Thanks for the explaination!
Cheers,
--
PA.
http://alt.textdrive.com/nanoki/
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users