Hi,
thanks for explaining your syntax in another post. Now about virtual
tables if you don't mind.

On Fri, Feb 28, 2014 at 8:24 PM, Eleytherios Stamatogiannakis
<[email protected]> wrote:
>
> If we load into SQLite, ....
>
> create table newtable as select * from READCOMPRESSEDFILE('ctable.rc');
>
> it takes: 55 sec
>
>
> If we create an external program ....
>
> it takes: 19 sec (~3x faster than using the virtual table API)
>
>

Looking at your numbers, as a user (and fan :) of virtual tables I
decided to do some tests.

I have a virtual table "all values", it was designed for enumeration
of all tables values to the one single virtual table, so finally it is
a long list of

  TableName, TableRowId, FieldName, Value

so you get the idea. As an example of what it may do, you may open
places.sqlite of mozilla browser and do

  Select * from AllValues where Value Like "%sqlite.org%"

and see actual results even not knowing how they planned their schema.

Internally this virtual table simply uses general selects for all
other tables met in sqlite_master. This is a good (but probably not
the best) test for measuring virtual tables performance, because

  SELECT * FROM AllValues

is equivalent to reading all conventional tables of this database.
Besides
- the tool I use has a tweaker implemented with VFS that allows
measuring speed and other characteristics of the query performed while
the query is in effect.
- I have an option that forces resetting windows cache for the
database file when it is reopened. So with it we exclude the windows
cache from consideration so pure I/O reading is used. Btw, when you do
your comparison, it's very important to reset system cache before
every measurement that involves I/O.


So I took a comparatively large (500 Mb) database consisting of
several small and one big table (Posts) and compared two queries.

(Query1)

  Select sum(length(Body) + length(Title)) from Posts

This ones effectively reads the table data and uses
- length() to force sqlite reading texts that don't fit into single db page
- sum() to exclude accumulating results on my side from comparison, so
we have a single row, single column result from the work completely
done by sqlite.

(Query2)

  Select Sum(Length(Value)) from AllValues

This one performs basically the same but using sqlite virtual tables
api. It also touches other tables, but since they're small, we can
forget about this.

Query1 (General):
  Read: 540MB,
  Time: 24.2 sec,
  CPU Time: 6 Sec (25%)
  Speed: 22.31 MB/Sec

Query2 (Virtual):
  Read: 540MB,
  Time: 27.3 Sec,
  CPU Time: 13 sec (51%)
  Speed: 20 MB/Sec

In my particular test the noticeable difference is at the part of the
CPU spent more with the virtual table. I assume this can be related to
my own implementation of this virtual table since I should retrieve,
store values temporary somewhere and talk to sqlite. But this also may
shed light on your performance drop. If your virtual implementation
spend much time processing a value, you may finally get a big drop.

You may tell that this test is not fair because it does not involve
creating a table from the values of a virtual table. Unfortunately I
can't create good enough test comparing Posts and AllValues table as
sources, because the destination geometry of the tables are different
( Posts have more columns, less rows, AllValue less columns, more
rows). The closest approximation was possible when I created an
intermediate physical table containing the results from AllValues and
compared table creation from this table and from virtual table. The
virtual one took longer, but the values - 56 seconds vs 43 second not
different enough to conclude something.

I'm not sure my tests defend sqlite virtual tables sufficiently, but
currently I don't have evidence of significant inefficiency either.

Max
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to