Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dimitris Bil
I see, so in the native implementation you already have the whole table in 
memory and only use the clustered b-tree index to search for tuples. So I would 
not  expect a large improvement from the virtual table implementation, but the 
virtual table being 5 times slower is strange. Maybe not the correct data 
structure used?


By the way, I had tried adding a virtual table in the sqlite amalgamation and I 
did not see observable difference. On the other hand, I have seen improvement 
in the virtual table utilization using the latest version of sqlite (in 
comparison to a release about a year ago).



From: sqlite-users  on behalf of 
Bob Friesenhahn 
Sent: Wednesday, February 8, 2017 4:09 PM
To: SQLite mailing list
Subject: Re: [sqlite] Virtual table vs real table query performance

On Wed, 8 Feb 2017, Dimitris Bil wrote:

> Do you perform the benchmark on the native database table using cold
> cache or warm cache? Also, can you briefly describe what the
> benchmark does and give the schema for the native database table?

My benchmark repeatedly reads all of the columns one by one given row
id and column name.  The table is read many (e.g. 100) times so this
is a warm cache test.

The schema is not terribly important but the table we are trying to
optimize (with 1800 or less rows) contains a 64-bit rowid, five
integer values, and two short text string values.

   int64, uint32, uint32, text[16], uint8, text[16], text[18], uint8,
   uint32

What I am looking for is expected average virtual table performance vs
native table performance for repeated column reads.

Due to being a generic implementation (supporting many virtual
tables), our virtual implementation uses programmed/dynamic
marshalling rather that compiled marshalling.  The schema definition
is also dynamically generated.

There are implementation overheads and it is useful to know what
performance is possible (e.g. compared to native table performance)
in order to know when the implementation is about as good as it can
be.

Bob
--

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dominique Devienne
On Wed, Feb 8, 2017 at 4:50 PM, Hick Gunter  wrote:

> This is with a generic, user definable table and index structure capable
> virtual table implementation; it is not a "one module per table" statically
> typed heavily optimized implementation.
>

Ah, that makes complete sense then. I didn't want the OP to think virtual
tables were slower
than native tables in the general case, especially since he mentioned
memory arrays in C code.

And indeed the virtual-table advantage I mentioned is with a different
statically-typed vtable impl/module per vtable,
with statically defined indexes, where the table structure is hard-coded in
the vtable impl itself, and corresponds
to a native "row" data structure. In that config one leverages the
"front-end" of SQLite (parser and VDBE engine)
and very little of the "back-end" (pager and btree), except when SQLite
decides to make temporary tables for
query processing I guess. FWIW. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Bob Friesenhahn

On Wed, 8 Feb 2017, Dimitris Bil wrote:

Do you perform the benchmark on the native database table using cold 
cache or warm cache? Also, can you briefly describe what the 
benchmark does and give the schema for the native database table?


My benchmark repeatedly reads all of the columns one by one given row 
id and column name.  The table is read many (e.g. 100) times so this 
is a warm cache test.


The schema is not terribly important but the table we are trying to 
optimize (with 1800 or less rows) contains a 64-bit rowid, five 
integer values, and two short text string values.


  int64, uint32, uint32, text[16], uint8, text[16], text[18], uint8,
  uint32

What I am looking for is expected average virtual table performance vs 
native table performance for repeated column reads.


Due to being a generic implementation (supporting many virtual 
tables), our virtual implementation uses programmed/dynamic 
marshalling rather that compiled marshalling.  The schema definition 
is also dynamically generated.


There are implementation overheads and it is useful to know what 
performance is possible (e.g. compared to native table performance) 
in order to know when the implementation is about as good as it can 
be.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Hick Gunter
This is with a generic, user definable table and index structure capable 
virtual table implementation; it is not a "one module per table" statically 
typed heavily optimized implementation.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dominique Devienne
Gesendet: Mittwoch, 08. Februar 2017 16:42
An: SQLite mailing list 
Betreff: Re: [sqlite] Virtual table vs real table query performance

On Wed, Feb 8, 2017 at 4:30 PM, Hick Gunter  wrote:

> Values are for retrieving 100.000 rows with a where clause not
> satisfiable from the index but true for alle rows
>
> asql> select count() from ;
> CPU Time: user 0.092986 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.189971 sys 0.00
> CPU Time: user 0.199969 sys 0.00
> CPU Time: user 0.199970 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.086987 sys 0.010998
> CPU Time: user 0.085987 sys 0.00
> CPU Time: user 0.076988 sys 0.002000
>


Frankly I'm surprised it's slower than "native" SQLite.

In bulk-insert, random lookup, and table-delete timings we did in 2009 between 
native in-memory SQLite, and pure-C++ virtual tables accessing pure C++ data 
structures (i.e. vm/reflection/introspection/dynamic lookup as in Python for 
example, but direct addressing of statically typed data), the virtual tables 
was always faster, and not by a small margin.

Admittedly it was a long time ago, and SQLite is getting faster all the time 
for sure, but you can't beat static typing of memory addressable structures, vs 
scanning pages of table data and dynamically/serially decoding variable sizes 
rows within those pages.

So something like "non-native" code or something "dynamic" is hiding in the 
virtual table impl, no? --DD ___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dominique Devienne
On Wed, Feb 8, 2017 at 4:30 PM, Hick Gunter  wrote:

> Values are for retrieving 100.000 rows with a where clause not satisfiable
> from the index but true for alle rows
>
> asql> select count() from ;
> CPU Time: user 0.092986 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.189971 sys 0.00
> CPU Time: user 0.199969 sys 0.00
> CPU Time: user 0.199970 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.086987 sys 0.010998
> CPU Time: user 0.085987 sys 0.00
> CPU Time: user 0.076988 sys 0.002000
>


Frankly I'm surprised it's slower than "native" SQLite.

In bulk-insert, random lookup, and table-delete timings we did in 2009
between native in-memory SQLite, and pure-C++ virtual tables accessing
pure C++ data structures (i.e. vm/reflection/introspection/dynamic lookup
as in Python for example, but direct addressing of statically typed data),
the virtual tables was always faster, and not by a small margin.

Admittedly it was a long time ago, and SQLite is getting faster all the time
for sure, but you can't beat static typing of memory addressable structures,
vs scanning pages of table data and dynamically/serially decoding variable
sizes
rows within those pages.

So something like "non-native" code or something "dynamic" is hiding in the
virtual table impl, no? --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Hick Gunter
Values are for retrieving 100.000 rows with a where clause not satisfiable from 
the index but true for alle rows

asql> select count() from ;
CPU Time: user 0.092986 sys 0.00

asql> select count() from  where =4;
CPU Time: user 0.189971 sys 0.00
CPU Time: user 0.199969 sys 0.00
CPU Time: user 0.199970 sys 0.00

asql> select count() from  where =4;
CPU Time: user 0.086987 sys 0.010998
CPU Time: user 0.085987 sys 0.00
CPU Time: user 0.076988 sys 0.002000

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Bob Friesenhahn
Gesendet: Mittwoch, 08. Februar 2017 15:39
An: SQLite mailing list 
Betreff: Re: [sqlite] Virtual table vs real table query performance

On Wed, 8 Feb 2017, Hick Gunter wrote:

> Having imlemented a memory-based virtual table complete with indices,
> full table scan and direct access via rowid (which happens to be the
> memory address of the row) I can do a batch delete of
> 100.000 rows (in a table with 1 composite index) in about 2 seconds

The case I am interested is pure read performance of a single column element at 
a time given properly implemented xBestIndex and xFilter support.  Rows are not 
being added/removed using sqlite.

It is possible that native tables can be faster since the implementation is not 
limited to the rigid set of callback functions provided for virtual tables to 
use and of course the amalgamation is optimized by the compiler as one source 
module.

By tracing the callbacks, we do see that our implementation is not invoking the 
callbacks more times than necessary (which was not the case before xBestIndex 
and xFilter support was added).  Due to the requirements of the implementation, 
POSIX reader/writer locks are used so there is some low-contention locking 
overhead.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dimitris Bil
Hello,


Do you perform the benchmark on the native database table using cold cache or 
warm cache? Also, can you briefly describe what the benchmark does and give the 
schema for the native database table?


thanks



From: sqlite-users  on behalf of 
Bob Friesenhahn 
Sent: Wednesday, February 8, 2017 2:39 PM
To: SQLite mailing list
Subject: Re: [sqlite] Virtual table vs real table query performance



The case I am interested is pure read performance of a single column
element at a time given properly implemented xBestIndex and xFilter
support.  Rows are not being added/removed using sqlite.

It is possible that native tables can be faster since the
implementation is not limited to the rigid set of callback functions
provided for virtual tables to use and of course the amalgamation is
optimized by the compiler as one source module.

By tracing the callbacks, we do see that our implementation is not
invoking the callbacks more times than necessary (which was not the
case before xBestIndex and xFilter support was added).  Due to the
requirements of the implementation, POSIX reader/writer locks are used
so there is some low-contention locking overhead.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
GraphicsMagick Image Processing System<http://www.graphicsmagick.org/>
www.graphicsmagick.org
GraphicsMagick is a robust collection of tools and libraries to read, write, 
and manipulate an image in any of the more popular image formats including GIF, 
JPEG, PNG ...


<http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Bob Friesenhahn

On Wed, 8 Feb 2017, Hick Gunter wrote:

Having imlemented a memory-based virtual table complete with 
indices, full table scan and direct access via rowid (which happens 
to be the memory address of the row) I can do a batch delete of 
100.000 rows (in a table with 1 composite index) in about 2 seconds


The case I am interested is pure read performance of a single column 
element at a time given properly implemented xBestIndex and xFilter 
support.  Rows are not being added/removed using sqlite.


It is possible that native tables can be faster since the 
implementation is not limited to the rigid set of callback functions 
provided for virtual tables to use and of course the amalgamation is 
optimized by the compiler as one source module.


By tracing the callbacks, we do see that our implementation is not 
invoking the callbacks more times than necessary (which was not the 
case before xBestIndex and xFilter support was added).  Due to the 
requirements of the implementation, POSIX reader/writer locks are used 
so there is some low-contention locking overhead.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Hick Gunter
Having imlemented a memory-based virtual table complete with indices, full 
table scan and direct access via rowid (which happens to be the memory address 
of the row) I can do a batch delete of 100.000 rows (in a table with 1 
composite index) in about 2 seconds (3.7 seconds with the condition) while 
running linux (RH 5.6 x86_64 VM) on a virtual machine. Deleting all rows of a 
native SQLite table (while checking for the value of a non-indexed field to 
avoid SQLite just dropping an re-creating the table) takes about 1 second.



Note that both operations require a full table scan to fill a „rowset“ (= 
SQLite internal temporary table) and that the virtual table function VUpdate 
expects the virtual table code to handle index deletetion which is explicitly 
coded in the native table case.



asql> explain delete from ;

addr  opcode p1p2p3p4 p5  comment

  -        -  --  -

0 Trace  0 0 000  NULL

1 Goto   0 18000  NULL

2 Integer0 1 000  NULL

3 Null   0 2 000  NULL

4 VOpen  0 0 0 vtab:187BE588:2ACC1FDC4990  00  NULL

5 Integer1 4 000  NULL

6 Integer0 5 000  NULL

7 VFilter0 12400  NULL

8 Rowid  0 3 000  NULL

9 RowSetAdd  2 3 000  NULL

10AddImm 1 1 000  NULL

11VNext  0 8 000  NULL

12Close  0 0 000  NULL

13RowSetRead 2 16300  NULL

14VUpdate0 1 3 vtab:187BE588:2ACC1FDC4990  02  NULL

15Goto   0 13000  NULL

16ResultRow  1 1 000  NULL

17Halt   0 0 000  NULL

18VBegin 0 0 0 vtab:187BE588:2ACC1FDC4990  00  NULL

19Goto   0 2 000  NULL



asql> explain delete from  where =4;

addr  opcode p1p2p3p4 p5  comment

  -        -  --  -

0 Trace  0 0 000  NULL

1 Goto   0 31000  NULL

2 Integer0 1 000  NULL

3 Null   0 2 000  NULL

4 OpenRead   0 215   0 7  00  

5 Rewind 0 13000  NULL

6 Column 0 6 400  .

7 Integer4 5 000  NULL

8 Ne 5 124   collseq(BINARY)  6c  NULL

9 Rowid  0 3 000  NULL

10RowSetAdd  2 3 000  NULL

11AddImm 1 1 000  NULL

12Next   0 6 001  NULL

13Close  0 0 000  NULL

14OpenWrite  0 215   0 8  00  

15OpenWrite  1 1362  0  Keyinfo(5,BINARY,BINARY)  00  

16RowSetRead 2 27300  NULL

17NotExists  0 26300  NULL

18Rowid  0 11000  NULL

19Column 0 1 600  .

20Column 0 2 700  .

21Column 0 3 800  .

22Column 0 4 900  .

23Column 0 5 10   00  .

24IdxDelete  1 6 600  NULL

25Delete 0 1 000  NULL

26Goto   0 16000  NULL

27Close  1 1362  000  NULL

28Close  0 0 000  NULL

29ResultRow  1 1 000  NULL

30Halt   0 0 000  NULL

31Transaction0 1 000  NULL

32VerifyCookie   0 1191  000  NULL

33TableLock  0 215   100  NULL

34Goto   0 2 000  NULL



-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Bob Friesenhahn
Gesendet: Dienstag, 07. Februar 2017 22:06
An: SQLite mailing list 
Betreff: [sqlite] Virtual table vs real table query performance



We are trying to improve the query perfo

[sqlite] Virtual table vs real table query performance

2017-02-07 Thread Bob Friesenhahn
We are trying to improve the query performance of our virtual table 
implementation (which is implemented in C).  Due to requirements of 
external code, a specified column of a specified row (by rowid) is 
queried at a time (the least efficient means of access).  Our virtual 
table is accessing entries in a memory-based array.


I have implemented a benchmark script written in Python using the APSW 
wrapper.  The benchmark script reveals that access to a native 
database table is 5 times faster than access to our virtual table.


Intuitively, I would think that access to a memory-based virtual table 
could be faster than native tables.  Our developer has implemented 
xBestIndex and xFilter support which is intended to result in direct 
access to the requested row rather than scanning the whole table.


What is the expected performance of a properly implemented virtual 
table (assuming little additional overhead) vs a native table?


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users