Also, i want to note the pretty clever "hack" from Hick Gunter that works by passing a "proxy" for the row (instead of the whole row's values) to SQLite. This proxy gets expanded using the xColumn API by a virtual table outside the whole query.

In this way a query using multiple VTs, only needs to incur the xColumn performance penalty only once.

Nevertheless, having people do crazy/clever hacks like that to avoid the inefficiencies of an API, gives a hint that something might be problematic.

In a previous email of mine, i had said half seriously/half joking about the cost in Watts of SQLite's worldwide usage. For something that is so widely used, even some % of efficiency improvement really makes a difference. It is not an "SQLite destroys/saves civilization" kind of difference, but IMHO it would be measurable in G/M Watts.

l.

On 4/3/2014 11:33 μμ, Alek Paunov wrote:
On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:
On 04/03/14 20:11, Alek Paunov wrote:
On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:
On 03/03/14 03:01, Alek Paunov wrote:
It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a
more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).

SQLite doesn't need any special format for the records to be passed
over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.


It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?

In the link that you posted above, look for all the sqlite3_bind_xxxxx
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.


Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like
bind_xxx for each column with the scalar addresses (allocated by you)
- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or
standalone sqlite add-on loaded by PyPy FFI)?

How you expect this model (managing one per cell count of scalar
allocations during the query) to perform in comparison with passing
encoded row pages (memory chinks) between sqlite and the script engine
especially when it is not PyPy or LuaJIT?

Regards,
Alek


_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to