Re: [sqlite] Compilation problems after upgrading from v3.7.6.3 to v3.8.4

2014-03-04 Thread ioannis
It seems that the compilation errors i reported earlier
error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 73531,
etc

only occured in debug mode, and, were caused by flag:
Program Database for Edit And Continue (/ZI)

after changing the flag to:
Program Database (/Zi)

everything is compiling normally again both in debug and release mode.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird out of memory problem a prepare

2014-03-04 Thread Simon Slavin

On 4 Mar 2014, at 8:05pm, Eduardo Morras  wrote:

> The tables have 4 rows each one,

Hahahaha.  Hah.  That changes things.

You have something wrong.  Either the database is corrupt (check it with a 
PRAGMA or make another one) or your code is messed up somehow.

As a test, open that database with the SQLite shell tool and execute exactly 
the same command that gave you the memory error in your code.  I bet it works 
fine.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Elefterios Stamatogiannakis
Also, i want to note the pretty clever "hack" from Hick Gunter that 
works by passing a "proxy" for the row (instead of the whole row's 
values) to SQLite. This proxy gets expanded using the xColumn API by a 
virtual table outside the whole query.


In this way a query using multiple VTs, only needs to incur the xColumn 
performance penalty only once.


Nevertheless, having people do crazy/clever hacks like that to avoid the 
inefficiencies of an API, gives a hint that something might be problematic.


In a previous email of mine, i had said half seriously/half joking about 
the cost in Watts of SQLite's worldwide usage. For something that is so 
widely used, even some % of efficiency improvement really makes a 
difference. It is not an "SQLite destroys/saves civilization" kind of 
difference, but IMHO it would be measurable in G/M Watts.


l.

On 4/3/2014 11:33 μμ, Alek Paunov wrote:

On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a
more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed
over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.



Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like
bind_xxx for each column with the scalar addresses (allocated by you)
- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or
standalone sqlite add-on loaded by PyPy FFI)?

How you expect this model (managing one per cell count of scalar
allocations during the query) to perform in comparison with passing
encoded row pages (memory chinks) between sqlite and the script engine
especially when it is not PyPy or LuaJIT?

Regards,
Alek



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Elefterios Stamatogiannakis

On 4/3/2014 11:33 μμ, Alek Paunov wrote:

On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a
more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed
over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.



Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like
bind_xxx for each column with the scalar addresses (allocated by you)
- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or
standalone sqlite add-on loaded by PyPy FFI)?


Yes these things can be done. We have written our own APSW compatible 
FFI for PyPy, in which we have tried techniques like the one you 
describe. The end result wasn't that much faster. Because no matter how 
many wrappers there are above VT's API, the efficiency characteristics 
of it are still there.


How you expect this model (managing one per cell count of scalar
allocations during the query) to perform in comparison with passing
encoded row pages (memory chinks) between sqlite and the script engine
especially when it is not PyPy or LuaJIT?



Essentially in my first post about VT's API efficiency the comparison 
was between the bind API (similar to your "encoded pages" idea but for a 
single row), and the current xColumn based VT API.


In our tests with Python, the difference between the two APIs was 3x. 
I've seen another post here by somebody else who was using a native 
compiled VT implementation saying that the difference was 2x. Both of 
these observations say that there is some inefficiency in the current 
situation with the VT API.


If this API is also used within SQLite's VM, maybe adding an xNextRow 
opcode, would speedup SQLite's VM too?


Best,

l.


Regards,
Alek



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Alek Paunov

On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.



Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like 
bind_xxx for each column with the scalar addresses (allocated by you)

- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or 
standalone sqlite add-on loaded by PyPy FFI)?


How you expect this model (managing one per cell count of scalar 
allocations during the query) to perform in comparison with passing 
encoded row pages (memory chinks) between sqlite and the script engine 
especially when it is not PyPy or LuaJIT?


Regards,
Alek

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compilation problems after upgrading from v3.7.6.3 to v3.8.4

2014-03-04 Thread Stephan Beal
On Tue, Mar 4, 2014 at 8:59 PM, ioannis  wrote:

> error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 73531
>

According to the MS docs, __LINE__ is a "decimal integer constant" in VS
2013:

http://msdn.microsoft.com/en-us/library/b0084kay.aspx

but what you're seeing seems to imply that __LINE__ is resolving to a
function call?

Weird.
-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird out of memory problem a prepare

2014-03-04 Thread mm.w
Hello,

what's your SharedSection value? [heap limitation]

Best Regards.


On Tue, Mar 4, 2014 at 12:16 PM, Clemens Ladisch  wrote:

> Eduardo Morras wrote:
> > Trying simple "SELECT r.name FROM resource AS r WHERE r.name = ?" gets
> "Out of memory" too calling preparev2.
>
> This has nothing to do with the query itself.
>
> If you aren't doing something funny with the memory allocator, it's likely
> that SQLite's database object got corrupt by some other buggy code.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Compilation problems after upgrading from v3.7.6.3 to v3.8.4

2014-03-04 Thread Richard Hipp
On Tue, Mar 4, 2014 at 2:59 PM, ioannis  wrote:

> I am having compilation problems after upgrading from v3.7.6.3 to v3.8.4 on
> the
> static const int iLn = __LINE__+4; <  lines containing these statements
>

We have had any trouble here.  Are you redefining __LINE__ somewhere in
your environment?



>
> Tried with VS2012 and VS2013, can someone help me fix this ?
>
> error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 73531
> error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 97602
> error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 97865
> error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 98568
> error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 106055
>
>
> Additionally i was wandering, whether the following is correct.
>  __LINE__+4 seems to point to the commented line !!!
>
> 73527  ** The sqlite3_blob_close() function finalizes the vdbe program,
> 73528  ** which closes the b-tree cursor and (possibly) commits the
> 73529  ** transaction.
> 73530  */
> 73531  static const int iLn = __LINE__+4;
> 73532  static const VdbeOpList openBlob[] = {
> 73533/* {OP_Transaction, 0, 0, 0},  // 0: Inserted separately */
> 73534{OP_TableLock, 0, 0, 0},   /* 1: Acquire a read or write lock
> */
> 73535/* One of the following two instructions is replaced by an
> OP_Noop. */
> 73536{OP_OpenRead, 0, 0, 0},/* 2: Open cursor 0 for reading */
> 73537{OP_OpenWrite, 0, 0, 0},   /* 3: Open cursor 0 for read/write
> */
> 73538{OP_Variable, 1, 1, 1},/* 4: Push the rowid to the stack
> */
>
> Thanks in advance!
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird out of memory problem a prepare

2014-03-04 Thread Clemens Ladisch
Eduardo Morras wrote:
> Trying simple "SELECT r.name FROM resource AS r WHERE r.name = ?" gets "Out 
> of memory" too calling preparev2.

This has nothing to do with the query itself.

If you aren't doing something funny with the memory allocator, it's likely
that SQLite's database object got corrupt by some other buggy code.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird out of memory problem a prepare

2014-03-04 Thread Eduardo Morras
On Tue, 4 Mar 2014 15:19:24 +
Simon Slavin  wrote:

> 
> On 4 Mar 2014, at 3:15pm, Simon Slavin  wrote:
> 
> > On 4 Mar 2014, at 3:09pm, Eduardo Morras  wrote:
> > 
> >> zSql= "SELECT r.name, s.content FROM resource AS r, static AS s
> >> WHERE (r.ids = s.ids AND r.name = ?);";
> >> 
> >> [snip]
> >> 
> >> Error on query: out of memory
> > 
> > I think this might require comparing every row in resource with
> > every row in static.  Which is a lot of temporary data to hold in
> > memory.
> > 
> > You might try something like
> > 
> > SELECT r.name, s.content FROM resource AS r JOIN static AS s ON
> > s.ids = r.ids  WHERE r.name = ?
> 
> and, of course, an index
> 
> CREATE INDEX i1 on resource (name,ids)
> 
> will make it run extremely quickly.

I'll reply both answers here, if you don't mind.

Thanks Simon, for the answers.

The tables have 4 rows each one, that's why I got suprised with the Out of 
Memory error. The biggest row has 12KB and with the join I do, shouldn't use 
more than 200KB.

Changing the ',' with the join you propose, gives Out of Memory too. It happens 
on prepare phase, before binding the ? with my data. The query didn't reach the 
step call.

Trying simple "SELECT r.name FROM resource AS r WHERE r.name = ?" gets "Out of 
memory" too calling preparev2.

Surely something is rotten on my development platform...

> 
> Simon.

Thanks
---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Compilation problems after upgrading from v3.7.6.3 to v3.8.4

2014-03-04 Thread ioannis
I am having compilation problems after upgrading from v3.7.6.3 to v3.8.4 on
the
static const int iLn = __LINE__+4; <  lines containing these statements

Tried with VS2012 and VS2013, can someone help me fix this ?

error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 73531
error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 97602
error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 97865
error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 98568
error C2099: initializer is not a constant ..\sqlite3\sqlite3.c 106055


Additionally i was wandering, whether the following is correct.
 __LINE__+4 seems to point to the commented line !!!

73527  ** The sqlite3_blob_close() function finalizes the vdbe program,
73528  ** which closes the b-tree cursor and (possibly) commits the
73529  ** transaction.
73530  */
73531  static const int iLn = __LINE__+4;
73532  static const VdbeOpList openBlob[] = {
73533/* {OP_Transaction, 0, 0, 0},  // 0: Inserted separately */
73534{OP_TableLock, 0, 0, 0},   /* 1: Acquire a read or write lock
*/
73535/* One of the following two instructions is replaced by an
OP_Noop. */
73536{OP_OpenRead, 0, 0, 0},/* 2: Open cursor 0 for reading */
73537{OP_OpenWrite, 0, 0, 0},   /* 3: Open cursor 0 for read/write
*/
73538{OP_Variable, 1, 1, 1},/* 4: Push the rowid to the stack */

Thanks in advance!
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x 
functions. In SQLite the bind API is used to pass parameters to prepared 
statements. The way the the bind API works is that you have a statement 
parameter "row", and you fill it by saying:


Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API 
"way", the program calls SQLite to fill a row's values.


Regards.

l.




For a paging API (which IMHO is too complex ATM), the bind API could be
extended with a row number parameter.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Alek Paunov

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


For a paging API (which IMHO is too complex ATM), the bind API could be
extended with a row number parameter.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis

Thank you for the explanation.

Answers inline.

On 04/03/14 16:16, Hick Gunter wrote:

Your VT1 table already has an xColumn implementation, possibly doing something 
like

 switch( p_column )
 {
 case 1: sqlite3_result_xxx( p_ctx, v_rec->f1, ...); break;
 ...
 case n: sqlite3_result_xxx( p_ctx, v_rec->fn, ...); break;
 }

This needs to have two cases added:

 case n+1: sqlite3_result_int64( p_ctx, (uintptr_t)v_rec );
 case n+2: sqlite3_result_int64( p_ctx, (uintptr_t)func );

where

 static int func( p_rec, p_ctx, p_column );

calls

 xColumn( v_cursor, p_ctx, p_column );

with a dummy cursor structure as defined for your table.

The VT2 table can then prepare "select __rec,__func from VT1", and in its 
xColumn implementation it calls

   v_rec  = (void *)sqlite3_column_int64( v_stmt, 0 ); // this can be 
stored and cleared in the xNext function
 v_func = (func *)sqlite3_column_int64( v_stmt, 1 ); // this can be 
stored
 v_func( v_rec, p_ctx, p_column );


I see, so you do a similar trick as what we do with passing Python's 
generators as values in SQLite.




As for your second example, as written it does not suffer from the effect 
because you are already selecting c1, c2 and c3 at the bottom level.

Rewritten as

Select processrow(c1,c2,c3) from VT2(select * from VT1);


Without knowing what VT2 will do, I don't think that this rewritting can 
happen. For example, "processrow" might return generators (nested 
tables), that get expanded by VT2. If you moved it outside VT2, then the 
generators would not be expanded.


Regards,

l.



results in the VT1 xColumn function getting called (via the VT2 xColumn 
function) just 3 times per row.

Additionally, you may like to "select __func from VT1 limit 1" and store that in your 
xFilter implementation; and then "select __rec from VT1" in your xNext implementation to 
have sqlite3_result_int64() called half as often.

HTH

-Ursprüngliche Nachricht-
Von: Eleytherios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Dienstag, 04. März 2014 14:15
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Could you explain some more your solution?

Does it work in this following case:

select * from VT2(select * from VT1);

by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)?

What would happen in the following case?:

select * from VT2(select processrow(c1,c2,c3) from VT1);

Regards,

l.

On 03/03/14 14:17, Hick Gunter wrote:

We have gotten around this problem by defining "virtual" fields that contain a reference 
to the "current record" and the entrypoint of a wrapper around the xColumn function. That 
way only two fields get passed upwards through the virtual table stack and the top level virtual 
table's xColumn implementation calls straight through to the bottom layer's wrapper.

It does take some care to avoid sorting in between the layers and 
re-preparation of statements on schema changes.

-Ursprüngliche Nachricht-
Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Sonntag, 02. März 2014 20:39
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

We have both input and output virtual tables that avoid hitting the hard disk 
and are also able to compress the incoming and outgoing data.

We have a virtual table that takes as input a query and sends the data to a port on another 
machine. This virtual table is called "OUTPUT". And another virtual table that takes as 
input data from another port and forwards it into SQLite. Lets call it "INPUT". A query 
that uses these two virtual tables would look like this in madIS:

OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');

We actually use queries like above (actually we don't do it directly to ports 
but to buffered named pipes that are then forwarded via netcat) to run 
distributed queries on clusters, connecting all the local SQLite/madIS 
instances on the different machines together.

The main point that i want to make with above explanation is that we don't view 
SQLite only as a traditional database. We also view it as a data stream 
processing machine, that doesn't have the requirement for the data to be stored 
on a hard disk.

Under this view, the efficiency of the virtual table api is very important. 
Above query only uses 2 VTs in it, but we have other queries that use a lot 
more VTs than that.

estama


On 2/3/2014 9:34 ìì, Max Vlasov wrote:

On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
 wrote:


Our main test case is TPCH, a standard DB benchmark. The "lineitem"
table of TPCH contains 16 columns, which for 10M rows would require
160M xColumn callbacks, to pass it through the virtual table API.
These callbacks are very expensive, especially when at the other end
sits a VM (CPython or PyPy) handling them.




Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis
In our tests the bind API can acquire from the Python side more than 20 
values in a single call, at the same time that xColumn acquires 2 values.


Most of the cost is in the callback and not in submitting a row's values 
through bind's API .


So with the exception of queries that need only 1 column, IMHO 
everything else should go through the xNextRow API.


To keep the complexity to the lowest minimum, my proposal is to use 
xNextRow API only for queries that only "scan" over a virtual table (no 
filtering).


l.

On 04/03/14 18:23, Hick Gunter wrote:

My guess: Yes.

It would require implementing an new opcode, either only for virtual tables or 
also for native tables too, that accepts a list of field numbers (currently 
there are only 5 parameters possible for an opcode and some of them have fixed 
meanings).

And the logic to generate theses opcodes based on the capabilities of the loaded table 
module combined with the requirements of the subject query (fields required for JOIN are 
fetched separately from those required for the result set) and the result of the 
xBestIndex calls (where it is possible to set the "omit" flag to suppress 
generation of a comparison). This also adds to the complexity of register allocation.

Take for example a join that needs 3 fields for the comparison, 2 of which are 
also required for the result set of 7 fields total. Do you request all 10 
fields at once (wastes up to 9 fields worth of effort if the JOIN is not met)? 
Or the 3 fields first and the 5 others only if the join matches (must allocate 
consecutive registers to build a result set)? Or 3 first and then 7 (which 
approximates the current behavior, as the 2 common fields are fetched twice on 
a match)?

And a set of new sqlite3_result routines that specify which of the various 
requested fields' value is being set.

-Ursprüngliche Nachricht-
Von: J. Merrill [mailto:j.merr...@enlyton.com]
Gesendet: Dienstag, 04. März 2014 16:23
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Eleytherios Stamatogiannakis wrote

Our main test case is TPCH, a standard DB benchmark. The "lineitem"
table of TPCH contains 16 columns, which for 10M rows would require
160M xColumn callbacks, to pass it through the virtual table API.
These callbacks are very expensive, especially when at the other end
sits a VM (CPython or PyPy) handling them.


Would it be very difficult to arrange for an option that would request that 
SQLite issue a single more-complex xMultiColumns (a sample name) callback 
request, with a way for multiple results to be returned, rather than many 
xColumn callbacks? This would reduce the number of calls across the VM boundary.

Applications that don't implement xMultiColumns (and request its use) would see 
no change; those that do would get the performance boost.

J. Merrill



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


---
Gunter Hick
Software Engineer

Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna,
Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then
delete this message from your system. Please do not copy it or use it for any 
purposes, or disclose its contents to any person as to do so could be a breach 
of confidence. Thank you for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Hick Gunter
My guess: Yes.

It would require implementing an new opcode, either only for virtual tables or 
also for native tables too, that accepts a list of field numbers (currently 
there are only 5 parameters possible for an opcode and some of them have fixed 
meanings).

And the logic to generate theses opcodes based on the capabilities of the 
loaded table module combined with the requirements of the subject query (fields 
required for JOIN are fetched separately from those required for the result 
set) and the result of the xBestIndex calls (where it is possible to set the 
"omit" flag to suppress generation of a comparison). This also adds to the 
complexity of register allocation.

Take for example a join that needs 3 fields for the comparison, 2 of which are 
also required for the result set of 7 fields total. Do you request all 10 
fields at once (wastes up to 9 fields worth of effort if the JOIN is not met)? 
Or the 3 fields first and the 5 others only if the join matches (must allocate 
consecutive registers to build a result set)? Or 3 first and then 7 (which 
approximates the current behavior, as the 2 common fields are fetched twice on 
a match)?

And a set of new sqlite3_result routines that specify which of the various 
requested fields' value is being set.

-Ursprüngliche Nachricht-
Von: J. Merrill [mailto:j.merr...@enlyton.com]
Gesendet: Dienstag, 04. März 2014 16:23
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Eleytherios Stamatogiannakis wrote
> Our main test case is TPCH, a standard DB benchmark. The "lineitem"
> table of TPCH contains 16 columns, which for 10M rows would require
> 160M xColumn callbacks, to pass it through the virtual table API.
> These callbacks are very expensive, especially when at the other end
> sits a VM (CPython or PyPy) handling them.

Would it be very difficult to arrange for an option that would request that 
SQLite issue a single more-complex xMultiColumns (a sample name) callback 
request, with a way for multiple results to be returned, rather than many 
xColumn callbacks? This would reduce the number of calls across the VM boundary.

Applications that don't implement xMultiColumns (and request its use) would see 
no change; those that do would get the performance boost.

J. Merrill



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


---
Gunter Hick
Software Engineer

Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna,
Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you 
have received it in error, you are on notice as to its status and accordingly 
please notify us immediately by reply e-mail and then
delete this message from your system. Please do not copy it or use it for any 
purposes, or disclose its contents to any person as to do so could be a breach 
of confidence. Thank you for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird out of memory problem a prepare

2014-03-04 Thread Simon Slavin

On 4 Mar 2014, at 3:15pm, Simon Slavin  wrote:

> On 4 Mar 2014, at 3:09pm, Eduardo Morras  wrote:
> 
>> zSql= "SELECT r.name, s.content FROM resource AS r, static AS s WHERE (r.ids 
>> = s.ids AND r.name = ?);";
>> 
>> [snip]
>> 
>> Error on query: out of memory
> 
> I think this might require comparing every row in resource with every row in 
> static.  Which is a lot of temporary data to hold in memory.
> 
> You might try something like
> 
> SELECT r.name, s.content FROM resource AS r JOIN static AS s ON s.ids = r.ids 
>  WHERE r.name = ?

and, of course, an index

CREATE INDEX i1 on resource (name,ids)

will make it run extremely quickly.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread J. Merrill
Eleytherios Stamatogiannakis wrote
> Our main test case is TPCH, a standard DB benchmark. The "lineitem" 
> table of TPCH contains 16 columns, which for 10M rows would require 160M 
> xColumn callbacks, to pass it through the virtual table API. These 
> callbacks are very expensive, especially when at the other end sits a VM 
> (CPython or PyPy) handling them.

Would it be very difficult to arrange for an option that would request that
SQLite issue a single more-complex xMultiColumns (a sample name) callback
request, with a way for multiple results to be returned, rather than many
xColumn callbacks? This would reduce the number of calls across the VM
boundary.

Applications that don't implement xMultiColumns (and request its use) would
see no change; those that do would get the performance boost.

J. Merrill



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Weird out of memory problem a prepare

2014-03-04 Thread Simon Slavin

On 4 Mar 2014, at 3:09pm, Eduardo Morras  wrote:

>  zSql= "SELECT r.name, s.content FROM resource AS r, static AS s WHERE (r.ids 
> = s.ids AND r.name = ?);";
> 
> [snip]
> 
> Error on query: out of memory

I think this might require comparing every row in resource with every row in 
static.  Which is a lot of temporary data to hold in memory.

You might try something like

SELECT r.name, s.content FROM resource AS r JOIN static AS s ON s.ids = r.ids  
WHERE r.name = ?

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Handling SQLITE_PROTOCOL with WAL databases

2014-03-04 Thread Török Edwin
Hi,

I encounter the same issue as described here: 
https://www.mail-archive.com/sqlite-users%40sqlite.org/msg57092.html
usually once or twice a day, when running a performance test-suite for my 
application (using SQLite 3.8.1).

What is the correct way to handle the SQLITE_PROTOCOL errors that usually occur 
when using WAL databases from multiple processes?

If I retry the query (or run other queries) after an SQLITE_PROTOCOL error is 
it possible to get the database corrupted,
like it used to be the case with SQLITE_BUSY? 
(https://www.sqlite.org/cvstrac/wiki?p=CorruptionFollowingBusyError)

I tend to think that the only safe way is to just close the database and quit 
the application when I encounter an SQLITE_PROTOCOL error.

Thanks,
--Edwin
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Weird out of memory problem a prepare

2014-03-04 Thread Eduardo Morras

Hi,

I have this code that fails always with the error output next: 

=
  zSql= "SELECT r.name, s.content FROM resource AS r, static AS s WHERE (r.ids 
= s.ids AND r.name = ?);";

  stmt = NULL;
  rc = sqlite3_prepare_v2(db, zSql, strlen(zSql), , NULL);

  if ( rc != SQLITE_OK ){
dprintf( log, "Error on query: %s\n", sqlite3_errmsg(db));
dprintf( log, "Query : %s\n", zSql);
dprintf( log, " at %s %d:\n", __func__, __LINE__);
exit(1);
=

Error on query: out of memory

Query : SELECT r.nombre, s.content FROM resource AS r, static AS s WHERE (r.ids 
= s.ids AND r.nombre = ?);
 at process_request 66:
=

The tables schema are:
"CREATE TABLE IF NOT EXISTS resource(\n"
"  rid INTEGER PRIMARY KEY NOT NULL,\n"   // Resource ID
"  type INTEGER NOT NULL,\n"  // Type of resource 
(music,video,+18)
"  ids INTEGER,\n"// FK Static Resource ID (if 
applicable)
"  sys INTEGER DEFAULT 0,\n"  // Is a system file (do not 
delete)
"  replicated INTEGER DEFAULT 0,\n"   // Is resource replicated 
somewhere (and safe)
"  nombre TEXT NOT NULL,\n"   // Resource name (filename or 
appropiate)
"  path TEXT\n"   // Path to resource
");

and

"CREATE TABLE IF NOT EXISTS static(\n"
"  ids INTEGER PRIMARY KEY,\n"// Static resource ID
"  desc TEXT,\n"  // Description
"  content BLOB\n"// Main content
");\n"

I don't know why I get an Out of memory preparing the query. Vars are 
initialized, db points to an open sqlite3 db, and stmt is the first time used 
int the code. I use a similar code on other projects abd works with no 
problems. 

Any clue?

Thanks

---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Virtual table API performance

2014-03-04 Thread Hick Gunter
Your VT1 table already has an xColumn implementation, possibly doing something 
like

switch( p_column )
{
case 1: sqlite3_result_xxx( p_ctx, v_rec->f1, ...); break;
...
case n: sqlite3_result_xxx( p_ctx, v_rec->fn, ...); break;
}

This needs to have two cases added:

case n+1: sqlite3_result_int64( p_ctx, (uintptr_t)v_rec );
case n+2: sqlite3_result_int64( p_ctx, (uintptr_t)func );

where

static int func( p_rec, p_ctx, p_column );

calls

xColumn( v_cursor, p_ctx, p_column );

with a dummy cursor structure as defined for your table.

The VT2 table can then prepare "select __rec,__func from VT1", and in its 
xColumn implementation it calls

  v_rec  = (void *)sqlite3_column_int64( v_stmt, 0 ); // this can be stored 
and cleared in the xNext function
v_func = (func *)sqlite3_column_int64( v_stmt, 1 ); // this can be 
stored
v_func( v_rec, p_ctx, p_column );


As for your second example, as written it does not suffer from the effect 
because you are already selecting c1, c2 and c3 at the bottom level.

Rewritten as

Select processrow(c1,c2,c3) from VT2(select * from VT1);

results in the VT1 xColumn function getting called (via the VT2 xColumn 
function) just 3 times per row.

Additionally, you may like to "select __func from VT1 limit 1" and store that 
in your xFilter implementation; and then "select __rec from VT1" in your xNext 
implementation to have sqlite3_result_int64() called half as often.

HTH

-Ursprüngliche Nachricht-
Von: Eleytherios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Dienstag, 04. März 2014 14:15
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Could you explain some more your solution?

Does it work in this following case:

select * from VT2(select * from VT1);

by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)?

What would happen in the following case?:

select * from VT2(select processrow(c1,c2,c3) from VT1);

Regards,

l.

On 03/03/14 14:17, Hick Gunter wrote:
> We have gotten around this problem by defining "virtual" fields that contain 
> a reference to the "current record" and the entrypoint of a wrapper around 
> the xColumn function. That way only two fields get passed upwards through the 
> virtual table stack and the top level virtual table's xColumn implementation 
> calls straight through to the bottom layer's wrapper.
>
> It does take some care to avoid sorting in between the layers and 
> re-preparation of statements on schema changes.
>
> -Ursprüngliche Nachricht-
> Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
> Gesendet: Sonntag, 02. März 2014 20:39
> An: sqlite-users@sqlite.org
> Betreff: Re: [sqlite] Virtual table API performance
>
> We have both input and output virtual tables that avoid hitting the hard disk 
> and are also able to compress the incoming and outgoing data.
>
> We have a virtual table that takes as input a query and sends the data to a 
> port on another machine. This virtual table is called "OUTPUT". And another 
> virtual table that takes as input data from another port and forwards it into 
> SQLite. Lets call it "INPUT". A query that uses these two virtual tables 
> would look like this in madIS:
>
> OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');
>
> We actually use queries like above (actually we don't do it directly to ports 
> but to buffered named pipes that are then forwarded via netcat) to run 
> distributed queries on clusters, connecting all the local SQLite/madIS 
> instances on the different machines together.
>
> The main point that i want to make with above explanation is that we don't 
> view SQLite only as a traditional database. We also view it as a data stream 
> processing machine, that doesn't have the requirement for the data to be 
> stored on a hard disk.
>
> Under this view, the efficiency of the virtual table api is very important. 
> Above query only uses 2 VTs in it, but we have other queries that use a lot 
> more VTs than that.
>
> estama
>
>
> On 2/3/2014 9:34 ìì, Max Vlasov wrote:
>> On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
>>  wrote:
>>>
>>> Our main test case is TPCH, a standard DB benchmark. The "lineitem"
>>> table of TPCH contains 16 columns, which for 10M rows would require
>>> 160M xColumn callbacks, to pass it through the virtual table API.
>>> These callbacks are very expensive, especially when at the other end
>>> sits a VM (CPython or PyPy) handling them.
>>>
>>
>> Ok, not stating that the performance improvment is impossible, I will
>> explain why I'm a little sceptical about it.
>>
>> For every bulk insert we have a theoretical maxiumum we'd all glad to
>> see sqlite would perform with - the speed of simple file copying.
>> Sqlite can't be faster than that, but to be on par is a good goal.
>> This is not possible when 

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis

Could you explain some more your solution?

Does it work in this following case:

select * from VT2(select * from VT1);

by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)?

What would happen in the following case?:

select * from VT2(select processrow(c1,c2,c3) from VT1);

Regards,

l.

On 03/03/14 14:17, Hick Gunter wrote:

We have gotten around this problem by defining "virtual" fields that contain a reference 
to the "current record" and the entrypoint of a wrapper around the xColumn function. That 
way only two fields get passed upwards through the virtual table stack and the top level virtual 
table's xColumn implementation calls straight through to the bottom layer's wrapper.

It does take some care to avoid sorting in between the layers and 
re-preparation of statements on schema changes.

-Ursprüngliche Nachricht-
Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Sonntag, 02. März 2014 20:39
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

We have both input and output virtual tables that avoid hitting the hard disk 
and are also able to compress the incoming and outgoing data.

We have a virtual table that takes as input a query and sends the data to a port on another 
machine. This virtual table is called "OUTPUT". And another virtual table that takes as 
input data from another port and forwards it into SQLite. Lets call it "INPUT". A query 
that uses these two virtual tables would look like this in madIS:

OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');

We actually use queries like above (actually we don't do it directly to ports 
but to buffered named pipes that are then forwarded via netcat) to run 
distributed queries on clusters, connecting all the local SQLite/madIS 
instances on the different machines together.

The main point that i want to make with above explanation is that we don't view 
SQLite only as a traditional database. We also view it as a data stream 
processing machine, that doesn't have the requirement for the data to be stored 
on a hard disk.

Under this view, the efficiency of the virtual table api is very important. 
Above query only uses 2 VTs in it, but we have other queries that use a lot 
more VTs than that.

estama


On 2/3/2014 9:34 ìì, Max Vlasov wrote:

On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
 wrote:


Our main test case is TPCH, a standard DB benchmark. The "lineitem"
table of TPCH contains 16 columns, which for 10M rows would require
160M xColumn callbacks, to pass it through the virtual table API.
These callbacks are very expensive, especially when at the other end
sits a VM (CPython or PyPy) handling them.



Ok, not stating that the performance improvment is impossible, I will
explain why I'm a little sceptical about it.

For every bulk insert we have a theoretical maxiumum we'd all glad to
see sqlite would perform with - the speed of simple file copying.
Sqlite can't be faster than that, but to be on par is a good goal.
This is not possible when an insert means also modification of other
parts of the file, for example when there's an index involved. But
let's forget about it. Finally when new data is added, sqlite should
write a number of database pages, the cost of this part is absolutely
in the hands of the media (driver) and OS (driver).  But for every
database page write there's also price to pay in CPU units, for many
actions sqlite should do before actual value is translated from what
the developer provided to what actually appears on disk.

The illustration of the CPU price is the following example
   CREATE TABLE t(Value)

on my ssd drive mulitply inserts (thousands)
insert into t (Value) values ('123456689  // this string
contains many symbols, for example 1024) performed with the speed
30 MB/Sec

but the query
insert into t (Value) values (10)  // this is a small integer
value only
3 Mb/Sec

Both shows almost full cpu load. Why such difference? Because with
latter query the system can do more than 30 MB of writes in 1 second,
but it should wait for sqlite spending 10 seconds in preparations.
The former is better because CPU cost of passing a large text value to
sqlite is comparatively low comparing to the  time spent in I/O in
writing this on disk.

So CPU price to pay isn't avoidable and notice that in example this is
not virtual table API, this is bind API. I suppose that the price we
pay for CPU spent in virtual table API is on par with an average price
payed in sqlite as a whole. This means that if I transfom the avove
queries into inserts from virtual tables, the final speed difference
will be similar. And this also means that for your comparision tests
(when you get x3 difference), the CPU price sqlite pays inside bind
api and in its code wrapping xColumn call is probably similar. The
rest is the share your code pays.

Well, I know that there are differences in CPU architectures 

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis

On 03/03/14 03:01, Alek Paunov wrote:

On 02.03.2014 21:38, Elefterios Stamatogiannakis wrote:

Under this view, the efficiency of the virtual table api is very
important. Above query only uses 2 VTs in it, but we have other queries
that use a lot more VTs than that.


Max tests in C shows 2x CPU work, but he explains that the test is not
very sound, so let's say somewhere between 1x-2x. Your tests - 3x time.

As you have already identified, the real reason probably is the million
scale callback quantity across the VM barrier - I do not follow PyPy,
but see these notes [1] by Mike Pall - the LuaJIT author (LuaJIT is the
leading project in the trace compilers filed):

[1] http://luajit.org/ext_ffi_semantics.html#callback_performance

Also from one of the dozens of threads touching the subject:

[2] http://www.freelists.org/post/luajit/Yielding-across-C-boundaries,3

```
Entering the VM needs a lot of state setup and leaving it isn't
free either. Constantly entering and leaving the VM via a callback
from C *to* Lua has a high overhead. For short callbacks, the
switching overhead between C and Lua may completely dominate the
total CPU time.

Calling an iterator written in C via the FFI *from* a Lua program
is much cheaper -- this compiles down to a simple call instruction.
```


I remember that i had seen above quote from Mike Pall, but i couldn't 
find it.


Thank you for unearthing it.


Unfortunately, for your "insert into t select * from vt" case an the
callback/iterator transformation is not possible (we do not have
repetitive _step call to invert the control somehow). What to do?

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over 
to it. It already has the "bind" API which would be very suitable for 
the xNextRow function too.


For a paging API (which IMHO is too complex ATM), the bind API could be 
extended with a row number parameter.


Regards,

estama



Kind regards,
Alek



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "INSERT" and "END TRANSACTION" time

2014-03-04 Thread Stephan Beal
On Tue, Mar 4, 2014 at 3:44 AM, Simon Slavin  wrote:

>
> On 4 Mar 2014, at 1:15am, romtek  wrote:
>
> > I have a question based on my observation. According to your numbers for
> a
> > 5400 RPM disk, one write op should take about 11 ms. However, it often
> > takes only about 1 ms on HostGator drives on its shared hosting servers.
> > Are there drives that are SO much faster than 5400 RPM ones?
>
> I'll bet that the hosting servers are virtual machines and the drives are
> virtual drives, not physical drives.  Everything is actually done in RAM
> and just flushed to physical disk every so often.
>

A related anecdote: i develop primarily on 64-bit hardware but very often
have a 32-bit VM open for portability testing, and in my experience my
sqlite-based apps _invariably_ run 20-30% faster in my 32-bit VM than on my
64-bit HW. While i cannot with 100% certainty say why that is, others have
suggested (quite possibly correctly) that this is probably due to relevant
parts of the VM's virtual drive container being cached within the host OS.

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using WHERE clauses with REAL data

2014-03-04 Thread Simon Slavin

On 4 Mar 2014, at 10:33am, Donald Shepherd  wrote:

> It's a nice idea but that's just some sample values generated by an
> emulator.  I've compromised and am using round() to limit it to a few
> digits after the decimal when doing the comparison.

If you're using randomly generated floating point numbers, and they're 
apparently not comparing properly, then these two things are about equally 
probable:

(A) the comparison does not work properly for numbers which are close together
(B) the strings of text which represent the numbers are incorrect

Think about how you display a number as decimal: you normally do it using a lot 
of multiplying by ten, then subtraction of an integer from a floating point 
value.  Multiplying a floating point number by an integer then subtracting an 
integer from the result is just as likely to be inaccurate as a comparison of 
two floating point numbers.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] "INSERT" and "END TRANSACTION" time

2014-03-04 Thread Simon Slavin

On 4 Mar 2014, at 5:06am, romtek  wrote:

> If that were true, then I wouldn't be getting a very large speed-up when
> enveloping write ops in an explicit transaction, would I?

Sorry, I can't tell.  It depends on how the virtual storage mechanism works.  
But if your performance characteristics when using a third-party server 
resemble the performance of using a :memory: database, then they're using 
virtualised main storage.  If you're finding that when using your own computer, 
you may find that your main storage is actually a hybrid drive which uses a 
large solid state cache.

Virtual storage is what hosting companies do these days: it uses less power, 
the servers take up less room, and the lack of moving parts means less failure. 
 Unless your hosting company guarantees ACID behaviour and minimal loss on 
power failure, which I don't think I've seen in standard cheap hosting 
contracts.

The take-away messages from all of this, if I may be so bold, are

(A) Your use of transactions in SQL should reflect which database modifications 
go together: which ones should fail if they can't all be done.  Performance 
issues are secondary.

(B) Your program should run "fast enough", not "as fast as possible".

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using WHERE clauses with REAL data

2014-03-04 Thread Donald Shepherd
It's a nice idea but that's just some sample values generated by an
emulator.  I've compromised and am using round() to limit it to a few
digits after the decimal when doing the comparison.


On 4 March 2014 21:27, Simon Slavin  wrote:

>
> On 4 Mar 2014, at 4:14am, Donald Shepherd 
> wrote:
>
> > It appears that using equals on floating point (REAL) data in WHERE
> clauses
> > doesn't necessarily work, presumably because of rounding errors - see
> below
> > for an example.  Is this the case?  Do I need to use BETWEEN instead of =
> > as I expect to be the case?
>
> A glance at those numbers suggests that their units are precise to 1/100th
> of a unit.  The proper reply to your question is that you should be storing
> and manipulating those values multiplied by 100, using integer arithmetic
> and INTEGER columns in your database.  This will mean you never have to
> worry about rounding or slack.
>
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Using WHERE clauses with REAL data

2014-03-04 Thread Simon Slavin

On 4 Mar 2014, at 4:14am, Donald Shepherd  wrote:

> It appears that using equals on floating point (REAL) data in WHERE clauses
> doesn't necessarily work, presumably because of rounding errors - see below
> for an example.  Is this the case?  Do I need to use BETWEEN instead of =
> as I expect to be the case?

A glance at those numbers suggests that their units are precise to 1/100th of a 
unit.  The proper reply to your question is that you should be storing and 
manipulating those values multiplied by 100, using integer arithmetic and 
INTEGER columns in your database.  This will mean you never have to worry about 
rounding or slack.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users