subject:"Re\: \[sqlite\] Virtual table"

Re: [sqlite] Virtual table function calls

2020-02-13 Thread David Jones

Jens Alfke asks:
>Maybe pass the column name as a string, i.e. `attr('H',3)`

2 problems with that idea. First, the first argument has to refer to a value in 
the virtual table in order to invoke the overridden version (overrides are per 
table, so I use the ppArg to bind function invocation to associated virtual 
table). Second, if I created a view that gives column H a name of Total, I’d 
want to the function call to be attr(Total,3). I don’t know how to make the 
function figure out that attr(‘Total’,3) really means column H in some table.
>>Dave
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table function calls

2020-02-13 Thread Jens Alfke

> On Feb 13, 2020, at 12:52 PM, David Jones  wrote:
> 
>   sqlite> select F,G,H,attr(H,3) from summary;   # show formula used 
> to calculate column H.

Maybe pass the column name as a string, i.e. `attr('H',3)`? It sounds like your 
`attr` function needs to know the _identity_ of the column, not its contents, 
and the name is basically the identity.

—Jens
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table OR constraint bug

2020-01-29 Thread Lalit Maganti

Thank you for the fast fix!

Best,
Lalit

On Wed, 29 Jan 2020 at 15:07, Dan Kennedy  wrote:

>
> On 29/1/63 20:09, Lalit Maganti wrote:
> > Hi folks,
> >
> > Just wanted to check up on this to see if this issue is something which
> is
> > being tracked and if there was a potential fix in the works?
>
> Thanks for the excellent bug report and minimal reproduction case. Now
> fixed here:
>
>https://www.sqlite.org/src/info/dcb4838757ca49cf
>
> None of us saw your post last month. We think it must have been filtered
> as spam by gmail. Sorry about that.
>
> Dan.
>
>
>
>
>
> >
> > Thanks,
> > Lalit
> >
> > On Thu, 2 Jan 2020 at 15:13, Lalit Maganti  wrote:
> >
> >> Hi all,
> >>
> >> I believe that I have found a bug in the virtual table bytecode
> generation
> >> when OR constraints are present and argvIndex is set in xBestIndex but
> the
> >> application does not actually filter fully.
> >>
> >> The problem seems to be in not setting/unsetting the non-null flag
> >> correctly (SQLITE_JUMPIFNULL) on the instruction which does the
> comparison;
> >> this leads to NULLs slipping through the filter.
> >>
> >> To fully repro the problem, please find attached a small C file which
> >> shows the unexpected behaviour. By adding an EXPLAIN in-front of the
> SELECT
> >> query, you should be able to see the incorrect bytecode on the NE
> opcode.
> >>
> >> If the attachment doesn't come through for some reason, I've also put
> the
> >> same file as a GitHub gist
> >> .
> >>
> >> Thank you!
> >>
> >> Regards,
> >> Lalit
> >>
> > ___
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table OR constraint bug

2020-01-29 Thread Dan Kennedy



On 29/1/63 20:09, Lalit Maganti wrote:

Hi folks,

Just wanted to check up on this to see if this issue is something which is
being tracked and if there was a potential fix in the works?


Thanks for the excellent bug report and minimal reproduction case. Now 
fixed here:


  https://www.sqlite.org/src/info/dcb4838757ca49cf

None of us saw your post last month. We think it must have been filtered 
as spam by gmail. Sorry about that.


Dan.







Thanks,
Lalit

On Thu, 2 Jan 2020 at 15:13, Lalit Maganti  wrote:


Hi all,

I believe that I have found a bug in the virtual table bytecode generation
when OR constraints are present and argvIndex is set in xBestIndex but the
application does not actually filter fully.

The problem seems to be in not setting/unsetting the non-null flag
correctly (SQLITE_JUMPIFNULL) on the instruction which does the comparison;
this leads to NULLs slipping through the filter.

To fully repro the problem, please find attached a small C file which
shows the unexpected behaviour. By adding an EXPLAIN in-front of the SELECT
query, you should be able to see the incorrect bytecode on the NE opcode.

If the attachment doesn't come through for some reason, I've also put the
same file as a GitHub gist
.

Thank you!

Regards,
Lalit


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table OR constraint bug

2020-01-29 Thread Lalit Maganti

Hi folks,

Just wanted to check up on this to see if this issue is something which is
being tracked and if there was a potential fix in the works?

Thanks,
Lalit

On Thu, 2 Jan 2020 at 15:13, Lalit Maganti  wrote:

> Hi all,
>
> I believe that I have found a bug in the virtual table bytecode generation
> when OR constraints are present and argvIndex is set in xBestIndex but the
> application does not actually filter fully.
>
> The problem seems to be in not setting/unsetting the non-null flag
> correctly (SQLITE_JUMPIFNULL) on the instruction which does the comparison;
> this leads to NULLs slipping through the filter.
>
> To fully repro the problem, please find attached a small C file which
> shows the unexpected behaviour. By adding an EXPLAIN in-front of the SELECT
> query, you should be able to see the incorrect bytecode on the NE opcode.
>
> If the attachment doesn't come through for some reason, I've also put the
> same file as a GitHub gist
> .
>
> Thank you!
>
> Regards,
> Lalit
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] virtual table xBestIndex and SQLITE_CONSTRAINT andzErrMsg

2019-02-19 Thread dave

> On 2/19/19, dave  wrote:
> > addition, but I have lost a capability relative to the 
> prior scheme of using
> > high query cost along with a special flag communicated in 
> pIdxInfo->idxNum,
> > that being the ablilty to emit contextual info as to why 
> the query failed.
> 
> Yeah.  There is no way to report an error out of xBestIndex.  And, in
> fact, you would not want to do that because one or more xBestIndex
> calls might actually work.  Or, there might be multiple xBestIndex
> calls that all fail for different reasons, in which case it is unclear
> which error should be reported.
> 
> I will ponder your request.  In the meantime, you can continue to use
> the old method, which still works like it always has.
> 
> -- 
> D. Richard Hipp

OK, well the theory being that the message would be emitted only when all
the candidate plans were tried, and still no solution waa found (I guess at
the same spot where the current message is emitted).  But maybe that is too
late, and any messages set along the way are already gone.

As for multiple messages, even just emitting an arbitrary one is useful.
These failures happen at design time and the developer incrementally refines
his/her query until there were no such errors.  I'm not sure if it is
possible to happen once a working query has been created.  I would think
that if you had defined a query that was demonstably solvable once, that any
subsequent executions would at worst gravitate to that known working soluton
even if the planner tried to do things differently that time (maybe based on
data values).

OK, for now I will revert to the old method.

Cheers!

-dave

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] virtual table xBestIndex and SQLITE_CONSTRAINT and zErrMsg

2019-02-19 Thread Richard Hipp

On 2/19/19, dave  wrote:
> I noticed that in 3.26 a feature was added whereby a proposed execution plan
> can be rejected in vtables by returning SQLITE_CONSTRAINT.  I welcome this
> addition, but I have lost a capability relative to the prior scheme of using
> high query cost along with a special flag communicated in pIdxInfo->idxNum,
> that being the ablilty to emit contextual info as to why the query failed.

Yeah.  There is no way to report an error out of xBestIndex.  And, in
fact, you would not want to do that because one or more xBestIndex
calls might actually work.  Or, there might be multiple xBestIndex
calls that all fail for different reasons, in which case it is unclear
which error should be reported.

I will ponder your request.  In the meantime, you can continue to use
the old method, which still works like it always has.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table row deletition on trigger

2017-12-09 Thread Tibor Balog


Hi,

I have realized that this quetion went awry so I give it a second run.

My scenario:

   Enviroment:

DotNet, no 3d party library just pinvoke.
SQLite:
Sqlite library version:3022000
-COMPILER=msvc-1911
-ENABLE_FTS5
-LIKE_DOESNT_MATCH_BLOBS
-MAX_EXPR_DEPTH=0
-OMIT_DECLTYPE
-OMIT_DEPRECATED
-OMIT_PROGRESS_CALLBACK
-OMIT_SHARED_CACHE
-TEMP_STORE=3
-THREADSAFE=1

Using:Synchronous connection.
"PRAGMA journal_mode=MEMORY"

   My assumption:
Opening just one (Read/Write|Create) conection SQLite should never signal 
busy to that connection after it successfuly acquired it.


Since you can use thight loops and it should be up to SQLite how long its 
takes for the individual command execution and how many threads its opens up 
to accomplish the job at hand,
it should come back sync. otherwise it will break the loop forcing to launch 
a signal handling thread.


   My task setup:
A table "ART"
+
A virtual fts5 contentless table:
CREATE VIRTUAL TABLE "OCR" using fts5 (content='',FullText)
+
Trigger:
CREATE TRIGGER "ART_AD" AFTER DELETE ON "ART" BEGIN INSERT INTO "OCR" 
("OCR",rowid) VALUES('DELETE',old.rowid);END

to keep the virtual table sync with row deletation of "ART"

   My task.:
0 Collecting user input from grid selection into a list of SQL DELETE range 
statements (BETWEEN value1 AND value2;)


1 Open connection
2 BEGIN_TRANSACTION
3 the delete loop on the list.
4 COMIT

SQLite breaks the loop with signal "busy" if one or more of the delete 
ranges are bigger than ~20.000 rows.


Without the transaction frame seems to be ok.

   My quetion:
Is my "assumption" above wrong and I allways have to take care of eventual 
interraptions from SQLite?
Is this a limitation because of the trigger forces a separate thread for the 
virtual table, which is maybe not that thightly integrated?

Is this a bug?

KR,
Tibor

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table row deletition on trigger

2017-12-08 Thread Tibor Balog


All right,

I think I was missing a potentialy important info.

I am deleting rows as ranges.

It is up to SQLite how to do the deletation.

I am pushing ranges in a loop in the transaction.

Than the loop breaks signaling "busy" or "overwhelmed" state.

Without the transaction frame it is working fine.

-Ursprüngliche Nachricht- 
From: Tibor Balog

Sent: Friday, December 8, 2017 7:11 PM
To: SQLite
Subject: [sqlite] Virtual table row deletition on trigger

Hello,

I run into a little anoyance regarding virtual table.

I have a contentless table:

CREATE VIRTUAL TABLE "OCR" using fts5 (content='',FullText)

bind with a trigger for row deletition:

CREATE TRIGGER "ART_AD" AFTER DELETE ON "ART" BEGIN INSERT INTO "OCR" 
("OCR",rowid) VALUES('DELETE',old.rowid);END


It works as advertised however if I am deleting rows in transaction above 
~20.000 rows put SQLite himself in “busy” state.


Is this a known limitation ?
That the virtual table row deletition triggered on normal table deletition 
interfers with the normal table in transaction?


Info:
Im not using VAL, jurnal is in memory.

Latest build.

Thank You,
Tibor
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users 


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dimitris Bil

I see, so in the native implementation you already have the whole table in 
memory and only use the clustered b-tree index to search for tuples. So I would 
not  expect a large improvement from the virtual table implementation, but the 
virtual table being 5 times slower is strange. Maybe not the correct data 
structure used?

By the way, I had tried adding a virtual table in the sqlite amalgamation and I 
did not see observable difference. On the other hand, I have seen improvement 
in the virtual table utilization using the latest version of sqlite (in 
comparison to a release about a year ago).

From: sqlite-users  on behalf of 
Bob Friesenhahn 
Sent: Wednesday, February 8, 2017 4:09 PM
To: SQLite mailing list
Subject: Re: [sqlite] Virtual table vs real table query performance

On Wed, 8 Feb 2017, Dimitris Bil wrote:

> Do you perform the benchmark on the native database table using cold
> cache or warm cache? Also, can you briefly describe what the
> benchmark does and give the schema for the native database table?

My benchmark repeatedly reads all of the columns one by one given row
id and column name.  The table is read many (e.g. 100) times so this
is a warm cache test.

The schema is not terribly important but the table we are trying to
optimize (with 1800 or less rows) contains a 64-bit rowid, five
integer values, and two short text string values.

   int64, uint32, uint32, text[16], uint8, text[16], text[18], uint8,
   uint32

What I am looking for is expected average virtual table performance vs
native table performance for repeated column reads.

Due to being a generic implementation (supporting many virtual
tables), our virtual implementation uses programmed/dynamic
marshalling rather that compiled marshalling.  The schema definition
is also dynamically generated.

There are implementation overheads and it is useful to know what
performance is possible (e.g. compared to native table performance)
in order to know when the implementation is about as good as it can
be.

Bob
--

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dominique Devienne

On Wed, Feb 8, 2017 at 4:50 PM, Hick Gunter  wrote:

> This is with a generic, user definable table and index structure capable
> virtual table implementation; it is not a "one module per table" statically
> typed heavily optimized implementation.
>

Ah, that makes complete sense then. I didn't want the OP to think virtual
tables were slower
than native tables in the general case, especially since he mentioned
memory arrays in C code.

And indeed the virtual-table advantage I mentioned is with a different
statically-typed vtable impl/module per vtable,
with statically defined indexes, where the table structure is hard-coded in
the vtable impl itself, and corresponds
to a native "row" data structure. In that config one leverages the
"front-end" of SQLite (parser and VDBE engine)
and very little of the "back-end" (pager and btree), except when SQLite
decides to make temporary tables for
query processing I guess. FWIW. --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Bob Friesenhahn


On Wed, 8 Feb 2017, Dimitris Bil wrote:

Do you perform the benchmark on the native database table using cold 
cache or warm cache? Also, can you briefly describe what the 
benchmark does and give the schema for the native database table?


My benchmark repeatedly reads all of the columns one by one given row 
id and column name.  The table is read many (e.g. 100) times so this 
is a warm cache test.


The schema is not terribly important but the table we are trying to 
optimize (with 1800 or less rows) contains a 64-bit rowid, five 
integer values, and two short text string values.


  int64, uint32, uint32, text[16], uint8, text[16], text[18], uint8,
  uint32

What I am looking for is expected average virtual table performance vs 
native table performance for repeated column reads.


Due to being a generic implementation (supporting many virtual 
tables), our virtual implementation uses programmed/dynamic 
marshalling rather that compiled marshalling.  The schema definition 
is also dynamically generated.


There are implementation overheads and it is useful to know what 
performance is possible (e.g. compared to native table performance) 
in order to know when the implementation is about as good as it can 
be.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Hick Gunter

This is with a generic, user definable table and index structure capable 
virtual table implementation; it is not a "one module per table" statically 
typed heavily optimized implementation.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dominique Devienne
Gesendet: Mittwoch, 08. Februar 2017 16:42
An: SQLite mailing list 
Betreff: Re: [sqlite] Virtual table vs real table query performance

On Wed, Feb 8, 2017 at 4:30 PM, Hick Gunter  wrote:

> Values are for retrieving 100.000 rows with a where clause not
> satisfiable from the index but true for alle rows
>
> asql> select count() from ;
> CPU Time: user 0.092986 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.189971 sys 0.00
> CPU Time: user 0.199969 sys 0.00
> CPU Time: user 0.199970 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.086987 sys 0.010998
> CPU Time: user 0.085987 sys 0.00
> CPU Time: user 0.076988 sys 0.002000
>

Frankly I'm surprised it's slower than "native" SQLite.

In bulk-insert, random lookup, and table-delete timings we did in 2009 between 
native in-memory SQLite, and pure-C++ virtual tables accessing pure C++ data 
structures (i.e. vm/reflection/introspection/dynamic lookup as in Python for 
example, but direct addressing of statically typed data), the virtual tables 
was always faster, and not by a small margin.

Admittedly it was a long time ago, and SQLite is getting faster all the time 
for sure, but you can't beat static typing of memory addressable structures, vs 
scanning pages of table data and dynamically/serially decoding variable sizes 
rows within those pages.

So something like "non-native" code or something "dynamic" is hiding in the 
virtual table impl, no? --DD ___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dominique Devienne

On Wed, Feb 8, 2017 at 4:30 PM, Hick Gunter  wrote:

> Values are for retrieving 100.000 rows with a where clause not satisfiable
> from the index but true for alle rows
>
> asql> select count() from ;
> CPU Time: user 0.092986 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.189971 sys 0.00
> CPU Time: user 0.199969 sys 0.00
> CPU Time: user 0.199970 sys 0.00
>
> asql> select count() from  where =4;
> CPU Time: user 0.086987 sys 0.010998
> CPU Time: user 0.085987 sys 0.00
> CPU Time: user 0.076988 sys 0.002000
>

Frankly I'm surprised it's slower than "native" SQLite.

In bulk-insert, random lookup, and table-delete timings we did in 2009
between native in-memory SQLite, and pure-C++ virtual tables accessing
pure C++ data structures (i.e. vm/reflection/introspection/dynamic lookup
as in Python for example, but direct addressing of statically typed data),
the virtual tables was always faster, and not by a small margin.

Admittedly it was a long time ago, and SQLite is getting faster all the time
for sure, but you can't beat static typing of memory addressable structures,
vs scanning pages of table data and dynamically/serially decoding variable
sizes
rows within those pages.

So something like "non-native" code or something "dynamic" is hiding in the
virtual table impl, no? --DD
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Hick Gunter

Values are for retrieving 100.000 rows with a where clause not satisfiable from 
the index but true for alle rows

asql> select count() from ;
CPU Time: user 0.092986 sys 0.00

asql> select count() from  where =4;
CPU Time: user 0.189971 sys 0.00
CPU Time: user 0.199969 sys 0.00
CPU Time: user 0.199970 sys 0.00

asql> select count() from  where =4;
CPU Time: user 0.086987 sys 0.010998
CPU Time: user 0.085987 sys 0.00
CPU Time: user 0.076988 sys 0.002000

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Bob Friesenhahn
Gesendet: Mittwoch, 08. Februar 2017 15:39
An: SQLite mailing list 
Betreff: Re: [sqlite] Virtual table vs real table query performance

On Wed, 8 Feb 2017, Hick Gunter wrote:

> Having imlemented a memory-based virtual table complete with indices,
> full table scan and direct access via rowid (which happens to be the
> memory address of the row) I can do a batch delete of
> 100.000 rows (in a table with 1 composite index) in about 2 seconds

The case I am interested is pure read performance of a single column element at 
a time given properly implemented xBestIndex and xFilter support.  Rows are not 
being added/removed using sqlite.

It is possible that native tables can be faster since the implementation is not 
limited to the rigid set of callback functions provided for virtual tables to 
use and of course the amalgamation is optimized by the compiler as one source 
module.

By tracing the callbacks, we do see that our implementation is not invoking the 
callbacks more times than necessary (which was not the case before xBestIndex 
and xFilter support was added).  Due to the requirements of the implementation, 
POSIX reader/writer locks are used so there is some low-contention locking 
overhead.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Dimitris Bil

Hello,


Do you perform the benchmark on the native database table using cold cache or 
warm cache? Also, can you briefly describe what the benchmark does and give the 
schema for the native database table?


thanks



From: sqlite-users  on behalf of 
Bob Friesenhahn 
Sent: Wednesday, February 8, 2017 2:39 PM
To: SQLite mailing list
Subject: Re: [sqlite] Virtual table vs real table query performance



The case I am interested is pure read performance of a single column
element at a time given properly implemented xBestIndex and xFilter
support.  Rows are not being added/removed using sqlite.

It is possible that native tables can be faster since the
implementation is not limited to the rigid set of callback functions
provided for virtual tables to use and of course the amalgamation is
optimized by the compiler as one source module.

By tracing the callbacks, we do see that our implementation is not
invoking the callbacks more times than necessary (which was not the
case before xBestIndex and xFilter support was added).  Due to the
requirements of the implementation, POSIX reader/writer locks are used
so there is some low-contention locking overhead.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
GraphicsMagick Image Processing System<http://www.graphicsmagick.org/>
www.graphicsmagick.org
GraphicsMagick is a robust collection of tools and libraries to read, write, 
and manipulate an image in any of the more popular image formats including GIF, 
JPEG, PNG ...


<http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Bob Friesenhahn


On Wed, 8 Feb 2017, Hick Gunter wrote:

Having imlemented a memory-based virtual table complete with 
indices, full table scan and direct access via rowid (which happens 
to be the memory address of the row) I can do a batch delete of 
100.000 rows (in a table with 1 composite index) in about 2 seconds


The case I am interested is pure read performance of a single column 
element at a time given properly implemented xBestIndex and xFilter 
support.  Rows are not being added/removed using sqlite.


It is possible that native tables can be faster since the 
implementation is not limited to the rigid set of callback functions 
provided for virtual tables to use and of course the amalgamation is 
optimized by the compiler as one source module.


By tracing the callbacks, we do see that our implementation is not 
invoking the callbacks more times than necessary (which was not the 
case before xBestIndex and xFilter support was added).  Due to the 
requirements of the implementation, POSIX reader/writer locks are used 
so there is some low-contention locking overhead.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table vs real table query performance

2017-02-08 Thread Hick Gunter

Having imlemented a memory-based virtual table complete with indices, full 
table scan and direct access via rowid (which happens to be the memory address 
of the row) I can do a batch delete of 100.000 rows (in a table with 1 
composite index) in about 2 seconds (3.7 seconds with the condition) while 
running linux (RH 5.6 x86_64 VM) on a virtual machine. Deleting all rows of a 
native SQLite table (while checking for the value of a non-indexed field to 
avoid SQLite just dropping an re-creating the table) takes about 1 second.



Note that both operations require a full table scan to fill a „rowset“ (= 
SQLite internal temporary table) and that the virtual table function VUpdate 
expects the virtual table code to handle index deletetion which is explicitly 
coded in the native table case.



asql> explain delete from ;

addr  opcode p1p2p3p4 p5  comment

  -        -  --  -

0 Trace  0 0 000  NULL

1 Goto   0 18000  NULL

2 Integer0 1 000  NULL

3 Null   0 2 000  NULL

4 VOpen  0 0 0 vtab:187BE588:2ACC1FDC4990  00  NULL

5 Integer1 4 000  NULL

6 Integer0 5 000  NULL

7 VFilter0 12400  NULL

8 Rowid  0 3 000  NULL

9 RowSetAdd  2 3 000  NULL

10AddImm 1 1 000  NULL

11VNext  0 8 000  NULL

12Close  0 0 000  NULL

13RowSetRead 2 16300  NULL

14VUpdate0 1 3 vtab:187BE588:2ACC1FDC4990  02  NULL

15Goto   0 13000  NULL

16ResultRow  1 1 000  NULL

17Halt   0 0 000  NULL

18VBegin 0 0 0 vtab:187BE588:2ACC1FDC4990  00  NULL

19Goto   0 2 000  NULL



asql> explain delete from  where =4;

addr  opcode p1p2p3p4 p5  comment

  -        -  --  -

0 Trace  0 0 000  NULL

1 Goto   0 31000  NULL

2 Integer0 1 000  NULL

3 Null   0 2 000  NULL

4 OpenRead   0 215   0 7  00  

5 Rewind 0 13000  NULL

6 Column 0 6 400  .

7 Integer4 5 000  NULL

8 Ne 5 124   collseq(BINARY)  6c  NULL

9 Rowid  0 3 000  NULL

10RowSetAdd  2 3 000  NULL

11AddImm 1 1 000  NULL

12Next   0 6 001  NULL

13Close  0 0 000  NULL

14OpenWrite  0 215   0 8  00  

15OpenWrite  1 1362  0  Keyinfo(5,BINARY,BINARY)  00  

16RowSetRead 2 27300  NULL

17NotExists  0 26300  NULL

18Rowid  0 11000  NULL

19Column 0 1 600  .

20Column 0 2 700  .

21Column 0 3 800  .

22Column 0 4 900  .

23Column 0 5 10   00  .

24IdxDelete  1 6 600  NULL

25Delete 0 1 000  NULL

26Goto   0 16000  NULL

27Close  1 1362  000  NULL

28Close  0 0 000  NULL

29ResultRow  1 1 000  NULL

30Halt   0 0 000  NULL

31Transaction0 1 000  NULL

32VerifyCookie   0 1191  000  NULL

33TableLock  0 215   100  NULL

34Goto   0 2 000  NULL



-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Bob Friesenhahn
Gesendet: Dienstag, 07. Februar 2017 22:06
An: SQLite mailing list 
Betreff: [sqlite] Virtual table vs real table query performance



We are trying to improve the query performance of

Re: [sqlite] Virtual table acting as wrapper of a regular table

2016-10-24 Thread Dimitris Bil

Hick,


sorry, I think I misunderstood your first reply.


Yes, of course I am calling each method twice when I use the wrapper: one time 
in the wrapper and one time in B, so an overhead is absolutely expected. But I 
would not expect this overhead to be equal to the time of the total execution 
without the wrapper, as the wrapper's methods pretty much do nothing more that 
calling the methods of table B that do the actual work. So I would expect the 
time to be double, if apart from calling the wrapper's methods I would also 
scan table A twice and searching B table double the number of the times I do in 
the normal execution.


Best,

Dimitris



From: sqlite-users  on behalf of 
Hick Gunter 
Sent: Monday, October 24, 2016 1:59 PM
To: 'SQLite mailing list'
Subject: Re: [sqlite] Virtual table acting as wrapper of a regular table

When you prepare() your statement, SQLite translates this into 2 nested loops:

Outer loop:
Rewind table A (retrieve first row)
Extract id from table A row
...
Retrieve next row from table A

Inner loop:
Set table B to A.id
Extract id from B
Check for match
(return result row)
Retrieve next row from table B

When you are using your wrapper, the inner loop gets replaced by:
Call xFilter() -> reset(), bind(), step(), column_value() .> Set table B to 
A.id, Extract id from table B row, check for match
Call xEof()
Call xColumn() -> return result value
Call xNext() -> step(), column_value() -> Retrieve next row from table B

This is about twice as much work.


-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dimitris Bil
Gesendet: Montag, 24. Oktober 2016 15:10
An: SQLite mailing list 
Betreff: Re: [sqlite] Virtual table acting as wrapper of a regular table

Indeed I am not checking argc. This code just only means to serve as a test in 
order to make comparisons for execution speed. I have stripped it of any other 
functionality.


Regarding prepare, I am only preparing the statement in xConnect (which also 
acts as xCreate). This is only executed once, when connection to the virtual 
table is opened, right?

Regarding the step method, I am executing a step the first time inside xFilter 
in order to advance the cursor in the first result. Then each time the xNext is 
called, I am executing step one more time to continue advancing the cursor. I 
think this is in accordance to what the sqlite page says: "If the virtual table 
contains one or more rows that match the search criteria, then the cursor must 
be left point at the first row. Subsequent calls to xEof must return false 
(zero). If there are no rows match, then the cursor must be left in a state 
that will cause the xEof to return true (non-zero). The SQLite engine will use 
the xColumn and xRowid methods to access that row content. The xNext method 
will be used to advance to the next row."

Also, I am using sqlite3_result_int64 only one time per result tuple inside 
xColumn. In my case, as each result tuple is the same, this indeed could be 
avoided, but I don't think that this is the cause for the overhead.

thanks again,
Dimitris




From: sqlite-users  on behalf of 
Hick Gunter 
Sent: Monday, October 24, 2016 10:39 AM
To: 'SQLite mailing list'
Subject: Re: [sqlite] Virtual table acting as wrapper of a regular table

Your xFilter method is blindly assuming that there is always an argv[0] without 
checking argc first.

You are incurring an extra "prepare" in your xConnect method, an extra "step" 
and "column" in your xFilter/xNext methods and an extra "result" in your 
xColumn function. Doing twice as much work taking twice as long seems quite 
reasonable.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dimitris Bil
Gesendet: Samstag, 22. Oktober 2016 15:31
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Virtual table acting as wrapper of a regular table

Hello,

I am trying to create a simple virtual table that acts as wrapper for a normal 
database table for a specific query. For example, consider that I have tables A 
and B in my database and the query:
select count(A.id) from A CROSS JOIN B where A.id=B.id

Now I am using a virtual table acting as wrapper for table B, so I can execute 
the query select count(A.id) from A CROSS JOIN w where A.id=w.id, where w is 
the virtual table instance, and get the same result. My problem is that the 
second query is about 2 times slower. I would normally expect some overhead, 
but this seems quite slow, so I was wondering if there is something wrong with 
my code.

I am using a prepared statement in connect method for query "select id from B 
where id=?1" and reset/bind/step in filter method and

Re: [sqlite] Virtual table acting as wrapper of a regular table

2016-10-24 Thread Hick Gunter

When you prepare() your statement, SQLite translates this into 2 nested loops:

Outer loop:
Rewind table A (retrieve first row)
Extract id from table A row
...
Retrieve next row from table A

Inner loop:
Set table B to A.id
Extract id from B
Check for match
(return result row)
Retrieve next row from table B

When you are using your wrapper, the inner loop gets replaced by:
Call xFilter() -> reset(), bind(), step(), column_value() .> Set table B to 
A.id, Extract id from table B row, check for match
Call xEof()
Call xColumn() -> return result value
Call xNext() -> step(), column_value() -> Retrieve next row from table B

This is about twice as much work.


-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dimitris Bil
Gesendet: Montag, 24. Oktober 2016 15:10
An: SQLite mailing list 
Betreff: Re: [sqlite] Virtual table acting as wrapper of a regular table

Indeed I am not checking argc. This code just only means to serve as a test in 
order to make comparisons for execution speed. I have stripped it of any other 
functionality.


Regarding prepare, I am only preparing the statement in xConnect (which also 
acts as xCreate). This is only executed once, when connection to the virtual 
table is opened, right?

Regarding the step method, I am executing a step the first time inside xFilter 
in order to advance the cursor in the first result. Then each time the xNext is 
called, I am executing step one more time to continue advancing the cursor. I 
think this is in accordance to what the sqlite page says: "If the virtual table 
contains one or more rows that match the search criteria, then the cursor must 
be left point at the first row. Subsequent calls to xEof must return false 
(zero). If there are no rows match, then the cursor must be left in a state 
that will cause the xEof to return true (non-zero). The SQLite engine will use 
the xColumn and xRowid methods to access that row content. The xNext method 
will be used to advance to the next row."

Also, I am using sqlite3_result_int64 only one time per result tuple inside 
xColumn. In my case, as each result tuple is the same, this indeed could be 
avoided, but I don't think that this is the cause for the overhead.

thanks again,
Dimitris




From: sqlite-users  on behalf of 
Hick Gunter 
Sent: Monday, October 24, 2016 10:39 AM
To: 'SQLite mailing list'
Subject: Re: [sqlite] Virtual table acting as wrapper of a regular table

Your xFilter method is blindly assuming that there is always an argv[0] without 
checking argc first.

You are incurring an extra "prepare" in your xConnect method, an extra "step" 
and "column" in your xFilter/xNext methods and an extra "result" in your 
xColumn function. Doing twice as much work taking twice as long seems quite 
reasonable.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dimitris Bil
Gesendet: Samstag, 22. Oktober 2016 15:31
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Virtual table acting as wrapper of a regular table

Hello,

I am trying to create a simple virtual table that acts as wrapper for a normal 
database table for a specific query. For example, consider that I have tables A 
and B in my database and the query:
select count(A.id) from A CROSS JOIN B where A.id=B.id

Now I am using a virtual table acting as wrapper for table B, so I can execute 
the query select count(A.id) from A CROSS JOIN w where A.id=w.id, where w is 
the virtual table instance, and get the same result. My problem is that the 
second query is about 2 times slower. I would normally expect some overhead, 
but this seems quite slow, so I was wondering if there is something wrong with 
my code.

I am using a prepared statement in connect method for query "select id from B 
where id=?1" and reset/bind/step in filter method and (if there are more 
results) step in next method.

Here's my code (I am just setting id=0 to denote eof) 
http://pastebin.com/ce8b4aLL

Do you think there's something wrong or it's an unavoidable overhead that comes 
with the virtual table usage? Are there any chances to improve performance?

thanks,
Dimitris

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legal

Re: [sqlite] Virtual table acting as wrapper of a regular table

2016-10-24 Thread Dimitris Bil

Indeed I am not checking argc. This code just only means to serve as a test in 
order to make comparisons for execution speed. I have stripped it of any other 
functionality.


Regarding prepare, I am only preparing the statement in xConnect (which also 
acts as xCreate). This is only executed once, when connection to the virtual 
table is opened, right?

Regarding the step method, I am executing a step the first time inside xFilter 
in order to advance the cursor in the first result. Then each time the xNext is 
called, I am executing step one more time to continue advancing the cursor. I 
think this is in accordance to what the sqlite page says: "If the virtual table 
contains one or more rows that match the search criteria, then the cursor must 
be left point at the first row. Subsequent calls to xEof must return false 
(zero). If there are no rows match, then the cursor must be left in a state 
that will cause the xEof to return true (non-zero). The SQLite engine will use 
the xColumn and xRowid methods to access that row content. The xNext method 
will be used to advance to the next row."

Also, I am using sqlite3_result_int64 only one time per result tuple inside 
xColumn. In my case, as each result tuple is the same, this indeed could be 
avoided, but I don't think that this is the cause for the overhead.

thanks again,
Dimitris




From: sqlite-users  on behalf of 
Hick Gunter 
Sent: Monday, October 24, 2016 10:39 AM
To: 'SQLite mailing list'
Subject: Re: [sqlite] Virtual table acting as wrapper of a regular table

Your xFilter method is blindly assuming that there is always an argv[0] without 
checking argc first.

You are incurring an extra "prepare" in your xConnect method, an extra "step" 
and "column" in your xFilter/xNext methods and an extra "result" in your 
xColumn function. Doing twice as much work taking twice as long seems quite 
reasonable.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dimitris Bil
Gesendet: Samstag, 22. Oktober 2016 15:31
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Virtual table acting as wrapper of a regular table

Hello,

I am trying to create a simple virtual table that acts as wrapper for a normal 
database table for a specific query. For example, consider that I have tables A 
and B in my database and the query:
select count(A.id) from A CROSS JOIN B where A.id=B.id

Now I am using a virtual table acting as wrapper for table B, so I can execute 
the query select count(A.id) from A CROSS JOIN w where A.id=w.id, where w is 
the virtual table instance, and get the same result. My problem is that the 
second query is about 2 times slower. I would normally expect some overhead, 
but this seems quite slow, so I was wondering if there is something wrong with 
my code.

I am using a prepared statement in connect method for query "select id from B 
where id=?1" and reset/bind/step in filter method and (if there are more 
results) step in next method.

Here's my code (I am just setting id=0 to denote eof) 
http://pastebin.com/ce8b4aLL

Do you think there's something wrong or it's an unavoidable overhead that comes 
with the virtual table usage? Are there any chances to improve performance?

thanks,
Dimitris

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table acting as wrapper of a regular table

2016-10-24 Thread Hick Gunter

Your xFilter method is blindly assuming that there is always an argv[0] without 
checking argc first.

You are incurring an extra "prepare" in your xConnect method, an extra "step" 
and "column" in your xFilter/xNext methods and an extra "result" in your 
xColumn function. Doing twice as much work taking twice as long seems quite 
reasonable.

-Ursprüngliche Nachricht-
Von: sqlite-users [mailto:sqlite-users-boun...@mailinglists.sqlite.org] Im 
Auftrag von Dimitris Bil
Gesendet: Samstag, 22. Oktober 2016 15:31
An: sqlite-users@mailinglists.sqlite.org
Betreff: [sqlite] Virtual table acting as wrapper of a regular table

Hello,

I am trying to create a simple virtual table that acts as wrapper for a normal 
database table for a specific query. For example, consider that I have tables A 
and B in my database and the query:
select count(A.id) from A CROSS JOIN B where A.id=B.id

Now I am using a virtual table acting as wrapper for table B, so I can execute 
the query select count(A.id) from A CROSS JOIN w where A.id=w.id, where w is 
the virtual table instance, and get the same result. My problem is that the 
second query is about 2 times slower. I would normally expect some overhead, 
but this seems quite slow, so I was wondering if there is something wrong with 
my code.

I am using a prepared statement in connect method for query "select id from B 
where id=?1" and reset/bind/step in filter method and (if there are more 
results) step in next method.

Here's my code (I am just setting id=0 to denote eof) 
http://pastebin.com/ce8b4aLL

Do you think there's something wrong or it's an unavoidable overhead that comes 
with the virtual table usage? Are there any chances to improve performance?

thanks,
Dimitris

___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table

2016-09-18 Thread Eduardo Morras

On Wed, 31 Aug 2016 10:34:05 -0300
Maria de Jesus Philadelpho  wrote:

> Hi,
> 
> I implement the SQLite extension, virtcsv, which allows attaching a
> CSV file as a virtual table.  At a command line everything works just
> fine, why I can't see the result of a select * from table using the
> SQLitestudio 3.0.7?

Beacuse Sqlite Studio must load the Sqlite csv virtual table before
use, or be compiled with it.

Read Sqlite Studio manual how to achive that, I don't use SqliteStudio.

Perhaps you can get better answers from Sqlite Studio developers.

> See the attached snapshot.

Sqlite maillist don't support attachment.

> regards,
> 
> Maria Azevedo
> --
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


---   ---
Eduardo Morras 
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table

2016-09-18 Thread John G

1) Can't see the 'attachment' - this list does not accept them.
Can you copy/paste sample code into an email?

John G



On 31 August 2016 at 14:34, Maria de Jesus Philadelpho <
jesus.ph...@gmail.com> wrote:

> Hi,
>
> I implement the SQLite extension, virtcsv, which allows attaching a CSV
> file as a virtual table.  At a command line everything works just fine, why
> I can't see the result of a select * from table using the SQLitestudio
> 3.0.7?
> See the attached snapshot.
>
> regards,
>
> Maria Azevedo
> --
> ___
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table BestIndex Being Passed LIKE constraints in SQLite 3.13.0?

2016-06-24 Thread Richard Hipp

On Sat, Jun 25, 2016 at 2:55 AM, Mike Nicolino
 wrote:
> Hey Everyone,
>
> I'm got a system with virtual tables using System.Data.SQLite 
> 1.0.102.0/SQLite version 3.13.0, the LIKE constraint for queries is now being 
> passed to the virtual table BestIndex functions (BestIndex is getting 65 as 
> its constraint.op).
>
> Is the behavior in 3.13.0 a bug or has something changed that now makes it 
> possible for the LIKE constraint to be sent to virtual tables?
>

This is a feature introduced in version 3.10.0.
https://www.sqlite.org/releaselog/3_10_0.html


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table implementation details

2014-11-20 Thread Alessandro Marzocchi

Sorry but I did not make my question clear... my problem is not only
restoring data to old data (easy, sqlite already does it at some time
during it) but to do other operations during rollback (e.g. notifying the
object that its data has changed, so the object will read it back). What I
have in mind is an architecture similar to model view controller... in
which the sqlite database acts as model.  So it's important for me which
will be the state of db when I'll query it during rollback... hope I could
clarify my question enough. In the evening I'll try to write some more
accurate description of the full architecture I have in mind to see if
anybody may have suggestions. Thank you again

You seem to be forgetting one important detail…  if a Rollback is called,
the underlying data and undo tables will be rolled back automatically.
They are, after all “real” tables in the database, and will be managed as
such.  Unless you’re storing data outside the SQLite database, there is
nothing for your function to do… the SQLite engine will rollback all the
tables and put them into their prior state all on its own.

 -j


On Nov 20, 2014, at 3:26 AM, Alessandro Marzocchi <
alessandro.marzoc...@gmail.com> wrote:

> Good day,
> I'm implementing a serialization&undo/redo engine based on sqlite. My
> implementation (particulars omitted) will work by adding a layer of
virtual
> tables over real sqlite data tables plus a physical table to store the
list
> of modifications made to them (to implement undo functionality). Also a
> callback to the real object is provided to notify when values are changed
> (eg for a rollback or undo). My problem comes on implementing rollback
> function. When rollback is called my thought is to restore old data
directly
> from the data table or the undo table. For example (still simplified)
>
> 1) Virtual table: dataObjectA(id INTEGER PRIMARY KEY, Foo INTEGER)
>Data table: _dataObjectA(id INTEGER PRIMARY KEY, Foo INTEGER)
>Undo table: _undo(step INTEGER PRIMARY KEY, STRING class, id INTEGER,
> variable STRING, oldValue VARIANT, newValue VARIANT),
>objectA[27]->Foo=3; so _dataObjectA contains a single entry (27,3)
> 2) BEGIN
> 3) User calls objectA[27]->setFoo(4): c++ from this calls make an UPDATE
> dataObjectA SET Foo=4 WHERE id=27;
> 4) My xUpdate gets called. Following queries are made: UPDATE _dataObjectA
> SET Foo=4 WHERE id=27; INSERT INTO _undo VALUES (0 /*incremental step
number
> */, 'objectA', 27, 'Foo', 3, 4)
> 5) ROLLBACK
> 6) My xRollback gets called and has to restore previous state (and notify
it
> through the callback)
>
> Now what database state will xRollback see? The one before (4) or the one
> after? Looking at code I see xRollback in sqlite3.c (3.8.7.1 amalgamation)
> is called with following call stack:
> 112789: static void callFinaliser(sqlite3 *db, int offset)
> 112838: sqlite3VtabRollback
>
> Which itself may be called from:
> 125934 sqlite3RollbackAll /* After sqlite3BtreeRollback is called */
> 125744 sqlite3Close /* Not meaningfull on my case */
>
> So it seems that when my xRollback will be called it will see old data.
Can
> anybody confirm me there are no corner case in which something different
> could happened and may I assume that this behavior will be kept in future
> releases too?
>
> Also, even if I'm afraid the answer is not is there a way to force
> "reloading" of virtual table(s)? In my implementation columns can be added
> (in virtual as in real data table) through another special virtual table
> (following the example before user may do a INSERT INTO classDef VALUES
> ('ObjectA', 'Bar', 'STRING') and this would add a column to _dataObjectA.
> Have I any ways to force (other than forced DROP TABLE/CREATE VIRTUAL
TABLE)
> reloading so I can notify sqlite of new schema with sqlite3_declare_vtab?
>
> Thank you for your help
> Alessandro
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

--
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it, but
showing it to the wrong people has the tendency to make them feel
uncomfortable." -- Angela Johnson





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table implementation details

2014-11-20 Thread Jay Kreibich


You seem to be forgetting one important detail…  if a Rollback is called, the 
underlying data and undo tables will be rolled back automatically.  They are, 
after all “real” tables in the database, and will be managed as such.  Unless 
you’re storing data outside the SQLite database, there is nothing for your 
function to do… the SQLite engine will rollback all the tables and put them 
into their prior state all on its own.

 -j


On Nov 20, 2014, at 3:26 AM, Alessandro Marzocchi 
 wrote:

> Good day,
> I'm implementing a serialization&undo/redo engine based on sqlite. My
> implementation (particulars omitted) will work by adding a layer of virtual
> tables over real sqlite data tables plus a physical table to store the list
> of modifications made to them (to implement undo functionality). Also a
> callback to the real object is provided to notify when values are changed
> (eg for a rollback or undo). My problem comes on implementing rollback
> function. When rollback is called my thought is to restore old data directly
> from the data table or the undo table. For example (still simplified)
> 
> 1) Virtual table: dataObjectA(id INTEGER PRIMARY KEY, Foo INTEGER)
>Data table: _dataObjectA(id INTEGER PRIMARY KEY, Foo INTEGER)
>Undo table: _undo(step INTEGER PRIMARY KEY, STRING class, id INTEGER,
> variable STRING, oldValue VARIANT, newValue VARIANT),
>objectA[27]->Foo=3; so _dataObjectA contains a single entry (27,3)
> 2) BEGIN
> 3) User calls objectA[27]->setFoo(4): c++ from this calls make an UPDATE
> dataObjectA SET Foo=4 WHERE id=27;
> 4) My xUpdate gets called. Following queries are made: UPDATE _dataObjectA
> SET Foo=4 WHERE id=27; INSERT INTO _undo VALUES (0 /*incremental step number
> */, 'objectA', 27, 'Foo', 3, 4)
> 5) ROLLBACK
> 6) My xRollback gets called and has to restore previous state (and notify it
> through the callback)
> 
> Now what database state will xRollback see? The one before (4) or the one
> after? Looking at code I see xRollback in sqlite3.c (3.8.7.1 amalgamation)
> is called with following call stack:
> 112789: static void callFinaliser(sqlite3 *db, int offset)
> 112838: sqlite3VtabRollback
> 
> Which itself may be called from:
> 125934 sqlite3RollbackAll /* After sqlite3BtreeRollback is called */
> 125744 sqlite3Close /* Not meaningfull on my case */
> 
> So it seems that when my xRollback will be called it will see old data. Can
> anybody confirm me there are no corner case in which something different
> could happened and may I assume that this behavior will be kept in future
> releases too?
> 
> Also, even if I'm afraid the answer is not is there a way to force
> "reloading" of virtual table(s)? In my implementation columns can be added
> (in virtual as in real data table) through another special virtual table
> (following the example before user may do a INSERT INTO classDef VALUES
> ('ObjectA', 'Bar', 'STRING') and this would add a column to _dataObjectA.
> Have I any ways to force (other than forced DROP TABLE/CREATE VIRTUAL TABLE)
> reloading so I can notify sqlite of new schema with sqlite3_declare_vtab?
> 
> Thank you for your help
> Alessandro
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

--  
Jay A. Kreibich < J A Y @ K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it, but showing 
it to the wrong people has the tendency to make them feel uncomfortable." -- 
Angela Johnson





___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-05 Thread Alek Paunov


On 05.03.2014 11:02, RSmith wrote:


On 2014/03/05 10:41, Dominique Devienne wrote:

On Tue, Mar 4, 2014 at 9:41 PM, Elefterios Stamatogiannakis

One thing that IMHO long term might improve the situation would be if
SQLite's own "native" tables would use the same Virtual Table API,//...

...//Of course, the above is a "naive" abstract reflection which ignores
the realities of VDBE, so it may sound rubbish to actual SQLite
developers. Apologies for that. --DD


I don't think it is rubbish at all, but maybe idealistic.  The biggest
problem I can see from making API's pov is that you can at any time
alter, update, change the way SQLIte (or any other API) works with the
base check that the input values produce the same (or maybe
more-correct) results.  Once you let the VT use the same API, any change
is a potential change to how other people's programmed interfaces need
to talk to - or get data from - the SQLite engine. This cannot simply
change on a whim, so the levels of separation remain needed.

That said, I'm all for making a more efficient VT API, but it would
probably need to be "new" functionality since I cannot see how the
existing interface could implement any of the mentioned enhancements
without breaking existing behaviour. The OP's xNextRow suggestion seems
a good idea, but opens up a whole can of what-ifs which other posters
have alluded to, but something to that effect might be worthwhile if the
efficiency bonus is significant.



The whole thread so far is based on the OP observations in mixed C/PyPy 
and apsw/CPython environments (as being said already, we suffering the 
noise in both cases).


To be helpful to the SQLite team, before proposing any changes, please 
let someone show some well designed, pure C vtable implementation 
demonstrating the possible vtable interface inefficiency.


Let's remember that all xNextRow, xNextPage optimizations are 
applicable only for "select *" cases, not in the general "select f(x), 
y" case.


Kind regards,
Alek

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-05 Thread RSmith

On 2014/03/05 10:41, Dominique Devienne wrote:

On Tue, Mar 4, 2014 at 9:41 PM, Elefterios Stamatogiannakis

One thing that IMHO long term might improve the situation would be if
SQLite's own "native" tables would use the same Virtual Table API,//...

...//Of course, the above is a "naive" abstract reflection which ignores
the realities of VDBE, so it may sound rubbish to actual SQLite
developers. Apologies for that. --DD

I don't think it is rubbish at all, but maybe idealistic. The biggest problem I can see from making API's pov is that you can at
any time alter, update, change the way SQLIte (or any other API) works with the base check that the input values produce the same
(or maybe more-correct) results. Once you let the VT use the same API, any change is a potential change to how other people's
programmed interfaces need to talk to - or get data from - the SQLite engine. This cannot simply change on a whim, so the levels of
separation remain needed.

That said, I'm all for making a more efficient VT API, but it would probably need to be "new" functionality since I cannot see how
the existing interface could implement any of the mentioned enhancements without breaking existing behaviour. The OP's xNextRow
suggestion seems a good idea, but opens up a whole can of what-ifs which other posters have alluded to, but something to that effect
might be worthwhile if the efficiency bonus is significant.

Cheers
Ryan
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-05 Thread Dominique Devienne

On Tue, Mar 4, 2014 at 9:41 PM, Elefterios Stamatogiannakis
 wrote:
> [...] Nevertheless, having people do crazy/clever hacks like that to avoid the
> inefficiencies of an API, gives a hint that something might be problematic.
>
> In a previous email of mine, i had said half seriously/half joking about the
> cost in Watts of SQLite's worldwide usage. For something that is so widely
> used, even some % of efficiency improvement really makes a difference. It is
> not an "SQLite destroys/saves civilization" kind of difference, but IMHO it
> would be measurable in G/M Watts.

The SQLite developers clearly care about performance, that much is clear.

But they also care about the "lite"-ness aspects too (and so does part
of its community, judging by this list).

They try to find the right balance for them, with ease of
implementation/testing/maintenance as the last variable of this
equation.

One thing that IMHO long term might improve the situation would be if
SQLite's own "native" tables would use the same Virtual Table API,
because then any bottleneck would apply to all "tables", not just the
VT ones, and would be more likely to be removed. And as recently
discussed, the VT API would be forced to acquire ways to communicate
with Explain Query Plan (for example) to notice Full Scans or Covering
Indexes not just for the "native" tables. VTs came after the "native"
tables, so the current situation is normal, but if all table accesses,
"native" or otherwise, could go thru the same API in the future (in
SQLite3 or Sqlite4), it would further decouple the "front-end" from
the "back-end", and ensure no performance differences between "native"
and "virtual" tables.

Of course, the above is a "naive" abstract reflection which ignores
the realities of VDBE, so it may sound rubbish to actual SQLite
developers. Apologies for that. --DD
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Elefterios Stamatogiannakis

Also, i want to note the pretty clever "hack" from Hick Gunter that 
works by passing a "proxy" for the row (instead of the whole row's 
values) to SQLite. This proxy gets expanded using the xColumn API by a 
virtual table outside the whole query.


In this way a query using multiple VTs, only needs to incur the xColumn 
performance penalty only once.


Nevertheless, having people do crazy/clever hacks like that to avoid the 
inefficiencies of an API, gives a hint that something might be problematic.


In a previous email of mine, i had said half seriously/half joking about 
the cost in Watts of SQLite's worldwide usage. For something that is so 
widely used, even some % of efficiency improvement really makes a 
difference. It is not an "SQLite destroys/saves civilization" kind of 
difference, but IMHO it would be measurable in G/M Watts.


l.

On 4/3/2014 11:33 μμ, Alek Paunov wrote:

On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a
more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed
over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.



Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like
bind_xxx for each column with the scalar addresses (allocated by you)
- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or
standalone sqlite add-on loaded by PyPy FFI)?

How you expect this model (managing one per cell count of scalar
allocations during the query) to perform in comparison with passing
encoded row pages (memory chinks) between sqlite and the script engine
especially when it is not PyPy or LuaJIT?

Regards,
Alek



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Elefterios Stamatogiannakis


On 4/3/2014 11:33 μμ, Alek Paunov wrote:

On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a
more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed
over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.



Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like
bind_xxx for each column with the scalar addresses (allocated by you)
- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or
standalone sqlite add-on loaded by PyPy FFI)?


Yes these things can be done. We have written our own APSW compatible 
FFI for PyPy, in which we have tried techniques like the one you 
describe. The end result wasn't that much faster. Because no matter how 
many wrappers there are above VT's API, the efficiency characteristics 
of it are still there.


How you expect this model (managing one per cell count of scalar
allocations during the query) to perform in comparison with passing
encoded row pages (memory chinks) between sqlite and the script engine
especially when it is not PyPy or LuaJIT?



Essentially in my first post about VT's API efficiency the comparison 
was between the bind API (similar to your "encoded pages" idea but for a 
single row), and the current xColumn based VT API.


In our tests with Python, the difference between the two APIs was 3x. 
I've seen another post here by somebody else who was using a native 
compiled VT implementation saying that the difference was 2x. Both of 
these observations say that there is some inefficiency in the current 
situation with the VT API.


If this API is also used within SQLite's VM, maybe adding an xNextRow 
opcode, would speedup SQLite's VM too?


Best,

l.


Regards,
Alek



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Alek Paunov


On 04.03.2014 20:25, Eleytherios Stamatogiannakis wrote:

On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use
case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary
record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x
functions. In SQLite the bind API is used to pass parameters to prepared
statements. The way the the bind API works is that you have a statement
parameter "row", and you fill it by saying:

Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API
"way", the program calls SQLite to fill a row's values.



Ah, OK. Let see if I finally understood your idea:

- exec "insert into t select * from vt"
- VDBE calls xNextRow
- in xNextRow implementation, the Python code calls something like 
bind_xxx for each column with the scalar addresses (allocated by you)

- xNextRow returns, VDBE inserts the row, you clean on next step

Questions:

What stops you to make this wrapper right now (e.g. as apsw patch or 
standalone sqlite add-on loaded by PyPy FFI)?


How you expect this model (managing one per cell count of scalar 
allocations during the query) to perform in comparison with passing 
encoded row pages (memory chinks) between sqlite and the script engine 
especially when it is not PyPy or LuaJIT?


Regards,
Alek

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis


On 04/03/14 20:11, Alek Paunov wrote:

On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


In the link that you posted above, look for all the sqlite3_bind_x 
functions. In SQLite the bind API is used to pass parameters to prepared 
statements. The way the the bind API works is that you have a statement 
parameter "row", and you fill it by saying:


Set column 1 of statement parameter "row" to an int with value 10
Set column 2 of statement parameter "row" to an float with value 3.5
...

So instead of SQLite calling back for each column, in the bind API 
"way", the program calls SQLite to fill a row's values.


Regards.

l.




For a paging API (which IMHO is too complex ATM), the bind API could be
extended with a row number parameter.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Alek Paunov


On 04.03.2014 15:05, Eleytherios Stamatogiannakis wrote:

On 03/03/14 03:01, Alek Paunov wrote:

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over
to it. It already has the "bind" API which would be very suitable for
the xNextRow function too.



It seems that I do not know the sqlite3_ API very well.

http://www.sqlite.org/c3ref/funclist.html

Would you like to point me to the "bind" API page?


For a paging API (which IMHO is too complex ATM), the bind API could be
extended with a row number parameter.


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis


Thank you for the explanation.

Answers inline.

On 04/03/14 16:16, Hick Gunter wrote:

Your VT1 table already has an xColumn implementation, possibly doing something 
like

 switch( p_column )
 {
 case 1: sqlite3_result_xxx( p_ctx, v_rec->f1, ...); break;
 ...
 case n: sqlite3_result_xxx( p_ctx, v_rec->fn, ...); break;
 }

This needs to have two cases added:

 case n+1: sqlite3_result_int64( p_ctx, (uintptr_t)v_rec );
 case n+2: sqlite3_result_int64( p_ctx, (uintptr_t)func );

where

 static int func( p_rec, p_ctx, p_column );

calls

 xColumn( v_cursor, p_ctx, p_column );

with a dummy cursor structure as defined for your table.

The VT2 table can then prepare "select __rec,__func from VT1", and in its 
xColumn implementation it calls

   v_rec  = (void *)sqlite3_column_int64( v_stmt, 0 ); // this can be 
stored and cleared in the xNext function
 v_func = (func *)sqlite3_column_int64( v_stmt, 1 ); // this can be 
stored
 v_func( v_rec, p_ctx, p_column );


I see, so you do a similar trick as what we do with passing Python's 
generators as values in SQLite.




As for your second example, as written it does not suffer from the effect 
because you are already selecting c1, c2 and c3 at the bottom level.

Rewritten as

Select processrow(c1,c2,c3) from VT2(select * from VT1);


Without knowing what VT2 will do, I don't think that this rewritting can 
happen. For example, "processrow" might return generators (nested 
tables), that get expanded by VT2. If you moved it outside VT2, then the 
generators would not be expanded.


Regards,

l.



results in the VT1 xColumn function getting called (via the VT2 xColumn 
function) just 3 times per row.

Additionally, you may like to "select __func from VT1 limit 1" and store that in your 
xFilter implementation; and then "select __rec from VT1" in your xNext implementation to 
have sqlite3_result_int64() called half as often.

HTH

-Ursprüngliche Nachricht-
Von: Eleytherios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Dienstag, 04. März 2014 14:15
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Could you explain some more your solution?

Does it work in this following case:

select * from VT2(select * from VT1);

by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)?

What would happen in the following case?:

select * from VT2(select processrow(c1,c2,c3) from VT1);

Regards,

l.

On 03/03/14 14:17, Hick Gunter wrote:

We have gotten around this problem by defining "virtual" fields that contain a reference 
to the "current record" and the entrypoint of a wrapper around the xColumn function. That 
way only two fields get passed upwards through the virtual table stack and the top level virtual 
table's xColumn implementation calls straight through to the bottom layer's wrapper.

It does take some care to avoid sorting in between the layers and 
re-preparation of statements on schema changes.

-Ursprüngliche Nachricht-
Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Sonntag, 02. März 2014 20:39
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

We have both input and output virtual tables that avoid hitting the hard disk 
and are also able to compress the incoming and outgoing data.

We have a virtual table that takes as input a query and sends the data to a port on another 
machine. This virtual table is called "OUTPUT". And another virtual table that takes as 
input data from another port and forwards it into SQLite. Lets call it "INPUT". A query 
that uses these two virtual tables would look like this in madIS:

OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');

We actually use queries like above (actually we don't do it directly to ports 
but to buffered named pipes that are then forwarded via netcat) to run 
distributed queries on clusters, connecting all the local SQLite/madIS 
instances on the different machines together.

The main point that i want to make with above explanation is that we don't view 
SQLite only as a traditional database. We also view it as a data stream 
processing machine, that doesn't have the requirement for the data to be stored 
on a hard disk.

Under this view, the efficiency of the virtual table api is very important. 
Above query only uses 2 VTs in it, but we have other queries that use a lot 
more VTs than that.

estama


On 2/3/2014 9:34 ìì, Max Vlasov wrote:

On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
 wrote:


Our main test case is TPCH, a standard DB benchmark. The "lineitem"
table of TPCH contains 16 columns, which for 10M rows would require
160M xColumn callbacks, to pass it through the virtual

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis

In our tests the bind API can acquire from the Python side more than 20
values in a single call, at the same time that xColumn acquires 2 values.

Most of the cost is in the callback and not in submitting a row's values
through bind's API .

So with the exception of queries that need only 1 column, IMHO
everything else should go through the xNextRow API.

To keep the complexity to the lowest minimum, my proposal is to use
xNextRow API only for queries that only "scan" over a virtual table (no
filtering).

On 04/03/14 18:23, Hick Gunter wrote:

My guess: Yes.

It would require implementing an new opcode, either only for virtual tables or
also for native tables too, that accepts a list of field numbers (currently
there are only 5 parameters possible for an opcode and some of them have fixed
meanings).

And the logic to generate theses opcodes based on the capabilities of the loaded table
module combined with the requirements of the subject query (fields required for JOIN are
fetched separately from those required for the result set) and the result of the
xBestIndex calls (where it is possible to set the "omit" flag to suppress
generation of a comparison). This also adds to the complexity of register allocation.

Take for example a join that needs 3 fields for the comparison, 2 of which are
also required for the result set of 7 fields total. Do you request all 10
fields at once (wastes up to 9 fields worth of effort if the JOIN is not met)?
Or the 3 fields first and the 5 others only if the join matches (must allocate
consecutive registers to build a result set)? Or 3 first and then 7 (which
approximates the current behavior, as the 2 common fields are fetched twice on
a match)?

And a set of new sqlite3_result routines that specify which of the various
requested fields' value is being set.

-Ursprüngliche Nachricht-
Von: J. Merrill [mailto:j.merr...@enlyton.com]
Gesendet: Dienstag, 04. März 2014 16:23
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Eleytherios Stamatogiannakis wrote

Our main test case is TPCH, a standard DB benchmark. The "lineitem"
table of TPCH contains 16 columns, which for 10M rows would require
160M xColumn callbacks, to pass it through the virtual table API.
These callbacks are very expensive, especially when at the other end
sits a VM (CPython or PyPy) handling them.

Would it be very difficult to arrange for an option that would request that
SQLite issue a single more-complex xMultiColumns (a sample name) callback
request, with a way for multiple results to be returned, rather than many
xColumn callbacks? This would reduce the number of calls across the VM boundary.

Applications that don't implement xMultiColumns (and request its use) would see
no change; those that do would get the performance boost.

J. Merrill

--
View this message in context:
http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

---
Gunter Hick
Software Engineer

Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna,
Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

This e-mail is confidential and may well also be legally privileged. If you
have received it in error, you are on notice as to its status and accordingly
please notify us immediately by reply e-mail and then
delete this message from your system. Please do not copy it or use it for any
purposes, or disclose its contents to any person as to do so could be a breach
of confidence. Thank you for your cooperation.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Hick Gunter

My guess: Yes.

And the logic to generate theses opcodes based on the capabilities of the
loaded table module combined with the requirements of the subject query (fields
required for JOIN are fetched separately from those required for the result
set) and the result of the xBestIndex calls (where it is possible to set the
"omit" flag to suppress generation of a comparison). This also adds to the
complexity of register allocation.

And a set of new sqlite3_result routines that specify which of the various
requested fields' value is being set.

-Ursprüngliche Nachricht-
Von: J. Merrill [mailto:j.merr...@enlyton.com]
Gesendet: Dienstag, 04. März 2014 16:23
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Eleytherios Stamatogiannakis wrote
> Our main test case is TPCH, a standard DB benchmark. The "lineitem"
> table of TPCH contains 16 columns, which for 10M rows would require
> 160M xColumn callbacks, to pass it through the virtual table API.
> These callbacks are very expensive, especially when at the other end
> sits a VM (CPython or PyPy) handling them.

Applications that don't implement xMultiColumns (and request its use) would see
no change; those that do would get the performance boost.

J. Merrill

---
Gunter Hick
Software Engineer

Scientific Games International GmbH
Klitschgasse 2 – 4, A - 1130 Vienna,
Austria
FN 157284 a, HG Wien
Tel: +43 1 80100 0
E-Mail: h...@scigames.at

Re: [sqlite] Virtual table API performance

2014-03-04 Thread J. Merrill

Eleytherios Stamatogiannakis wrote
> Our main test case is TPCH, a standard DB benchmark. The "lineitem" 
> table of TPCH contains 16 columns, which for 10M rows would require 160M 
> xColumn callbacks, to pass it through the virtual table API. These 
> callbacks are very expensive, especially when at the other end sits a VM 
> (CPython or PyPy) handling them.

Would it be very difficult to arrange for an option that would request that
SQLite issue a single more-complex xMultiColumns (a sample name) callback
request, with a way for multiple results to be returned, rather than many
xColumn callbacks? This would reduce the number of calls across the VM
boundary.

Applications that don't implement xMultiColumns (and request its use) would
see no change; those that do would get the performance boost.

J. Merrill



--
View this message in context: 
http://sqlite.1065341.n5.nabble.com/Why-does-type-affinity-declared-on-a-foreign-key-column-affect-join-speed-tp74183p74295.html
Sent from the SQLite mailing list archive at Nabble.com.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Hick Gunter

Your VT1 table already has an xColumn implementation, possibly doing something 
like

switch( p_column )
{
case 1: sqlite3_result_xxx( p_ctx, v_rec->f1, ...); break;
...
case n: sqlite3_result_xxx( p_ctx, v_rec->fn, ...); break;
}

This needs to have two cases added:

case n+1: sqlite3_result_int64( p_ctx, (uintptr_t)v_rec );
case n+2: sqlite3_result_int64( p_ctx, (uintptr_t)func );

where

static int func( p_rec, p_ctx, p_column );

calls

xColumn( v_cursor, p_ctx, p_column );

with a dummy cursor structure as defined for your table.

The VT2 table can then prepare "select __rec,__func from VT1", and in its 
xColumn implementation it calls

  v_rec  = (void *)sqlite3_column_int64( v_stmt, 0 ); // this can be stored 
and cleared in the xNext function
v_func = (func *)sqlite3_column_int64( v_stmt, 1 ); // this can be 
stored
v_func( v_rec, p_ctx, p_column );


As for your second example, as written it does not suffer from the effect 
because you are already selecting c1, c2 and c3 at the bottom level.

Rewritten as

Select processrow(c1,c2,c3) from VT2(select * from VT1);

results in the VT1 xColumn function getting called (via the VT2 xColumn 
function) just 3 times per row.

Additionally, you may like to "select __func from VT1 limit 1" and store that 
in your xFilter implementation; and then "select __rec from VT1" in your xNext 
implementation to have sqlite3_result_int64() called half as often.

HTH

-Ursprüngliche Nachricht-
Von: Eleytherios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Dienstag, 04. März 2014 14:15
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

Could you explain some more your solution?

Does it work in this following case:

select * from VT2(select * from VT1);

by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)?

What would happen in the following case?:

select * from VT2(select processrow(c1,c2,c3) from VT1);

Regards,

l.

On 03/03/14 14:17, Hick Gunter wrote:
> We have gotten around this problem by defining "virtual" fields that contain 
> a reference to the "current record" and the entrypoint of a wrapper around 
> the xColumn function. That way only two fields get passed upwards through the 
> virtual table stack and the top level virtual table's xColumn implementation 
> calls straight through to the bottom layer's wrapper.
>
> It does take some care to avoid sorting in between the layers and 
> re-preparation of statements on schema changes.
>
> -Ursprüngliche Nachricht-
> Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
> Gesendet: Sonntag, 02. März 2014 20:39
> An: sqlite-users@sqlite.org
> Betreff: Re: [sqlite] Virtual table API performance
>
> We have both input and output virtual tables that avoid hitting the hard disk 
> and are also able to compress the incoming and outgoing data.
>
> We have a virtual table that takes as input a query and sends the data to a 
> port on another machine. This virtual table is called "OUTPUT". And another 
> virtual table that takes as input data from another port and forwards it into 
> SQLite. Lets call it "INPUT". A query that uses these two virtual tables 
> would look like this in madIS:
>
> OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');
>
> We actually use queries like above (actually we don't do it directly to ports 
> but to buffered named pipes that are then forwarded via netcat) to run 
> distributed queries on clusters, connecting all the local SQLite/madIS 
> instances on the different machines together.
>
> The main point that i want to make with above explanation is that we don't 
> view SQLite only as a traditional database. We also view it as a data stream 
> processing machine, that doesn't have the requirement for the data to be 
> stored on a hard disk.
>
> Under this view, the efficiency of the virtual table api is very important. 
> Above query only uses 2 VTs in it, but we have other queries that use a lot 
> more VTs than that.
>
> estama
>
>
> On 2/3/2014 9:34 ìì, Max Vlasov wrote:
>> On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
>>  wrote:
>>>
>>> Our main test case is TPCH, a standard DB benchmark. The "lineitem"
>>> table of TPCH contains 16 columns, which for 10M rows would require
>>> 160M xColumn callbacks, to pass it through the virtual table API.
>>> These callbacks are very expensive, especially when at the other end
>>> sits a VM (CPython or PyPy) handling them.
>>>
>>
>> Ok, not stating that the performance

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis

Could you explain some more your solution?

Does it work in this following case:

select * from VT2(select * from VT1);

by directly passing the rows from VT1 to VT2 (short-circuiting SQLite)?

What would happen in the following case?:

select * from VT2(select processrow(c1,c2,c3) from VT1);

Regards,

On 03/03/14 14:17, Hick Gunter wrote:

We have gotten around this problem by defining "virtual" fields that contain a reference
to the "current record" and the entrypoint of a wrapper around the xColumn function. That
way only two fields get passed upwards through the virtual table stack and the top level virtual
table's xColumn implementation calls straight through to the bottom layer's wrapper.

It does take some care to avoid sorting in between the layers and
re-preparation of statements on schema changes.

-Ursprüngliche Nachricht-
Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Sonntag, 02. März 2014 20:39
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

We have both input and output virtual tables that avoid hitting the hard disk
and are also able to compress the incoming and outgoing data.

We have a virtual table that takes as input a query and sends the data to a port on another
machine. This virtual table is called "OUTPUT". And another virtual table that takes as
input data from another port and forwards it into SQLite. Lets call it "INPUT". A query
that uses these two virtual tables would look like this in madIS:

OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');

We actually use queries like above (actually we don't do it directly to ports
but to buffered named pipes that are then forwarded via netcat) to run
distributed queries on clusters, connecting all the local SQLite/madIS
instances on the different machines together.

The main point that i want to make with above explanation is that we don't view
SQLite only as a traditional database. We also view it as a data stream
processing machine, that doesn't have the requirement for the data to be stored
on a hard disk.

Under this view, the efficiency of the virtual table api is very important.
Above query only uses 2 VTs in it, but we have other queries that use a lot
more VTs than that.

estama

On 2/3/2014 9:34 ìì, Max Vlasov wrote:

On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
wrote:

Ok, not stating that the performance improvment is impossible, I will
explain why I'm a little sceptical about it.

For every bulk insert we have a theoretical maxiumum we'd all glad to
see sqlite would perform with - the speed of simple file copying.
Sqlite can't be faster than that, but to be on par is a good goal.
This is not possible when an insert means also modification of other
parts of the file, for example when there's an index involved. But
let's forget about it. Finally when new data is added, sqlite should
write a number of database pages, the cost of this part is absolutely
in the hands of the media (driver) and OS (driver). But for every
database page write there's also price to pay in CPU units, for many
actions sqlite should do before actual value is translated from what
the developer provided to what actually appears on disk.

The illustration of the CPU price is the following example
CREATE TABLE t(Value)

on my ssd drive mulitply inserts (thousands)
insert into t (Value) values ('123456689 // this string
contains many symbols, for example 1024) performed with the speed
30 MB/Sec

but the query
insert into t (Value) values (10) // this is a small integer
value only
3 Mb/Sec

Both shows almost full cpu load. Why such difference? Because with
latter query the system can do more than 30 MB of writes in 1 second,
but it should wait for sqlite spending 10 seconds in preparations.
The former is better because CPU cost of passing a large text value to
sqlite is comparatively low comparing to the time spent in I/O in
writing this on disk.

So CPU price to pay isn't avoidable and notice that in example this is
not virtual table API, this is bind API. I suppose that the price we
pay for CPU spent in virtual table API is on par with an average price
payed in sqlite as a whole. This means that if I transfom the avove
queries into inserts from virtual tables, the final speed difference
will be similar. And this also means that for your comparision tests
(when you get x3 difference), the CPU price sqlite pays inside bind
api and in its code wrapping xColumn call is probably si

Re: [sqlite] Virtual table API performance

2014-03-04 Thread Eleytherios Stamatogiannakis


On 03/03/14 03:01, Alek Paunov wrote:

On 02.03.2014 21:38, Elefterios Stamatogiannakis wrote:

Under this view, the efficiency of the virtual table api is very
important. Above query only uses 2 VTs in it, but we have other queries
that use a lot more VTs than that.


Max tests in C shows 2x CPU work, but he explains that the test is not
very sound, so let's say somewhere between 1x-2x. Your tests - 3x time.

As you have already identified, the real reason probably is the million
scale callback quantity across the VM barrier - I do not follow PyPy,
but see these notes [1] by Mike Pall - the LuaJIT author (LuaJIT is the
leading project in the trace compilers filed):

[1] http://luajit.org/ext_ffi_semantics.html#callback_performance

Also from one of the dozens of threads touching the subject:

[2] http://www.freelists.org/post/luajit/Yielding-across-C-boundaries,3

```
Entering the VM needs a lot of state setup and leaving it isn't
free either. Constantly entering and leaving the VM via a callback
from C *to* Lua has a high overhead. For short callbacks, the
switching overhead between C and Lua may completely dominate the
total CPU time.

Calling an iterator written in C via the FFI *from* a Lua program
is much cheaper -- this compiles down to a simple call instruction.
```


I remember that i had seen above quote from Mike Pall, but i couldn't 
find it.


Thank you for unearthing it.


Unfortunately, for your "insert into t select * from vt" case an the
callback/iterator transformation is not possible (we do not have
repetitive _step call to invert the control somehow). What to do?

It seems that the easiest optimization for this (very often) VT use case
(bulk streaming) is SQLite add-on in _C_ to be written, implementing
vtable interface specialization containing xNextPage "buffering" let's
say 4K rows or even better 16KB data (in addition to your initial
proposal of xNextRow).

The technical question is: how the rows to be encoded? You said
initially that you use some compressed format. But for such extension,
to gain more traction in the future, it would be better probably a more
standard format to be chosen.

a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record
streaming could be discussed and adopted across the SQLite binding and
add-on developers. The possible applications are not limited only to
vtables ;-).


SQLite doesn't need any special format for the records to be passed over 
to it. It already has the "bind" API which would be very suitable for 
the xNextRow function too.


For a paging API (which IMHO is too complex ATM), the bind API could be 
extended with a row number parameter.


Regards,

estama



Kind regards,
Alek



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-03 Thread Hick Gunter

We have gotten around this problem by defining "virtual" fields that contain a 
reference to the "current record" and the entrypoint of a wrapper around the 
xColumn function. That way only two fields get passed upwards through the 
virtual table stack and the top level virtual table's xColumn implementation 
calls straight through to the bottom layer's wrapper.

It does take some care to avoid sorting in between the layers and 
re-preparation of statements on schema changes.

-Ursprüngliche Nachricht-
Von: Elefterios Stamatogiannakis [mailto:est...@gmail.com]
Gesendet: Sonntag, 02. März 2014 20:39
An: sqlite-users@sqlite.org
Betreff: Re: [sqlite] Virtual table API performance

We have both input and output virtual tables that avoid hitting the hard disk 
and are also able to compress the incoming and outgoing data.

We have a virtual table that takes as input a query and sends the data to a 
port on another machine. This virtual table is called "OUTPUT". And another 
virtual table that takes as input data from another port and forwards it into 
SQLite. Lets call it "INPUT". A query that uses these two virtual tables would 
look like this in madIS:

OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');

We actually use queries like above (actually we don't do it directly to ports 
but to buffered named pipes that are then forwarded via netcat) to run 
distributed queries on clusters, connecting all the local SQLite/madIS 
instances on the different machines together.

The main point that i want to make with above explanation is that we don't view 
SQLite only as a traditional database. We also view it as a data stream 
processing machine, that doesn't have the requirement for the data to be stored 
on a hard disk.

Under this view, the efficiency of the virtual table api is very important. 
Above query only uses 2 VTs in it, but we have other queries that use a lot 
more VTs than that.

estama

On 2/3/2014 9:34 ìì, Max Vlasov wrote:
> On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
>  wrote:
>>
>> Our main test case is TPCH, a standard DB benchmark. The "lineitem"
>> table of TPCH contains 16 columns, which for 10M rows would require
>> 160M xColumn callbacks, to pass it through the virtual table API.
>> These callbacks are very expensive, especially when at the other end
>> sits a VM (CPython or PyPy) handling them.
>>
>
> Ok, not stating that the performance improvment is impossible, I will
> explain why I'm a little sceptical about it.
>
> For every bulk insert we have a theoretical maxiumum we'd all glad to
> see sqlite would perform with - the speed of simple file copying.
> Sqlite can't be faster than that, but to be on par is a good goal.
> This is not possible when an insert means also modification of other
> parts of the file, for example when there's an index involved. But
> let's forget about it. Finally when new data is added, sqlite should
> write a number of database pages, the cost of this part is absolutely
> in the hands of the media (driver) and OS (driver).  But for every
> database page write there's also price to pay in CPU units, for many
> actions sqlite should do before actual value is translated from what
> the developer provided to what actually appears on disk.
>
> The illustration of the CPU price is the following example
>   CREATE TABLE t(Value)
>
> on my ssd drive mulitply inserts (thousands)
>insert into t (Value) values ('123456689  // this string
> contains many symbols, for example 1024) performed with the speed
>30 MB/Sec
>
> but the query
>insert into t (Value) values (10)  // this is a small integer
> value only
>3 Mb/Sec
>
> Both shows almost full cpu load. Why such difference? Because with
> latter query the system can do more than 30 MB of writes in 1 second,
> but it should wait for sqlite spending 10 seconds in preparations.
> The former is better because CPU cost of passing a large text value to
> sqlite is comparatively low comparing to the  time spent in I/O in
> writing this on disk.
>
> So CPU price to pay isn't avoidable and notice that in example this is
> not virtual table API, this is bind API. I suppose that the price we
> pay for CPU spent in virtual table API is on par with an average price
> payed in sqlite as a whole. This means that if I transfom the avove
> queries into inserts from virtual tables, the final speed difference
> will be similar. And this also means that for your comparision tests
> (when you get x3 difference), the CPU price sqlite pays inside bind
> api and in its code wrapping xColumn call is probably similar. The
> rest is the share your code pays.
>
> Wel

Re: [sqlite] Virtual table API performance

2014-03-02 Thread Alek Paunov


On 02.03.2014 21:38, Elefterios Stamatogiannakis wrote:

Under this view, the efficiency of the virtual table api is very
important. Above query only uses 2 VTs in it, but we have other queries
that use a lot more VTs than that.


Max tests in C shows 2x CPU work, but he explains that the test is not 
very sound, so let's say somewhere between 1x-2x. Your tests - 3x time.


As you have already identified, the real reason probably is the million 
scale callback quantity across the VM barrier - I do not follow PyPy, 
but see these notes [1] by Mike Pall - the LuaJIT author (LuaJIT is the 
leading project in the trace compilers filed):


[1] http://luajit.org/ext_ffi_semantics.html#callback_performance

Also from one of the dozens of threads touching the subject:

[2] http://www.freelists.org/post/luajit/Yielding-across-C-boundaries,3

```
Entering the VM needs a lot of state setup and leaving it isn't
free either. Constantly entering and leaving the VM via a callback
from C *to* Lua has a high overhead. For short callbacks, the
switching overhead between C and Lua may completely dominate the
total CPU time.

Calling an iterator written in C via the FFI *from* a Lua program
is much cheaper -- this compiles down to a simple call instruction.
```

Unfortunately, for your "insert into t select * from vt" case an the 
callback/iterator transformation is not possible (we do not have 
repetitive _step call to invert the control somehow). What to do?


It seems that the easiest optimization for this (very often) VT use case 
(bulk streaming) is SQLite add-on in _C_ to be written, implementing 
vtable interface specialization containing xNextPage "buffering" let's 
say 4K rows or even better 16KB data (in addition to your initial 
proposal of xNextRow).


The technical question is: how the rows to be encoded? You said 
initially that you use some compressed format. But for such extension, 
to gain more traction in the future, it would be better probably a more 
standard format to be chosen.


a) Rows represented in native SQLite3 format [3]
b) ... native SQLite4 format
c) Some wide used encoding near to SQLite types [4]
d) ...

[3] http://www.sqlite.org/fileformat.html#record_format
[4] https://github.com/msgpack/msgpack/blob/master/spec.md#formats

IMHO, It would be very nice if some common conventions for binary record 
streaming could be discussed and adopted across the SQLite binding and 
add-on developers. The possible applications are not limited only to 
vtables ;-).


Kind regards,
Alek

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-02 Thread Elefterios Stamatogiannakis

We have both input and output virtual tables that avoid hitting the hard 
disk and are also able to compress the incoming and outgoing data.


We have a virtual table that takes as input a query and sends the data 
to a port on another machine. This virtual table is called "OUTPUT". And 
another virtual table that takes as input data from another port and 
forwards it into SQLite. Lets call it "INPUT". A query that uses these 
two virtual tables would look like this in madIS:


OUTPUT ip:192.168.0.1 port:8080 select * from INPUT('port:8081');

We actually use queries like above (actually we don't do it directly to 
ports but to buffered named pipes that are then forwarded via netcat) to 
run distributed queries on clusters, connecting all the local 
SQLite/madIS instances on the different machines together.


The main point that i want to make with above explanation is that we 
don't view SQLite only as a traditional database. We also view it as a 
data stream processing machine, that doesn't have the requirement for 
the data to be stored on a hard disk.


Under this view, the efficiency of the virtual table api is very 
important. Above query only uses 2 VTs in it, but we have other queries 
that use a lot more VTs than that.


estama


On 2/3/2014 9:34 μμ, Max Vlasov wrote:

On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
 wrote:


Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of
TPCH contains 16 columns, which for 10M rows would require 160M xColumn
callbacks, to pass it through the virtual table API. These callbacks are
very expensive, especially when at the other end sits a VM (CPython or PyPy)
handling them.



Ok, not stating that the performance improvment is impossible, I will
explain why I'm a little sceptical about it.

For every bulk insert we have a theoretical maxiumum we'd all glad to
see sqlite would perform with - the speed of simple file copying.
Sqlite can't be faster than that, but to be on par is a good goal.
This is not possible when an insert means also modification of other
parts of the file, for example when there's an index involved. But
let's forget about it. Finally when new data is added, sqlite should
write a number of database pages, the cost of this part is absolutely
in the hands of the media (driver) and OS (driver).  But for every
database page write there's also price to pay in CPU units, for many
actions sqlite should do before actual value is translated from what
the developer provided to what actually appears on disk.

The illustration of the CPU price is the following example
  CREATE TABLE t(Value)

on my ssd drive mulitply inserts (thousands)
   insert into t (Value) values ('123456689  // this string
contains many symbols, for example 1024)
performed with the speed
   30 MB/Sec

but the query
   insert into t (Value) values (10)  // this is a small integer value
only
   3 Mb/Sec

Both shows almost full cpu load. Why such difference? Because with
latter query the system can do more than 30 MB of writes in 1 second,
but it should wait for sqlite spending 10 seconds in preparations.
The former is better because CPU cost of passing a large text value to
sqlite is comparatively low comparing to the  time spent in I/O in
writing this on disk.

So CPU price to pay isn't avoidable and notice that in example this is
not virtual table API, this is bind API. I suppose that the price we
pay for CPU spent in virtual table API is on par with an average price
payed in sqlite as a whole. This means that if I transfom the avove
queries into inserts from virtual tables, the final speed difference
will be similar. And this also means that for your comparision tests
(when you get x3 difference), the CPU price sqlite pays inside bind
api and in its code wrapping xColumn call is probably similar. The
rest is the share your code pays.

Well, I know that there are differences in CPU architectures and
probably there are platform where compiled code for bind api and
virtual tables api behaves a little differently making the costs more
diffrent. But imagine that hard task of fine tuning and refactoring
just to get a noticeable difference for a particular platform.


Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-02 Thread Max Vlasov

On Sun, Mar 2, 2014 at 5:21 PM, Elefterios Stamatogiannakis
 wrote:
>
> Our main test case is TPCH, a standard DB benchmark. The "lineitem" table of
> TPCH contains 16 columns, which for 10M rows would require 160M xColumn
> callbacks, to pass it through the virtual table API. These callbacks are
> very expensive, especially when at the other end sits a VM (CPython or PyPy)
> handling them.
>

Ok, not stating that the performance improvment is impossible, I will
explain why I'm a little sceptical about it.

For every bulk insert we have a theoretical maxiumum we'd all glad to
see sqlite would perform with - the speed of simple file copying.
Sqlite can't be faster than that, but to be on par is a good goal.
This is not possible when an insert means also modification of other
parts of the file, for example when there's an index involved. But
let's forget about it. Finally when new data is added, sqlite should
write a number of database pages, the cost of this part is absolutely
in the hands of the media (driver) and OS (driver).  But for every
database page write there's also price to pay in CPU units, for many
actions sqlite should do before actual value is translated from what
the developer provided to what actually appears on disk.

The illustration of the CPU price is the following example
 CREATE TABLE t(Value)

on my ssd drive mulitply inserts (thousands)
  insert into t (Value) values ('123456689  // this string
contains many symbols, for example 1024)
performed with the speed
  30 MB/Sec

but the query
  insert into t (Value) values (10)  // this is a small integer value
only
  3 Mb/Sec

Both shows almost full cpu load. Why such difference? Because with
latter query the system can do more than 30 MB of writes in 1 second,
but it should wait for sqlite spending 10 seconds in preparations.
The former is better because CPU cost of passing a large text value to
sqlite is comparatively low comparing to the  time spent in I/O in
writing this on disk.

So CPU price to pay isn't avoidable and notice that in example this is
not virtual table API, this is bind API. I suppose that the price we
pay for CPU spent in virtual table API is on par with an average price
payed in sqlite as a whole. This means that if I transfom the avove
queries into inserts from virtual tables, the final speed difference
will be similar. And this also means that for your comparision tests
(when you get x3 difference), the CPU price sqlite pays inside bind
api and in its code wrapping xColumn call is probably similar. The
rest is the share your code pays.

Well, I know that there are differences in CPU architectures and
probably there are platform where compiled code for bind api and
virtual tables api behaves a little differently making the costs more
diffrent. But imagine that hard task of fine tuning and refactoring
just to get a noticeable difference for a particular platform.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table API performance

2014-03-02 Thread Elefterios Stamatogiannakis

In our performance tests we try to work with data and queries that are 
representative of what we would find in a typical DB.

This means a lot of "small" values (ints, floats, small strings), and 
5-20 columns.

Our main test case is TPCH, a standard DB benchmark. The "lineitem" 
table of TPCH contains 16 columns, which for 10M rows would require 160M 
xColumn callbacks, to pass it through the virtual table API. These 
callbacks are very expensive, especially when at the other end sits a VM 
(CPython or PyPy) handling them.

For PyPy in particular, which is able to produce JIT compiled and 
optimized UDF code, adapted on the "shape" (type and value distribution) 
of the data flows as they pass through it, every time it faces the 
virtual table API it slows down to a crawl, becoming more than 2x slower 
than interpreted Python. This happens because it cannot see the data/row 
"shape" from the many small and unrelated between each other, single 
value based, xColumn callbacks.

Changing the subject, i've seen some requests in previous emails from 
people asking for windowing functions to be added to SQLite. I want to 
propose an alternative that we have been using for years, and is a lot 
more generic than adding specific functions for very "narrow" use cases 
in SQLite.

We have added the "EXPAND" VT function in madIS, which "emulates" nested 
tables in SQLite, enabling to have row and aggregate functions that 
return (in a streaming fashion) multiple values on multiple columns. The 
"EXPAND" function, takes as input a table containing as values (in our 
case Python) generators, and then it calls the generators "expanding" 
the input table to its final form. "EXPAND" is automatically inserted 
wherever is required, so it isn't visible. An example follows:

> select strsplit('one and other');
one|and|other <-- 3 columns

or

> select strsplitV('one and other');
one
and<-- 3 individual rows
other

So by adding a single VT function and some syntactic sugar (auto 
inserting EXPAND VT), we were able to have functionality that is not 
case specific, allowing us to run all kinds of analytics inside SQLite.

The performance of above functionality is already very good. But it 
could be a lot better with a more efficient VT API.

Regards,

estama

On 2/3/2014 9:15 πμ, Max Vlasov wrote:

Hi,
thanks for explaining your syntax in another post. Now about virtual
tables if you don't mind.

On Fri, Feb 28, 2014 at 8:24 PM, Eleytherios Stamatogiannakis
 wrote:

If we load into SQLite, 

create table newtable as select * from READCOMPRESSEDFILE('ctable.rc');

it takes: 55 sec

If we create an external program 

it takes: 19 sec (~3x faster than using the virtual table API)

Looking at your numbers, as a user (and fan :) of virtual tables I
decided to do some tests.

I have a virtual table "all values", it was designed for enumeration
of all tables values to the one single virtual table, so finally it is
a long list of

   TableName, TableRowId, FieldName, Value

so you get the idea. As an example of what it may do, you may open
places.sqlite of mozilla browser and do

   Select * from AllValues where Value Like "%sqlite.org%"

and see actual results even not knowing how they planned their schema.

Internally this virtual table simply uses general selects for all
other tables met in sqlite_master. This is a good (but probably not
the best) test for measuring virtual tables performance, because

   SELECT * FROM AllValues

is equivalent to reading all conventional tables of this database.
Besides
- the tool I use has a tweaker implemented with VFS that allows
measuring speed and other characteristics of the query performed while
the query is in effect.
- I have an option that forces resetting windows cache for the
database file when it is reopened. So with it we exclude the windows
cache from consideration so pure I/O reading is used. Btw, when you do
your comparison, it's very important to reset system cache before
every measurement that involves I/O.

So I took a comparatively large (500 Mb) database consisting of
several small and one big table (Posts) and compared two queries.

(Query1)

   Select sum(length(Body) + length(Title)) from Posts

This ones effectively reads the table data and uses
- length() to force sqlite reading texts that don't fit into single db page
- sum() to exclude accumulating results on my side from comparison, so
we have a single row, single column result from the work completely
done by sqlite.

(Query2)

   Select Sum(Length(Value)) from AllValues

This one performs basically the same but using sqlite virtual tables
api. It also touches other tables, but since they're small, we can
forget about this.

Query1 (General):
   Read: 540MB,
   Time: 24.2 sec,
   CPU Time: 6 Sec (25%)
   Speed: 22.31 MB/Sec

Query2 (Virtual):
   Read: 540MB,
   Time: 27.3 Sec,
   CPU Time: 13 sec (51%)
   Speed: 20 MB/Sec

In my particular test the noticeable difference is at the p

Re: [sqlite] Virtual table API performance

2014-03-01 Thread Max Vlasov

Hi,
thanks for explaining your syntax in another post. Now about virtual
tables if you don't mind.

On Fri, Feb 28, 2014 at 8:24 PM, Eleytherios Stamatogiannakis
 wrote:
>
> If we load into SQLite, 
>
> create table newtable as select * from READCOMPRESSEDFILE('ctable.rc');
>
> it takes: 55 sec
>
>
> If we create an external program 
>
> it takes: 19 sec (~3x faster than using the virtual table API)
>
>

Looking at your numbers, as a user (and fan :) of virtual tables I
decided to do some tests.

I have a virtual table "all values", it was designed for enumeration
of all tables values to the one single virtual table, so finally it is
a long list of

  TableName, TableRowId, FieldName, Value

so you get the idea. As an example of what it may do, you may open
places.sqlite of mozilla browser and do

  Select * from AllValues where Value Like "%sqlite.org%"

and see actual results even not knowing how they planned their schema.

Internally this virtual table simply uses general selects for all
other tables met in sqlite_master. This is a good (but probably not
the best) test for measuring virtual tables performance, because

  SELECT * FROM AllValues

is equivalent to reading all conventional tables of this database.
Besides
- the tool I use has a tweaker implemented with VFS that allows
measuring speed and other characteristics of the query performed while
the query is in effect.
- I have an option that forces resetting windows cache for the
database file when it is reopened. So with it we exclude the windows
cache from consideration so pure I/O reading is used. Btw, when you do
your comparison, it's very important to reset system cache before
every measurement that involves I/O.

So I took a comparatively large (500 Mb) database consisting of
several small and one big table (Posts) and compared two queries.

(Query1)

  Select sum(length(Body) + length(Title)) from Posts

This ones effectively reads the table data and uses
- length() to force sqlite reading texts that don't fit into single db page
- sum() to exclude accumulating results on my side from comparison, so
we have a single row, single column result from the work completely
done by sqlite.

(Query2)

  Select Sum(Length(Value)) from AllValues

This one performs basically the same but using sqlite virtual tables
api. It also touches other tables, but since they're small, we can
forget about this.

Query1 (General):
  Read: 540MB,
  Time: 24.2 sec,
  CPU Time: 6 Sec (25%)
  Speed: 22.31 MB/Sec

Query2 (Virtual):
  Read: 540MB,
  Time: 27.3 Sec,
  CPU Time: 13 sec (51%)
  Speed: 20 MB/Sec

In my particular test the noticeable difference is at the part of the
CPU spent more with the virtual table. I assume this can be related to
my own implementation of this virtual table since I should retrieve,
store values temporary somewhere and talk to sqlite. But this also may
shed light on your performance drop. If your virtual implementation
spend much time processing a value, you may finally get a big drop.

You may tell that this test is not fair because it does not involve
creating a table from the values of a virtual table. Unfortunately I
can't create good enough test comparing Posts and AllValues table as
sources, because the destination geometry of the tables are different
( Posts have more columns, less rows, AllValue less columns, more
rows). The closest approximation was possible when I created an
intermediate physical table containing the results from AllValues and
compared table creation from this table and from virtual table. The
virtual one took longer, but the values - 56 seconds vs 43 second not
different enough to conclude something.

I'm not sure my tests defend sqlite virtual tables sufficiently, but
currently I don't have evidence of significant inefficiency either.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table "Functions"

2014-03-01 Thread Elefterios Stamatogiannakis


Please excuse me for not explaining.

The syntax that you puzzle about is supported by madIS [*] which 
translates it into SQLite. Having being using madIS for many years (we 
created it in 2008) it comes natural to me, forgetting that SQLite 
doesn't support it.


What essentially madIS does is to create and destroy the virtual tables 
in an madIS/SQLite query. It also supports an "inverted" syntax, that 
permits virtual table pipelines (like pipes in unix):


select * from (XMLPARSE select * from FILE("data.xml"));

above query is the same as writting:

select * from XMLPARSE(' select * from FILE("data.xml") ')

but without the thorny quote escaping problems.

The "select * from " part is optional (it is autocompleted) so above 
query could be rewritten to:


XMLPARSE FILE "data.xml";

Both XMLPARSE and FILE are regular SQLite virtual table functions coded 
in Python (the whole madIS is in Python). Also, due to SQLite's virtual 
table API design quality, both functions are also streaming (XML parsing 
works on continuous multi-GB long streams of XML input, coming from the 
internal query that uses FILE).


You may think that it is crazy to do this pipelining through a 
relational engine (SQLite), but the whole thing is very fast and very 
versatille. We have been processing hundrends of GB of data (in various 
forms) using such "enhanced" SQLite queries for many years.


Having said all of the above, i hope that you can see why we care so 
much about SQLite's virtual table API efficiency.


estama.

[*] https://code.google.com/p/madis/

On 1/3/2014 10:35 πμ, Max Vlasov wrote:

On Fri, Feb 28, 2014 at 10:14 PM, Dominique Devienne
 wrote:

Can someone tell me how the statement below works?


Thanks for any help on this. This is really puzzling to me. --DD



Very puzzling for me too
For any statement like this

   select * from blablabla(123)

sqlite (3.8.3.1) primarily reports syntax a error...

near "(": syntax error

... before complaining about anything else.

I'd also be glad to see a shortcut when a virtual table created just
for a select statement and dies automatically, but I suspect it is not
supported and it was their own modified version of sqlite. Hope OP
reads this post also and will have an answer for us. Probably this
will help providing following information regarding his issues.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users



___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table "Functions"

2014-03-01 Thread Max Vlasov

On Fri, Feb 28, 2014 at 10:14 PM, Dominique Devienne
 wrote:
> Can someone tell me how the statement below works?
> 
>
> Thanks for any help on this. This is really puzzling to me. --DD

Very puzzling for me too
For any statement like this

  select * from blablabla(123)

sqlite (3.8.3.1) primarily reports syntax a error...

   near "(": syntax error

... before complaining about anything else.

I'd also be glad to see a shortcut when a virtual table created just
for a select statement and dies automatically, but I suspect it is not
supported and it was their own modified version of sqlite. Hope OP
reads this post also and will have an answer for us. Probably this
will help providing following information regarding his issues.

Max
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xRowid

2014-02-05 Thread Kevin Martin

On 5 Feb 2014, at 18:40, Dan Kennedy  wrote:

> But is it the case that your virtual table doesn't handle
> anything other than full scans of the entire table contents? If
> so, it's probably not a problem if rowids are inconsistent.

Thanks for the heads up on the 'or' case, I didn't know that. I don't use 
xBestIndex, so the table only supports full scans.

I think I'll try it with returning SQLITE_ERROR in xRowid, because I'd rather 
see the error than have strange results happen silently, I'll, watch what 
happens. One of the columns in the underlying table/view is unique, so one 
solution is to order by that column when doing the internal select, and then 
the auto generated row ids will be consistent, but that just feels like it adds 
a lot of calculation I don't want to do, and won't be necessary most of the 
time.

Thanks,
Kevin
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xRowid

2014-02-05 Thread Dan Kennedy


On 02/06/2014 01:22 AM, Kevin Martin wrote:

Hi,

My questions are basically:

1) What causes sqlite3 to call xRowid
2) If I don't want to update my virtual table, or do a select ROWID, can I just 
use an incremental counter, increased on every call to xNext (bearing in mind 
the order of my data is not guaranteed, so this won't necessarily return the 
same rowid for the same row on separate calls to xFilter), or even better just 
return SQLITE_ERROR?


One more case. If you have a query like this:

  SELECT ... FROM vtable WHERE vtable.a = ? OR vtable.b = ?;

and the implementation of vtable can handle queries for both
(vtable.a=?) and (vtable.b=?) efficiently, then SQLite might
make two separate sets of calls to xFilter/xStep - one to obtain
the set of rows for (a=?) and a second to obtain the set that
match (b=?). It then merges the two sets together using the
rowids as a unique key. If the rowids in the two sets are not
consistent, then you might end up returning duplicate rows to
the user or omitting rows from the result.

But is it the case that your virtual table doesn't handle
anything other than full scans of the entire table contents? If
so, it's probably not a problem if rowids are inconsistent.

Dan.







The problem itself is detailed below.

Thanks,
Kevin

---

I am creating virtual table that unpacks the contents of another table/view.

As a simple example, consider the following:

create table t(week, performance_csv)

Where each row contains a csv file for the performance of the given week, and 
each csv file has two columns, person and percentage.

I want to create a virtual table as follows:

create virtual table t2 using unpack_performance(t)

This table will declare the table as follows

create table t2(week, person, percentage)

and it will iterate over each row of each csv in the table t.

Implementing xRowid is easy because I can base it off the ROWID of t and number 
of lines in the csv file.

However, the problem comes when t is a view, I've only just discovered that 
views don't have ROWIDs!

I can quite happily just increment a simple counter, but the view isn't 
necessarily ordered, so the rowid won't match on multiple filters.

In my actual application the view is the union of a few selects, so actually 
including the rowid in the view is going to be a complete pain, and is not an 
option I want to consider.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table: misuse error on sqlite3_declare_vtab()

2013-10-04 Thread Simon

GOT IT.

Well, I found my mistake that lead to all this.  I'm sorry to have wasted
your time.  Here's what happened for those who struggle with similar
issues...

I originally opted to make sqlite3_module a member of my C++ "module"
class.  My C++ constructor was therefore assigning all the xFunctions to
that module's members.  Later, without realizing the impact, my
implementation changed so my class derived sqlite3_module.  However, I
failed to switch the xFunc assignements to this->xFunc = foo;  In the end,
it was exactly as if I had never assigned anything to them.

Thank you all very much!  :)

Simon


On Fri, Oct 4, 2013 at 3:14 PM, Simon  wrote:

> The option was already on for my own stuff but not sqlite.  However, I
> just had to recompile libSqlite3 with debug option to get more details and
> it's a bit scary...
>
> SQL query using sqlite3_exec() and stmt: create virtual table foo using
> vtable_module1;
> -Calling sqlite3_prepare_v2()...
> -Calling sqlite3_step()...
> staging.bin: sqlite3.c:103774: vtabCallConstructor: Assertion `xConstruct'
> failed.
>
> What's that xConstruct?!  I can't even grep that in the sqlite source
> files!  :(
>
> After having browsed around, I think it might actually be a macro
> (somehow) which aliases either xCreate or xConnect depending on the
> situation, but this is just a guess...  and still, it doesn't really help.
>  I reviewed my xCreate and xConnect and they look fine at first glance...
>
> Thanks,
>   Simon
>
>
> On Fri, Oct 4, 2013 at 2:26 PM, Richard Hipp  wrote:
>
>>
>>
>>
>> On Fri, Oct 4, 2013 at 2:22 PM, Simon  wrote:
>>
>>> Hi Richard,
>>>
>>> That helped a lot and got me passed that point.  However, I didn't go
>>> very far from where I were...
>>>
>>> It now compiles fine, however the program segfaults within the
>>> sqlite3_step() of the "create virtual table foo using vtable_module1;"
>>> statement.
>>>
>>> Using valgrind, I get this interesting message:
>>>
>>> ==31748== Jump to the invalid address stated on the next line
>>> ==31748== at 0x0: ???
>>> ==31748== by 0x5A0D119: vtabCallConstructor (in
>>> /usr/lib64/libsqlite3.so.0.8.6)
>>> ==31748== by 0x5A3D499: sqlite3VdbeExec (in
>>> /usr/lib64/libsqlite3.so.0.8.6)
>>> ==31748== by 0x5A3E969: sqlite3_step (in /usr/lib64/libsqlite3.so.0.8.6)
>>> [...]
>>>
>>> This looks (to me) like one of the NULL function pointers in struct
>>> sqlite3_module is being called.  I added a valid xRename() function as I
>>> had initially thought it was optionnal, but still not working.  I then
>>> tried creating a dummy function for every one of these pointers, but I get
>>> the same result.  I even tried changing the "create virtual..." statement
>>> to include module params (which my module ignores) but again it didn't
>>> change anything.
>>>
>>> Where should I look at next?
>>>
>>>
>> Recompile with line-number information (-g is it?) so that you know
>> exactly which line of code tried to jump to the NULL pointer.  Then you'll
>> know exactly which method you need to add.
>>
>>
>> --
>> D. Richard Hipp
>> d...@sqlite.org
>>
>
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table: misuse error on sqlite3_declare_vtab()

2013-10-04 Thread Simon

The option was already on for my own stuff but not sqlite.  However, I just
had to recompile libSqlite3 with debug option to get more details and it's
a bit scary...

SQL query using sqlite3_exec() and stmt: create virtual table foo using
vtable_module1;
-Calling sqlite3_prepare_v2()...
-Calling sqlite3_step()...
staging.bin: sqlite3.c:103774: vtabCallConstructor: Assertion `xConstruct'
failed.

What's that xConstruct?!  I can't even grep that in the sqlite source
files!  :(

After having browsed around, I think it might actually be a macro (somehow)
which aliases either xCreate or xConnect depending on the situation, but
this is just a guess...  and still, it doesn't really help.  I reviewed my
xCreate and xConnect and they look fine at first glance...

Thanks,
  Simon


On Fri, Oct 4, 2013 at 2:26 PM, Richard Hipp  wrote:

>
>
>
> On Fri, Oct 4, 2013 at 2:22 PM, Simon  wrote:
>
>> Hi Richard,
>>
>> That helped a lot and got me passed that point.  However, I didn't go
>> very far from where I were...
>>
>> It now compiles fine, however the program segfaults within the
>> sqlite3_step() of the "create virtual table foo using vtable_module1;"
>> statement.
>>
>> Using valgrind, I get this interesting message:
>>
>> ==31748== Jump to the invalid address stated on the next line
>> ==31748== at 0x0: ???
>> ==31748== by 0x5A0D119: vtabCallConstructor (in
>> /usr/lib64/libsqlite3.so.0.8.6)
>> ==31748== by 0x5A3D499: sqlite3VdbeExec (in
>> /usr/lib64/libsqlite3.so.0.8.6)
>> ==31748== by 0x5A3E969: sqlite3_step (in /usr/lib64/libsqlite3.so.0.8.6)
>> [...]
>>
>> This looks (to me) like one of the NULL function pointers in struct
>> sqlite3_module is being called.  I added a valid xRename() function as I
>> had initially thought it was optionnal, but still not working.  I then
>> tried creating a dummy function for every one of these pointers, but I get
>> the same result.  I even tried changing the "create virtual..." statement
>> to include module params (which my module ignores) but again it didn't
>> change anything.
>>
>> Where should I look at next?
>>
>>
> Recompile with line-number information (-g is it?) so that you know
> exactly which line of code tried to jump to the NULL pointer.  Then you'll
> know exactly which method you need to add.
>
>
> --
> D. Richard Hipp
> d...@sqlite.org
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table: misuse error on sqlite3_declare_vtab()

2013-10-04 Thread Richard Hipp

On Fri, Oct 4, 2013 at 2:22 PM, Simon  wrote:

> Hi Richard,
>
> That helped a lot and got me passed that point.  However, I didn't go very
> far from where I were...
>
> It now compiles fine, however the program segfaults within the
> sqlite3_step() of the "create virtual table foo using vtable_module1;"
> statement.
>
> Using valgrind, I get this interesting message:
>
> ==31748== Jump to the invalid address stated on the next line
> ==31748== at 0x0: ???
> ==31748== by 0x5A0D119: vtabCallConstructor (in
> /usr/lib64/libsqlite3.so.0.8.6)
> ==31748== by 0x5A3D499: sqlite3VdbeExec (in /usr/lib64/libsqlite3.so.0.8.6)
> ==31748== by 0x5A3E969: sqlite3_step (in /usr/lib64/libsqlite3.so.0.8.6)
> [...]
>
> This looks (to me) like one of the NULL function pointers in struct
> sqlite3_module is being called.  I added a valid xRename() function as I
> had initially thought it was optionnal, but still not working.  I then
> tried creating a dummy function for every one of these pointers, but I get
> the same result.  I even tried changing the "create virtual..." statement
> to include module params (which my module ignores) but again it didn't
> change anything.
>
> Where should I look at next?
>
>
Recompile with line-number information (-g is it?) so that you know exactly
which line of code tried to jump to the NULL pointer.  Then you'll know
exactly which method you need to add.


-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table: misuse error on sqlite3_declare_vtab()

2013-10-04 Thread Simon

Hi Richard,

That helped a lot and got me passed that point.  However, I didn't go very
far from where I were...

It now compiles fine, however the program segfaults within the
sqlite3_step() of the "create virtual table foo using vtable_module1;"
statement.

Using valgrind, I get this interesting message:

==31748== Jump to the invalid address stated on the next line
==31748== at 0x0: ???
==31748== by 0x5A0D119: vtabCallConstructor (in
/usr/lib64/libsqlite3.so.0.8.6)
==31748== by 0x5A3D499: sqlite3VdbeExec (in /usr/lib64/libsqlite3.so.0.8.6)
==31748== by 0x5A3E969: sqlite3_step (in /usr/lib64/libsqlite3.so.0.8.6)
[...]

This looks (to me) like one of the NULL function pointers in struct
sqlite3_module is being called.  I added a valid xRename() function as I
had initially thought it was optionnal, but still not working.  I then
tried creating a dummy function for every one of these pointers, but I get
the same result.  I even tried changing the "create virtual..." statement
to include module params (which my module ignores) but again it didn't
change anything.

Where should I look at next?

Thanks again for your help,
  Simon

On Fri, Oct 4, 2013 at 10:54 AM, Richard Hipp  wrote:

>
>
>
> On Fri, Oct 4, 2013 at 10:07 AM, Simon  wrote:
>
>>
>> 3) I call sqlite3_create_module() with module name "vtable1"
>>
>> 4) I call sqlite3_declare_vtab() with this statement: "CREATE TABLE foo (
>> x
>> integer, y integer );"
>>
>
>
> The application must never invoke sqlite3_declare_vtab().  The
> sqlite3_declare_vtab() interface can only be called from within the xCreate
> and xConnect methods of the virtual table implementation.  If called from
> any other context, sqlite3_declare_vtab() returns the error you describe.
>
> When you run "CREATE VIRTUAL TABLE name USING module;"  the xCreate method
> is called to create the new virtual table.  The xCreate method needs to
> tell the SQLite core what the schema for the virtual table is and it uses
> sqlite3_declare_vtab() to do so.
>
> --
> D. Richard Hipp
> d...@sqlite.org
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table: misuse error on sqlite3_declare_vtab()

2013-10-04 Thread Richard Hipp

On Fri, Oct 4, 2013 at 10:07 AM, Simon  wrote:

>
> 3) I call sqlite3_create_module() with module name "vtable1"
>
> 4) I call sqlite3_declare_vtab() with this statement: "CREATE TABLE foo ( x
> integer, y integer );"
>

The application must never invoke sqlite3_declare_vtab().  The
sqlite3_declare_vtab() interface can only be called from within the xCreate
and xConnect methods of the virtual table implementation.  If called from
any other context, sqlite3_declare_vtab() returns the error you describe.

When you run "CREATE VIRTUAL TABLE name USING module;"  the xCreate method
is called to create the new virtual table.  The xCreate method needs to
tell the SQLite core what the schema for the virtual table is and it uses
sqlite3_declare_vtab() to do so.

-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table: misuse error on sqlite3_declare_vtab()

2013-10-04 Thread Max Vlasov

Simon,

don't know what exactly wrong in your particular case, but I'd suggest
setting debugger breakpoints everywhere in your x-handlers and notice the
moment after which calls are ceased (or you get a error code).

Max


On Fri, Oct 4, 2013 at 6:07 PM, Simon  wrote:

> Hi there,
>
> I'm currently building my own virtual table implementation. I've built a
> dummy vtable that returns the string "1" to all queries for all columns
> but I'm having trouble getting started (getting a 'misuse' error). Can
> someone help me get in the right direction?
>
>
> >
>
> 1) I first create an sqlite3* object in ":memory:". This one works fine.
>
> 2) Allocate and init my derived sqlite3_module* object.
> 2.1) Function pointers I use are Create,Destroy, Connect,Disconnect,
> Open,Close, BestIndex,Filter, Next,Eof, Column,Rowid.
> 2.2) NULL pointers for Update, Begin,Sync,Commit,Rollback,
> FindFunction,Rename, Savepoint,Release,RollbackTo.
> 2.3) *** ? *** Are there any other members not documented I'd need to init?
>
> 3) I call sqlite3_create_module() with module name "vtable1"
>
> 4) I call sqlite3_declare_vtab() with this statement: "CREATE TABLE foo ( x
> integer, y integer );"
>
> --> That one fails with ERROR 21 (lib misuse).
>
> <
>
>
> Did I forget a step? Do them in wrong order? Anything obvious seems wrong
> to you? Where should I go to get more details and fix the issue?
> My current implementation is rather complex and involves C++ and templates
> a lot. Making a "small example" would be highly time-consuming, so I'd
> prefer to avoid going that way if possible.
>
> Anyone can tell me where I'd find a simple functionnal vtable "hello
> world"? The vtable documentation is great but is missing examples, imo.
> Personnally, I think the vtable documentation page could benefit from one
> example project throughout and develop it. Something like a basic CSV
> vtable would probably help the masses a lot! :)
>
> Thanks for your help and time,
> Simon
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] virtual table end of insert

2013-09-11 Thread Simon Slavin

On 12 Sep 2013, at 1:27am, Simon Slavin  wrote:

> I suspect that in your place I would explicitly execute BEGIN and COMMIT, and 
> use xCommit to tell that the update was finished.

Sorry, in case it's not obvious, ignore what I wrote and do whatever Dr Hipp 
wrote.  He knows much better than I.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] virtual table end of insert

2013-09-11 Thread Simon Slavin

On 12 Sep 2013, at 1:16am, E. Timothy Uy  wrote:

> In a virtual table, an insert of multiple rows calls xUpdate multiple
> times. How can I tell when the entire insert is complete?
> 
> e.g.,
> INSERT INTO myvirtualtable (token, rank) VALUES (..,..), (..,..),(..,..)
> 
> will call xUpdate 3 times. But I would like to know when the whole thing is
> done so that I can do something

Does SQLite always call xBegin and xCommit even if you execute UPDATE without 
explicit BEGIN and COMMIT commands ?  No, the documentation suggests it doesn't 
(but I may have misinterpreted).

> or is my only option sending a command?

I suspect that in your place I would explicitly execute BEGIN and COMMIT, and 
use xCommit to tell that the update was finished.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] virtual table end of insert

2013-09-11 Thread Richard Hipp

On Wed, Sep 11, 2013 at 8:16 PM, E. Timothy Uy  wrote:

> In a virtual table, an insert of multiple rows calls xUpdate multiple
> times. How can I tell when the entire insert is complete?
>

I think the xRelease method of the virtual table object gets called when
the statement finishes.


>
> e.g.,
> INSERT INTO myvirtualtable (token, rank) VALUES (..,..), (..,..),(..,..)
>
> will call xUpdate 3 times. But I would like to know when the whole thing is
> done so that I can do something
>
> or is my only option sending a command?
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug!)

2011-08-02 Thread Simon Slavin

On 2 Aug 2011, at 9:06am, Igor Sereda wrote:

> Thanks, Simon - it's a bug then. Hope it will get pulled into the bug
> tracker.

> Simon Slavin-3 wrote:
> 
>> The spec for '<=' should say that comparing any number with NULL always
>> gives a NULL result.  If SQLite is doing anything apart from that, it's a
>> bug.
>> 
>> Okay, here it is: SQL92 8.2 (1) (a):
>> 
>> "If XV or YV is the null value, then "X  Y" is unknown."
>> 
>> In this context, returning 'unknown' means returning NULL.

You know, I might have to withdraw the last part of that.  The SQL92 spec talks 
about truth functions having three possible answers: FALSE, TRUE, and UNKNOWN.  
It treats UNKNOWN as a reserved word in the same way as it treats TRUE, 
contains statements like

"If "R   T" is neither true nor false,
  then it is unknown."

and the truth tables in section 8.12 give results for TRUE, FALSE, and UNKNOWN. 
 And nowhere can I find any statement that says UNKNOWN means NULL.  And I can 
a distinction between the two: NULL means "I don't have that datum" whereas 
UNKNOWN means "I can't work that out".

But I don't think SQLite implements UNKNOWN.  So we need someone intimately 
familiar with the guts of SQLite, which I'm not.

Either way, I don't think it should be possible to come up with a well-ordering 
system that includes NULL, numbers and strings and stays within the SQL 
standard.  The big question is whether any of the big users of SQLite is 
depending on this behaviour.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug!)

2011-08-02 Thread Igor Sereda


Thanks, Simon - it's a bug then. Hope it will get pulled into the bug
tracker.

Igor



Simon Slavin-3 wrote:
> 
> 
> On 2 Aug 2011, at 1:10am, Igor Sereda wrote:
> 
>> To my humble knowledge, operations with NULL have well-defined semantics,
>> both in SQL-you-name-it standards and in SQLite. "A < B" may have three
>> results - TRUE, FALSE and NULL. It doesn't matter whether you can make
>> any
>> sense of it - it's the spec ;)
> 
> The spec for '<=' should say that comparing any number with NULL always
> gives a NULL result.  If SQLite is doing anything apart from that, it's a
> bug.
> 
> Okay, here it is: SQL92 8.2 (1) (a):
> 
> "If XV or YV is the null value, then "X  Y" is unknown."
> 
> In this context, returning 'unknown' means returning NULL.
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Virtual-Table-xBestIndex-and-NULL-Search-Conditions-%28Bug-%29-tp32172549p32175828.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Simon Slavin

On 2 Aug 2011, at 1:10am, Igor Sereda wrote:

> To my humble knowledge, operations with NULL have well-defined semantics,
> both in SQL-you-name-it standards and in SQLite. "A < B" may have three
> results - TRUE, FALSE and NULL. It doesn't matter whether you can make any
> sense of it - it's the spec ;)

The spec for '<=' should say that comparing any number with NULL always gives a 
NULL result.  If SQLite is doing anything apart from that, it's a bug.

Okay, here it is: SQL92 8.2 (1) (a):

"If XV or YV is the null value, then "X  Y" is unknown."

In this context, returning 'unknown' means returning NULL.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Igor Sereda

Simon, Michael -

To my humble knowledge, operations with NULL have well-defined semantics,
both in SQL-you-name-it standards and in SQLite. "A < B" may have three
results - TRUE, FALSE and NULL. It doesn't matter whether you can make any
sense of it - it's the spec ;)

Therefore I'm trying to report a bug here according to guideline at
http://www.sqlite.org/src/wiki?name=Bug+Reports - I would very much like to
hear from SQLite developers whether this report makes sense or if additional
information is needed.

Cheers,
Igor

Simon Slavin-3 wrote:
> 
> 
> On 1 Aug 2011, at 10:45pm, Black, Michael (IS) wrote:
> 
>> If it's meaningless then shouldn't it be a syntax error?
> 
> It's about as meaningless as
> 
> X <= maxreal
> 
> so it would take quite a lot of processing time to identify it as
> meaningless.  Not sure as if it's worth the processing time.  Any decent
> debugging effort should find the problem.
> 
> Simon.
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Virtual-Table-xBestIndex-and-NULL-Search-Conditions-%28Bug-%29-tp32172549p32174172.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Simon Slavin


On 1 Aug 2011, at 10:45pm, Black, Michael (IS) wrote:

> If it's meaningless then shouldn't it be a syntax error?

It's about as meaningless as

X <= maxreal

so it would take quite a lot of processing time to identify it as meaningless.  
Not sure as if it's worth the processing time.  Any decent debugging effort 
should find the problem.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Black, Michael (IS)

If it's meaningless then shouldn't it be a syntax error?





Michael D. Black

Senior Scientist

NG Information Systems

Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Simon Slavin [slav...@bigfraud.org]
Sent: Monday, August 01, 2011 4:43 PM
To: General Discussion of SQLite Database
Subject: EXT :Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions 
(Bug?)


On 1 Aug 2011, at 9:47pm, Igor Sereda wrote:

> So - who else thinks it's a bug?

The SQL standard says 'NULL' means 'I don't know' or 'value missing' or 
something of the kind.  So using a comparison like

X > NULL

doesn't mean anything, since there can't be a well-ordering principle for a 
missing value since there's nothing to compare it to.

By the way, the recommended way to do what the original line of code does seems 
to be to use 'typeof(X)'.  But I don't know how a beginner SQLite user is 
expected to know that.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Simon Slavin


On 1 Aug 2011, at 9:47pm, Igor Sereda wrote:

> So - who else thinks it's a bug?

The SQL standard says 'NULL' means 'I don't know' or 'value missing' or 
something of the kind.  So using a comparison like

X > NULL

doesn't mean anything, since there can't be a well-ordering principle for a 
missing value since there's nothing to compare it to.

By the way, the recommended way to do what the original line of code does seems 
to be to use 'typeof(X)'.  But I don't know how a beginner SQLite user is 
expected to know that.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Igor Sereda


Thanks Jay,

That's a good hint about the origin of the problem. 

However, you refer to the sort order, but the problem is with WHERE
statement. Since numeric comparison  with NULL always evaluates to NULL (see
section 4.0 of the link you gave me), a statement like "SELECT * FROM table
WHERE value > NULL" would return an empty result set on any table --
*always* -- whereas "SELECT * FROM table WHERE value IS NOT NULL" would
return all rows with non-null value. 

That works on the normal tables, but it's probably broken on the virtual
tables in the latest version.

I've checked what's happening on 3.7.4: when parsing "SELECT value FROM
table WHERE value IS NOT NULL", the xBestIndex method receives no
contraints, which, I believe, is the correct thing:

pIdxInfo->nConstraint == 0

So - who else thinks it's a bug?

Cheers
Igor




Jay A. Kreibich-2 wrote:
> 
> On Mon, Aug 01, 2011 at 12:34:33PM -0700, Igor Sereda scratched on the
> wall:
>> 
>> Hello,
>> 
>> I'm seeing strange input given into xBestIndex method of my virtual
>> table.
>> 
>> I'm maintaining sqlite4java wrapper and I'm trying to upgrade it from
>> SQLite
>> 3.7.4 to 3.7.7.1. A couple of failed tests uncovered that there's a
>> problem
>> when searching a simple virtual table with constraints that contain NULL. 
>> 
>> More specifically, the virtual table is declared as follows in xCreate
>> method:
>>
>>  CREATE TABLE x(value INTEGER)
>> 
>> When the following SQL is executed:
>> 
>>  SELECT value FROM table WHERE value IS NOT NULL
>> 
>> , xBestIndex receives the following parameters:
>> 
>> pIdxInfo->nConstraint == 1
>> pIdxInfo->aConstraint[0].usable == 1
>> pIdxInfo->aConstraint[0].iColumn == 0
>> pIdxInfo->aConstraint[0].op == 4 (GT)
>> 
>> So basically the search is going to be for condition "value > ?".
>>
>> When xFilter is called, the value passed is NULL. So instead of searching
>> for "value IS NOT NULL" the module is instructed to search for "value >
>> NULL" - which gives the opposite result.  And when SQL executed is
>> "SELECT
>> value FROM table WHERE value > NULL", all the parameters are identical.
> 
>   All values in SQLite have a consistent sort order.  As section 3.1 of
>   http://sqlite.org/datatype3.html#comparisons shows, NULL is considered
>   to be the "smallest" value.  Hence, "value > NULL" is equivalent to
>   "value IS NOT NULL".
> 
>   It might not be the most obvious logic, but it is the logic used by
>   SQLite and the query optimizer, so it is the logic that needs to be
>   used by any virtual table.
> 
>> This problem did not exist in SQLite 3.7.4.
> 
>   What did earlier versions do?
> 
>> Do I miss something or is this a bug? 
> 
>   I assume it is a change in the query optimizer.  Since this is a
>   legit way to express an IS NOT NULL, it isn't exactly "wrong", just
>   different.
> 
>-j
> 
> -- 
> Jay A. Kreibich < J A Y  @  K R E I B I.C H >
> 
> "Intelligence is like underwear: it is important that you have it,
>  but showing it to the wrong people has the tendency to make them
>  feel uncomfortable." -- Angela Johnson
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Virtual-Table-xBestIndex-and-NULL-Search-Conditions-%28Bug-%29-tp32172549p32173021.html
Sent from the SQLite mailing list archive at Nabble.com.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table xBestIndex and NULL Search Conditions (Bug?)

2011-08-01 Thread Jay A. Kreibich

On Mon, Aug 01, 2011 at 12:34:33PM -0700, Igor Sereda scratched on the wall:
> 
> Hello,
> 
> I'm seeing strange input given into xBestIndex method of my virtual table.
> 
> I'm maintaining sqlite4java wrapper and I'm trying to upgrade it from SQLite
> 3.7.4 to 3.7.7.1. A couple of failed tests uncovered that there's a problem
> when searching a simple virtual table with constraints that contain NULL. 
> 
> More specifically, the virtual table is declared as follows in xCreate
> method:
>
>  CREATE TABLE x(value INTEGER)
> 
> When the following SQL is executed:
> 
>  SELECT value FROM table WHERE value IS NOT NULL
> 
> , xBestIndex receives the following parameters:
> 
> pIdxInfo->nConstraint == 1
> pIdxInfo->aConstraint[0].usable == 1
> pIdxInfo->aConstraint[0].iColumn == 0
> pIdxInfo->aConstraint[0].op == 4 (GT)
> 
> So basically the search is going to be for condition "value > ?".
>
> When xFilter is called, the value passed is NULL. So instead of searching
> for "value IS NOT NULL" the module is instructed to search for "value >
> NULL" - which gives the opposite result.  And when SQL executed is "SELECT
> value FROM table WHERE value > NULL", all the parameters are identical.

  All values in SQLite have a consistent sort order.  As section 3.1 of
  http://sqlite.org/datatype3.html#comparisons shows, NULL is considered
  to be the "smallest" value.  Hence, "value > NULL" is equivalent to
  "value IS NOT NULL".

  It might not be the most obvious logic, but it is the logic used by
  SQLite and the query optimizer, so it is the logic that needs to be
  used by any virtual table.

> This problem did not exist in SQLite 3.7.4.

  What did earlier versions do?

> Do I miss something or is this a bug? 

  I assume it is a change in the query optimizer.  Since this is a
  legit way to express an IS NOT NULL, it isn't exactly "wrong", just
  different.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table and the xBestIndex

2011-02-26 Thread Dariusz Matkowski

Hi all,
Where can I get an example of xFilter implementation? The issue I am having 
here is that when I am trying to execute a "select * from tablex where value=1" 
all I get is the last result from the file (336cc2282716::1).


int my_filter( sqlite3_vtab_cursor* pCursor, int idxNum, const char *idxStr,
int argc, sqlite3_value **argv )
{
csv_cursor * cur = (csv_cursor *) pCursor;
csv_vtab   * tab = (csv_vtab *) cur->cursor.pVtab;

csvfile_rewind( tab->csv );
csvfile_read( tab->csv );   /* skip first row with column names */
cur->csvpos = csvfile_tell( tab->csv );
csvfile_read( tab->csv );   /* skip first row with column names */

return SQLITE_OK;
}

***first row in the file not important
ed527dc53686::1
336cc2282716::1
41f955b0::0
00225828815a::1
00271368ace9::0
f0def10e9046::0
70b1e68e9e67::0
ed527dc53686::1
336cc2282716::1

When I do "select * from tablex" I get all the results fine. It is only when I 
use the where clause is that I get strange results.


Any hints will be greatly appreciated.

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Jay A. Kreibich
Sent: Friday, February 25, 2011 11:00 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Virtual Table and the xBestIndex

On Fri, Feb 25, 2011 at 09:30:00PM -0500, Dariusz Matkowski scratched on the 
wall:

> I am trying to implement a simple virtual table that reads in a file
> with two columns and "::" separator. My question is: is it necessary
> to implement xBestIndex to run a simple where clause queries?

  You must implement a xBestIndex() function, but it does not need to
  do anything other than return SQLITE_OK.  This will result in a full
  table scan for each query, but for something like a flat file (where
  you cannot easily jump to specific records) that's usually not a
  significant limitation.

-j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

-
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table and the xBestIndex

2011-02-26 Thread Dariusz Matkowski

Thank you very much. Now that you cleared that for me I need to look in my code 
why simple query like "select * from table x" works fine but as soon as I call 
"select * from table x where value=1" is only returning the last line from the 
file.

Thank you
Darek

-Original Message-
From: sqlite-users-boun...@sqlite.org [mailto:sqlite-users-boun...@sqlite.org] 
On Behalf Of Jay A. Kreibich
Sent: Friday, February 25, 2011 11:00 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Virtual Table and the xBestIndex

On Fri, Feb 25, 2011 at 09:30:00PM -0500, Dariusz Matkowski scratched on the 
wall:

> I am trying to implement a simple virtual table that reads in a file
> with two columns and "::" separator. My question is: is it necessary
> to implement xBestIndex to run a simple where clause queries?

  You must implement a xBestIndex() function, but it does not need to
  do anything other than return SQLITE_OK.  This will result in a full
  table scan for each query, but for something like a flat file (where
  you cannot easily jump to specific records) that's usually not a
  significant limitation.

-j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

-
This transmission (including any attachments) may contain confidential 
information, privileged material (including material protected by the 
solicitor-client or other applicable privileges), or constitute non-public 
information. Any use of this information by anyone other than the intended 
recipient is prohibited. If you have received this transmission in error, 
please immediately reply to the sender and delete this information from your 
system. Use, dissemination, distribution, or reproduction of this transmission 
by unintended recipients is not authorized and may be unlawful.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual Table and the xBestIndex

2011-02-25 Thread Jay A. Kreibich

On Fri, Feb 25, 2011 at 09:30:00PM -0500, Dariusz Matkowski scratched on the 
wall:

> I am trying to implement a simple virtual table that reads in a file
> with two columns and "::" separator. My question is: is it necessary
> to implement xBestIndex to run a simple where clause queries?

  You must implement a xBestIndex() function, but it does not need to
  do anything other than return SQLITE_OK.  This will result in a full
  table scan for each query, but for something like a flat file (where
  you cannot easily jump to specific records) that's usually not a
  significant limitation.

-j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table optimization using xBestIndex/xFilter

2010-12-20 Thread Sören Brunk

Thanks Jay and Roger, for your explanations and ideas.

I did't think of passing my own data structure to xFilter using idxStr before, 
but in my case that seems to be the easiest way.

In xBestIndex I'm allocating enough memory to hold information about all 
usable constraints, using sqlite3_malloc. Then I'm packing a struct containing 
iColumn and op into that memory for each usable constraint and I'm using
idxinfo->aConstraintUsage[x].argvIndex to get the right-hand side value for 
each constraint in the same order as the constraint information in xFilter.
Finally I'm setting needToFreeIdxStr to make sure all memory is freed.
It seems to work fine that way.

Regards,
Sören
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table optimization using xBestIndex/xFilter

2010-12-19 Thread Jay A. Kreibich

On Sun, Dec 19, 2010 at 04:43:05AM -0800, Roger Binns scratched on the wall:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 12/19/2010 04:21 AM, Sören Brunk wrote:
> > I'm wondering if there is any way to pass that additional information to
> > xFilter besides encoding it into idxNum/idxStr somehow.
> 
> That is the mechanism to use.  Remember that internally SQLite uses only
> one index, hence xBestIndex to find "the" index and xFilter not needing
> much extra information to work with the selected index.

  The "one-index-per-table-per-query" rule of thumb is a limitation on
  the B-Tree storage system and standard indexes, not SQL or SQLite
  inherently.

  Because virtual tables use their own storage systems (by layering
  together multiple standard tables, or using some external data source),
  the rule does not apply unless the VT chooses to make it apply in its 
  xBestIndex() function.  For example, the whole point of the R-Tree
  virtual table is that it can use multiple internal "indexes" to
  quickly find what it is looking for.

  The name of the xBestIndex() function is also a bit confusing.  It is
  really about processing query constraints, not indexes.  It just
  happens that, within SQL, the standard tool to efficiently process
  constraints is to use indexes.  But it is really about constraints,
  and if you're dealing with a VT that might have multiple shadow tables
  (possibly with multiple indexes) or is using an external data store,
  there is really no say about how many "indexes" may or may not be used
  to process a series of constraints within a single query upon a VT.

  What it all boils down to is using constraint information to
  short-cut as much data processing as possible, which is more or less
  saying that every performance-related VT needs to implement its own
  custom optimizer.  That can get pretty complex, but that's why many
  VTs tend to be target a very specific style of data storage and
  query, and why the best known VTs are products like FTS and R-Tree.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table optimization using xBestIndex/xFilter

2010-12-19 Thread Jay A. Kreibich

On Sun, Dec 19, 2010 at 01:21:03PM +0100, Sören Brunk scratched on the wall:

> In xBestIndex, each constraint contains the column on the left-hand side
> (iColumn) and the operator (op). If I understood correctly, xFilter gets
> the value on the right-hand side of each constraint you chose to use.

  Correct.

> But I need to know not only that value in xFilter, but also the column
> on the left-hand side and the operator.

  Column indexes are not passed into xFilter().  From inside xFilter(),
  there is no way to match a given value to a specific column.  You
  have to know the order that the column values that are passed in.

  That ordering is determined by your code in xBestIndex().  Inside
  xBestIndex(), the values you set in 
  idxinfo->aConstraintUsage[x].argvIndex will determine where the
  idxinfo->aConstraint[x].iColumn value is passed into xFilter().

  So the two functions must agree on how and where all those values are
  setup, or you must somehow convey the information via idxNum/idxStr.

> I'm wondering if there is any way to pass that additional information to
> xFilter besides encoding it into idxNum/idxStr somehow.

  Not easily.

  There may be multiple tables in-flight at a given time, so any kind
  of global storage outside the function parameters can get quite
  tricky.  By far, the easiest thing is to pack everything into those
  two parameters.
  
  xBestIndex() can be called multiple times to test multiple query
  plans.  This means anything you allocate inside of xBestIndex() may
  get abandoned.  Again, without some complex logic to deal with
  external memory stores, the safest thing to do is single allocations
  with sqlite3_malloc() that are passed via idxStr.  If the memory is
  allocated with sqlite3_malloc(), then SQLite will properly deallocate
  any abandoned query plans.  But the allocation must be one big chunk
  (it cannot be a multi-level or "deep" data structure with pointers to
  other data structures) because SQLite will only call sqlite3_free()
  on the top level idxStr pointer.

  As explained on page 267, the most typical usage is for "internal"
  modules that are using normal SQL statements on "shadow tables" to
  implement the virtual table.  In that case, xBestIndex() can build
  SQL statement strings, complete with indexed SQL statement
  parameters, such as "?1", "?2", etc.  These SQL command strings can
  be passed via idxStr, and the parameter indexes can be matched to
  table column indexes by setting the argvIndex numbers.  Then,
  xFilter() only need to prepare the given idxStr command strings and
  bind argv[0] to parameter index 1, argv[1] to parameter index 2,
  and so on (remember that parameter indexes start with 1), and run
  the statements.  In that situation, most of the actual logic happens
  in xBestIndex().

  This doesn't work so well with "external" modules, where you often
  need to pass a great deal more state than a simple string.  If you
  need to pass a data structure, you can use the idxStr pointer.  Just
  remember that you should allocate the structure as a single call to
  sqlite3_malloc().  That means keeping the data structure "flat",
  either by using static sizes or hand-packing the memory.

  You also can't bind every column value to its own column index via the
  argvIndex (that is, make all the argv[] indexes match the column
  indexes), because the argvIndex values are not allowed to have gaps.

  I'd love to point you at some examples, but I'm not sure there are
  any significant "external" style modules that provide solid examples
  of xBestIndex() and xFilter().  External modules are typically fairly
  custom in nature, and I'm not sure there is a very established design
  pattern, as there is with "internal" style modules.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"Intelligence is like underwear: it is important that you have it,
 but showing it to the wrong people has the tendency to make them
 feel uncomfortable." -- Angela Johnson
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table optimization using xBestIndex/xFilter

2010-12-19 Thread Roger Binns

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/19/2010 04:21 AM, Sören Brunk wrote:
> I'm wondering if there is any way to pass that additional information to
> xFilter besides encoding it into idxNum/idxStr somehow.

That is the mechanism to use.  Remember that internally SQLite uses only
one index, hence xBestIndex to find "the" index and xFilter not needing
much extra information to work with the selected index.

You could probably do something like have the hex address of a data
structure as the string.

Roger
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk0N/dIACgkQmOOfHg372QSY8ACeJ3C7ZVopFCqSuVHTEiMlXsXc
7uoAnRhWMsGj2hUulf8RlRYcr001fXc5
=6r4Y
-END PGP SIGNATURE-
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table sample code

2008-06-24 Thread Alex Katebi

In sqlite source code. src/test_schema.c

On Tue, Jun 24, 2008 at 4:57 PM, <[EMAIL PROTECTED]> wrote:

> Is there any example source code available that demonstrates a working
> virtual table implementation?
>
> Thanks
> Dan Winslow, GamePlan
> 402-991-5875 x219
> [EMAIL PROTECTED]
>
> Third Nerd from the left,Technology Defenestration Office
>
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-03 Thread Scott Hess

You can _generally_ work around this kind of problem using sub-selects. So 
instead of:

SELECT x FROM fts_table WHERE fts_table MATCH "y" OR fts_table MATCH "z";

You might write:

SELECT x FROM fts_table WHERE rowid IN (SELECT rowid FROM fts_table WHERE 
fts_table MATCH "y") OR rowid IN (SELECT rowid FROM fts_table WHERE 
fts_table MATCH "z");

Of course, you could probably phrase it eleven different ways, I'm not 
suggesting that the above is the most efficient way, just that you can 
sometimes perturb your query to make things work better :-). Note that the 
above change would work best if you don't expect many hits.

Also, I'm making no representation about the state of SQLite's ability to 
optimize the original query (you'd rather have written it fts_table MATCH "y 
OR z" from the get-go). But if the original poster is already thinking of 
using a virtual table to give access to masses of external data, writing 
queries in this somewhat more complicated fashion may make a lot of sense.

-scott



2008/4/1 Ben Harper <[EMAIL PROTECTED]>:
> The only limitation imposed by SQL that I can think of would be
> inefficiencies in its query plan builder. That is the part that
> figures out how to use the indexes available in the database in order
> to execute the SQL query most efficiently. So it really depends on
> what type of SQL queries you are going to be running against this huge
> DB.
> The dangers are easy to evaluate:
> Create a quick-and-dirty dummy virtual table mechanism, and respond to
> the xBestIndex/xFilter functions. You will notice that for certain
> queries, xBestIndex/xFilter does not get used. That means that the
> SQLite engine is going to have to walk through your entire table,
> evaluating the conditions on each field. This is obviously what you
> wish to avoid. As an example, I noticed briefly (I did not investigate
> thoroughly) that having an OR condition in a query would prevent the
> indexes from being used. That was some time ago, and it was before the
> rewrite of the SQL VM, so I don't know if that still applies. You'll
> have to investigate your potential queries yourself. A simple query
> such as "WHERE myvalue > 100" should definitely invoke the use of your
> own indexes.
> 
> Ben
> 
> 
> 
> 
> On Tue, Apr 1, 2008 at 10:12 PM, Aladdin Lampé <[EMAIL PROTECTED]> 
wrote:
> >
> > Hi all!
> >
> > Very often, when people ask this list why they have trouble managing in 
sqlite a "big table" (50 million lines or more than 10 Go), they are told 
that sqlite is an embedded database and is not meant to be used for very big 
databases/tables.
> >
> > I'm currently in the process of designing a specific, read-only, sqlite 
"virtual table" in order to enable sqlite to access data stored in an 
external database which is specially designed to handle very big tables.
> >
> > My final objective is to be able to easily query a big external table 
(stored in another database) through the - excellent - sqlite interface.
> >
> > Now I have this terrible doubt: will the existing sqlite "limitations" 
for big sqlite tables also apply to my read-only virtual tables?
> >
> > Thus... am I currently losing my time developing such a "virtual table" 
with this objective in mind? Or is there a better way to achieve my 
objective?
> >
> > Thank you for your help!
> >
> > _
> > Votre contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte 
gratuitement !
> > http://www.windowslive.fr/hotmail/default.asp
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> 
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-02 Thread Mike Owens

On Wed, Apr 2, 2008 at 11:59 AM, Evans, Mark (Tandem) <[EMAIL PROTECTED]> wrote:

>  One other consideration:  If the query or update has to walk a large
>  range of rows, there's no way for the core to tell the VTM
>  that it's done accessing a given row as it sweeps the cursor
>  forward.  You can end up with a huge number of virtual table
>  rows in memory.

I was under the impression that the very act of sweeping the cursor
forward via xNext() means that the previous row is no longer needed,
therefore the VT can safely deallocate any resources associated with
the previous row. I would think that a VT need not ever end up with
any old/stale rows in memory, unless of course its particular
implementation called for it (e.g. caching old rows).

-- Mike



>  Regards,
>  Mark
>
>
>
>
>  > -Original Message-
>  > From: [EMAIL PROTECTED]
>  > [mailto:[EMAIL PROTECTED] On Behalf Of Ben Harper
>  > Sent: Tuesday, April 01, 2008 4:53 PM
>  > To: General Discussion of SQLite Database
>  > Subject: Re: [sqlite] Virtual table used to query big
>  > external database
>  >
>
>
> > The only limitation imposed by SQL that I can think of would
>  > be inefficiencies in its query plan builder. That is the part
>  > that figures out how to use the indexes available in the
>  > database in order to execute the SQL query most efficiently.
>  > So it really depends on what type of SQL queries you are
>  > going to be running against this huge DB.
>  > The dangers are easy to evaluate:
>  > Create a quick-and-dirty dummy virtual table mechanism, and
>  > respond to the xBestIndex/xFilter functions. You will notice
>  > that for certain queries, xBestIndex/xFilter does not get
>  > used. That means that the SQLite engine is going to have to
>  > walk through your entire table, evaluating the conditions on
>  > each field. This is obviously what you wish to avoid. As an
>  > example, I noticed briefly (I did not investigate
>  > thoroughly) that having an OR condition in a query would
>  > prevent the indexes from being used. That was some time ago,
>  > and it was before the rewrite of the SQL VM, so I don't know
>  > if that still applies. You'll have to investigate your
>  > potential queries yourself. A simple query such as "WHERE
>  > myvalue > 100" should definitely invoke the use of your own indexes.
>  >
>  > Ben
>  >
>  >
>  > On Tue, Apr 1, 2008 at 10:12 PM, Aladdin Lampé
>  > <[EMAIL PROTECTED]> wrote:
>  > >
>  > > Hi all!
>  > >
>  > > Very often, when people ask this list why they have trouble
>  > managing in sqlite a "big table" (50 million lines or more
>  > than 10 Go), they are told that sqlite is an embedded
>  > database and is not meant to be used for very big databases/tables.
>  > >
>  > > I'm currently in the process of designing a specific,
>  > read-only, sqlite "virtual table" in order to enable sqlite
>  > to access data stored in an external database which is
>  > specially designed to handle very big tables.
>  > >
>  > > My final objective is to be able to easily query a big
>  > external table (stored in another database) through the -
>  > excellent - sqlite interface.
>  > >
>  > > Now I have this terrible doubt: will the existing sqlite
>  > "limitations" for big sqlite tables also apply to my
>  > read-only virtual tables?
>  > >
>  > > Thus... am I currently losing my time developing such a
>  > "virtual table" with this objective in mind? Or is there a
>  > better way to achieve my objective?
>  > >
>  > > Thank you for your help!
>  > >
>  > > _
>  > > Votre contact a choisi Hotmail, l'e-mail ultra sécurisé.
>  > Créez un compte gratuitement !
>  > > http://www.windowslive.fr/hotmail/default.asp
>  > > ___
>  > > sqlite-users mailing list
>  > > sqlite-users@sqlite.org
>  > > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>  > >
>  > ___
>  > sqlite-users mailing list
>  > sqlite-users@sqlite.org
>  > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>  >
>  ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-02 Thread Evans, Mark (Tandem)

Hi Ben,

You said:   "You will notice
that for certain queries, xBestIndex/xFilter does not get
used. That means that the SQLite engine is going to have to
walk through your entire table, evaluating the conditions on
each field."

I am not aware of this behavior.  Could you cite an example?

I thought xOpen and xFilter always kick off a walk over a
range of virtual table rows.  Prerequisite of xOpen is index
id returned by xBestIndex.

So you can't just walk a virtual table without xBestIndex
having been called, unless I missed a boat somewhere

It is true that xOpen/xFilter/xClose can be called multiple
times for an index returned by xBestIndex.  So if you allocate
any objects in xBestIndex, you can't let go of them in xClose.
Memory leak is unavoidable unless you modify the virtual table
interface as I had to do to pass additional information in the
calls (prepared statement handle).

When I have more time, I'd like to suggest changes to the
virtual table call interface to facilitate resource management
in the VT  module.  This doesn't come up with the FTS
VT modules because they seem to always return a small index
to a static structure.

One other consideration:  If the query or update has to walk a large
range of rows, there's no way for the core to tell the VTM
that it's done accessing a given row as it sweeps the cursor
forward.  You can end up with a huge number of virtual table
rows in memory.

Regards,
Mark



> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Ben Harper
> Sent: Tuesday, April 01, 2008 4:53 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] Virtual table used to query big
> external database
>
> The only limitation imposed by SQL that I can think of would
> be inefficiencies in its query plan builder. That is the part
> that figures out how to use the indexes available in the
> database in order to execute the SQL query most efficiently.
> So it really depends on what type of SQL queries you are
> going to be running against this huge DB.
> The dangers are easy to evaluate:
> Create a quick-and-dirty dummy virtual table mechanism, and
> respond to the xBestIndex/xFilter functions. You will notice
> that for certain queries, xBestIndex/xFilter does not get
> used. That means that the SQLite engine is going to have to
> walk through your entire table, evaluating the conditions on
> each field. This is obviously what you wish to avoid. As an
> example, I noticed briefly (I did not investigate
> thoroughly) that having an OR condition in a query would
> prevent the indexes from being used. That was some time ago,
> and it was before the rewrite of the SQL VM, so I don't know
> if that still applies. You'll have to investigate your
> potential queries yourself. A simple query such as "WHERE
> myvalue > 100" should definitely invoke the use of your own indexes.
>
> Ben
>
>
> On Tue, Apr 1, 2008 at 10:12 PM, Aladdin Lampé
> <[EMAIL PROTECTED]> wrote:
> >
> > Hi all!
> >
> > Very often, when people ask this list why they have trouble
> managing in sqlite a "big table" (50 million lines or more
> than 10 Go), they are told that sqlite is an embedded
> database and is not meant to be used for very big databases/tables.
> >
> > I'm currently in the process of designing a specific,
> read-only, sqlite "virtual table" in order to enable sqlite
> to access data stored in an external database which is
> specially designed to handle very big tables.
> >
> > My final objective is to be able to easily query a big
> external table (stored in another database) through the -
> excellent - sqlite interface.
> >
> > Now I have this terrible doubt: will the existing sqlite
> "limitations" for big sqlite tables also apply to my
> read-only virtual tables?
> >
> > Thus... am I currently losing my time developing such a
> "virtual table" with this objective in mind? Or is there a
> better way to achieve my objective?
> >
> > Thank you for your help!
> >
> > _
> > Votre contact a choisi Hotmail, l'e-mail ultra sécurisé.
> Créez un compte gratuitement !
> > http://www.windowslive.fr/hotmail/default.asp
> > ___
> > sqlite-users mailing list
> > sqlite-users@sqlite.org
> > http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> >
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-02 Thread Mike Owens

Here's a link to the source:

ftp://66.77.27.238/sourcecode/ddj/2007/0712.zip

BTW, I totally missed the new bitvec recently introduced in the latest
version to track dirty pages. This pretty much removes the previous
constraints I mentioned on large databases, reducing memory
consumption to proportional to transaction size (actual dirty pages).
So this would seem to raise the practical limits significantly,
perhaps into the TBs for many operations (e.g. SELECTs).

-- Mike

On Wed, Apr 2, 2008 at 3:57 AM, Aladdin Lampé <[EMAIL PROTECTED]> wrote:
>
>  Thanks a lot Mike for your detailed answer! I've just read your acticle.
>  Do you know where I can get the files "fs.c" and "example.c" you mention? 
> I've been on the FTP (available from www.ddj.com/code/) but I was not able to 
> locate any file related to your article :( Could you please send them to me 
> by email if they are no longer available online? (genio570 -AT- hotmail DOT 
> fr).
>
>  It seems like I will be able to further develop the system I described in my 
> previous post with the information you mentioned.
>
>  Another quick question: I planned to develop my own indexing algorithm on 
> the "external database" side. But do you think I could also consider (as 
> another option) using the "CREATE INDEX" sqlite command (involving of course 
> a complete scan of the external table's column and then storing the index 
> inside the sqlite file)? What would you recommend me to do?
>
>  In a more general way, I wonder which sqlite SQL commands can involve 
> virtual tables and which cannot or should not (for instance for performance 
> reasons, etc.).
>
>  Thanks again for your help,
>  aladdin> Date: Tue, 1 Apr 2008 20:17:54 -0500> From: [EMAIL PROTECTED]> To: 
> sqlite-users@sqlite.org> Subject: Re: [sqlite] Virtual table used to query 
> big external database> > The main reason why SQLite's practical limit is in 
> the 10s of GBs as> opposed to TBs (theoretical) is due to how it tracks dirty 
> pages. This> is described in the "Appropriate Uses" page> 
> (http://www.sqlite.org/whentouse.html) but I'll rehash it here for> 
> convenience. SQLite tracks dirty pages with a bitmap which is> allocated 
> before each transaction. The size of the bitmap is> proportional to the size 
> (not in rows but in pages) of the database> (256 bytes for every 1Mb of 
> database), so as the database grows, the> amount of memory allocated before 
> each transaction grows. When you get> into the GB range, you are starting to 
> allocate in the MB range of> dirty page map memory per transaction, which 
> starts to take its toll> on performance.> > I could be wrong, but from what I 
> know about virtual tables, there is> no such correlation between virtual 
> table size and the dirty page> bitmap, as SQLite has no idea how big a 
> virtual table is, nor does it> manage the data within the vtable. 
> Furthermore, all SQLite really does> in a SELECT statement on a vtable is 
> call your code to iterate over> it. So really the only performance issue is 
> how long it takes your> code to iterate over your vtable. Thus, your table 
> could be in the TB> range, and as long as you are fine with iterating over 
> its contents,> there is no additional performance issues to speak of. There 
> are ways> to implement virtual tables such that you can limit how much of 
> the> table is scanned for certain queries, avoiding having to scan the> whole 
> thing every time. I wrote an article that touches on this using> the match() 
> function. Its available online:> > http://www.ddj.com/database/202802959> > 
> IMO, virtual tables are one of the most powerful and unique features> of 
> SQLite. There is a bit of a learning curve, but it's amazing what> you can do 
> with them. It sounds like you going to have to iterate over> your external 
> table one way or the other. I see no reason why the> vtable approach would be 
> any slower than any other approach that> iterates over the data.> > Having 
> said that, while iterating over a large vtable is not a big> deal (as your 
> program will just step through it one row at a time),> you need to be careful 
> about getting too fancy with your SQL as you> may end up triggering a lot of 
> background IO. For example, if you tack> on an ORDER BY which sorts one of 
> the columns of your vtable, SQLite> will end up essentially copying the 
> vtable contents into a temporary> file and sorting it, which may or may not 
> be a strain on your system> depending on how big your table is (e.g. your 
> vtable is 30Gb and your> /tmp fold

Re: [sqlite] Virtual table used to query big external database

2008-04-02 Thread Aladdin Lampé


Thank you. That's exactly the mistake I did :-)> Date: Wed, 2 Apr 2008 09:11:44 
-0400> From: [EMAIL PROTECTED]> To: sqlite-users@sqlite.org> Subject: Re: 
[sqlite] Virtual table used to query big external database> > Hi, Aladin,> > 
Regarding: > "Do you know where I can get the files "fs.c" and "example.c" you> 
mention? [ http://www.ddj.com/database/202802959 ] I've been on the FTP> 
(available from www.ddj.com/code/) but I was not able to locate..."> > > > The 
code appears to be in the December 2007 (0712.zip) file, and not the> 0711.zip 
as one might think.> > > > > > This email and any attachments have been scanned 
for known viruses using multiple scanners. We believe that this email and any 
attachments are virus free, however the recipient must take full responsibility 
for virus checking. > This email message is intended for the named recipient 
only. It may be privileged and/or confidential. If you are not the named 
recipient of this email please notify us immediately and do not copy it or use 
it for any purpose, nor disclose its contents to any other person.> 
___> sqlite-users mailing list> 
sqlite-users@sqlite.org> 
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_
Découvrez les profils Messenger de vos amis !
http://home.services.spaces.live.com/
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-02 Thread Griggs, Donald

Hi, Aladin,

Regarding:  
   "Do you know where I can get the files "fs.c" and "example.c" you
mention? [ http://www.ddj.com/database/202802959 ] I've been on the FTP
(available from www.ddj.com/code/) but I was not able to locate..."



The code appears to be in the December 2007 (0712.zip) file, and not the
0711.zip as one might think.





This email and any attachments have been scanned for known viruses using 
multiple scanners. We believe that this email and any attachments are virus 
free, however the recipient must take full responsibility for virus checking. 
This email message is intended for the named recipient only. It may be 
privileged and/or confidential. If you are not the named recipient of this 
email please notify us immediately and do not copy it or use it for any 
purpose, nor disclose its contents to any other person.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-02 Thread Aladdin Lampé


Thanks a lot Mike for your detailed answer! I've just read your acticle.
Do you know where I can get the files "fs.c" and "example.c" you mention? I've 
been on the FTP (available from www.ddj.com/code/) but I was not able to locate 
any file related to your article :( Could you please send them to me by email 
if they are no longer available online? (genio570 -AT- hotmail DOT fr).
 
It seems like I will be able to further develop the system I described in my 
previous post with the information you mentioned.
 
Another quick question: I planned to develop my own indexing algorithm on the 
"external database" side. But do you think I could also consider (as another 
option) using the "CREATE INDEX" sqlite command (involving of course a complete 
scan of the external table's column and then storing the index inside the 
sqlite file)? What would you recommend me to do?
 
In a more general way, I wonder which sqlite SQL commands can involve virtual 
tables and which cannot or should not (for instance for performance reasons, 
etc.).
 
Thanks again for your help,
aladdin> Date: Tue, 1 Apr 2008 20:17:54 -0500> From: [EMAIL PROTECTED]> To: 
sqlite-users@sqlite.org> Subject: Re: [sqlite] Virtual table used to query big 
external database> > The main reason why SQLite's practical limit is in the 10s 
of GBs as> opposed to TBs (theoretical) is due to how it tracks dirty pages. 
This> is described in the "Appropriate Uses" page> 
(http://www.sqlite.org/whentouse.html) but I'll rehash it here for> 
convenience. SQLite tracks dirty pages with a bitmap which is> allocated before 
each transaction. The size of the bitmap is> proportional to the size (not in 
rows but in pages) of the database> (256 bytes for every 1Mb of database), so 
as the database grows, the> amount of memory allocated before each transaction 
grows. When you get> into the GB range, you are starting to allocate in the MB 
range of> dirty page map memory per transaction, which starts to take its toll> 
on performance.> > I could be wrong, but from what I know about virtual tables, 
there is> no such correlation between virtual table size and the dirty page> 
bitmap, as SQLite has no idea how big a virtual table is, nor does it> manage 
the data within the vtable. Furthermore, all SQLite really does> in a SELECT 
statement on a vtable is call your code to iterate over> it. So really the only 
performance issue is how long it takes your> code to iterate over your vtable. 
Thus, your table could be in the TB> range, and as long as you are fine with 
iterating over its contents,> there is no additional performance issues to 
speak of. There are ways> to implement virtual tables such that you can limit 
how much of the> table is scanned for certain queries, avoiding having to scan 
the> whole thing every time. I wrote an article that touches on this using> the 
match() function. Its available online:> > 
http://www.ddj.com/database/202802959> > IMO, virtual tables are one of the 
most powerful and unique features> of SQLite. There is a bit of a learning 
curve, but it's amazing what> you can do with them. It sounds like you going to 
have to iterate over> your external table one way or the other. I see no reason 
why the> vtable approach would be any slower than any other approach that> 
iterates over the data.> > Having said that, while iterating over a large 
vtable is not a big> deal (as your program will just step through it one row at 
a time),> you need to be careful about getting too fancy with your SQL as you> 
may end up triggering a lot of background IO. For example, if you tack> on an 
ORDER BY which sorts one of the columns of your vtable, SQLite> will end up 
essentially copying the vtable contents into a temporary> file and sorting it, 
which may or may not be a strain on your system> depending on how big your 
table is (e.g. your vtable is 30Gb and your> /tmp folder is on a 10Gb 
partition). So think through what you are> doing when going beyond a simple 
SELECT * from big_vtable.> > -- Mike> > On Tue, Apr 1, 2008 at 3:12 PM, Aladdin 
Lampé <[EMAIL PROTECTED]> wrote:> >> > Hi all!> >> > Very often, when people 
ask this list why they have trouble managing in sqlite a "big table" (50 
million lines or more than 10 Go), they are told that sqlite is an embedded 
database and is not meant to be used for very big databases/tables.> >> > I'm 
currently in the process of designing a specific, read-only, sqlite "virtual 
table" in order to enable sqlite to access data stored in an external database 
which is specially designed to handle very big tables.> >> > My final objective 
is to be able to easily query a big external table

Re: [sqlite] Virtual table used to query big external database

2008-04-01 Thread Jay A. Kreibich

On Tue, Apr 01, 2008 at 08:17:54PM -0500, Mike Owens scratched on the wall:
> The main reason why SQLite's practical limit is in the 10s of GBs as
> opposed to TBs (theoretical) is due to how it tracks dirty pages. 
> [...]
> SQLite tracks dirty pages with a bitmap which is
> allocated before each transaction. The size of the bitmap is
> proportional to the size (not in rows but in pages) of the database

  Doesn't the "bitvec" structure introduced in 3.5.7 help with that
  somewhat?  I was under the impression it allowed for sparse
  bitmaps, which seems like it would be the common case for INSERTs,
  UPDATEs, and DELETEs.

  I suppose a huge delete or a DROP [TABLE|INDEX] might still cause the
  bitmap to get pretty big, but those are more unusual operations.

   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
   - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-01 Thread Mike Owens

The main reason why SQLite's practical limit is in the 10s of GBs as
opposed to TBs (theoretical) is due to how it tracks dirty pages. This
is described in the "Appropriate Uses" page
(http://www.sqlite.org/whentouse.html) but I'll rehash it here for
convenience. SQLite tracks dirty pages with a bitmap which is
allocated before each transaction. The size of the bitmap is
proportional to the size (not in rows but in pages) of the database
(256 bytes for every 1Mb of database), so as the database grows, the
amount of memory allocated before each transaction grows. When you get
into the GB range, you are starting to allocate in the MB range of
dirty page map memory per transaction, which starts to take its toll
on performance.

I could be wrong, but from what I know about virtual tables, there is
no such correlation between virtual table size and the dirty page
bitmap, as SQLite has no idea how big a virtual table is, nor does it
manage the data within the vtable. Furthermore, all SQLite really does
in a SELECT statement on a vtable is call your code to iterate over
it. So really the only performance issue is how long it takes your
code to iterate over your vtable. Thus, your table could be in the TB
range, and as long as you are fine with iterating over its contents,
there is no additional performance issues to speak of. There are ways
to implement virtual tables such that you can limit how much of the
table is scanned for certain queries, avoiding having to scan the
whole thing every time. I wrote an article that touches on this using
the match() function. Its available online:

http://www.ddj.com/database/202802959

IMO, virtual tables are one of the most powerful and unique features
of SQLite. There is a bit of a learning curve, but it's amazing what
you can do with them. It sounds like you going to have to iterate over
your external table one way or the other. I see no reason why the
vtable approach would be any slower than any other approach that
iterates over the data.

Having said that, while iterating over a large vtable is not a big
deal (as your program will just step through it one row at a time),
you need to be careful about getting too fancy with your SQL as you
may end up triggering a lot of background IO. For example, if you tack
on an ORDER BY which sorts one of the columns of your vtable, SQLite
will end up essentially copying the vtable contents into a temporary
file and sorting it, which may or may not be a strain on your system
depending on how big your table is (e.g. your vtable is 30Gb and your
/tmp folder is on a 10Gb partition). So think through what you are
doing when going beyond a simple SELECT * from big_vtable.

-- Mike

On Tue, Apr 1, 2008 at 3:12 PM, Aladdin Lampé <[EMAIL PROTECTED]> wrote:
>
>  Hi all!
>
>  Very often, when people ask this list why they have trouble managing in 
> sqlite a "big table" (50 million lines or more than 10 Go), they are told 
> that sqlite is an embedded database and is not meant to be used for very big 
> databases/tables.
>
>  I'm currently in the process of designing a specific, read-only, sqlite 
> "virtual table" in order to enable sqlite to access data stored in an 
> external database which is specially designed to handle very big tables.
>
>  My final objective is to be able to easily query a big external table 
> (stored in another database) through the - excellent - sqlite interface.
>
>  Now I have this terrible doubt: will the existing sqlite "limitations" for 
> big sqlite tables also apply to my read-only virtual tables?
>
>  Thus... am I currently losing my time developing such a "virtual table" with 
> this objective in mind? Or is there a better way to achieve my objective?
>
>  Thank you for your help!
>
>
>  _
>  Votre contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte 
> gratuitement !
>  http://www.windowslive.fr/hotmail/default.asp
>  ___
>  sqlite-users mailing list
>  sqlite-users@sqlite.org
>  http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table used to query big external database

2008-04-01 Thread Ben Harper

The only limitation imposed by SQL that I can think of would be
inefficiencies in its query plan builder. That is the part that
figures out how to use the indexes available in the database in order
to execute the SQL query most efficiently. So it really depends on
what type of SQL queries you are going to be running against this huge
DB.
The dangers are easy to evaluate:
Create a quick-and-dirty dummy virtual table mechanism, and respond to
the xBestIndex/xFilter functions. You will notice that for certain
queries, xBestIndex/xFilter does not get used. That means that the
SQLite engine is going to have to walk through your entire table,
evaluating the conditions on each field. This is obviously what you
wish to avoid. As an example, I noticed briefly (I did not investigate
thoroughly) that having an OR condition in a query would prevent the
indexes from being used. That was some time ago, and it was before the
rewrite of the SQL VM, so I don't know if that still applies. You'll
have to investigate your potential queries yourself. A simple query
such as "WHERE myvalue > 100" should definitely invoke the use of your
own indexes.

Ben

On Tue, Apr 1, 2008 at 10:12 PM, Aladdin Lampé <[EMAIL PROTECTED]> wrote:
>
> Hi all!
>
> Very often, when people ask this list why they have trouble managing in 
> sqlite a "big table" (50 million lines or more than 10 Go), they are told 
> that sqlite is an embedded database and is not meant to be used for very big 
> databases/tables.
>
> I'm currently in the process of designing a specific, read-only, sqlite 
> "virtual table" in order to enable sqlite to access data stored in an 
> external database which is specially designed to handle very big tables.
>
> My final objective is to be able to easily query a big external table (stored 
> in another database) through the - excellent - sqlite interface.
>
> Now I have this terrible doubt: will the existing sqlite "limitations" for 
> big sqlite tables also apply to my read-only virtual tables?
>
> Thus... am I currently losing my time developing such a "virtual table" with 
> this objective in mind? Or is there a better way to achieve my objective?
>
> Thank you for your help!
>
> _
> Votre contact a choisi Hotmail, l'e-mail ultra sécurisé. Créez un compte 
> gratuitement !
> http://www.windowslive.fr/hotmail/default.asp
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Virtual table xBestIndex and when to free index resource

2008-02-11 Thread Dan

On Feb 12, 2008, at 7:05 AM, Evans, Mark (Tandem) wrote:

> SQLite experts:
>
> The xBestIndex method of the SQLite virtual table interface  
> implemented by the VT module returns an output to the core by  
> setting idxNum member var of struct sqlite3_index_info to a value  
> that is meaningful to the VT module.  Assume that a memory resource  
> was created in conjunction with the chosen index that will hold  
> information passed by xFilter.
>
> The question is:  How can VT module tell when it is safe to release  
> that resource?  I'm thinking, it's when the associated statement is  
> finalized.  But how does the VT module know that?  I have found  
> that xClose() call is not the answer because I have stumbled on a  
> test sequence that shows this to be unsafe:
>
>
> do_test update-1.0 {
> execsql {DELETE FROM t1}
> execsql {insert into t1 values(1,2,3)}
> execsql {SELECT * FROM t1 }
> execsql {UPDATE t1 SET y=3 WHERE x=1}
> execsql {SELECT * FROM t1 }
> } {1 2 3 1 3 3}
>
> After execution of the UPDATE, the VT module call sequence for the  
> next SELECT does not include xBestIndex as I was expecting.  It  
> calls xFilter with the idxNum that the previous SELECT created (I  
> think).  I crash and burn because I released the index resource in  
> the xClose call for the first SELECT.
>
> I'd be most appreciative if an expert could steer me in the right  
> direction.

Executing an SQL statement is broken into two parts: compilation
(sqlite3_prepare()) to virtual machine code and execution of that
virtual machine code (sqlite3_step()). The xBestIndex() method is
called as part of compilation, xFilter() is called as part of
execution. As is xClose().

The first time your SELECT statement is run the Tcl interface
calls sqlite3_prepare() to compile it, then
sqlite3_step()/sqlite3_reset() to execute it. The second time,
it is able to re-use the compiled statement. That is why xBestIndex
is not called for the second SELECT.

For passing context, you can also use the sqlite3_index_info.idxStr
variable. Set this to point at a string allocated by sqlite3_malloc()
and sqlite will automatically free it when it is no longer required.
This allows you to store a blob of context data instead of a single
integer.

If you need some resource that really does require a destructor (a
connection handle to some other database etc.), do not open it in
xBestIndex(). Open it in xFilter() and close it in xClose(). Each
xFilter() call should be matched by exactly one xClose().

Regards,
Dan.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

RE: [sqlite] Virtual Table LIKE operator constaint?

2007-11-05 Thread Evans, Mark (Tandem)

To answer my own question:

The magic is in either 

SQLITE_CASE_SENSITIVE_LIKE 

compile time option or 

CASE_SENSITIVE_LIKE pragma.

I have seen this before, but it just didn't stick in my small memory
bank.
Mark

> -Original Message-
> From: Evans, Mark (Tandem) 
> Sent: Monday, November 05, 2007 4:34 PM
> To: sqlite-users@sqlite.org
> Subject: [sqlite] Virtual Table LIKE operator constaint?
> 
> Hi all,
> 
> In a virtual table context,  I expected a statement such as
> 
> select * from t where x like 'abc%'
> 
> to pass a constraint for the LIKE clause to xBestIndex method 
> of virtual table module but it doesn't get such a constraint. 
>  I expected SQLITE_INDEX_CONSTRAINT_MATCH.  Do I have to do 
> something special to enable this constraint to be passed or 
> does it just do a full-table scan when it sees the LIKE, 
> GLOB, etc. operator?
> 
> Thanks,
> Mark
> 
> 
> 
> 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

RE: [sqlite] Virtual table xFilter argv object longevity

2007-10-28 Thread Evans, Mark (Tandem)

Thanks Dan 

> -Original Message-
> From: Dan Kennedy [mailto:[EMAIL PROTECTED] 
> Sent: Sunday, October 28, 2007 11:11 AM
> To: sqlite-users@sqlite.org
> Subject: Re: [sqlite] Virtual table xFilter argv object longevity
> 
> On Sun, 2007-10-28 at 14:03 +, Evans, Mark (Tandem) wrote:
> > The question is what is the lifetime of sqlite3_value 
> objects passed 
> > as argv array to the xFilter virtual table module call?  Can I save 
> > the pointers and reference the values for constraint testing in my 
> > implementation of xNext?
> 
> No. They are only good until the xFilter() call returns. The
> popStack() near the bottom of the OP_VFilter opcode in vdbe.c 
> will invalidate them.
> 
> Dan.
> 

-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Virtual table xFilter argv object longevity

2007-10-28 Thread Dan Kennedy

On Sun, 2007-10-28 at 14:03 +, Evans, Mark (Tandem) wrote:
> The question is what is the lifetime of sqlite3_value objects passed as
> argv array to the xFilter virtual table module call?  Can I save the
> pointers and reference the values for constraint testing in my
> implementation of xNext?

No. They are only good until the xFilter() call returns. The
popStack() near the bottom of the OP_VFilter opcode in vdbe.c
will invalidate them.

Dan.



-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Virtual Table questions

2007-10-23 Thread drh

"Evans, Mark (Tandem)" <[EMAIL PROTECTED]> wrote:
> Hi All,
> 
> I am exploring the use of the Virtual Table capability of SQLite 3.5.1
> for implementing alternative backing store and have a couple of
> questions for virtual table gurus:
> 
> Say I define a virtual table module registered as "custom" and then
> define virtual table T as follows:
> 
> CREATE VIRTUAL TABLE T USING custom ( args );
> 
> This results in call to custom.xcreate() which calls
> sqlite3_declare_vtab() with a second argument of, say:
> "CREATE TABLE xxx ( a, b, c )".
> 
> Now the sqlite3 shell meta command, .schema, outputs:
> 
> CREATE VIRTUAL TABLE T USING custom ( args );
> 
> but not:  CREATE TABLE xxx ( a, b, c );
> 
> How cosmic would it be to extend the .schema command to show both the
> CREATE VIRTUAL TABLE and corresponding CREATE TABLE statements?  This is
> just a nicety, for debug checking.
> 
> 
> Secondly, and more important, I was hoping that it would be possible to
> have CHECK constraints on a virtual table, by saying, for example:
> 
> "CREATE TABLE xxx ( a, b, c, CHECK( a < 10 ) )"
> 
> as second argument to sqlite3_declare_vtab().  But it appears that the
> code generator routine, sqlite3GenerateConstraintChecks((), does not get
> called for virtual tables.  I wonder how cosmic it would be to add a
> routine, sqlite3GenerateVirtualTableConstraintChecks(), that would
> generate constraint checking code for virtual tables.  Is the current
> virtual table module interface sufficient to support such a capability?
> 

Both changes would be exceedingly cosmic.

--
D. Richard Hipp <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-

Re: [sqlite] Virtual Table: xRowID shortcommings

2006-06-25 Thread Ralf Junker


>> 2. In case the virtual table implementation needs to allocate memory in 
>> order to uniquely
>> describe a row/item, this memory needs to be freed when no longer used. As I 
>> see it, there is no
>> such method in the Virtual Table implementation.
>
>Maybe the transaction part of the virtual table API is useful in
>this context (xBegin/xSync/xCommit/xRollback). SQLite will only store
>values retrieved via xRowid for the duration of a transaction. So
>if you need to create a mapping between the integer id's used by
>SQLite and the complex identifiers, you can throw the table away
>at the end of the transaction.
>
>A linked list and a willingness to cast pointers to 64-bit integers
>seems like it would do the trick :)

This willingness is certainly there, but with many thousand modifications in a 
single transaction, this would accumulate a couple thousant memory allocations 
and free them only after the transaction is committed. Very ineffective, in my 
opinion :(

>Of course that won't help with sqlite apps that expect the rowid
>field to remain persistent between queries that span multiple 
>transactions (i.e. MS Access style GUIs etc.).

Correct. Quite a few DBMS simply do not use and support the concept of integer 
rowids. The current xRowID implementation would not allow to access those as 
virtual tables from SQLite.

To support those as well, I believe that the SQLite type integer rowids could 
be made optional for virtual tables and another, more flexible approach could 
instead be made available (as proposed in my previous mailing). This would of 
course mean that those virtual tables would not have a rowid column and SQLite 
would return the usual 'No such solumn: rowid'. But I do not see any problems 
to this from a user's perspective.

Re: [sqlite] Virtual Table: xRowID shortcommings

2006-06-24 Thread Dan Kennedy


> I have played with the new Virtual Table interface from CVS and found some 
> shortcommings for the
> current implementation of the xRowID method:
> 
>   int (*xRowid)(sqlite3_vtab_cursor*, sqlite_int64 *pRowid);
> 
> As far as I understand, this function is called by SQLite whenever it needs a 
> unique identifer
> for a row/item as a signed 64 bit integer type. SQLite uses this to pass it 
> back to other
> Virtual Table methods (most notably xUpdate) so the Virtual Table 
> implementation can use this
> identifier to locate a particular row/item for deletes, inserts, or updates.

Quite correct. It will also be accessible as the "rowid" field via SQL.

> I can see two problems here:
> 
> 1. Using a signed 64 bit integer type works fine for SQLite, but might be 
> insufficient for other
> Virtual Table implementations. Imagine a Virtual Table which maps the files 
> of a disk/directoy.
> In Windows, for example, a unique ITEMIDLIST can be generated for a file or 
> folder. However,
> this ITEMIDLIST is not a signed 64 bit integer, so mapping an ITEMIDLIST to 
> the current
> implementation is not easily possible. I can even imaginge other scenarios 
> where the virtual
> table implementation must allocate a special memory structure larger than 64 
> bit to uniquely
> identify a row/item.

> 2. In case the virtual table implementation needs to allocate memory in order 
> to uniquely
> describe a row/item, this memory needs to be freed when no longer used. As I 
> see it, there is no
> such method in the Virtual Table implementation.

Maybe the transaction part of the virtual table API is useful in
this context (xBegin/xSync/xCommit/xRollback). SQLite will only store
values retrieved via xRowid for the duration of a transaction. So
if you need to create a mapping between the integer id's used by
SQLite and the complex identifiers, you can throw the table away
at the end of the transaction.

A linked list and a willingness to cast pointers to 64-bit integers
seems like it would do the trick :)

Of course that won't help with sqlite apps that expect the rowid
field to remain persistent between queries that span multiple 
transactions (i.e. MS Access style GUIs etc.).


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

95 matches

Mail list logo