[sqlite] What's the best way to pass function information to virtual table?

2015-06-23 Thread Mike Nicolino
I'll summarize what I've done; note that I don't consider myself an 'expert' in 
this area.  What I have works, but others may have suggestions on improvements. 
 (I'd actually be happy to hear any suggestions).

For my virtual tables I generate an additional 'special' column called 
'_MatchFilter' (the leading underscore is my convention identifying it as 
'special').  It's a string column that returns either empty string or the 
specified 'filter' from the where clause if specified.  This column is a 
placeholder and not useful, except when used in a where clause with 'match' or 
'='.  In these cases, my BestIndex prefers queries including _MatchFilter with 
'match' or '=' over all others and Filter takes the match/= compare value and 
passes it as a custom query to my 'table' so it can do things more efficiently. 
 Example:

select * from MyTable where _MatchFilter match "my custom query here"
select * from MyTable where _MatchFilter = "my custom query here"

You still have to handle cases where _MatchFilter gets 'and'ed with additional 
clauses of course in your BestIndex and Filter, though at worst you should be 
able to run your custom query and then in memory filter the results on the 
other clause (or vice versa).  But you can get really crazy with what you allow 
for the filter in this manner, passing it directly to your virtual table 
implementation.

What you want to do with Average is somewhat different; you really don't need a 
custom query, but a way to avoid having SQLite do the average for your table.  
One trick I can think of is adding your own special column to the table 
'_Average', that returns the calculated average.  Its somewhat strange in that 
ever 'row' would contain the average of course.  If it's still expensive to 
calculate, you might require something similar to my _MatchFilter' arrangement 
to cause '_Average' to generate and return '0' for it if not present in the 
where clause.

This is all somewhat hacky of course and non-standard if you're exposing SQL to 
users.

MikeN


-Original Message-
From: sqlite-users-bounces at mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Jerry
Sent: Monday, June 22, 2015 2:47 PM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] What's the best way to pass function information to 
virtual table?

Hi, MikeN.

Assuming I want to compute the average value of all keys, how to write the 
query for this using match?

For example, if I write SQL in this way

> SELECT avg(key) From table WHERE key MATCH('avg');

with MATCH info, the virtual table is able to know the query is looking for 
average value for the keys.
Now, assume the virtual table is able to compute the average faster than 
SQLite, after I get the average value, how can I return the result immediately 
without SQLite going through the aggregator avg()?

What are your queries like when you "pass arbitrary specialized queries 
directly to my virtual table"?

Thanks,
-C.Lin

2015-06-17 8:48 GMT-07:00 Mike Nicolino :

> The override of match() trick works pretty well for cases like this.  
> I've overridden match in my virtual table implementation to allow me 
> to pass arbitrary specialized queries directly to my virtual table 
> modules for cases that I know the virtual table can do a better job 
> that SQLite on that query.  Downside is if you're exposing the SQL to 
> users of course as using match in such a manner is non-standard.
>
> MikeN
>
>
> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:
> sqlite-users-bounces at mailinglists.sqlite.org] On Behalf Of Clemens 
> Ladisch
> Sent: Wednesday, June 17, 2015 1:23 AM
> To: sqlite-users at mailinglists.sqlite.org
> Subject: Re: [sqlite] What's the best way to pass function information 
> to virtual table?
>
> Jerry wrote:
> > With xBestIndex and xFilter, we can pass the constraint information 
> > (e.g., those from WHERE clause) to virtual table (through struct 
> > sqlite3_index_info), so that we can locate the cursor to narrow the 
> > search space.
> > However, it does not provide information about functions used in SQL 
> > queries.
> >
> > For example,
> >
> >> SELECT MAX(key) from Table tab;
> >
> > The virtual table has no way to know the function MAX is called
>
> SQLite has a special optimization for this particular query, and 
> rewrites it as "SELECT key FROM tab ORDER BY key DESC LIMIT 1".
>
> > The virtual table provides xFindFunction to override functions 
> > (maybe this can be used to pass some information).
>
> For example, the full-text search module overrides the match() 
> function; you could do something similar:
>   SELECT key FROM tab WHERE tab MATCH 'max(key)'
>
> > But it seems only general functions can be override -- it has not 
> > effect on aggregate functions.
>
> The virtual table interface does not allow access to all the internals 
> of the query optimizer.
>
> When there is an 

[sqlite] What's the best way to pass function information to virtual table?

2015-06-23 Thread Jerry
Thanks, Mike and Clemens. I got your solutions.

Seems there is no graceful way to do this.
It would be better if virtual table can provide more flexibilities,
although it might introduce some issues.

2015-06-23 9:19 GMT-07:00 Mike Nicolino :

> I'll summarize what I've done; note that I don't consider myself an
> 'expert' in this area.  What I have works, but others may have suggestions
> on improvements.  (I'd actually be happy to hear any suggestions).
>
> For my virtual tables I generate an additional 'special' column called
> '_MatchFilter' (the leading underscore is my convention identifying it as
> 'special').  It's a string column that returns either empty string or the
> specified 'filter' from the where clause if specified.  This column is a
> placeholder and not useful, except when used in a where clause with 'match'
> or '='.  In these cases, my BestIndex prefers queries including
> _MatchFilter with 'match' or '=' over all others and Filter takes the
> match/= compare value and passes it as a custom query to my 'table' so it
> can do things more efficiently.  Example:
>
> select * from MyTable where _MatchFilter match "my custom query
> here"
> select * from MyTable where _MatchFilter = "my custom query here"
>
> You still have to handle cases where _MatchFilter gets 'and'ed with
> additional clauses of course in your BestIndex and Filter, though at worst
> you should be able to run your custom query and then in memory filter the
> results on the other clause (or vice versa).  But you can get really crazy
> with what you allow for the filter in this manner, passing it directly to
> your virtual table implementation.
>
> What you want to do with Average is somewhat different; you really don't
> need a custom query, but a way to avoid having SQLite do the average for
> your table.  One trick I can think of is adding your own special column to
> the table '_Average', that returns the calculated average.  Its somewhat
> strange in that ever 'row' would contain the average of course.  If it's
> still expensive to calculate, you might require something similar to my
> _MatchFilter' arrangement to cause '_Average' to generate and return '0'
> for it if not present in the where clause.
>
> This is all somewhat hacky of course and non-standard if you're exposing
> SQL to users.
>
> MikeN
>
>
> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:
> sqlite-users-bounces at mailinglists.sqlite.org] On Behalf Of Jerry
> Sent: Monday, June 22, 2015 2:47 PM
> To: General Discussion of SQLite Database
> Subject: Re: [sqlite] What's the best way to pass function information to
> virtual table?
>
> Hi, MikeN.
>
> Assuming I want to compute the average value of all keys, how to write the
> query for this using match?
>
> For example, if I write SQL in this way
>
> > SELECT avg(key) From table WHERE key MATCH('avg');
>
> with MATCH info, the virtual table is able to know the query is looking
> for average value for the keys.
> Now, assume the virtual table is able to compute the average faster than
> SQLite, after I get the average value, how can I return the result
> immediately without SQLite going through the aggregator avg()?
>
> What are your queries like when you "pass arbitrary specialized queries
> directly to my virtual table"?
>
> Thanks,
> -C.Lin
>
> 2015-06-17 8:48 GMT-07:00 Mike Nicolino :
>
> > The override of match() trick works pretty well for cases like this.
> > I've overridden match in my virtual table implementation to allow me
> > to pass arbitrary specialized queries directly to my virtual table
> > modules for cases that I know the virtual table can do a better job
> > that SQLite on that query.  Downside is if you're exposing the SQL to
> > users of course as using match in such a manner is non-standard.
> >
> > MikeN
> >
> >
> > -Original Message-
> > From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:
> > sqlite-users-bounces at mailinglists.sqlite.org] On Behalf Of Clemens
> > Ladisch
> > Sent: Wednesday, June 17, 2015 1:23 AM
> > To: sqlite-users at mailinglists.sqlite.org
> > Subject: Re: [sqlite] What's the best way to pass function information
> > to virtual table?
> >
> > Jerry wrote:
> > > With xBestIndex and xFilter, we can pass the constraint information
> > > (e.g., those from WHERE clause) to virtual table (through struct
> > > sqlite3_index_info), so that we can locate the cursor to narrow the
> > > search space.
> > > However, it does not provide information about functions used in SQL
> > > queries.
> > >
> > > For example,
> > >
> > >> SELECT MAX(key) from Table tab;
> > >
> > > The virtual table has no way to know the function MAX is called
> >
> > SQLite has a special optimization for this particular query, and
> > rewrites it as "SELECT key FROM tab ORDER BY key DESC LIMIT 1".
> >
> > > The virtual table provides xFindFunction to override functions
> > > (maybe this can be used to pass some 

[sqlite] What's the best way to pass function information to virtual table?

2015-06-23 Thread Clemens Ladisch
Jerry wrote:
> Assuming I want to compute the average value of all keys, how to write the
> query for this using match?
>
> For example, if I write SQL in this way
>
>> SELECT avg(key) From table WHERE key MATCH('avg');
>
> with MATCH info, the virtual table is able to know the query is looking for
> average value for the keys.
> Now, assume the virtual table is able to compute the average faster than
> SQLite, after I get the average value, how can I return the result
> immediately without SQLite going through the aggregator avg()?

You could return a single row that already contains the average value.

> What are your queries like when you "pass arbitrary specialized queries
> directly to my virtual table"?

You would have to put everything into the MATCH operand.


The virtual table interface is not really designed for such things;
it works best when you have some custom index that merely reduces
the number of rows to be returned.


Regards,
Clemens


[sqlite] What's the best way to pass function information to virtual table?

2015-06-22 Thread Jerry
Hi, MikeN.

Assuming I want to compute the average value of all keys, how to write the
query for this using match?

For example, if I write SQL in this way

> SELECT avg(key) From table WHERE key MATCH('avg');

with MATCH info, the virtual table is able to know the query is looking for
average value for the keys.
Now, assume the virtual table is able to compute the average faster than
SQLite, after I get the average value, how can I return the result
immediately without SQLite going through the aggregator avg()?

What are your queries like when you "pass arbitrary specialized queries
directly to my virtual table"?

Thanks,
-C.Lin

2015-06-17 8:48 GMT-07:00 Mike Nicolino :

> The override of match() trick works pretty well for cases like this.  I've
> overridden match in my virtual table implementation to allow me to pass
> arbitrary specialized queries directly to my virtual table modules for
> cases that I know the virtual table can do a better job that SQLite on that
> query.  Downside is if you're exposing the SQL to users of course as using
> match in such a manner is non-standard.
>
> MikeN
>
>
> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:
> sqlite-users-bounces at mailinglists.sqlite.org] On Behalf Of Clemens Ladisch
> Sent: Wednesday, June 17, 2015 1:23 AM
> To: sqlite-users at mailinglists.sqlite.org
> Subject: Re: [sqlite] What's the best way to pass function information to
> virtual table?
>
> Jerry wrote:
> > With xBestIndex and xFilter, we can pass the constraint information
> > (e.g., those from WHERE clause) to virtual table (through struct
> > sqlite3_index_info), so that we can locate the cursor to narrow the
> > search space.
> > However, it does not provide information about functions used in SQL
> > queries.
> >
> > For example,
> >
> >> SELECT MAX(key) from Table tab;
> >
> > The virtual table has no way to know the function MAX is called
>
> SQLite has a special optimization for this particular query, and rewrites
> it as "SELECT key FROM tab ORDER BY key DESC LIMIT 1".
>
> > The virtual table provides xFindFunction to override functions (maybe
> > this can be used to pass some information).
>
> For example, the full-text search module overrides the match() function;
> you could do something similar:
>   SELECT key FROM tab WHERE tab MATCH 'max(key)'
>
> > But it seems only general functions can be override -- it has not
> > effect on aggregate functions.
>
> The virtual table interface does not allow access to all the internals of
> the query optimizer.
>
> When there is an aggregate function, you can filter the rows that will be
> given to it, but the actual aggregation is still done by SQLite.
>
> If you can compute aggregates more efficiently than SQLite, you could
> create a separate virtual table:
>   SELECT max_key FROM tab_agg
> but this would not work for more complex queries.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] What's the best way to pass function information to virtual table?

2015-06-20 Thread Clemens Ladisch
Jerry wrote:
> 2015-06-17 1:23 GMT-07:00 Clemens Ladisch :
>> Jerry wrote:
>>> But it seems only general functions can be override -- it has not effect on
>>> aggregate functions.
>>
>> The virtual table interface does not allow access to all the internals
>> of the query optimizer.
>>
>> When there is an aggregate function, you can filter the rows that will
>> be given to it, but the actual aggregation is still done by SQLite.
>
> Can you tell more on this? According to my understanding, the constraints
> appeared in WHERE clause will be passed to virtual table (through
> xBestIndex() ), then xFilter() can use the information to narrow the search
> space. However, as for functions, virtual table does not know which
> function is called, so the filter is not able to filter rows based on the
> function.

Yes; you get only constraints of a form like "MyColumn = some_value", and
you don't get to know what that value is, or where it came from.

>> If you can compute aggregates more efficiently than SQLite, you could
>> create a separate virtual table:
>>   SELECT max_key FROM tab_agg
>> but this would not work for more complex queries.
>
> I think using separate tables might not be a good solution if the table
> keeps updating -- there might be consistency issues.

A _virtual_ table does not necessarily imply that the data is actually
stored in some real table; the values could be computed dynamically.

But a separate virtual table is useful only if the data has a different
structure from the data in the original table.  Otherwise, using MATCH
appears to be a better solution.


Regards,
Clemens


[sqlite] What's the best way to pass function information to virtual table?

2015-06-18 Thread Jerry
Thanks, MikeN. I will consider MATCH for now, although it is not standard.

2015-06-17 8:48 GMT-07:00 Mike Nicolino :

> The override of match() trick works pretty well for cases like this.  I've
> overridden match in my virtual table implementation to allow me to pass
> arbitrary specialized queries directly to my virtual table modules for
> cases that I know the virtual table can do a better job that SQLite on that
> query.  Downside is if you're exposing the SQL to users of course as using
> match in such a manner is non-standard.
>
> MikeN
>
>
> -Original Message-
> From: sqlite-users-bounces at mailinglists.sqlite.org [mailto:
> sqlite-users-bounces at mailinglists.sqlite.org] On Behalf Of Clemens Ladisch
> Sent: Wednesday, June 17, 2015 1:23 AM
> To: sqlite-users at mailinglists.sqlite.org
> Subject: Re: [sqlite] What's the best way to pass function information to
> virtual table?
>
> Jerry wrote:
> > With xBestIndex and xFilter, we can pass the constraint information
> > (e.g., those from WHERE clause) to virtual table (through struct
> > sqlite3_index_info), so that we can locate the cursor to narrow the
> > search space.
> > However, it does not provide information about functions used in SQL
> > queries.
> >
> > For example,
> >
> >> SELECT MAX(key) from Table tab;
> >
> > The virtual table has no way to know the function MAX is called
>
> SQLite has a special optimization for this particular query, and rewrites
> it as "SELECT key FROM tab ORDER BY key DESC LIMIT 1".
>
> > The virtual table provides xFindFunction to override functions (maybe
> > this can be used to pass some information).
>
> For example, the full-text search module overrides the match() function;
> you could do something similar:
>   SELECT key FROM tab WHERE tab MATCH 'max(key)'
>
> > But it seems only general functions can be override -- it has not
> > effect on aggregate functions.
>
> The virtual table interface does not allow access to all the internals of
> the query optimizer.
>
> When there is an aggregate function, you can filter the rows that will be
> given to it, but the actual aggregation is still done by SQLite.
>
> If you can compute aggregates more efficiently than SQLite, you could
> create a separate virtual table:
>   SELECT max_key FROM tab_agg
> but this would not work for more complex queries.
>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] What's the best way to pass function information to virtual table?

2015-06-18 Thread Jerry
Thanks, Clemens. Some of my comments as follows

2015-06-17 1:23 GMT-07:00 Clemens Ladisch :

> Jerry wrote:
> > With xBestIndex and xFilter, we can pass the constraint information
> (e.g.,
> > those from WHERE clause) to virtual table (through struct
> > sqlite3_index_info), so that we can locate the cursor to narrow the
> search
> > space.
> > However, it does not provide information about functions used in SQL
> > queries.
> >
> > For example,
> >
> >> SELECT MAX(key) from Table tab;
> >
> > The virtual table has no way to know the function MAX is called
>
> SQLite has a special optimization for this particular query, and
> rewrites it as "SELECT key FROM tab ORDER BY key DESC LIMIT 1".
>

Yes, I see. I am just using MAX for an example.


> > The virtual table provides xFindFunction to override functions (maybe
> this
> > can be used to pass some information).
>
> For example, the full-text search module overrides the match() function;
> you could do something similar:
>   SELECT key FROM tab WHERE tab MATCH 'max(key)'
>

Thanks, I think MATCH will be one workaround.


>
> > But it seems only general functions can be override -- it has not effect
> on
> > aggregate functions.
>
> The virtual table interface does not allow access to all the internals
> of the query optimizer.
>
> When there is an aggregate function, you can filter the rows that will
> be given to it, but the actual aggregation is still done by SQLite.
>

Can you tell more on this? According to my understanding, the constraints
appeared in WHERE clause will be passed to virtual table (through
xBestIndex() ), then xFilter() can use the information to narrow the search
space. However, as for functions, virtual table does not know which
function is called, so the filter is not able to filter rows based on the
function.  Let me know if I am not correct.


>
> If you can compute aggregates more efficiently than SQLite, you could
> create a separate virtual table:
>   SELECT max_key FROM tab_agg
> but this would not work for more complex queries.
>

I think using separate tables might not be a good solution if the table
keeps updating -- there might be consistency issues.


>
>
> Regards,
> Clemens
> ___
> sqlite-users mailing list
> sqlite-users at mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>


[sqlite] What's the best way to pass function information to virtual table?

2015-06-17 Thread Mike Nicolino
The override of match() trick works pretty well for cases like this.  I've 
overridden match in my virtual table implementation to allow me to pass 
arbitrary specialized queries directly to my virtual table modules for cases 
that I know the virtual table can do a better job that SQLite on that query.  
Downside is if you're exposing the SQL to users of course as using match in 
such a manner is non-standard.

MikeN


-Original Message-
From: sqlite-users-bounces at mailinglists.sqlite.org 
[mailto:sqlite-users-boun...@mailinglists.sqlite.org] On Behalf Of Clemens 
Ladisch
Sent: Wednesday, June 17, 2015 1:23 AM
To: sqlite-users at mailinglists.sqlite.org
Subject: Re: [sqlite] What's the best way to pass function information to 
virtual table?

Jerry wrote:
> With xBestIndex and xFilter, we can pass the constraint information 
> (e.g., those from WHERE clause) to virtual table (through struct 
> sqlite3_index_info), so that we can locate the cursor to narrow the 
> search space.
> However, it does not provide information about functions used in SQL 
> queries.
>
> For example,
>
>> SELECT MAX(key) from Table tab;
>
> The virtual table has no way to know the function MAX is called

SQLite has a special optimization for this particular query, and rewrites it as 
"SELECT key FROM tab ORDER BY key DESC LIMIT 1".

> The virtual table provides xFindFunction to override functions (maybe 
> this can be used to pass some information).

For example, the full-text search module overrides the match() function; you 
could do something similar:
  SELECT key FROM tab WHERE tab MATCH 'max(key)'

> But it seems only general functions can be override -- it has not 
> effect on aggregate functions.

The virtual table interface does not allow access to all the internals of the 
query optimizer.

When there is an aggregate function, you can filter the rows that will be given 
to it, but the actual aggregation is still done by SQLite.

If you can compute aggregates more efficiently than SQLite, you could create a 
separate virtual table:
  SELECT max_key FROM tab_agg
but this would not work for more complex queries.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] What's the best way to pass function information to virtual table?

2015-06-17 Thread Clemens Ladisch
Jerry wrote:
> With xBestIndex and xFilter, we can pass the constraint information (e.g.,
> those from WHERE clause) to virtual table (through struct
> sqlite3_index_info), so that we can locate the cursor to narrow the search
> space.
> However, it does not provide information about functions used in SQL
> queries.
>
> For example,
>
>> SELECT MAX(key) from Table tab;
>
> The virtual table has no way to know the function MAX is called

SQLite has a special optimization for this particular query, and
rewrites it as "SELECT key FROM tab ORDER BY key DESC LIMIT 1".

> The virtual table provides xFindFunction to override functions (maybe this
> can be used to pass some information).

For example, the full-text search module overrides the match() function;
you could do something similar:
  SELECT key FROM tab WHERE tab MATCH 'max(key)'

> But it seems only general functions can be override -- it has not effect on
> aggregate functions.

The virtual table interface does not allow access to all the internals
of the query optimizer.

When there is an aggregate function, you can filter the rows that will
be given to it, but the actual aggregation is still done by SQLite.

If you can compute aggregates more efficiently than SQLite, you could
create a separate virtual table:
  SELECT max_key FROM tab_agg
but this would not work for more complex queries.


Regards,
Clemens


[sqlite] What's the best way to pass function information to virtual table?

2015-06-16 Thread Jerry
Hi,

I am working on sqlite+rocksdb, i.e., using rocksdb as the back-end using
sqlite virtual table.
With xBestIndex and xFilter, we can pass the constraint information (e.g.,
those from WHERE clause) to virtual table (through struct
sqlite3_index_info), so that we can locate the cursor to narrow the search
space.
However, it does not provide information about functions used in SQL
queries.

For example,

> SELECT MAX(key) from Table tab;

The virtual table has no way to know the function MAX is called -- I want
this information, because I can get the max key quickly within the
underlying database. Similarly, I also need this for other certain types of
queries.

The virtual table provides xFindFunction to override functions (maybe this
can be used to pass some information).
But it seems only general functions can be override -- it has not effect on
aggregate functions.

Anyone knows how to pass function information to virtual table?
If sqlite does not provide this feature, what's the best way to tweak the
source code to implement this?

Thanks.

-C. Lin