Re: [sqlite] Keeping -wal and -shm files

2014-10-28 Thread Steinar Midtskogen
"James K. Lowden"  writes:

> and -o grpid changes the directory's *gid* effect, not setgid.  Are you
> sure that the directory is setgid?  
>
> $ ls -ld steinar
> drwxrwxr-x  2 jklowden  wheel  512 Oct 28 09:54 steinar
> $ chmod 2775 steinar
> $ ls -ld steinar
> drwxrwsr-x  2 jklowden  wheel  512 Oct 28 09:55 steinar
>  ^--- note "s" in permission bit

Aah.  Thanks for pointing this out.  The directory was drwsrwxr-x.  I
changed it to drwxrwsr-x and it works as expected without grpid.

> You may be more used to traditional BSD behavior, which -o grpid
> restores.  

No, I must have mixed up 4775 and 2775.  I'm unable to set g+s/2775 as
user (even though I own the directory), it only works as root, so
perhaps I did 4775 when 2775 didn't do anything (while I've been on
Linux for more than 20 years, I rarely touch these flags so I usually
don't remember which bit is which).  It took a while after I set the
permissions before I discovered that it didn't work, and I've assumed
that the flags were right.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Keeping -wal and -shm files

2014-10-27 Thread Steinar Midtskogen
"James K. Lowden"  writes:

> See -o grpid in mount(8).  I think that's what you want.  

Thanks.  It works!

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Keeping -wal and -shm files

2014-10-27 Thread Steinar Midtskogen
I have a database that is updated (written) by one user and read by
another user.  So I use WAL mode to get concurrency.  I'm running
Linux/ext4.

Both users are members of the same group.  The trouble is that Linux
ignores the setgid bit on the directory, so when a user access the
database, they will create the -wal and -shm files using their default
group, not the directory group of which they're both members.  So when
one is accessing the database, the other gets locked out because the
-wal and -shm files are created with the wrong group id.

Is there a way to prevent the -wal and -shm files from being deleted
after use, so that I can have them always have the right group?  Or is
there a way to tell Linux to observe the setgid flag on a directory
(ext4)?

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Full outer joins

2014-10-24 Thread Steinar Midtskogen
Richard Hipp  writes:

> In 14 years, you are the first person to ask for them.  That tells me that
> probably not many people would use them even if we did put them in.

I've only written one program using sqlite a few years ago, and I had to
make an ugly workaround using UNIONs and LEFT OUTER JOIN.  I left a
comment in the code that this should be replaced when sqlite supports
FULL OUTER JOIN because the workaround can become extremely slow.
Perhaps native support wont help speed, but sqlite might be in a better
position to optimise than me.  I would certainly try it out if it gets
supported.

I sorely missed that join back then, but I didn't ask for it since it's
on the top of the list at https://www.sqlite.org/omitted.html

At that time the page said "Those features near the top of the list are
likely to be added in the near future".  So I was thinking it would be a
bit impolite or impatient to ask...

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Decimal point and localisation

2012-10-27 Thread Steinar Midtskogen
According to the documenation, "the "." character is always used as
the decimal point even if the locale setting specifies "," for this
role - the use of "," for the decimal point would result in syntactic
ambiguity".

But why can't sqlite observe LC_NUMERIC for the output - no ambiguity
would then arise?
-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Find first non-NULL values of several columns

2012-10-19 Thread Steinar Midtskogen
[Simon Slavin]

> On 20 Oct 2012, at 3:57am, Keith Medcalf  wrote:
>
>> While this is presently how SQLite works, it may not always be so.  It is 
>> possible that a future version may change the order of traversal.
>
> Or indeed
>
> <http://www.sqlite.org/pragma.html#pragma_reverse_unordered_selects>

Thanks.  That was quite useful, actually, and caught a missing "ORDER
BY" in my code.  Of course, I could stick with a specific SQLite
version, but if SQLite at some point will support NATURAL FULL OUTER
JOIN, I would want to upgrade as NATURAL FULL OUTER JOIN would
simplify my code and perhaps SQLite can optimise that better than my
current use of UNIONs and NATURAL LEFT OUTER JOIN.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Find first non-NULL values of several columns

2012-10-19 Thread Steinar Midtskogen
"Keith Medcalf"  writes:

> Alternatively, to implement your original question, write an aggregate 
> function which returns the first non-null value it comes across, and use the 
> negated unix timestamp as an explicit rowid, depending on the fact that a 
> table-scan does an in-order traversal of the table btree:

Except for the names, the below code is *exactly*, character by
character, as what I first wrote, but then I realised that the order
of the values going to the step function might not be defined.  So can
I trust that I will get the order of my unix time primary key?  It
would be convenient if I can omit the timestamp argument in the
function.

>
> typedef struct fnnCtx fnnCtx;
> struct fnnCtx
> {
> double fnn;
> int flag;
> };
>
> SQLITE_PRIVATE void fnnStep(sqlite3_context *context, int argc, sqlite3_value 
> **argv)
> {
> fnnCtx* p = sqlite3_aggregate_context(context, sizeof(fnnCtx));
> if (p && sqlite3_value_numeric_type(argv[0]) != SQLITE_NULL && p->flag == 
> 0)
> {
> p->fnn = sqlite3_value_double(argv[0]);
> p->flag = 1;
> }
> }
>
> SQLITE_PRIVATE void fnnFinal(sqlite3_context *context)
> {
> fnnCtx* p = sqlite3_aggregate_context(context, 0);
> if (p && p->flag == 1)
> sqlite3_result_double(context, p->fnn);
> }

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Find first non-NULL values of several columns

2012-10-19 Thread Steinar Midtskogen
Ryan Johnson  writes:

> On 19/10/2012 3:09 PM, Steinar Midtskogen wrote:
>> I have tables with a timestamp (unix time) and columns containing
>> sensor readings which are inserted continuously.  I frequently need to
>> access the most recent values (or NULL if there is no value within the
>> latest, say, hour).  I would like to do something like:

> Is each column associated with a different sensor?

Yes.

> If so, do sensors routinely generate data at the same time and combine their 
> entries?

They usually generate data at the same time, but not always.

> Do sensors emit data frequently?

Usually every 5 minutes, but since the timestamp is unix time, 1
second frequency is supported.

> If any of the above is false, I'd definitely store each column in its
> own table, because the timestamp (the only possible thing to share) is
> replicated anyway.

I have many tables, but each has up to 30 or so values.  The table
division corresponds to the sensor source.  One source will typically
give a bunch of sensor values every 5 minutes (some values might be
NULL, though).  This makes INSERTs trivial.  I suppose I could put
everything in a single table by using UPDATEs, but since the different
sources have different timestamps and even intervals, I've been
thinking that it could make the database consist of mostly NULLs and
explode in size (say, if one table has readings every second, a single
table would have to have 3599 NULLs for each value in a column that
only have one value per hour).

The other extreme, one table for each sensor, has its problems as
well.  Quite often, I'd like to combine several values in one
expression (e.g. subtract one sensor value from another), and that may
be a bit tricky if they are located in different tables, especially if
they don't have the same timestamps.

Anyway, for this purpose I made a module which allows me to combine
several tables into a read-only virtual table similar to a natural
full outer join (I think).  Also, the virtual table will fill out all
missing values using interpolation.  I've also added functionality for
calculating time weighted averages.

It's pretty neat.  I wrote a perl wrapper as well which will look at
the columns that I want to access and it will create the necessary
virtual tables so I don't have to worry about which tables.

So if I want the outside temperature at 14 October 12:12:!2 even if
there is no table entry for that time, I can do:

$ select.pl 'temp_out WHERE unix_time = strftime("%s", "2012-10-14 12:12:12")'
0.93384

And in this case the result is an interpolation using the nearest
entries at 12:10:00 and 12:15:00.  If I want the 24 hour mean around
that time I do:

$ select.pl 'temp_out__avg_24h WHERE unix_time = strftime("%s", "2012-10-14 
12:12:12")'
1.70068 

where the __avg_24h bit is parsed so that an approperiate virtual
table gets created that will give me the 24h average.  And I can
combine columns from different tables, as in this case:

$ select.pl 'temp_in, temp_out, temp_in - temp_out WHERE unix_time = 
strftime("%s", "2012-10-14 12:12:12")'
21.528520.93384   20.59468 

where temp_in is in one table and temp_out is in another and neither
table has a row for 12:12:12.  And I can even use different averages
in the same expression, which will cause two virtual tables to become
created behind the scenes:

$ select.pl 'temp_out__avg_1y, temp_out__avg_24h, temp_out__avg_1y - 
temp_out__avg_24h WHERE unix_time = strftime("%s", "2010-01-01 00:00:00")'
4.40974   -7.61093   12.02068 

I'm getting off topic, but the essence is that I want to do flexible
queries with as little typing as possible.  If it can be done in plain
SQL, I'll do that.  If not, if it can be done by using user functions
or modules, I'll do that.  If I can't do that either, I'll add an
outer wrapper translating quasi SQL into SQLite commands.  For the
problem in this thread it seems that I can get away with adding new
aggregate functions.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Find first non-NULL values of several columns

2012-10-19 Thread Steinar Midtskogen
Thank you for all suggestions.

I will need to do such queries often, so it's just a matter of saving
the typing.  Unfortunately, views aren't going to be very practical
either, because there are a lot of tables and columns (100+), and new
ones will be added.

The actual use case is as follows:

I have tables with a timestamp (unix time) and columns containing
sensor readings which are inserted continuously.  I frequently need to
access the most recent values (or NULL if there is no value within the
latest, say, hour).  I would like to do something like:

SELECT coalesce(col_1), ..., coalesce(col_n) FROM v WHERE unix_time > 
strftime('%s', 'now', '-1 hour') ORDER BY unix_time DESC;

So I would typically want to access the last non-NULL value because of
the DESC keyword.  But if I understand things correctly, a statement
like above will never work because an aggregate function reads the
data in no particular order regardless of the ORDER BY statement.

I like Igor's suggestion.  Although not quite universal, it's clever.
Ryan's suggestion should work well, except that I will need a first(a,
b) and last(a, b) function (if I want to support both ascending and
descending order) and I can leave out the ORDER BY part.  So:

SELECT last(col_1, unix_time), ..., last(col_n, unix_time) FROM v WHERE 
unix_time > strftime('%s', 'now', '-1 hour');

Yes, it will have run through the whole set, whereas multiple SELECT
col_x FROM v WHERE unix_time > strftime('%s', 'now', '-1 hour') AND
col_x IS NOT NULL ORDER BY unix_time DESC LIMIT 1 will stop early.
But this will not be a problem for me since I want to have a modest
upper limit (1 hour) anyway.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Find first non-NULL values of several columns

2012-10-19 Thread Steinar Midtskogen
Simon Slavin  writes:

> Rows do not have an order.  Without an ORDER BY clause SELECT can return rows 
> in a random order if it wants.  If you would like to define 'order' for me I 
> can give you a SELECT which will find the first non-NULL value in a column, 
> probably something like
>
> SELECT c FROM v WHERE c IS NOT NULL ORDER BY rowid LIMIT 1

Ok, so let's say the table v (with "a" as the primary key) is:

a|b|c|d|e|f
0| | |2| |9
1|1| |3| |8
2|1| |4|4|7
3| |5|5|4|6
4|1|6|6| |5

The the question becomes, is there a more convenient way to do:

SELECT * FROM (SELECT b FROM v WHERE b IS NOT NULL ORDER BY a LIMIT 1),
  (SELECT c FROM v WHERE c IS NOT NULL ORDER BY a LIMIT 1),
  (SELECT d FROM v WHERE d IS NOT NULL ORDER BY a LIMIT 1),
  (SELECT e FROM v WHERE e IS NOT NULL ORDER BY a LIMIT 1),
  (SELECT f FROM v WHERE f IS NOT NULL ORDER BY a LIMIT 1);

to get 1|5|2|4|9?

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Find first non-NULL values of several columns

2012-10-19 Thread Steinar Midtskogen
Suppose I have this table v:

a|b|c|d|e
 | |2| |9
1| |3| |8
1| |4|4|7
 |5|5|4|6
1|6|6| |5

And I would like to return the first non-NULL value of each column.  I
can do somthing like:

SELECT * FROM (SELECT a FROM v WHERE a IS NOT NULL LIMIT 1),
  (SELECT b FROM v WHERE b IS NOT NULL LIMIT 1),
  (SELECT c FROM v WHERE c IS NOT NULL LIMIT 1),
  (SELECT d FROM v WHERE d IS NOT NULL LIMIT 1),
  (SELECT e FROM v WHERE e IS NOT NULL LIMIT 1);

to get 1|5|2|4|9.  But is there a more convenient way?  I don't think
I can define an aggregate function (similar to coalesce) since the
order will not be defined.  But perhaps I'm missing the obvious
solution.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] xColumn called twice for a value

2012-10-19 Thread Steinar Midtskogen
Hi

Suppose I have a virtual table v with a column c.  Then, if I do:

  SELECT c, c+1, c-1 FROM v;

xColumn() will then only be called once per row.  Which is all good.

But if I add any constraints on c, e.g.:

  SELECT c, c+1, c-1 FROM v WHERE c IS NOT NULL;

Then xColumn() will get called a second time for c, which seems
wasteful if c takes time to compute.  Or very bad if the computation
has side effects (which it probably shouldn't have).

Is there any reason for this behaviour?

My SQlite version is 3.7.14.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Read-only media

2012-09-07 Thread Steinar Midtskogen
[Richard Hipp]

> If the last writer to the database file crashed and left a hot
> journal<http://www.sqlite.org/atomiccommit.html#section_4_2>then the
> next reader to come along must rollback that journal before it can
> start reading, and that will require write access.

Can similar things happen if you copy the database file (using
cp/tar/whatever) while an application wants to write to it?

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] how to update the Moving average value

2012-08-13 Thread Steinar Midtskogen
Earlier this year I was bothering this list with many questions about
the sqlite virtual table support, because I needed (among other
things) an efficient way to compute moving averages.  And here is the
result, free for anyone to play with:

 http://voksenlia.net/sqlite3/interpolate.c

Rather than to average the last X values, it assumes that a table has
a timestamp associated with the values that will be averaged, and
what's going to be calculated is the average for a given period.  The
average of angles is also supported.

It's a module which will take existing tables as input and create a
virtual table with new columns for the moving average.  It also allows
you to look up any timestamp regardless of whether that timestamp
exists in the underlying table(s).  The returned value will be
interpolated.

It was written to be efficient on big datasets, but may become slow if
used with virtual tables as input (or used recursively).

My main use for this is to access weather data that I have stored.
For instance, I have temperature (temp_out) stored somewhat
irregularly at roughly 5 minute intervals, and I can now look up
"temp_out" for any timestamp.  Also, if I want the moving 24h
temperature, I specify that in the virtual table declaration, and I
can look up a new column "temp_out__avg" which will give me the moving
average and it will work even if there are gaps in the data.  More
details are described in the comments of the C file.  The module
allowed me to make a web interface for my weather data:
 
 http://voksenlia.net/met/data/plot.php   (Norwegian only)

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Views Optimization

2012-06-03 Thread Steinar Midtskogen
"Black, Michael (IS)"  writes:

> Perhaps the query flattener should ignore any nondeterministic functions?  
> Are there any others besides random() or date/time functions?

User defined functions.  Or views on virtual tables.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Force drop table

2012-05-31 Thread Steinar Midtskogen
Is there a way to force a virtual table to be deleted from a database
file (other than to rebuild the whole database without the table)?
When the interface of a virtual table changes, "drop table" on a table
created with the old interface might not work, since "drop table" will
call the constructor and if the interface has changed, the constructor
might fail.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] FULL OUTER JOIN

2012-05-11 Thread Steinar Midtskogen
http://www.sqlite.org/omitted.html lists what's not supported and
"those features near the top of the list are likely to be added in the
near future".  The list has been unchanged for a few years now, and
RIGHT and FULL OUTER JOIN are on the top of the list.

I wonder, is it still likely that RIGHT and FULL OUTER JOIN will be
added?

Since SQLite is free software I hope asking doesn't make me sound
impatient or critical in any way regarding the progress.  I'm just
curious.

I know that these joins can be emulated by supported SQL, so having
SQLite support them is mainly about convenience and possibly speed, I
suppose.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] db->pVtabCtx and xCreate accessing a virtual table

2012-04-26 Thread Steinar Midtskogen
Richard Hipp  writes:

> On Thu, Apr 26, 2012 at 5:07 PM, Steinar Midtskogen
> wrote:
>
>> My xCreate has to run some queries in order to build its declare
>> statement.  If a virtual table is queried inside xCreate causing a
>> another xCreate to be called, the following sqlite3_declare_vtab will
>> fail because pVtabCtx gets erased.
>>
>
> Fixed here:  http://www.sqlite.org/src/info/696a5a40bb

Thanks.  I can confirm that it fixes everything for me.

One other thing which perhaps is surprising, but it can be argued that
it is a feature rather than a bug, is that a virtual table can be
queried before its xCreate gets finished and its sqlite3_declare_vtab
is called.  So if someone says CREATE VIRTUAL TABLE x ... and its
xCreate has a SELECT ... FROM x, it wont get "no such table".  Rather
it will call itself and eventually crash unless xCreate takes care to
test for the recursion somehow.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] db->pVtabCtx and xCreate accessing a virtual table

2012-04-26 Thread Steinar Midtskogen
The sqlite3 struct has a pVtabCtx pointer.  It seems to me that it
will be shared between multiple xCreate in action simultaniously and
in that case cause disaster.  This can happen even if there is no
thread concurrency going on using the same database connection.

My xCreate has to run some queries in order to build its declare
statement.  If a virtual table is queried inside xCreate causing a
another xCreate to be called, the following sqlite3_declare_vtab will
fail because pVtabCtx gets erased.

Is this a design flaw of sqlite or intentionally designed this way for
a good reason?

The only workaround that I can think of is to fork() in xCreate having
the child build the statement for the parent's sqlite3_declare_vtab(),
pass that to the waiting parent and exit.  But in my case the child
would also have to pass a lot of other variable length information
back to the parent, which will involve a lot of IPC mess to implement.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Permissions

2012-04-23 Thread Steinar Midtskogen
[Richard Hipp]

> On Sun, Apr 22, 2012 at 12:40 PM, Steinar Midtskogen
> wrote:
>
>>
>> Any reason why sqlite doesn't use the same file permissions as the
>> database file when creating these extra files?
>>
>>
> There was a change in version 3.7.11 to do exactly that.
> http://www.sqlite.org/src/info/84b324606a

Oh, great!  I'm just at 3.7.9, the latest on cpan.  This was the
fastest response to a feature request I've ever experienced.  Fixed a
few weeks before I asked the question!

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Permissions

2012-04-22 Thread Steinar Midtskogen
[Simon Slavin]

> The solution I came up with is that the database file owner also
> uses Apache to look at it: I use web-facing database administration
> software rather than opening the database in another application.
> (I wrote a simple one myself in PHP and JavaScript.)  However this
> is unacceptable for some users.

That gave me an idea, which should solve the problem for me.  Only two
applications access the database: apache or the sqlite3 commandline
tool.  So I simply chowned the sqlite3 application and made it setuid
apache.

It doesn't solve the general case, though, where any application owned
by any user in a certain group should be able to access the database.

> You're using WAL mode.  DELETE mode is the default behaviour: when
> the last connection to the database is closed, the journal is
> deleted.  But you can change this to TRUNCATE or some other value
> that suits you.  That way, the files will not have to be remade.  So
> then you would …

I chose WAL since I'd like to have as much concurrency as possible.

If TRUNCATE means that the files will always be present, never
deleted, then I suppose that also could solve my problem, since the
file then could be made group writeable.


Any reason why sqlite doesn't use the same file permissions as the
database file when creating these extra files? 
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Permissions

2012-04-22 Thread Steinar Midtskogen
Stephan Beal  writes:

> Try the sticky bit:
>
> chown user:apache theDir
> chmod 4775 theDir

I think the effect of that only is to restrict anyone but root or the
owner of a file from deleting or renaming an otherwise writeable file
in that directory.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Permissions

2012-04-22 Thread Steinar Midtskogen
Patrik Nilsson  writes:

> You can try setting your user as member of group apache.

That's already done, but the trouble is that when the shm and wal
files are created by and therefore owned by "apache", then "user"
can't change that file unless it's group writeable.  Having apache run
with umask 002 should fix this, but I wonder if there is another
workaround (and I haven't figured yet out how to configure apache to
do this, anyway).

-Steinar

>
> On 04/22/2012 10:31 AM, Steinar Midtskogen wrote:
>> This might be slightly off topic, but perhaps a common problem for
>> many sqlite users.
>> 
>> I have a database (wal mode) that apache (the web server) needs to
>> access, readonly.  Since it needs to be able to lock it for reading,
>> apache needs write access.  So the database has these permissions:
>> 
>> -rw-rw-r--  1 userapache  1837704192 2012-04-22 09:58 database.db
>> 
>> The directory is also group writeable.
>> 
>> The trouble is that when apache is accessing the database, the
>> database file owner can't access it, not even for reading.  The result
>> is "unable to open database file".  I believe that the cause is that
>> apache creates these files:
>> 
>> -rw-r--r--  1 apache  apache   32768 2012-04-22 10:15 database.db-shm
>> -rw-r--r--  1 apache  apache   0 2012-04-22 09:58 database.db-wal
>> 
>> which other users have no write access to.  So access to the database
>> is locked until sqlite remove these files.
>> 
>> Is there a way to work around this, other than to set umask 002 for
>> apache?
>> 
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Permissions

2012-04-22 Thread Steinar Midtskogen
This might be slightly off topic, but perhaps a common problem for
many sqlite users.

I have a database (wal mode) that apache (the web server) needs to
access, readonly.  Since it needs to be able to lock it for reading,
apache needs write access.  So the database has these permissions:

-rw-rw-r--  1 userapache  1837704192 2012-04-22 09:58 database.db

The directory is also group writeable.

The trouble is that when apache is accessing the database, the
database file owner can't access it, not even for reading.  The result
is "unable to open database file".  I believe that the cause is that
apache creates these files:

-rw-r--r--  1 apache  apache   32768 2012-04-22 10:15 database.db-shm
-rw-r--r--  1 apache  apache   0 2012-04-22 09:58 database.db-wal

which other users have no write access to.  So access to the database
is locked until sqlite remove these files.

Is there a way to work around this, other than to set umask 002 for
apache?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-16 Thread Steinar Midtskogen
To answer my own question, whether there is an efficient way to find
max() of an increasingly sorted column in a virtual array: What is
needed is to make sure that xBestIndex sets orderByConsumed, and that
the module takes care of all sorting.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
[Gerry Snyder]

> At worst you could use another table to keep track of the maximum and
> minimum, and update it with triggers when something is added to or deleted
> from the virtual table.

My module knows what the maximum and minimum values are at all times.
It also knows that the column is sorted.  The trouble is that it
doesn't know that the values it can produce for that column will be
fed to a max() and min() function.  If it did, it could simply just
return one value.

I might be missing something, though.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
[Kit]

> 2012/4/15 Steinar Midtskogen :
>> So, is there really a way to create an index in a virtual table, or a
>> way to emulate this?
>
> Why? You don't need this. Use index on base tables.

My base tables are indexed.  Let's say I want to make a very simple
virtual table this way:

 create virtual table vtab using copy(indexed_table);

which simply maps any query for vtab to indexed_table and returns that.
So let's say that indexed_table have an integer column "key" which
also a primary key.  So "select max(key) from indexed_table" will be
fast no matter how big it is and the module can find this value in a
blink.  What I would like to is to have "select max(key) from vtab"
run fast as well, without having to run through the billion rows in
index.

So what happens when I run "select max(key) from vtab"?  Well, all
xFilter will know is that it needs to produce the "key" column, and there
should be a "order by key" clause as well, but even if we can assume
that what we're dealing with is a sorted column, and xFilter could
look up the max in no time, xFilter doesn't know that the query is for
the max value.  Can my module do anything better than to produce all
the rows for sqlite to feed into the max aggregate function?

>> My xRowid function simply returns the value of the "unix_time" column,
>> but even "select max(rowid)" is equally slow.
>> Steinar
>
> Why you need "select max(rowid)"? Something is wrong in your data
> design. Use autoincrement.

I don't need it, but a virtual table must provide one.  I'm not sure
why.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
[Simon Slavin]

> On 15 Apr 2012, at 1:31pm, Steinar Midtskogen  wrote:
>
>> Another question about max()/min() optimisation.  Is there a way I can
>> implement a virtual table so that max()/min() of a sorted
>> (incrementing) column (which could be an integer primary key in a
>> regular table) gets fast?
>
> The max() and min() functions work instantly if the columns they're
> looking at are indexed.  They simply find the first or last entry in
> the index.  So when defining your virtual table routines, just make
> sure your key columns have an index, and that your xBestIndex method
> finds the right index.

According to the "Using SQLite" book, "you cannot create an index on
a view or on a virtual table".

Also, when declaring the virtual table using sqlite3_declare_vtab the
book says: "any constraints, default values, or key definitions within
the table definition are also ignord - this includes any definition of
INTEGER PRIMARY KEY as a ROWID alias".

So, is there really a way to create an index in a virtual table, or a
way to emulate this?

My xRowid function simply returns the value of the "unix_time" column,
but even "select max(rowid)" is equally slow.
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-15 Thread Steinar Midtskogen
Hello again,

Another question about max()/min() optimisation.  Is there a way I can
implement a virtual table so that max()/min() of a sorted
(incrementing) column (which could be an integer primary key in a
regular table) gets fast?

For example,

sqlite> explain query plan select max(unix_time) from vtab;
0|0|0|SEARCH TABLE vtab VIRTUAL TABLE INDEX 0: (~1 rows)

Currently, "select max(unix_time) from vtab" causes SQLite to search
through millions of rows, which may take nearly half a minute for my
table, no faster than other non-sorted columns.

I've added special treatment of this sorted "unix_time" column in
xBestIndex, so that a query like:

 select max(unix_time) from vtab where unix_time > strftime("%s", "2012-04-14");

runs fast (i.e. then my table will only look through a few rows at the end).


Perhaps what I'm asking is whether it's possible to add a special
treatment for max() and min() in a virtual table.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Steinar Midtskogen
Puneet Kishor  writes:

> If you want the results in separate columns, you can do something like
>
> SELECT Min(a) minimum, 'none' maximum FROM t UNION ALL SELECT 'none' minimum, 
> Max(a) minimum FROM t;

Then it does a full scan again.

But Igor's suggestion "SELECT (SELECT min(unix_time) FROM table),
(SELECT max(unix_time) FROM table)" works fine, and means less code.

Thanks!

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Steinar Midtskogen
Alessandro Marzocchi  writes:

> What does EXPLAIN QUERY PLAN says?

sqlite> EXPLAIN QUERY PLAN SELECT min(unix_time) FROM table;
0|0|0|SEARCH TABLE table USING INTEGER PRIMARY KEY (~1 rows)

sqlite> EXPLAIN QUERY PLAN SELECT max(unix_time) FROM table;
0|0|0|SEARCH TABLE table USING INTEGER PRIMARY KEY (~1 rows)

sqlite> EXPLAIN QUERY PLAN SELECT min(unix_time), max(unix_time) FROM table;
0|0|0|SCAN TABLE table (~100 rows)

I suppose a query for a single min/max gets optimised, while a query
involving multiple columns doesn't.

I have a much bigger table as well, and on that one the speedup is in
the millions to run two SELECTs.  It's hard to guess that there will
be such a difference, but I suppose I should be happy that there is at
least an optimised way to get min and max for the integer primary key.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Why are two select statements 2000 times faster than one?

2012-04-13 Thread Steinar Midtskogen
Hello,

I have a table with "unix_time" as primary key and I want to get the
minimum and maximum values of "unix_time".  When I do:

  SELECT min(unix_time), max(unix_time) from table;

it is very slow.  It takes about 250ms, nearly everything in the
step() call.

However, if I do:

  SELECT min(unix_time) FROM table; SELECT max(unix_time) FROM table;

to get the same values, it takes a fraction of the time.  The speedup
is more than 2000x.

Why?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Select time('now')

2012-04-13 Thread Steinar Midtskogen
YAN HONG YE  writes:

> the current time is 15:15
> when I use this following command:
> sqlite> Select time('now');
> return 07:15:42
> not current time,why?

Read http://www.sqlite.org/lang_datefunc.html

"Format 11, the string 'now', is converted into the current date and
time as obtained from the xCurrentTime method of the sqlite3_vfs
object in use. Universal Coordinated Time (UTC) is used."

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] xCreate calling itself

2012-04-06 Thread Steinar Midtskogen
I ran gdb to find out more, and here's how sqlite3_declare_vtab()
exits:

102133if( !db->pVtabCtx || !(pTab = db->pVtabCtx->pTab) ){
(gdb) n
102134  sqlite3Error(db, SQLITE_MISUSE, 0);
(gdb) print db->pVtabCtx
$1 = (VtabCtx *) 0x0

To explain further what I'm doing: My virtual table basically takes a
number of tables as arguments and creates a new table like a natural
join of the input tables, except that where a natural join would
create NULL values, the virtual table will create interpolated
values.  The module does a number of other things as well, but I don't
think that's relevant here.  So I can do:

create virtual table v1 using interpolate(tab1, tab2, tab3);

But then this will fail:

create virtual table v2 using interpolate(v1, tab4);

When I run in gdb, I see that pVtabCtx is nonzero in xCreate for v2
until it runs sqlite3_prepare_v2() with a statement which selects from
v1 (causing xCreate to be called for v1).  After sqlite3_prepare_v2()
returns, pVtabCtx is 0.

Is this a bug, or is it intentional (possibly for a good reason)?

My xCreate function needs to run queries on the input tables to find
the column names needed for the vtab declaration.

-Steinar

Steinar Midtskogen  writes:

> Hello,
>
> In certain cases when I try to create a virtual table,
> sqlite3_declare_vtab() returns SQLITE_MISUSE.  Nothing appears to be
> wrong with the string I pass to the sqlite3_declare_vtab().  That is,
> if I execute the "create table" statement in that string, a regular
> table will be created - no error.
>
> Is there a way to get more hints why sqlite3_declare_vtab() fails?
>
> The only thing special when it fails is that xCreate, before the call
> to sqlite3_declare_vtab, has (successfully) run and finished
> statements (prepare/step/finalize) which involve a virtual table using
> the very same module.  Is it forbidden to nest virtual tables this
> way?  That is, having xCreate trigger a call to itself (using
> different arguments).  I'm pretty sure that I don't use non-const
> static variables which could mess up things.  Perhaps sqlite3 does?
>
> -- 
> Steinar
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] xCreate calling itself

2012-04-06 Thread Steinar Midtskogen
Hello,

In certain cases when I try to create a virtual table,
sqlite3_declare_vtab() returns SQLITE_MISUSE.  Nothing appears to be
wrong with the string I pass to the sqlite3_declare_vtab().  That is,
if I execute the "create table" statement in that string, a regular
table will be created - no error.

Is there a way to get more hints why sqlite3_declare_vtab() fails?

The only thing special when it fails is that xCreate, before the call
to sqlite3_declare_vtab, has (successfully) run and finished
statements (prepare/step/finalize) which involve a virtual table using
the very same module.  Is it forbidden to nest virtual tables this
way?  That is, having xCreate trigger a call to itself (using
different arguments).  I'm pretty sure that I don't use non-const
static variables which could mess up things.  Perhaps sqlite3 does?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting every nth row efficiently

2012-04-04 Thread Steinar Midtskogen
Let's say that I want to plot the entire column and the plot is 1000
pixels wide.  Then I only need 1000 samples, so I could do this:

SELECT timestamp, sample FROM mytable GROUP BY timestamp * 1000 / ((SELECT 
max(timestamp) FROM mytable) - (SELECT min(timestamp) FROM mytable));

(timestamp is the primary key)

-Steinar

Steinar Midtskogen  writes:

> [Jean-Christophe Deschamps]
>
>> You're going to have at most one random sample in every slice of 320
>> s.  The GROUP BY clause will select only one for you and the query can
>> be as simple as:
>>
>> select sample from from mytable group by timestamp / 320 order by
>> timestamp;
>
> Ah.  I didn't think of that.  It's even better than getting every nth
> row, since I get one sample for a fixed period, which is what I really
> want.  And yet better, I suppose I could do something like SELECT
> min(sample), max(sample) FROM mytable GROUP BY timestamp / 3600 and
> use financebars or similar in gnuplot to avoid missing the extremes in
> the plot, making it appear more or less identical as if I had plotted
> every value.
>
> Thanks!
> -- 
> Steinar
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting every nth row efficiently

2012-04-04 Thread Steinar Midtskogen
[Jean-Christophe Deschamps]

> You're going to have at most one random sample in every slice of 320
> s.  The GROUP BY clause will select only one for you and the query can
> be as simple as:
>
> select sample from from mytable group by timestamp / 320 order by
> timestamp;

Ah.  I didn't think of that.  It's even better than getting every nth
row, since I get one sample for a fixed period, which is what I really
want.  And yet better, I suppose I could do something like SELECT
min(sample), max(sample) FROM mytable GROUP BY timestamp / 3600 and
use financebars or similar in gnuplot to avoid missing the extremes in
the plot, making it appear more or less identical as if I had plotted
every value.

Thanks!
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Selecting every nth row efficiently

2012-04-04 Thread Steinar Midtskogen
[Jean-Christophe Deschamps]

>>If your sampling is essentially regular, why not make it
>>
>>select ... where timestamp % N between min_interval and max_interval
>>
>> N being the typical time delta of your n rows above and interval
>> bounds reducing the possiblity of gross under- and over-sampling.
>> May need adjustment but the query should run faster than full table
>> load.
>
> Let's assume your timestamps are unix epoch and your device samples on
> an average 1-second basis (delays in processing could cause two
> samples with the same timestamp or no sample with a given
> epoch). You're certain that there is always at least one sample
> between seconds (say) 8 and 11 of every hour.  You want only one
> sample every hour, taken randomly(*) between seconds 8 and 11.
>
> select sample from from mytable where timestamp % 86400 between 8 and
> 11 group by timestamp / 86400;
>
> (*) SQL[ite] doesn't give you a chance to specify which row will
> represent each group, hence the randomly in 8..11
>
> Would that do what you want?

This is a good suggestion.  A drawback is that the interval can't be
too small otherwise there is a risk that a sample would be missed.  So
I will get more samples than I need.  In you example, if there is a
sample every second more or less, I would usually get 3-4 samples
every hour instead of just one which would suffice.

In reality I have around 5 samples every 300 seconds, and almost
certainly at least one sample within 320 seconds.  Then, most of the 5
samples will typically be within a few seconds, and rarely spread
evenly across the 300 second interval.  So the sampling is only
semi-regular.

I will mostly be selecting from a virtual table, so another option for
me could be to add a column which simply returns an incrementing
number, which I can use % on, or perhaps better a column containing a
flag which is set every time the timestamp has increased more than a
certain limit since the last time the flag was set.  But something
less ad hoc is better.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Selecting every nth row efficiently

2012-04-03 Thread Steinar Midtskogen
I have a big database with timestamps and sensor readings, which I
access with SELECT to have gnuplot draw graphs.  However, sometimes I
have readings every minute and want to plot several years of data, and
feeding everything to gnuplot is overkill.  In these cases it would be
sufficient to select only every nth row.  Is there a way to do this
efficiently (i.e. more efficient than just having gnuplot, which is
pretty fast, to plot everything)?

My rowid isn't increasing with something predictable, so I can't do
something like WHERE rowid % n = 0.  I can use WHERE random() % n = 0
giving me sort of what I want (better than row % n, but I still need
something better).

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] how to wrie "" char in my html file?

2012-03-31 Thread Steinar Midtskogen
[YAN HONG YE]

>>sqlite3 -html C:\mydatabase\mydzh.db "select ''">mm.html
> this command result is not  in the mm.html file, it's this following 
> text:
> 
> 
> not I wanted, how to wrie  char in my html file?

That would create invalid html.  "
" will be displayed as "" by the html parser. -- Steinar ___ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] about sqlite3_exec function

[YAN HONG YE]

> sqlite3_exec( db, "???", 0, 0, &pErrMsg);
>
> I wanna add this following command into sqlite3_exec func:
> "sqlite3 -html -header mydzh.db \"select * from dhq where ph15>10;\" >mm.html"
>  like this:
> sqlite3_exec( db, "-html -header \"select * from dhq where ph15>10;\" 
> >mm.html", 0, 0, &pErrMsg);
>
> but it doesn't  work

sqlite3_exec() runs SQL statements, not shell commands.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] how to export to html file?

[YAN HONG YE]

> my sqlite database want to export to html file, I know the command 
> sqlite3 -html film.db "select * from film;" 
> could show the table in cmd window, but how to export to the html file  like 
> sqlite3 -html film.db  mm.html "select * from film;"

sqlite3 -html film.db "select * from film" > mm.html

That's Unix syntax.  I'm not sure about Windows, but I suppose you can
do something similar.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] User input checking

I'm planning to allow users to make database queries through a web
page.  I'm thinking of letting the user provide the string that goes
between SELECT and FROM using the sqlite3 command tool, but what kind
of input checking is then needed?

Obviously, I need to check that the input doesn't contain any
semicolons, otherwise the user could enter something like "; DROP
big_table;".  But is there anything else that needs checking?  I only
want to make sure that the user can't change anything.  If the query
is too big and will take forever, that's fine for now.

xkcd comes to mind: http://xkcd.com/327/  :)

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite bug?

Roger Binns  writes:

> On 07/03/12 13:46, Steinar Midtskogen wrote:
>> I think indeed that this is a problem:
>
> I cannot reproduce it using my own virtual tables.  This strongly implies
> that it is something to do with the code for your virtual tables.  Doesn't
> your 'interpolate' module use existing tables?

xCreate()/xConnect() will run through the argument list and do a
PRAGMA table_info(%s) and then a couple of SELECT ... FROM %s.  When
created with itself it loops in the PRAGMA query.  When called with a
second table which points back to it, it seems to pass the PRAGMA, but
it loops in the first SELECT query.

> It is also worth pointing out that table names, column names and column
> types live in separate namespaces.  This works just fine:
>
>   sqlite> create table a(a a);
>   sqlite>

Yes, but the arguments of the virtual tables are names, so "CREATE
VIRTUAL TABLE table USING interpolate(table)" will make
xCreate()/xConnect() fire off the queries PRAGMA table_info(table) and
SELECT * FROM table.

Anyway, I managed to break this loop by requiring the argument to have
an integer primary key, which virtual tables don't have, so it can't
be called recursively either.  I'm not sure if this is a restriction
that I want, though.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite bug?

Steinar Midtskogen  writes:

> I can easily check that the argument doesn't match argv[2], but if
> it's still possible to get into a loop like this if virtual tables
> refer to eachother, then I don't think it's possible to detect this in
> the VT code.

I think indeed that this is a problem:

sqlite> create table a ( x INTEGER, PRIMARY KEY (x) );
sqlite> create virtual table b using interpolate(a);
sqlite> drop table a;
sqlite> create virtual table a using interpolate(b);

Program received signal SIGSEGV, Segmentation fault.

And the call backtrace goes on forever.

Yes, this is a user fault, but if the VT code must check this, it
would be nice to have a way to detect it.  It's possible to have some
kind of static counter in xCreate() which could be used for detecting
this, but that would limit the number of instances rather than the
level of nesting.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SQLite bug?

"Jay A. Kreibich"  writes:

> On Wed, Mar 07, 2012 at 08:58:27PM +0100, Steinar Midtskogen scratched on the 
> wall:
>> I've created a module which will take a table as an argument.  In
>> Xconnect it will run a query on that table.  Then I accidently used
>> the name of the virtual table in the argument list:
>> 
>>  CREATE VIRTUAL TABLE v USING my_module(v);
>> 
>> which seems to have triggered an infinite call loop.  I suppose what's
>> going on is that when Xconnect
>
>   It should be xCreate() if it is a new table.

Yes, probably.  xCreate() and xConnect() are the same function.  I
just happened to call it "connect".

>   I doubt that is what is happening.  More likely, the query is
>   blocking in some way.  SQLite would never automatically create a
>   non-existent table you tried to access.  How would it know what to
>   create?

I was guessing this having looked at the call trace in gdb.  It is:

#0  0x7fd0ed98d8a3 in sqlite3VXPrintf (pAccum=Cannot access memory at 
address 0x75644f08
) at sqlite3.c:19459
#1  0x7fd0ed98f97a in sqlite3VMPrintf (db=0x60e030, zFormat=0x7fd0eda087bf 
"%s", ap=0x75645120) at sqlite3.c:20096
#2  0x7fd0ed98fa9a in sqlite3MPrintf (db=0x60e030, zFormat=0x7fd0eda087bf 
"%s") at sqlite3.c:20112
#3  0x7fd0ed9f in vtabCallConstructor (db=0x60e030, pTab=0x6385e0, 
pMod=0x61eae0, xConstruct=0x7fd0ec345470 , 
pzErr=0x756452c0) at sqlite3.c:101732
#4  0x7fd0ed9f48e5 in sqlite3VtabCallConnect (pParse=0x19dd610, 
pTab=0x6385e0) at sqlite3.c:101848
#5  0x7fd0ed9d4b35 in sqlite3ViewGetColumnNames (pParse=0x19dd610, 
pTable=0x6385e0) at sqlite3.c:82813
#6  0x7fd0ed9e3fbd in sqlite3Pragma (pParse=0x19dd610, pId1=0x19dd938, 
pId2=0x19dd958, pValue=0x19dd998, minusFlag=0) at sqlite3.c:92812
#7  0x7fd0eda00cce in yy_reduce (yypParser=0x19dd8c0, yyruleno=257) at 
sqlite3.c:110533
#8  0x7fd0eda016af in sqlite3Parser (yyp=0x19dd8c0, yymajor=1, yyminor={z = 
0x19dd603 ")", n = 1}, pParse=0x19dd610) at sqlite3.c:110915
#9  0x7fd0eda024d7 in sqlite3RunParser (pParse=0x19dd610, zSql=0x19dd5f0 
"PRAGMA table_info(x)", pzErrMsg=0x75645888) at sqlite3.c:111752
#10 0x7fd0ed9e6b44 in sqlite3Prepare (db=0x60e030, zSql=0x19dd5f0 "PRAGMA 
table_info(x)", nBytes=-1, saveSqlFlag=1, pReprepare=0x0, 
ppStmt=0x75645a28, pzTail=0x0) at sqlite3.c:94079
#11 0x7fd0ed9e6e4f in sqlite3LockAndPrepare (db=0x60e030, zSql=0x19dd5f0 
"PRAGMA table_info(x)", nBytes=-1, saveSqlFlag=1, pOld=0x0, 
ppStmt=0x75645a28, pzTail=0x0) at sqlite3.c:94171
#12 0x7fd0ed9e7010 in sqlite3_prepare_v2 (db=0x60e030, zSql=0x19dd5f0 
"PRAGMA table_info(x)", nBytes=-1, ppStmt=0x75645a28, pzTail=0x0)
at sqlite3.c:94246
#13 0x7fd0ec34566e in interpolate_connect (db=0x60e030, pAux=, argc=, argv=0x638580, ppVtab=0x19dd4c0, 
pzErr=) at extension-functions.c:447
#14 0x7fd0ed9f44fa in vtabCallConstructor (db=0x60e030, pTab=0x6385e0, 
pMod=0x61eae0, xConstruct=0x7fd0ec345470 , 
pzErr=0x75645b40) at sqlite3.c:101752
#15 0x7fd0ed9f48e5 in sqlite3VtabCallConnect (pParse=0x19dc130, 
pTab=0x6385e0) at sqlite3.c:101848
#16 0x7fd0ed9d4b35 in sqlite3ViewGetColumnNames (pParse=0x19dc130, 
pTable=0x6385e0) at sqlite3.c:82813
#17 0x7fd0ed9e3fbd in sqlite3Pragma (pParse=0x19dc130, pId1=0x19dc458, 
pId2=0x19dc478, pValue=0x19dc4b8, minusFlag=0) at sqlite3.c:92812
#18 0x7fd0eda00cce in yy_reduce (yypParser=0x19dc3e0, yyruleno=257) at 
sqlite3.c:110533
#19 0x7fd0eda016af in sqlite3Parser (yyp=0x19dc3e0, yymajor=1, yyminor={z = 
0x19dc123 ")", n = 1}, pParse=0x19dc130) at sqlite3.c:110915
#20 0x7fd0eda024d7 in sqlite3RunParser (pParse=0x19dc130, zSql=0x19dc110 
"PRAGMA table_info(x)", pzErrMsg=0x75646108) at sqlite3.c:111752
#21 0x7fd0ed9e6b44 in sqlite3Prepare (db=0x60e030, zSql=0x19dc110 "PRAGMA 
table_info(x)", nBytes=-1, saveSqlFlag=1, pReprepare=0x0, 
ppStmt=0x756462a8, pzTail=0x0) at sqlite3.c:94079
#22 0x7fd0ed9e6e4f in sqlite3LockAndPrepare (db=0x60e030, zSql=0x19dc110 
"PRAGMA table_info(x)", nBytes=-1, saveSqlFlag=1, pOld=0x0, 
ppStmt=0x756462a8, pzTail=0x0) at sqlite3.c:94171
#23 0x7fd0ed9e7010 in sqlite3_prepare_v2 (db=0x60e030, zSql=0x19dc110 
"PRAGMA table_info(x)", nBytes=-1, ppStmt=0x756462a8, pzTail=0x0)
at sqlite3.c:94246
#24 0x7fd0ec34566e in interpolate_connect (db=0x60e030, pAux=, argc=, argv=0x638580, ppVtab=0x19dbfe0, 
pzErr=) at extension-functions.c:447
#25 0x7fd0ed9f44fa in vtabCallConstructor (db=0x60e030, pTab=0x6385e0, 
pMod=0x61eae0, xConstruct=0x7fd0ec345470 , 
pzErr=0x756463c0) at sqlite3.c:101752

... and so on, until:

#42352 0x7fd0ec34566e in interpolate_connect (db=0x60e030, pAux=, argc=, argv=0x638580,

[sqlite] SQLite bug?

I've created a module which will take a table as an argument.  In
Xconnect it will run a query on that table.  Then I accidently used
the name of the virtual table in the argument list:

 CREATE VIRTUAL TABLE v USING my_module(v);

which seems to have triggered an infinite call loop.  I suppose what's
going on is that when Xconnect tries to run a query on that table,
SQLite will try to create the virtual table again, and then loops and
quickly overflows the stack.

Is this a SQLite bug, or should the virtual table code somehow detect
that the user is trying to create a virtual table using itself?  Or
should the user be blamed?

I suspect that this kind of loop could be less obvious if several
virtual tables are involved in a loop.

In my opinion the cleanest approach would be that SQLite itself
determines that it has nested too deeply and gives an error.

This is version 3.7.10.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] xRowid doesn't get called

Hello,

I've noticed in my virtual table implementation that xRowid never gets
called.  This is a read-only table, so if xRowid is mainly used for
making inserts or changes to the table, that might be why.  But
shouldn't it also get called if I do SELECT rowid FROM VirtualTable?
The result is just NULLs and my xRowid function never gets called.  If
it never gets called for read-only tables, why then is it required?

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] xFilter or xBestIndex needs to know which columns were selected

I did implement what I described below, which I believed to work well
unless it is the first row that needs the biggest buffer.  However, I
ran into a problem.  I keep getting incorrect results when I use the
coalesce() function.  Usually coalesce() will simply return the first
argument, and if this is the case before I realloc my buffer so it
only contains the columns seen so far, then I run into a problem when
coalesce() later needs to look past its first argument if that column
no longer exists in my buffer.

I believed that I would know which columns to care about after the
first row had been completed, but that's not a safe assumptation.  On
the other hand, it would be a waste to compute values that coalesce()
will discard, so this behaviour makes sense.  I could run through
xColumn twice for the whole result table: once to determine which
columns to care about and once again to do the actual computations.
That will give me a tradeoff between speed and memory usage.

-Steinar

Steinar Midtskogen  writes:

> Thanks to Dan and Roger for the information and suggestions.  I have
> about 100 columns (and millions of rows), but usually queries will
> only ask for a few columns.
>
> I think I know a way to work around this for my case.  Basically, I
> don't know the exact size required for my lookahead buffer (currently
> allocated in xFilter), so I probably need to make that size dynamic
> anyway rather than to allocate the maximum possible size.  I don't
> know the size because I need to look a certain time ahead (in the
> timestamp column) and I don't know how many rows that will be.  So if
> xColumn finds out that it hasn't enough lookahead data to do its
> computation, I should resize and read more.  But at that time (unless
> it's the first row), I can know what the columns are that I'm going to
> need.
>
> So I can start out with a buffer with all the columns but few rows,
> and then increase the number of rows when needed and hopefully reduce
> it to only the interesting columns at the same time.
>
> It will require a bit of bookkeeping, but seems doable.
>
> -Steinar
>
> Roger Binns  writes:
>
>> On 23/02/12 23:02, Steinar Midtskogen wrote:
>>> I know that xColumn will only get called for these columns.
>>
>> As Dan said there isn't a direct way of knowing.  There is a reasonably
>> convenient workaround of having multiple virtual tables with different
>> subsets of the columns but all returning the same underlying data.
>>
>> Something else to look at are hidden columns.  See section 2.1.1 in
>> http://www.sqlite.org/vtab.html
>>
>> With that you can make the "cheap" columns regular and the expensive ones
>> hidden so that the SQL query explicitly has to ask for them.  In your vtab
>> implementation you can always start out by only getting the cheap columns
>> until xColumn tells you that one of the hidden/expensive ones is needed
>> and then go off and redo the underlying query.
>>
>> Roger
>> ___
>> sqlite-users mailing list
>> sqlite-users@sqlite.org
>> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] SELECT average timestamp to get average time of day?

[C M ]

> For example, the average I'd
> want from these three timestamps:
>
> '2012-02-18 22:00:00.00'
> '2012-02-19 23:00:00.00'
> '2012-02-28 01:00:00.00'
>
> Should be 11:20pm, as they are all within a few hours of each other at
> night.  I have not been able to find a query that produces this.

Sounds like you need to convert these timestamps into unix time
(seconds since Jan 1 1970), calculate the average, then convert it
back to its original format.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] xFilter or xBestIndex needs to know which columns were selected

Thanks to Dan and Roger for the information and suggestions.  I have
about 100 columns (and millions of rows), but usually queries will
only ask for a few columns.

I think I know a way to work around this for my case.  Basically, I
don't know the exact size required for my lookahead buffer (currently
allocated in xFilter), so I probably need to make that size dynamic
anyway rather than to allocate the maximum possible size.  I don't
know the size because I need to look a certain time ahead (in the
timestamp column) and I don't know how many rows that will be.  So if
xColumn finds out that it hasn't enough lookahead data to do its
computation, I should resize and read more.  But at that time (unless
it's the first row), I can know what the columns are that I'm going to
need.

So I can start out with a buffer with all the columns but few rows,
and then increase the number of rows when needed and hopefully reduce
it to only the interesting columns at the same time.

It will require a bit of bookkeeping, but seems doable.

-Steinar

Roger Binns  writes:

> On 23/02/12 23:02, Steinar Midtskogen wrote:
>> I know that xColumn will only get called for these columns.
>
> As Dan said there isn't a direct way of knowing.  There is a reasonably
> convenient workaround of having multiple virtual tables with different
> subsets of the columns but all returning the same underlying data.
>
> Something else to look at are hidden columns.  See section 2.1.1 in
> http://www.sqlite.org/vtab.html
>
> With that you can make the "cheap" columns regular and the expensive ones
> hidden so that the SQL query explicitly has to ask for them.  In your vtab
> implementation you can always start out by only getting the cheap columns
> until xColumn tells you that one of the hidden/expensive ones is needed
> and then go off and redo the underlying query.
>
> Roger
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] xFilter or xBestIndex needs to know which columns were selected

Hello

Is it possible to find out in xFilter or xBestIndex which columns were
selected?  That is, if I do "SELECT a, b, c FROM t" where t is a
virtual table, I would like to know in xFilter or xBestIndex that the
result will only consist of the rows a, b and c.

I know that xColumn will only get called for these columns.  The
reason why I would like to know before that, is that xFilter will do a
"SELECT *" on another table and it needs read a bunch of its rows
before the first xColumn can be called (I'm doing interpolation and
need to look ahead).  But in most cases its a huge waste of resources
to read every column ahead of xColumn.  I only need to read those
columns that xColumn will be called on.  I could potentionally save
GB's of memory if I know what wont be needed in xFilter.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Lifetime of an sqlite_value object

Steinar Midtskogen  writes:

> Thank you.  A followup question: Is there a portable way to find the
> size of an sqlite_value object?

I had a quick look in the sqlite3 source code.  With knowledge of the
internal structures it seems possible to copy a value if everything
inside its object is allocated with sqlite3_malloc() which, I think,
keeps track of the size of the allocated memory, but it's probably
more complicated than what I'm trying to simplify.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Lifetime of an sqlite_value object

Larry Brasfield  writes:

> Steinar Midtskogen wrote:
>> Is it safe to assume that the object pointed to by
>> sqlite3_column_value() will exist until sqlite3_finalize() is called?
>> Or will it only be valid until the next call to sqlite3_step()?
>
> From the API doc titled "Result Values From A Query":
> The pointers returned are valid until a type conversion occurs as
> described above, or until sqlite3_step() or sqlite3_reset() or
> sqlite3_finalize() is called.

Thank you.  A followup question: Is there a portable way to find the
size of an sqlite_value object?

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Lifetime of an sqlite_value object

Hello

Is it safe to assume that the object pointed to by
sqlite3_column_value() will exist until sqlite3_finalize() is called?
Or will it only be valid until the next call to sqlite3_step()?

I know I can call sqlite3_value_xxx() and make a copy which I can
store as long as I want to, but I'd like to deal with the value type
until I actually need the value.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Function context

[Simon Slavin]

> On 13 Feb 2012, at 7:51pm, Steinar Midtskogen wrote:
>
>> One should think that the ability to calculate a moving average would
>> be a pretty common request.  But people do it in their application
>> code instead?
>
> Actually, my expectation is the other way.  I'm continually surprised by 
> posts on this list that people expect to be able to use SQLite with no 
> external programming at all.  I often see complicated compound JOIN and 
> sub-SELECT SQL commands here which can be replaced by four lines in any 
> programming language, yielding faster simpler code which would be easier to 
> document and debug.

Well, yes, to do things in C or similar when SQL becomes too unwieldy
is precisely what I want, but the question is rather whether it's
feasible to do it within SQLite's framework for custom functions and
virtual tables.  This is the first time I do something with SQLite, so
my naive approach is that anything would be neater to have as custom
functions or virtual tables before doing it completely ad hoc in C.
Not just for the sake of using one of SQLite's key features, but to
keep things more reusable and to keep the core application as simple
as possible.

On the other side, I realise that if the API is to grant every wish
any programmer might have, it will break one of SQLite's other key
features: simplicity.

I think in this order:

1. Feasible using SQL?
2. Feasible using custom functions?
3. Feasible using virtual tables?
4. Ad hoc application code.

Perhaps number 4 will get me faster to my goal right now, but I'd like
to think that trying the approaches higher up first can give me
something back in the longer run.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Function context

[Scott Hess]

> I think you're making unwarranted assumptions about the order your
> custom function will be called.  Even if you added "ORDER BY" to the
> end of the query, that wouldn't necessarily order the calls to your
> custom function.  Even if you find a workaround which allows you to
> implement something in the current version of SQLite, it wouldn't
> necessarily work in a future version.

Yes, I'm fearing that.  For instance, non-aggregate functions would be
natural candidates for parallelisation on a multicore platform, and
then the order will surely be broken.  While the sqlite site declares
that "threads are evil", it seems likely that people will ask for more
performance on their 100 core CPU's, so it might be a dangerous bet to
assume that sqlite never will go that path eventually.

> Unfortunately, I can't offhand think of a reasonable solution for you,
> I think I'd just use the SELECT to generate the data, while
> calculating the moving average in my application code.

Yes, but that reduces sqlite to just a way to store data.  It would be
nice to be able to use SQL and aggregate functions on the resulting
moving average (in particular max() and min()).

Perhaps the moving average can be implemented as a virtual table?

One should think that the ability to calculate a moving average would
be a pretty common request.  But people do it in their application
code instead?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Function context

[Peter Aronson]

> (2) You can associate data with an argument to a regular user-defined
> function using sqlite3_set_auxdata() and sqlite3_get_auxdata() as long
> as the value of the argument is static.  If you don't normally have a
> static argument to your function, you can add one (say a string
> MAVG').  I actually used this approach with some application generated
> SQL in my current project at one point.

Thanks.  I'm intending to write a function so I can do:

SELECT unix_time, mavg(value, unix_time, ) FROM tab;

assuming:

CREATE TABLE tab (value REAL, unix_time INTEGER, PRIMARY KEY (unix_time));

So I assume that your second approach could work, since the third
argument to mavg() (the period, window size in seconds) is static,
e.g. mavg(value, unix_time, 86400) will give me the moving daily
average.

But will the data be private to only one query?  That is, if two
queries using the same period happen to run simultaniously, will it
still work?  The documentation wasn't clear.  In its example of using
this data for storing a compiled regexp, it would rather be useful if
it was not strictly private.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Function context

Hello

Is it possible to have a context for a custom SQL function that is NOT
an aggregate function?  It might sound silly, but if a SELECT
statement with this function causes the function to be called in a
defined order (as with the step function of an aggregate function),
this can be useful to calculate the moving average.  That is, in its
simplest form, to return the average of the N last values.

sqlite spiral to a crash if I call sqlite3_aggregate_context() when I
don't have a finalise function.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] SQLite libraries

Hello

Has anyone collected a library of extensions to SQLite, such as useful
aggregate functions, modules, etc?

There is http://www.sqlite.org/contrib and extension-functions.c, but
is there more out there?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Simon Slavin]

> On 10 Feb 2012, at 3:24pm, Steinar Midtskogen wrote:
>
>> I feared that.  As it is, it takes 6 seconds to do a SELECT * FROM
>> Combined LIMIT 1 ("Combined" is a view representing the merged table).
>> If I add an ORDER BY, it takes 35 seconds.
>> 
>> Any way to speed up the ordering?
>
> Are you putting the ORDER BY in the VIEW definition or the SELECT definition 
> ?  Whichever you're doing, try the other one.  Also, is there an index which 
> provides a sorted list in an order which suits your ORDER BY clause ?

I created the view this way:

CREATE VIEW Combined AS
 SELECT strftime("%Y-%m-%d %H:%M:%S", unix_time, "unixepoch") AS time, * FROM
 (SELECT unix_time FROM Voksenlia1 UNION
  SELECT unix_time FROM Voksenlia2 UNION
  SELECT unix_time FROM Voksenlia3 UNION
  SELECT unix_time FROM Voksenlia4 UNION
  SELECT unix_time FROM Voksenlia5 UNION
  SELECT unix_time FROM Voksenlia6 UNION
  SELECT unix_time FROM Voksenlia8
)
LEFT NATURAL JOIN Voksenlia1
LEFT NATURAL JOIN Voksenlia2
LEFT NATURAL JOIN Voksenlia3
LEFT NATURAL JOIN Voksenlia4
LEFT NATURAL JOIN Voksenlia5
LEFT NATURAL JOIN Voksenlia6
LEFT NATURAL JOIN Voksenlia8 ORDER BY unix_time;

All 7 tables have a PRIMARY KEY (unix_time)

I tried then this:

CREATE VIEW Combined AS
 SELECT strftime("%Y-%m-%d %H:%M:%S", unix_time, "unixepoch") AS time, * FROM
 (SELECT unix_time FROM Voksenlia1 UNION
  SELECT unix_time FROM Voksenlia2 UNION
  SELECT unix_time FROM Voksenlia3 UNION
  SELECT unix_time FROM Voksenlia4 UNION
  SELECT unix_time FROM Voksenlia5 UNION
  SELECT unix_time FROM Voksenlia6 UNION
  SELECT unix_time FROM Voksenlia8 ORDER BY unix_time
)
LEFT NATURAL JOIN Voksenlia1
LEFT NATURAL JOIN Voksenlia2
LEFT NATURAL JOIN Voksenlia3
LEFT NATURAL JOIN Voksenlia4
LEFT NATURAL JOIN Voksenlia5
LEFT NATURAL JOIN Voksenlia6
LEFT NATURAL JOIN Voksenlia8;

And I got a big speedup.  Is this what you meant?

The combined view currently has 84 columns and 2,548,717 rows, so
doing things the wrong way makes a huge impact.
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Igor Tandetnik]

> If you need a particular order, it's best to add an explicit ORDER BY. 
> Otherwise, you are at the mercy of an implementation. Your current version of 
> SQLite chooses an execution plan that happens, by accident, to produce rows 
> in the desired order. Tomorrow you upgrade to a new version, and it chooses a 
> different execution plan that results in a different order.

I feared that.  As it is, it takes 6 seconds to do a SELECT * FROM
Combined LIMIT 1 ("Combined" is a view representing the merged table).
If I add an ORDER BY, it takes 35 seconds.

Any way to speed up the ordering?

I think I'll need the ordering to do interpolation.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Igor Tandetnik]

> Steinar Midtskogen  wrote:
>> 
>> Thanks, I didn't think in that simple terms. :) I think about listing
>> all the values, so I got lost.

I lost a word there: "I didn't think about listing"...

>> 
>> But what if the tables share a timestamp, then I would get, say:
>> 
>> 1328873300|1|2| | | |
>> 1328873300| | |3| | |
>> 1328873300| | | |4|5|6
>> 
>> How can that get collapsed into:
>> 
>> 1328873300|1|2|3|4|5|6
>
> Try something like this:
>
> select timestamp, value1, ..., value6 from
> (select timestamp from tab1
>  union
>  select timestamp from tab2
>  union
>  select timestamp from tab3)
> left join tab1 using (timespamp)
> left join tab2 using (timespamp)
> left join tab3 using (timespamp);

Wonderful!  It also eliminates the need to list all the nulls and
values, and as a bonus it gets ordered by timestamp whereas the
previous solution required an "ORDER BY timestamp" which made
everything slower than this solution (for some reason).

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Igor Tandetnik]

>> timestamp|value1|value2|value3|value4|value5|value6
>> 1328873000|1|2| |  |  |
>> 1328873050| | |7|  |  |
>> 1328873075| | | |10|13|16
>> 1328873100|3|4| |  |  |
>> 1328873150| | |8|  |  |
>> 1328873175| | | |11|14|17
>> 1328873200|5|6| |  |  |
>> 1328873250| | |9|  |  |
>> 1328873275| | | |12|15|18
>> 
>> But, first things first, how can I merge my tables to get the combined
>> table with NULLs?
>
> select value1, value2, null, null, null, null from tab1
> union all
> select null, null, value3, null, null, null from tab2
> union all
> select null, null, null, value4, value5, value6 from tab3;

Thanks, I didn't think in that simple terms. :) I think about listing
all the values, so I got lost.

But what if the tables share a timestamp, then I would get, say:

1328873300|1|2| | | |
1328873300| | |3| | |
1328873300| | | |4|5|6

How can that get collapsed into:

1328873300|1|2|3|4|5|6

?

One way could be to fill out the missing values using interpolation
(as I would like anyway), then remove duplicate lines, but if there is
a simple way before I attempt to interpolate, I should probably
collapse first.

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

I've rethought the interpolation strategy.  It's not important to be
able to look up any timestamp, just the timestamps that actually have
values in at least one table.  Let's say I have N tables, each with a
timestamp as primary key.  For instance:

tab1:
timestamp|value1|value2
1328873000|1|2  
1328873100|3|4  
1328873200|5|6  

tab2:
timestamp|value3
1328873050|7
1328873150|8 
1328873250|9

tab3:
timestamp|value4|value5|value6
1328873075|10|13|16
1328873175|11|14|17
1328873275|12|15|18

First, I'd like to merge all tables to create one single table with
every timestamp and every value.  That is, it will have the timestamps
that I get by:

 SELECT timestamp FROM tab1 UNION
 SELECT timestamp FROM tab2 UNION
 SELECT timestamp FROM tab3;

So the resulting table should be:

timestamp|value1|value2|value3|value4|value5|value6
1328873000|1|2| |  |  |
1328873050| | |7|  |  |
1328873075| | | |10|13|16
1328873100|3|4| |  |  |
1328873150| | |8|  |  |
1328873175| | | |11|14|17
1328873200|5|6| |  |  |
1328873250| | |9|  |  |
1328873275| | | |12|15|18

The resulting table will have a lot of NULLs.  Next, I'd like to fill
out all NULLs by using linear interpolation.

But, first things first, how can I merge my tables to get the combined
table with NULLs?  I've been playing with JOIN and UNION, but I'm
afraid my SQL experience is very limited, so I got stuck at this one.

In reality I have 7 tables to be merged with a lot of columns (perhaps
100 in all) and up to a million rows, so if an SQL statement to do
this sounds unrealistic, that would be a good answer as well.

Thanks,
-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Inserts get slower and slower

Thanks to all for suggestions.

> My guesses:
> - Your unix_time values are not successive. In this case your first fast
> results are due to advantages of memory caching. The following slowness is
> the result of the internal fragmentation

All unix_time values should be successive, but in the particular file
used to create the inserts I discovered some corruptions in one area,
random isolated bytes had been replaced with garbage, including bits
of the timestamps, so this was the cause.  Thanks for getting me on
the right track!

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Inserts get slower and slower

Hello

I'm having trouble with one table in my database.

When I build my database from scratch using millions of inserts, one
table causes problems.  Inserts get slower and slower.  I have about
830,000 inserts for that table.  It gets to 300,000 pretty fast, but
then it gets slower and slower, and eventually it will only do a few
inserts per second, and I then I have to kill sqlite3 as it will run
for hours if not days.  The -echo option reveals that it gets slower
and slower.  sqlite3 runs at 100% CPU.

I create other similar tables with 830,000 inserts the same way, but
inserts into them don't slow down.

The table in question is:

CREATE TABLE Voksenlia2 (
 temp_in REAL,
 pressure REAL,
 rh_in REAL,
 temp_in_2 REAL,
 temp_in_3 REAL,
 temp_in_4 REAL,
 temp_in_5 REAL,
 temp_ground_0cm REAL,
 temp_ground_10cm REAL,
 temp_ground_20cm REAL,
 temp_ground_50cm REAL,
 radiation INTEGER,
 radiation_2 INTEGER,
 uv REAL,
 temp_uv REAL,

 unix_time INTEGER, PRIMARY KEY (unix_time)
);

The commands start this way:
begin;
insert into Voksenlia2 values(25.010, 1011.260, null, null, null, null, null, 
null, null, null, null, null, null, null, null, 1072915200);
insert into Voksenlia2 values(25.010, 1011.260, null, null, null, null, null, 
null, null, null, null, null, null, null, null, 1075063152);

and eventually end in a "commit".  So only one transaction.

I've tried:

* PRAGMA synchronous=OFF.
* Create the database file in /dev/shm/ (RAM disk).
* Break up the inserts into several transactions.
* Create and fill this table before everything else.
* Upgrade from version 3.4.2 to 3.7.10.

Nothing has made any difference.  Any ideas?  Anything I could try or
any ways to debug this?

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Kit]

> SELECT a.temp+(b.temp-a.temp)/(b.time-a.time)*(strftime('%s','2012-02-08
> 11:37:00')-a.time) FROM
>   (select time, temp FROM tp
> WHERE strftime('%s','2012-02-08 11:37:00')*1>=time
> ORDER BY time DESC LIMIT 1) AS a,
>   (select time, temp FROM tp
> WHERE strftime('%s','2012-02-08 11:37:00')*1 ORDER BY time LIMIT 1) AS b;

Thanks!  I will try to rewrite this as a view.
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Kevin Martin]

> For the quick and dirty solution, I think you can use something like this to 
> create your view. You would need to index time, and even with the index, I'm 
> not too sure about speed.
>
> select x1.time, x1.value, x2.time from x as x1 left join x as x2 on 
> x2.time=(select max(time) from x where time I would favor the virtual table approach, as I think the other
> solutions require more complicated queries to account for the fact
> that interpolation is going on.

Yes, but if creating views does the queries fast enough for me, I'll
be pragmatic about this. :)

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

Related to this thread, I wonder if it's possible to create a view
which can give me a value from the row immediately above.  E.g. given
the table:

unix_time  val
--+---
1325376000|val1
1325376300|val2
1325376600|val3
1325376900|val4

(the first column is a unix timestamp and unique)

can I create a view which gives me:

unix_time  val  prev_unix_time
--++--
1325376000|val1|
1325376300|val2|1325376000
1325376600|val3|1325376300
1325376900|val4|1325376600

Something like this will not work:

 create view new as select unix_time, val, (select unix_time from old where 
new.unix_time < old.unix_time order by unix_time desc limit 1) as 
prev_unix_time from old;

as I can't refer to new.unix_time inside the view that defines "new".

The idea is, if this is possible, then I should be able to get my
weighted average by something like this (not verified, but you get the
idea):

 select sum(val * (unix_time - prev_unix_time)) / sum(unix_time - 
prev_unix_time) from new;

-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Interpolation

[Roger Binns]

> I'd recommend you write code in your application first that knows how to
> calculate the values you want.  That way you can ensure the calculations
> are correct, you have something for test harnesses that produces "good"
> values and you have something to port to your final solution.
>
> I'd probably stop there.  Remember that your application code and the
> SQLite library are running in the same process.  It is almost certain that
> it is (better/more convenient/easier to develop and use) for this code to
> be app code than within SQLite.

OK.  But then I wont have an SQL interface for accessing interpolated
data.  It's acceptable.  An SQL interface is mostly a "nice to have",
so I could do some quick queries in SQL instead of having to write C
code for it.

> If someone will be doing queries expecting to match a row with second
> granularity then your SQLite side solutions are virtual tables and
> functions.  The former is well documented.  For the latter you can make a
> function like "reading" which behind the scenes calls your app code which
> prepares a statement, finds neighbouring readings and returns the
> interpolated result - eg `select reading("2012-01-01T012345")`

So, if I go for that approach, you'd recommend that I add functions,
such as "reading", and if I want a mean temperature, I should add a
function "mean" and not try to change what AVG will do?

Thanks,
-- 
Steinar
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Interpolation

Hello

I have a large (> 1GB) collection of text files containing sensor
values associated with timestamps, and I want to switch to an sqlite
database.  For simplicity, let's assume that I have two tables, one
that stores temperatures and another that stores relative_humidity:

CREATE TABLE temperature (
  temp REAL,
  unix_time INTEGER,
  PRIMARY KEY (unix_time)
);

CREATE TABLE relative_humidity (
  rh REAL,
  unix_time INTEGER,
  PRIMARY KEY (unix_time)
);

What I need to solve is this:

1. I'd like to be able to look up any timestamp between the oldest and
the newest in the database, and if there is no value stored for that
timestamp, the value given should be an interpolation of the two
closest.  So, if the table has:

1325376000 (Jan 1 2012 00:00:00 UTC) | 5.0
1325376300 (Jan 1 2012 00:05:00 UTC) | 10.0

and I do "SELECT temp FROM temperature WHERE unix_time = 1325376120"
(00:02:00) I should get 7.0.

2. I'd like to calculate averages for certain periods.  Now, I could
use AVG if everything was stored in fixed intervals, but that's not
the case (which is also the reason why I want the interpolation above
- the user can't know what valid timestamps are).  For instance, let's
say I want the average temperature for two consecutive days.  The
first day was cold and has one value for every hour.  The second was
hot and has one value for every minute.  A plain AVG would not give me
what I'd expect, since the hot day would get far too much weight.  So
when calculating my average, I need to weigth the values depending on
the interval they'll represent.

3. Say that I want to know the dew point for a certain timestamp or
the average dew point for a whole day.  The dew point is calculated
from the temperature and the relative humidity.  The trouble is that
the two tables don't contain the same timestamps, so I can't look up
temp and rh in the tables using the timestamp and then do the
calculations (unless problem 1 has been solved).

In short, if my tables had values for every second, all these problems
would go away.  I could even use AVG and get what I wanted.  Is it
possible to create some kind of virtual table which to the user
appears to have values for every second?


I'm not asking the list to solve these problems for me, but it would
greatly help if anyone could point me to the right direction.  Which
approach would work?

Is it possible to create a VIEW to do any of this?

Could I use sqlite's virtual table functionality?

Or is it best to leave the tables as they are and do what I want to do
in C/C++, that is, to abandon the idea that I can get what I want
using regular SQL statements.

I could, of course, write a program that does all the interpolation
for every second and store the interpolated values in the database,
that would be very simple, but that would also make the database way
too large and slow.

Thanks.

-- 
Steinar Midtskogen
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users