Hubertus wrote:
This is the schema:
CREATE TABLE 'data'('nr' INTEGER PRIMARY KEY,
'year' INTEGER,
'month' INTEGER,
'day' INTEGER,
'sec' REAL,
'campId' TEXT,
'flightNr' INTEGER,
'1' REAL,
...
'71' REAL
);
You should use double quotes around column and table names. Single
quotes delimit literal strings in SQL.
CREATE INDEX sec on data(year,month,day,sec);
This index will not help with the queries you are testing.
I experience a big variability of time a query needs:
~database> time sqlite3 data.db 'select "14" from data where campId="polstar98" and
"14">-9999;' >/dev/null
real 0m3.115s
user 0m1.748s
sys 0m1.368s
~/database> time sqlite3 data.db 'select "14" from data where campId="polstar98" and
"14">-9999;' >/dev/null
real 0m3.139s
user 0m1.756s
sys 0m1.380s
~/database> time sqlite3 data.db 'select "42" from data where campId="stream94" and
"14">-9999;' >/dev/null
real 0m50.227s
user 0m4.692s
sys 0m3.028s
I've tried it with and without index and at different times. Most of the
time such simple queries take about 35 sec. Why is that so and what can be done?
I can live with 3 sec of response but not up to one minute and the
database is still not complete. Would pytables with hdf5 be an alternative?
Knowing that this is probably not the right place to ask...
You have your quotes mixed up in your queries as well.
select "42" from data where campId="stream94" and "14">-9999;
should be
select "42" from data where campId='stream94' and "14">-9999;
The value 'stream94' is a literal string and should be delimited with
single quotes. You will need to escape those single quotes on the bash
command line to get the correct query passed to sqlite.
Your query is scanning the entire table looking for records that match
your conditions. I suspect that a lot of your variability is due to the
relative number of records matching your first condition. To speed these
searches you need an index on the columns you are searching.
Create index dataCampId on data (campId);
Note that sqlite will only use a single index per table per query. In
your case you are searching based on two columns. You can either create
a compound index on both columns
Create index dataCompound1 on data (campId, "14");
or create multiple indices on the individual columns and then use the
analyze command to let sqlite gather the statistics that it needs to
select the best index to use for a given query. The compound index
approach will be the fastest if you use the same pair of columns as
search criterion in most of your queries. If your search criteria vary
you are probably best using single indexes and the analyze command.
HTH
Dennis Cote
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------