Hubertus wrote:

This is the schema:
CREATE TABLE 'data'('nr' INTEGER PRIMARY KEY,
  'year' INTEGER,
  'month' INTEGER,
  'day' INTEGER,
  'sec' REAL,
  'campId' TEXT,
  'flightNr' INTEGER,
  '1' REAL,
  ...
  '71' REAL
  );
You should use double quotes around column and table names. Single quotes delimit literal strings in SQL.

CREATE INDEX sec on data(year,month,day,sec);
This index will not help with the queries you are testing.

I experience a big variability of time a query needs:
~database> time sqlite3  data.db 'select "14" from data where campId="polstar98" and 
"14">-9999;' >/dev/null
real    0m3.115s
user    0m1.748s
sys     0m1.368s
~/database> time sqlite3  data.db 'select "14" from data where campId="polstar98" and 
"14">-9999;' >/dev/null
real    0m3.139s
user    0m1.756s
sys     0m1.380s
~/database> time sqlite3  data.db 'select "42" from data where campId="stream94" and 
"14">-9999;' >/dev/null
real    0m50.227s
user    0m4.692s
sys     0m3.028s

I've tried it with and without index and at different times. Most of the
time such simple queries take about 35 sec. Why is that so and what can be done?
I can live with 3 sec of response but not up to one minute and the
database is still not complete. Would pytables with hdf5 be an alternative? Knowing that this is probably not the right place to ask...


You have your quotes mixed up in your queries as well.

   select "42" from data where campId="stream94" and "14">-9999;

should be

   select "42" from data where campId='stream94' and "14">-9999;

The value 'stream94' is a literal string and should be delimited with single quotes. You will need to escape those single quotes on the bash command line to get the correct query passed to sqlite.

Your query is scanning the entire table looking for records that match your conditions. I suspect that a lot of your variability is due to the relative number of records matching your first condition. To speed these searches you need an index on the columns you are searching.

   Create index dataCampId on data (campId);

Note that sqlite will only use a single index per table per query. In your case you are searching based on two columns. You can either create a compound index on both columns

   Create index dataCompound1 on data (campId, "14");

or create multiple indices on the individual columns and then use the analyze command to let sqlite gather the statistics that it needs to select the best index to use for a given query. The compound index approach will be the fastest if you use the same pair of columns as search criterion in most of your queries. If your search criteria vary you are probably best using single indexes and the analyze command.

HTH
Dennis Cote


-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to