> Maybe, if we had a column called 'published_date' and we did a query
> for data within a date range.. the fastest way to get the information
>back would be to have an index on that column.  Suppose we sorted
>all the data by date - would there be a way to use that information so
>that we don't have to create the index?

If you can make the column fit in a 64-bit int and make it unique, you
can use it to create a fast index. The trick is to make it into the
table's rowid- then the index is the structure of the table, and
incurs less overhead than a regular index lookup.

Turning a non-unique date into a unique rowid is not trivial but it
isn't impossible. Store the published date in whatever numeric format
is convenient- YYYYMMDD, Julian date, seconds since the epoch, etc...
let's pick YYYYMMDD as an example. That fits in 8 digits. A 64 bit int
is 19+ digits, so if you put the date in the leftmost 8 then the other
11 can be a serial number to make a unique key.

Then create your table with an "integer primary key", eg

CREATE TABLE t(published_date_serial INTEGER PRIMARY KEY ASC, y, z);

After populating it, to select all records published between Aug 10
2010 and Sep 3 2010:

select
  published_date_serial/'100000000000 as published_date, y, z
from t
where published_date_serial between 2010081000000000000 and 2010090399999999999

That is a bit of a digression.

The original question mirrors one I have- if I am creating a large
read-only database, and can control over the data I'm importing, will
the data import faster if it is in the order of the integer primary
key? I'll have to benchmark that.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to