P Kishor wrote:
On 2/23/07, Dennis Cote <[EMAIL PROTECTED]> wrote:
P Kishor wrote:
>
>
> Most of the time I am looking at one site, so there is a speed gain by
> not plowing through other sites' data. This is what is causing me to
> pause before I rush forward.
>
If you have an index on the readings table by site_id (and perhaps
timestamp) sqlite will use the index to go directly to that site's
records and will not plow through all the other site's data. Adding a
second field, like the timestamp, to the index will let you do fast
searches or selections based on time within a single site as well.
P.S. I would recommend keeping the integer primary key ID fields.
Because sqlite uses them for the rowid (which each record in each table
must have anyway) they cost you nothing and make joins more direct and
obvious.
HTH
yes, Dennis, this does help. Thanks. I liked John Stanton's suggestion
as well, very elegant, but I think your approach is more scalable, as
once the prototype is done and working well, it will be migrated to
PostGres. A question -- you say
If you have an index on the readings table by site_id (and perhaps
timestamp) sqlite will use the index to go directly to that site's
records and will not plow through all the other site's data. Adding a
second field, like the timestamp, to the index will let you do fast
well, site_id is a PRIMARY KEY, so it is automatically indexed. I am
not creating another index on it. I am indexing timestamp, but are you
suggesting a compound site,timestamp index?
Puneet,
The column site_id is a primary key in the sites table only. It is an
unindexed column in the readings table. I was suggesting an index on
site_id in readings table. This will provide an index so sqlite can
start at the first site specific record when doing selects from the
reading table with a where clause that includes the site. This sill
eliminate the table scan of the readings table.
create index reading_site on readings(site)
select ... from readings where site = ? and ...
Yes, I was suggesting a compound index. If you use the following
compound index
create index reading_site_timestamp on reading(site, timestamp)
it will be used for site specific queries and will also let you order
and/or condition your search by time.
select ... from readings where site = ? and timestamp > ? order by
timestamp.
This will start at the first record that matches both the site and
timestamp criteria and scan only records that match those criteria.
Also, if I have two cols a and b, and I create separate indexes on a
and b, would it hurt if I also have an additional compound index on
a,b? In other words, does SQLite have the smarts to know when to use
what?
Yes and no. It will take more space to store the additional index, and
it also take longer to do inserts since sqlite must update the
additional indexes. And, sqlite will only use one of the indexes in any
particular query, regardless of how many might be useful. If you do an
analyze after the tables are populated, sqlite should pick the best
available index for a given query. It is not useful to have an index on
b if you never search by that column by itself. It can't be used in
conjunction with the index on a to satisfy a query like the second one
above, you need a compound index.
HTH
Dennis Cote
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------