Christian,

thanks for the reply.

> >i'm having strange performance problem with sqlite-3.2.0. consider the
> >following table
> >
> > [snip]
> >
> >now the problem:
> >
> >1) if i do a select with an idex it takes 27 sec. to get 333392 rows
> >
> >> time sqlite3 db 'select n2 from data where a <= 18234721' > /dev/null
> >25.95u 0.71s 0:27.02 98.6%
> >
> >> time sqlite3 db 'select n2 from data where a <= 18234721' > /dev/null
> >26.02u 0.66s 0:27.53 96.9%
> >
> >2) if i do a select with sequential lookup it takes 1min to get 9818210 rows
> >
> >> time sqlite3 db 'select n2 from data where a+0 <= 18234721' > /dev/null
> >49.54u 14.65s 1:04.98 98.7%
> >
> >> time sqlite3 db 'select n2 from data where a+0 <= 18234721' > /dev/null
> >49.80u 14.42s 1:05.03 98.7%
> >
> >- so how come it take only twice as much time to get 30 times more rows?
> 
> When doing an index scan, you will be accessing the index as well as the
> table pages. For a sequential scan, only the table pages are touched,
> reducing thrashing of the cache.

i guess, i can believe this. however its pretty disappointing to get
50% improvement on 30 times less dataset :(

but how do you explain this?

sqlite> .schema data
CREATE TABLE data
(
       a INTEGER,
       b INTEGER,
       c CHAR,
       d INTEGER,
       e INTEGER,
       n1 FLOAT,
       n2 FLOAT
);
CREATE INDEX data_by_a ON data (a);

> time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
26.15u 0.59s 0:27.00 99.0%
> time sqlite3 db 'select n1 from data where a <= 18234721' > /dev/null
26.04u 0.61s 0:26.91 99.0%

and

> time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
12.22u 0.41s 0:12.67 99.6%
> time sqlite3 db 'select e from data where a <= 18234721' > /dev/null
12.01u 0.61s 0:12.64 99.8%

so just by selecting two different rows (both next to each other). i
get another 50% time difference? sure the column types are different,
and i can even understand that FLOAT might be 8 bytes and INTEGER 4
bytes, but 50% time difference? it just cant be that linear can it?

do you think it is possible to get better results by issuing multiple
queries each of which will return even less dataset? and, no, i'm not
a database guy.

> Also, if the data is not in index order in the table, and/or dispersed
> across the database file, you may have to visit each page more than once
> when traversing in index order. In the full table scan, you'll read the
> rows in table order, hence only touching each page once before moving on
> to later pages, thus reducing cache thrashing even more.

all rows were inserted into the table in order that matches the index.
i can not say if the records on disk will have the same order.

> >- and why is it taking 27 seconds to get 333392 rows anyway?
> 
> You think 12347 rows/s is bad?

because i have the same data set in perl multilevel hash stored on
disk (storable format), and it takes about 1 minute to perform the
same query on the same hardware. this time includes reading the data
from the disk, traversing every single key on each hash level, etc. i
was hoping that sqlite  would do something like in just a few (< 10)
seconds.

> >- is there any way to create an index in separate file?
> 
> No, SQLite is a single file embedded database. Keeps administration
> almost non-existent (by design.)

- so, if i have a large table ( > 9 million records) with an index
then sqlite would have constantly seek back and forth between index
and data (within the same file)  probably wasting lots of disk
bandwidth?

- if above is correct than can i force sqlite to get all (or as much
as possible) of index into the memory?

> >the hardware is sun netra t1 running solaris 5.7. the db file size is
> >about 800 mbytes.
> >
> >just for the record i'd like to have at least 15 times more records in
> >the 'data' table.
> 
> If you can match SQLite for simple read-only throughput with another
> database, you'll be doing well.

i do not know about other database, but compared to plain perl its not
that impressive. i must be doing something wrong here.

thanks,
max

Reply via email to