Robert,

> [snip]
> 
> > i said i print these rows to /dev/null too in my perl code. plus the
> > perl code does some other things such as joining these rows with other
> > hashes and summing the numbers.
> 
> That's fine.  I was merely trying to account for the 50% speed difference
> between the two differing column tests, which has been accomplished.
> 
> > > As for the temp table ... I haven't tried this, but isn't
> > "temp" a reserved
> > > word in SQLite?  More importantly, you should be doing this
> > statement inside
> >
> > yes, it is. i really want to create 'temporary table' in memory. i was
> > really hoping it would speed things up.
> 
> I misread the statement, so ignore me on that part.  However, 339,000 rows
> into a temporary in-memory table ... I tried some experiments locally here
> and none of them took more than 2 seconds to execute.  Are you sure you're
> not using up all available memory, which is causing the system to hit the
> swapfile?  What does this same query look like when you drop the "temp" from
> the query?

the system has 1G of ram. i was "monitoring" sqlite3 memory usage with
'top'. the SIZE and RES did not exceed 30M. so i do not think the
memory is the issue here.

> time sqlite3 db 'create table foo as select * from data where a <= 18234721' 
> > /dev/null
22.06u 1.39s 0:27.75 84.5%

so pretty much the same time without 'temp'.

i'm starting to suspect disk. here is what i did. i created a separate
database with only one table. this table contains subset of 333392
rows from original data table. it also has the same index on "a"
column, i.e. i did

> sqlite3 db1
sqlite> attach db as s;
sqlite> create table data as select * from s.data where a <= 18234721;
sqlite> create index data_by_a on data (a);

full scan

> time sqlite3 db1 'select n1 from data' > /dev/null
17.19u 0.55s 0:19.06 93.0%

"bad" index scan, because it is guaranteed then the table only has
keys that match "where"

> time sqlite3 db1 'select n1 from data where a <= 18234721' > /dev/null
25.73u 0.59s 0:28.37 92.7%

+10 seconds! is this the overhead of "indexed" scan? is this what it
really takes to seek back and forth between index and data? what am i
missing here?

thanks,
max

Reply via email to