On Sat, Dec 19, 2009 at 10:18 AM, Simon Slavin <slav...@bigfraud.org> wrote:
> On 19 Dec 2009, at 9:27am, Robert Citek wrote:
>
>> Does anyone have any pointers on how I can speed up a sqlite3 query
>> that sorts a list of random integers?
>
> What are  you doing that sorts them?

An "order by" clause.

> Are they in an indexed column, or are you doing it an ORDER BY clause ?

Not an indexed column.

>  If you have an index, delete it; if you don't, make one.  See whether that 
> speeds it up.

Building an index works well for 1 MM records. The build+query time
(~40s=27+10) is about 6x faster than querying without an index
(~240s), but still 6x slower than querying and piping to the sort
command (~6s).

However, times take noticeably longer when the dataset size is doubled
to 2 MM records.  Querying+index (~160=136+20) took 4x longer than
with the 1 MM dataset, although it was still 4x faster than querying
without an index (~610s).  Piping to the sort command was still the
fastest at 12s, more than 13x faster than querying with an index and
50x faster than querying without an index.

Below is version information and the timing data along with the script
that creates and queries the database.

Regards,
- Robert

----- version info -----

$ ( set -x ; sqlite3 -version ; lsb_release -a ; ruby --version )
+ sqlite3 -version
3.4.2
+ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 8.04.3 LTS
Release:        8.04
Codename:       hardy
+ ruby --version
ruby 1.8.6 (2007-09-24 patchlevel 111) [i486-linux]


----- Timing data -----

#### 1 MM records:

=== load data
1000000
real 12.42
user 18.61
sys 0.74
=== sort random
1000000
real 238.73
user 19.60
sys 10.81
=== shell sort random
1000000
real 5.92
user 5.79
sys 0.34
=== build index
real 27.01
user 14.51
sys 5.44
=== sort random with index
1000000
real 9.77
user 6.84
sys 3.00

#### 2 MM records:

=== load data
2000000
real 21.86
user 36.94
sys 1.34
=== sort random
2000000
real 613.56
user 39.49
sys 23.59
=== shell sort random
2000000
real 12.10
user 11.90
sys 0.76
=== build index
real 136.45
user 31.30
sys 14.20
=== sort random with index
2000000
real 20.07
user 14.04
sys 6.21


---- script -----
#!/bin/bash
[ -f sample.db ] && \rm sample.db
echo "=== load data"
time -p ruby -e '
size=1000000
srand=1234
(1..size).each do |i|
  puts [i, rand(2**15)].join("\t")
end
' |
sqlite3 -init <(echo '
CREATE TABLE sample ( samp1 INTEGER, samp2 INTEGER)
;
.mod tabs
.imp "/dev/stdin" "sample"
select count(*) from sample
;
') sample.db .quit

echo "=== sort random"
time -p { grep -v ^# <<eof
.mode tabs
select *
from sample
order by samp2
;
eof
} | sqlite3 sample.db |
wc -l

echo "=== shell sort random"
time -p { grep -v ^# <<eof
.mode tabs
select *
from sample
;
eof
} | sqlite3 sample.db |
LC_ALL=C sort -k2,2n |
wc -l

echo "=== build index"
time -p { grep -v ^# <<eof
create index sample_samp1 on sample (samp2)
;
eof
} | sqlite3 sample.db

echo "=== sort random with index"
time -p { grep -v ^# <<eof
.mode tabs
select *
from sample
order by samp2
;
eof
} | sqlite3 sample.db |
wc -l
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to