On 08/15/2016 12:02 AM, Венцислав Русев wrote:
My computer has 4 cores. I have compile sqlite like this "gcc
-DSQLITE_MAX_WORKER_THREADS=4 -DSQLITE_DEFAULT_WORKER_THREADS=4
shell.c sqlite3.c -lpthread -ldl -o sqlite3". I made some tests and
found that "pragma threads = 4" doesn't decrease runtime of the query
that sorts 1 milion records.
SQLite version 3.8.8
sqlite> pragma threads;
4
sqlite> CREATE TABLE event (
ID INTEGER PRIMARY KEY NOT NULL,
date INTEGER NOT NULL,
value INTEGER NOT NULL );
sqlite> EXPLAIN QUERY PLAN SELECT ID FROM event ORDER BY date LIMIT 1;
0|0|0|SCAN TABLE event
0|0|0|USE TEMP B-TREE FOR ORDER BY
sqlite> SELECT ID FROM event ORDER BY date LIMIT 1;
4101021
Run Time: real 2.493 user 2.426000 sys 0.049000
sqlite> pragma threads = 0;
0
sqlite> SELECT ID FROM event ORDER BY date LIMIT 1;
4101021
Run Time: real 2.484 user 2.421000 sys 0.044000
To sum it up I have:
- multi core cpu;
- sqlite that is compiled to use "auxiliary threads";
- a table with 1 million records;
- a query that scans through the table, sorts all records and
outputs the first ID.
sqlite doesn't use these "auxiliary threads" that sqlite docs talks
about and the runtime of that query is the same with or without
"pragma threads = 4".
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
Some clarifications:
- OS: Ubuntu 14.04 LTS;
- "PRAGMA compile_options" gives "SYSTEM_MALLOC THREADSAFE=1";
- when sqlite shell is running and no query is running at the moment
the OS reports that it has only one thread;
On 09/16/2016 11:29 AM, Richard Hipp wrote:
The algorithm used for "ORDER BY ... LIMIT N" uses much less memory
than a full-up "ORDER BY" because is only keeps track of the top N
entries seen so far, discarding the rest. But it also only uses a
single thread. If you want multiple threads to be used, you'll need to
drop the LIMIT, though I imagine that would defeat your purpose, no?
Richard Hipp has a point. I searched the sources and found that the
sorting algorithm uses threads only if there isn't a LIMIT clause.
To be exact the sources say "This file contains code for the VdbeSorter
object, used in concert with a VdbeCursor to sort large numbers of keys
for CREATE INDEX statements or by SELECT statements with ORDER BY
clauses that cannot be satisfied using indexes and without LIMIT clauses."
But I'm wondering what if your query is like so "SELECT ID FROM event
ORDER BY date LIMIT 1 OFFSET 1000000;"? Isn't it better to sort the
results using available threads?
On 09/16/2016 12:13 PM, Adam Devita wrote:
Have you proven that the cpu is the bottleneck? Sorting a million rows
seems like a lot, but even older single core cpus may be capable of 2
billion ops per second. [I apologize if this has been sorted out
already.... I've got about 2 days of history on the thread] regards,
Adam DeVita
This was just a test to see the performance of "PRAGMA threads".
Actually I expect to sort 10-30 million rows or more. But after the
tests I may choose the index.
Just to let you know what other tests I've conducted.
sqlite> .output stdout
sqlite> pragma threads = 4;
sqlite> select ID from event order by date;
Run Time: real 2.880 user 4.469000 sys 0.147000
sqlite> pragma threads = 0;
sqlite> select ID from event order by date;
Run Time: real 3.870 user 3.771000 sys 0.075000
Besides the runtime difference I inspected the process information while
the query is running. When "pragma threads = 4;" sqlite shell starts new
threads for sorting and stops them after the query has finished.
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users