On 08/15/2016 12:02 AM, Венцислав Русев wrote:
My computer has 4 cores. I have compile sqlite like this "gcc -DSQLITE_MAX_WORKER_THREADS=4 -DSQLITE_DEFAULT_WORKER_THREADS=4 shell.c sqlite3.c -lpthread -ldl -o sqlite3". I made some tests and found that "pragma threads = 4" doesn't decrease runtime of the query that sorts 1 milion records.

SQLite version 3.8.8
sqlite> pragma threads;
4
sqlite> CREATE TABLE event (
    ID             INTEGER PRIMARY KEY NOT NULL,
    date           INTEGER NOT NULL,
    value          INTEGER NOT NULL );
sqlite> EXPLAIN QUERY PLAN SELECT ID FROM event ORDER BY date LIMIT 1;
0|0|0|SCAN TABLE event
0|0|0|USE TEMP B-TREE FOR ORDER BY
sqlite> SELECT ID FROM event ORDER BY date LIMIT 1;
4101021
Run Time: real 2.493 user 2.426000 sys 0.049000
sqlite> pragma threads = 0;
0
sqlite> SELECT ID FROM event ORDER BY date LIMIT 1;
4101021
Run Time: real 2.484 user 2.421000 sys 0.044000

To sum it up I have:
  - multi core cpu;
  - sqlite that is compiled to use "auxiliary threads";
  - a table with 1 million records;
- a query that scans through the table, sorts all records and outputs the first ID.

sqlite doesn't use these "auxiliary threads" that sqlite docs talks about and the runtime of that query is the same with or without "pragma threads = 4".

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Some clarifications:
  - OS: Ubuntu 14.04 LTS;
  - "PRAGMA compile_options" gives "SYSTEM_MALLOC  THREADSAFE=1";
- when sqlite shell is running and no query is running at the moment the OS reports that it has only one thread;


On 09/16/2016 11:29 AM, Richard Hipp wrote:

The algorithm used for "ORDER BY ... LIMIT N" uses much less memory than a full-up "ORDER BY" because is only keeps track of the top N entries seen so far, discarding the rest. But it also only uses a single thread. If you want multiple threads to be used, you'll need to drop the LIMIT, though I imagine that would defeat your purpose, no?

Richard Hipp has a point. I searched the sources and found that the sorting algorithm uses threads only if there isn't a LIMIT clause. To be exact the sources say "This file contains code for the VdbeSorter object, used in concert with a VdbeCursor to sort large numbers of keys for CREATE INDEX statements or by SELECT statements with ORDER BY clauses that cannot be satisfied using indexes and without LIMIT clauses."

But I'm wondering what if your query is like so "SELECT ID FROM event ORDER BY date LIMIT 1 OFFSET 1000000;"? Isn't it better to sort the results using available threads?


On 09/16/2016 12:13 PM, Adam Devita wrote:

Have you proven that the cpu is the bottleneck? Sorting a million rows seems like a lot, but even older single core cpus may be capable of 2 billion ops per second. [I apologize if this has been sorted out already.... I've got about 2 days of history on the thread] regards, Adam DeVita

This was just a test to see the performance of "PRAGMA threads". Actually I expect to sort 10-30 million rows or more. But after the tests I may choose the index.


Just to let you know what other tests I've conducted.

sqlite> .output stdout

sqlite> pragma threads = 4;
sqlite> select ID from event order by date;
Run Time: real 2.880 user 4.469000 sys 0.147000

sqlite> pragma threads = 0;
sqlite> select ID from event order by date;
Run Time: real 3.870 user 3.771000 sys 0.075000

Besides the runtime difference I inspected the process information while the query is running. When "pragma threads = 4;" sqlite shell starts new threads for sorting and stops them after the query has finished.

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to