On 2017/05/31 9:31 PM, Thomas Flemming wrote:
Hi,
maybe, hopefully, I missed something, its still about this database:
http://files.qvgps.com/0-tom-public/Geonames_World_2017.zip
Copying just the ids from 12mio records ordered in a temp-table takes
60 seconds. There is a COLLATE NOCASE index on label.
Is this normal or can this also be done faster?
DROP TABLE IF EXISTS RowCursor;
CREATE TEMP TABLE RowCursor (Id int);
INSERT INTO RowCursor SELECT Id from Pois ORDER BY Label COLLATE NOCASE;
CREATE INDEX RowCursor_Id on RowCursor(Id);
At the end of the day it IS 12m records (or 11.42m to be more
precise)... so it should take a bit of time. That said, it could
probably be faster - My first attempt using your database included
changing the page-size from 1024 to 4096 using:
PRAGMA page_size = 4096; VACUUM; -- (Bytes) 4KB
(Ensure the drive with your TEMP folder has more than 5GB free else the
above may fail).
Then setting Synchronous mode from FULL to Normal using:
PRAGMA synchronous = 1; -- Normal
Then jacked up the cache size from 2000 bytes to 8000 pages using:
PRAGMA cache_size = 8000; -- (Pages)
Made sure Journal mode is DELETE,
Made sure Threads is set to 8.
Next I dropped the "CREATE INDEX..." bit at the end - it only consumed
circa 3 seconds, but it is not needed, your row-id is already indexed
and you only use the Id field for reference look-ups in the other table,
no need to index it here. The result is what seems to be about half your
time. I use a good processor but the DB itself sat on a platter drive,
so I doubt the gains are due to system differences, though some of it
might be. Herewith the result:
-- SQLite version 3.17.0 [ Release: 2017-02-13 ] on SQLitespeed
version 2.0.2.4.
--
================================================================================================
DROP TABLE IF EXISTS RowCursor;
CREATE TEMP TABLE RowCursor (Id int);
INSERT INTO RowCursor SELECT Id from Pois ORDER BY Label COLLATE NOCASE;
-- Item Stats: Item No: 3 Query Size
(Chars): 74
-- VM Work Steps: 102790606 Rows
Modified: 11421177
-- Full Query Time: 0d 00h 00m and 28.471s
-- Query Result: Success.
--
------------------------------------------------------------------------------------------------
-- Script Stats: Total Script Execution Time: 0d 00h 00m and
29.389s
-- Total Script Query Time: 0d 00h 00m and
28.522s
-- Total Database Rows Changed: 11421177
-- Total Virtual-Machine Steps: 205581259
-- Last executed Item Index: 4
-- Last Script Error:
--
================================================================================================
On another tangent: No person can ever look at 12 million records in one
sitting. You should limit the span of results to something plausible,
like 100K rows. It's still a lot, but it is conceivable a person can
spend a few hours paging from one page to the next traversing an actual
100K rows... Improbable, but possible. Scanning 12 mil or even 1 mil
records is just implausible.
If you are going to view the entire table in Alphabetical order, an even
faster tactic would be to just permanently keep that Table with all the
rows in alphabetical order as a kind-of index table, filling it with a
trigger from the main table when changes happen. That way you can at any
time search for any Label, and when found, just look up the Pois ID in
the RowCursor table, and start paging from that row_id. How your UI
will work and how you want it to work will of course dictate what is the
best solution.
Good luck!
Ryan
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users