A lot of speculation here. I am certainly no SQLite expert. Your input has 1g positions, taking 187gb, so averaging 187b/position. From your CREATE TABLE, it looks like to get that size most of your qfrom and qto are fairly long strings. I'm assuming there are a great many duplications in those positions. If so, put them into a separate position table { positionId INT, positionName TEXT }, with positionId as the primary key and positionName also being unique. This will be even more useful if you have a fixed set of possible positions, and you make it so that positionId is increasing whenever positionName is increasing. In your qmoves table, store positionId values, rather than postionName values. Saves a lot of space because no name is in the database more than once, and most of your space is related to names.
Space is important, because at a few hundred gb, your database is probably bigger than all of your available fast cache's, and you are probably storing your data on rotating storage. Writes to random positions might easily average 10ms, and the import of most of your records may involve one or more such a writes by the time indices are updated. Reducing sizes mean fewer such writes, because things are more likely to fit in the various caches. I believe that with SQLite, if you don't specify WITHOUT ROWID your "real" record order is based on rowid, so complete records are, perhaps, stored in import order. The PRIMARY KEY you specified is large (a name and an integer), so the index entries are likely about half as large as a complete record, and they don't fit in cache either. It is also likely that your input data was not in sorted order. That means that adding an entry to that index likely involves a write to a random position. 10ms * 1g = 10e6s, or about three months. Your qmoves_by_dest index is about as large as your primary (it also has a position and an integer), so similar timings might apply. I suggest trying timing your script on smaller inputs (1gb, 10gb, 20gb) and seeing if there is a size where things suddenly get worse (available caches are not big enough). See if my suggestions help those timings. See if WITHOUT ROWID helps those timings. In principle, indices can be created by writing the needed information (index key, record position) in the original order, and then sorting that into key-order. That can be done with many less random seeks (merge sorts involve mostly sequential reads and writes). I don't know if, or when, SQLite does that. Regards, Bill -----Original Message----- From: Kevin O'Gorman [mailto:kevinogorm...@gmail.com] Sent: Wednesday, August 03, 2016 10:00 PM To: sqlite-users Subject: [sqlite] newbie has waited days for a DB build to complete. what's up with this. I'm working on a hobby project, but the data has gotten a bit out of hand. I thought I'd put it in a real database rather than flat ASCII files. I've got a problem set of about 1 billion game positions and 187GB to work on (no, I won't have to solve them all) that took about 4 hours for a generator program just to write. I wrote code to turn them into something SQLite could import. Actually, it's import, build a non-primary index, and alter table to add a column, all in sqlite3. The database was growing for about 1-1/2 days. Then its journal disappeared, the file size dropped to zero, but sqlite3 is still running 100% CPU time, now for a total of 3800+ minutes (63+ hours). The database is still locked, but I have no idea what sqlite3 is doing, or if it will ever stop. All partitions still have lots of space left (most of this is running in a RAID partition of 11 TiB). Here's what I gave to sqlite3 on my Linux system: time sqlite3 qubic.db <<EOF BEGIN EXCLUSIVE TRANSACTION; DROP TABLE IF EXISTS qmoves; CREATE TABLE qmoves ( qfrom CHAR(64), qmove INT, qto CHAR(64), qweight INT, PRIMARY KEY (qfrom, qmove) ON CONFLICT ROLLBACK ); CREATE INDEX IF NOT EXISTS qmoves_by_dest ON qmoves ( qto, qweight ); CREATE TABLE IF NOT EXISTS qposn ( qposn CHAR(64) PRIMARY KEY ON CONFLICT ROLLBACK, qmaxval INT, qmove INT, qminval INT, qstatus INT ); .separator " " .import am.all qmoves ALTER TABLE qmoves ADD COLUMN qstatus INT DEFAULT NULL; .schema COMMIT TRANSACTION; EOF Any clues, hints, or advice? -- #define QUESTION ((bb) || (!bb)) /* Shakespeare */ ************************************************************************************** This e-mail and any attachments thereto may contain confidential information and/or information protected by intellectual property rights for the exclusive attention of the intended addressees named above. If you have received this transmission in error, please immediately notify the sender by return e-mail and delete this message and its attachments. Unauthorized use, copying or further full or partial distribution of this e-mail or its contents is prohibited. ************************************************************************************** _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users