Lots of answers, so thanks all around.  Some more info:

1. All partitions have at least 3 GB free, and it's not changing.  /tmp is
3 TiB and empty.
2. I have a RAID partition, for size, but no RAID controller.  As a hobby
project, I don't have spare parts, and I fear the results of a failure of a
hardware RAID without a spare, so I use Linux mdadm to manage software RAID
across three 4-TB drives.
3. Positions are 64 bytes always, so your size guesses are right.  They are
in no particular order.  I like the suggestion of a separate position
table, because they're going to appear in multiple qmove records, with an
average of about 3 or 4 appearances I think.  Maybe more.

I'm going to retry, using some of the suggestions above: smaller
transactions, IGNORE, position table rowids in the moves table, smaller
subsets being loaded, developing ideas of how time depends on data size.

If it doesn't go well fairly quickly, I'll probably go back to flat files
and writing the queries the hard way.  At least I know what I'm dealing
with there.

QUESTIONS:
If it's best in general to build indexes as a separate step, does this also
apply to primary indexes?  Can a table without a primary index have one
added later?  Isn't ROWID the real primary, presuming it has ROWIDs?  And
if so, then is a primary index on a ROWID table just for compliance with
standard SQL, and really no better than any other index?  Obviously, I'm a
bit confused about this.

While I'm at it, I may as well ask if ROWID has any physical significance,
such that a VACUUM operation might change it.  Or is it just an arbitrary
ID inserted by SQLite and added to each record when they exist at all.

The current dataset is intended to solve one particular issue in the
overall project.  It looks like I'd want to build each such dataset
separately, as there will likely be a few hundred, and I gather that adding
to these tables will be pretty slow once the indexes have been built.  Or
is it sensible to drop indexes, add data and rebuild?

On Thu, Aug 4, 2016 at 7:27 AM, Jim Callahan <jim.callahan.orla...@gmail.com
> wrote:

> Temp Files
> Have you checked how much storage is available to the temporary file
> locations?
> The temporary file locations are different depending on the OS, build, VFS
> and PRAGMA settings.
> See the last section "5.0 Temporary File Storage Locations" of:
> https://www.sqlite.org/tempfiles.html
>
>
> The database was growing for about 1-1/2 days.  Then its journal
> > disappeared, the file size dropped to zero, but sqlite3 is still running
> > 100% CPU time, now for a total of 3800+ minutes (63+ hours).  The
> database
> > is still locked, but I have no idea what sqlite3 is doing, or if it will
> > ever stop.  All partitions still have lots of space left (most of this is
> > running in a RAID partition of 11 TiB).  Here's what I gave to sqlite3 on
> > my Linux system:
>
>
> You might have a huge storage allocation for the main file and log, but
> some other temp file might be being dumped
> to a more constrained storage location.
>
> RAM
> Since you are using RAID disk controller; I assume you have 64 bit CPU and
> more than 8 GB of RAM?
> If you have 8 GB or more of RAM would it help to use an in memory database?
>
> Transactions
> Are you using explicit or implicit transactions?
> https://www.sqlite.org/lang_transaction.html
>
>
> Steps
> Agree with Darren Duncan and Dr. Hipp you may want to have at least 3
> separate steps
> (each step should be a separate transaction):
>
> 1. Simple load
> 2. Create additional column
> 3. Create index
>
> Have you pre-defined the table you are loading data into? (step 0 CREATE
> TABLE)
>
> If "Step 1 Simple Load" does not complete; then may want to load a fixed
> number of rows into separate tables (per Darren Duncan)  and then combine
> using an APPEND
> or a UNION query (doing so before steps 2 and 3).
>
> HTH
>
> Jim Callahan
> Data Scientist
> Orlando, FL
>
>
>
>
> On Wed, Aug 3, 2016 at 11:00 PM, Kevin O'Gorman <kevinogorm...@gmail.com>
> wrote:
>
> > I'm working on a hobby project, but the data has gotten a bit out of
> hand.
> > I thought I'd put it in a real database rather than flat ASCII files.
> >
> > I've got a problem set of about 1 billion game positions and 187GB to
> work
> > on (no, I won't have to solve them all) that took about 4 hours for a
> > generator program just to write.  I wrote code to turn them into
> something
> > SQLite could import.  Actually, it's import, build a non-primary index,
> and
> > alter table to add a column, all in sqlite3.
> >
> > The database was growing for about 1-1/2 days.  Then its journal
> > disappeared, the file size dropped to zero, but sqlite3 is still running
> > 100% CPU time, now for a total of 3800+ minutes (63+ hours).  The
> database
> > is still locked, but I have no idea what sqlite3 is doing, or if it will
> > ever stop.  All partitions still have lots of space left (most of this is
> > running in a RAID partition of 11 TiB).  Here's what I gave to sqlite3 on
> > my Linux system:
> >
> > time sqlite3 qubic.db <<EOF
> > BEGIN EXCLUSIVE TRANSACTION;
> > DROP TABLE IF EXISTS qmoves;
> > CREATE TABLE qmoves (
> >   qfrom CHAR(64),
> >   qmove INT,
> >   qto   CHAR(64),
> >   qweight INT,
> >   PRIMARY KEY (qfrom, qmove) ON CONFLICT ROLLBACK
> > );
> > CREATE INDEX IF NOT EXISTS qmoves_by_dest ON qmoves (
> >   qto,
> >   qweight
> > );
> > CREATE TABLE IF NOT EXISTS qposn (
> >   qposn CHAR(64) PRIMARY KEY ON CONFLICT ROLLBACK,
> >   qmaxval INT,
> >   qmove INT,
> >   qminval INT,
> >   qstatus INT
> > );
> > .separator " "
> > .import am.all qmoves
> > ALTER TABLE qmoves ADD COLUMN qstatus INT DEFAULT NULL;
> > .schema
> > COMMIT TRANSACTION;
> >
> > EOF
> >
> > Any clues, hints, or advice?
> >
> >
> > --
> > #define QUESTION ((bb) || (!bb)) /* Shakespeare */
> > _______________________________________________
> > sqlite-users mailing list
> > sqlite-users@mailinglists.sqlite.org
> > http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
> >
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@mailinglists.sqlite.org
> http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
#define QUESTION ((bb) || (!bb)) /* Shakespeare */
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to