Re: [sqlite] build db in memory then dump to a persistent file.

2009-10-05 Thread Jeremy Hinegardner
Hi George,

On Sat, Oct 03, 2009 at 01:50:31PM -0700, George Hartzell wrote:
> 
> Hi all,
> 
> I use an SQLite database w/ the rtree extension to hold information
> about genetic polymorphism (snp's), based on UCSC's mapping work and
> their mysql table dumps.  My database is write-once, read from then
> on.
> 
> One of the tables has almost 19 million rows.  This is the table on
> which I build my 2-D rtree index.
> 
> I read the data from tab delimited mysql dumps into the basic tables,
> then run a query that pulls the data I want to rtree-index out of it's
> table, cleans up one of the columns, the inserts it into the rtree
> index.  Finally I add a set of indices to several columns in the big
> table.
> 
> I tried using .import to move the data into the tables, but gave up
> after it ran well past the time it took to do the inserts.  That
> didn't seem to help with the index/rtree creation time either.
> 
> I'm turning synchronous off and doing the inserts inside of a
> transaction.
> 
> I was wondering if there was some way to populate the database inside
> a ':memory:' database and then dump/copy/... the results into a file.
> I've seen posts that suggest that I can select from the memory tables
> and insert into the persistent ones, but that seems like it'd take
> more work to get the indexes.
> 
> I'd be interested in any suggestions for making my loading go faster.

Have you tried working with the backup API?  If all the data will fit 
into memory, do all the work in a memory database and then use the backup
api to dump it to disk.

http://sqlite.org/backup.html

enjoy,

-jeremy

-- 

 Jeremy Hinegardner  jer...@hinegardner.org 

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] build db in memory then dump to a persistent file.

2009-10-03 Thread Simon Slavin

On 3 Oct 2009, at 9:50pm, George Hartzell wrote:

> One of the tables has almost 19 million rows.  This is the table on
> which I build my 2-D rtree index.
>
> I read the data from tab delimited mysql dumps into the basic tables,
> then run a query that pulls the data I want to rtree-index out of it's
> table, cleans up one of the columns, the inserts it into the rtree
> index.  Finally I add a set of indices to several columns in the big
> table.
>
> I tried using .import to move the data into the tables, but gave up
> after it ran well past the time it took to do the inserts.  That
> didn't seem to help with the index/rtree creation time either.
>
> I'm turning synchronous off and doing the inserts inside of a
> transaction.

One or the other will help a lot but you shouldn't need both.

> I was wondering if there was some way to populate the database inside
> a ':memory:' database and then dump/copy/... the results into a file.
> I've seen posts that suggest that I can select from the memory tables
> and insert into the persistent ones, but that seems like it'd take
> more work to get the indexes.
>
> I'd be interested in any suggestions for making my loading go faster.

I see you're already doing the important bit: using a transaction.   
Possibly the next biggest influence is INSERTing rows first then  
making the indices later.  I don't think using a memory table will  
help much.  I would probably write the INSERT commands to a text file,  
then top and tail it with the other commands, or use the command-line  
tool and type the other commands manually.  If you have two hard  
disks, you might get best results by putting the text file with the  
INSERTs on a different hard disk, or you might not: depends how your  
controllers work.  Try it with 100,000 records and see which is faster.

Simon.

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] build db in memory then dump to a persistent file.

2009-10-03 Thread George Hartzell

Hi all,

I use an SQLite database w/ the rtree extension to hold information
about genetic polymorphism (snp's), based on UCSC's mapping work and
their mysql table dumps.  My database is write-once, read from then
on.

One of the tables has almost 19 million rows.  This is the table on
which I build my 2-D rtree index.

I read the data from tab delimited mysql dumps into the basic tables,
then run a query that pulls the data I want to rtree-index out of it's
table, cleans up one of the columns, the inserts it into the rtree
index.  Finally I add a set of indices to several columns in the big
table.

I tried using .import to move the data into the tables, but gave up
after it ran well past the time it took to do the inserts.  That
didn't seem to help with the index/rtree creation time either.

I'm turning synchronous off and doing the inserts inside of a
transaction.

I was wondering if there was some way to populate the database inside
a ':memory:' database and then dump/copy/... the results into a file.
I've seen posts that suggest that I can select from the memory tables
and insert into the persistent ones, but that seems like it'd take
more work to get the indexes.

I'd be interested in any suggestions for making my loading go faster.

Thanks,

g.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users