Hi
We are using SQLite for a fairly big (but simple) calculation, and
have some problems when creating an index on a database with
600.000.000 rows. Specifically it has not ended even after 5 days of
running. We have done it successfully on 25% of the full data base,
and are now wondering what we can do.
Details:
We have a very simple database, just two columns VAR1 and VAR2 which
are both integers. Every combination of (VAR1, VAR2) is unique. We
want to add an index by using the following command
CREATE UNIQUE INDEX on DATABASE (VAR1, VAR2);
After creating the index we have to do appr. 100.000.000 lookups on
(VAR1, VAR2). We have benchmarked the results on a small(er) database
with 145.000.000 rows where the indexing takes around 1 hour and the
speed of the lookup is _very_ acceptable. However when running the
indexing command on the full database it does not terminate even
after 5-6 days (unfortunately our server got rebooted while it was
running). We are wondering on the rather drastic increase in run time
when going from 145.000.000 rows to 600.000.000 rows.
?: does the unique keyword matter, ie. does it make lookups faster
and does it slow down the index creation.
?: does the order of the data matter a lot - right now it is
partially sorted in that all combinations involving a single value of
VAR2 (not VAR1) are next to each other.
?: is it possible to enable some kind of logging/debugging mode so we
can monitor progression and see how it is doing.
We run the program on an 2600mhz Opteron with 16GB of ram and so far
it seems we are not getting killed by I/O or RAM: using top
frequently shows around 99% cpu usage and less than 40% memory usage.
We are using the following program (including all pragmas):
sqlite3 blasthits.db "pragma default_cache_size = 12000000;"
sqlite3 blasthits.db "pragma synchronous = off;
pragma temp_store = 2;
create unique index probeseqIDX on blasthits (probeidx,prokMSA);"
(here blashits.db is the database name and probeidx, prokMSA are
VAR1, VAR2).
?: Is there anything we can do to speed it up further?
Thanks a lot for any help, even if it is "it does not seem to be
possible to improve the speed of this operation".
Jim and Kasper
-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------