Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-03 Thread Michael Sizaki

Radzi,

are the ids of the Transaction table ordered when inserted?
I have discovered that it is very bad for performance of huge
tables, if the rows are inserted with random ids. If you use
an integer id (primary key )for such a table, SQLite uses the
ROWID column to store the integer primary key. SQLite will
put the records physically in the order you insert them but
logically in ROWID order.

Suppose you insert the following data:

id data
9  -- disk 1
6  -- disk 2
8  -- disk 3
1  -- disk 3
5  -- disk 5
2  -- disk 6
7  -- disk 7
4  -- disk 8
3  -- disk 9

The recorders are on disk in order 'disk 1' .. 'disk 9'.
But SQLite accesses the in id order. If the table is huge,
then the head of your hard disk jumps around like crazy.

When you create an index, SQLite uses the id order to access
your entries. This takes for ever.

If you can order the data on id before you insert should dramatically
speed up the indexing. If this is not possible, don't make the id column
primary key, but create an index for id instead.

I wonder how this would change the performance of your application


Michael

Thanks for the suggestion. I'm a bit lost now. I've tried to load 
80million rows now. It took 40 minutes to load into non-index tables; 
but creating index now take almost forever. It's already 12 hrs, not yet 
complete.


regards,
Radzi.

- Original Message - From: <[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Sunday, December 03, 2006 8:21 PM
Subject: Re: [sqlite] How do I speed up CREATE INDEX ?



"Mohd Radzi Ibrahim" <[EMAIL PROTECTED]> wrote:

Hi,
I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 
million rows (with no index). But then when I run CREATE INDEX it 
took me 40 mins to do that. What could I do to speed up the indexing 
process ?




The reason index creation slows down when creating large
indices is a problem with locality of reference in your disk
cache.  I've learned a lot about dealing with locality
while working on full-text search, and I think I can
probably implement a CREATE INDEX that runs much faster
for a large table.  There are some plans in the works
that might permit me the time to do this in the spring.
But in the meantime, the only thing I can suggest is to
add more RAM to your machine so that you disk cache is
larger.  Or get a faster disk drive.
--
D. Richard Hipp  <[EMAIL PROTECTED]>





- 


To unsubscribe, send email to [EMAIL PROTECTED]
- 






-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-03 Thread Mohd Radzi Ibrahim
Thanks for the suggestion. I'm a bit lost now. I've tried to load 80million 
rows now. It took 40 minutes to load into non-index tables; but creating 
index now take almost forever. It's already 12 hrs, not yet complete.


regards,
Radzi.

- Original Message - 
From: <[EMAIL PROTECTED]>

To: <sqlite-users@sqlite.org>
Sent: Sunday, December 03, 2006 8:21 PM
Subject: Re: [sqlite] How do I speed up CREATE INDEX ?



"Mohd Radzi Ibrahim" <[EMAIL PROTECTED]> wrote:

Hi,
I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 
million rows (with no index). But then when I run CREATE INDEX it took me 
40 mins to do that. What could I do to speed up the indexing process ?




The reason index creation slows down when creating large
indices is a problem with locality of reference in your disk
cache.  I've learned a lot about dealing with locality
while working on full-text search, and I think I can
probably implement a CREATE INDEX that runs much faster
for a large table.  There are some plans in the works
that might permit me the time to do this in the spring.
But in the meantime, the only thing I can suggest is to
add more RAM to your machine so that you disk cache is
larger.  Or get a faster disk drive.
--
D. Richard Hipp  <[EMAIL PROTECTED]>





-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-03 Thread drh
"Mohd Radzi Ibrahim" <[EMAIL PROTECTED]> wrote:
> Hi,
> I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million 
> rows (with no index). But then when I run CREATE INDEX it took me 40 mins to 
> do that. What could I do to speed up the indexing process ?
> 

The reason index creation slows down when creating large
indices is a problem with locality of reference in your disk
cache.  I've learned a lot about dealing with locality
while working on full-text search, and I think I can
probably implement a CREATE INDEX that runs much faster
for a large table.  There are some plans in the works 
that might permit me the time to do this in the spring.  
But in the meantime, the only thing I can suggest is to 
add more RAM to your machine so that you disk cache is 
larger.  Or get a faster disk drive.
--
D. Richard Hipp  <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-02 Thread Mohd Radzi Ibrahim


- Original Message - 
From: "P Kishor" <[EMAIL PROTECTED]>

To: <sqlite-users@sqlite.org>
Sent: Sunday, December 03, 2006 10:42 AM
Subject: Re: [sqlite] How do I speed up CREATE INDEX ?



On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote:


- Original Message -



Because I'm going to do it 24 times. I have 24 monthly files each of 
which

about the same size. A faster computer is not an answer at this time.


unless you are doing something really wrong (creating the wrong
indexes, for example), there is nothing wrong about CREATE INDEX
taking a relatively long time. That is the price you pay... you pay it
once, when creating the index. After that, your queries are quick
because they use the index. There really is no way around that.



I have 5 simple tables:
1. Transaction (id, NoOfPack, NoOfDX, NoOfOps, NoOfCure)
2. Cure(transID, cureID)
3. Diag(transID, diagID, Sequence)
4. Ops(transID, opsID)
5. Pack(transID, packID, qty, amt)

Tansaction has primary key on ID, and I've created index on each detail 
tables (Cure, Diag, Ops and Pack - one-to-many relationship with Trans) 
after loading the data. The indexes are Cure(CureID, TransID), Diag(diagID, 
transID), Ops(opsID, transID), Pack(packID, transID). Same goes with the 
other tables.


There are 6million rows in Pack, 2 million in Trans.

I was quite impressed with the loading part. Even faster than MSSQL Bulk 
Insert. But the indexing part is much slower than MSSQL. I got a better 
result (12 min) when using PRAGMA CACHE_SIZE=2. That probably match the 
time for MSSQL.  Increasing to 4 does not show any improvement.



regards,
Radzi. 




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-02 Thread P Kishor

On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote:


- Original Message -
From: "Jay Sprenkle" <[EMAIL PROTECTED]>
To: <sqlite-users@sqlite.org>
Sent: Sunday, December 03, 2006 9:12 AM
Subject: Re: [sqlite] How do I speed up CREATE INDEX ?

> On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote:
>> Hi,
>> I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6
>> million rows (with no index). But then when I run CREATE INDEX it took me
>> 40 mins to do that. What could I do to speed up the indexing process ?
>
> Details of the schema would help.
> Assuming you didn't create more index than you need, a faster computer
> is the answer.
> It only has to be done once so why is that a problem?
>
Because I'm going to do it 24 times. I have 24 monthly files each of which
about the same size. A faster computer is not an answer at this time.


unless you are doing something really wrong (creating the wrong
indexes, for example), there is nothing wrong about CREATE INDEX
taking a relatively long time. That is the price you pay... you pay it
once, when creating the index. After that, your queries are quick
because they use the index. There really is no way around that.

By the way, related question -- I typically drop the index before
loading a huge amount of data. Once the data are loaded, I create the
index. Is that act of creating an index conceptually the same as
analyzing that table? Does analyze only end up rebuilding the indexes?
Or does it do some other magic?


--
Puneet Kishor http://punkish.eidesis.org/
Nelson Inst. for Env. Studies, UW-Madison http://www.nelson.wisc.edu/
Open Source Geospatial Foundation https://edu.osgeo.org/
-
collaborate, communicate, compete
=

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-02 Thread Mohd Radzi Ibrahim


- Original Message - 
From: "Jay Sprenkle" <[EMAIL PROTECTED]>

To: <sqlite-users@sqlite.org>
Sent: Sunday, December 03, 2006 9:12 AM
Subject: Re: [sqlite] How do I speed up CREATE INDEX ?



On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote:

Hi,
I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 
million rows (with no index). But then when I run CREATE INDEX it took me 
40 mins to do that. What could I do to speed up the indexing process ?


Details of the schema would help.
Assuming you didn't create more index than you need, a faster computer
is the answer.
It only has to be done once so why is that a problem?



Because I'm going to do it 24 times. I have 24 monthly files each of which 
about the same size. A faster computer is not an answer at this time.





-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] How do I speed up CREATE INDEX ?

2006-12-02 Thread Jay Sprenkle

On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote:

Hi,
I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million 
rows (with no index). But then when I run CREATE INDEX it took me 40 mins to do 
that. What could I do to speed up the indexing process ?


Details of the schema would help.
Assuming you didn't create more index than you need, a faster computer
is the answer.
It only has to be done once so why is that a problem?

--
SqliteImporter and SqliteReplicator: Command line utilities for Sqlite
http://www.reddawn.net/~jsprenkl/Sqlite

Cthulhu Bucks!
http://www.cthulhubucks.com

-
To unsubscribe, send email to [EMAIL PROTECTED]
-