Re: [sqlite] How do I speed up CREATE INDEX ?
Radzi, are the ids of the Transaction table ordered when inserted? I have discovered that it is very bad for performance of huge tables, if the rows are inserted with random ids. If you use an integer id (primary key )for such a table, SQLite uses the ROWID column to store the integer primary key. SQLite will put the records physically in the order you insert them but logically in ROWID order. Suppose you insert the following data: id data 9 -- disk 1 6 -- disk 2 8 -- disk 3 1 -- disk 3 5 -- disk 5 2 -- disk 6 7 -- disk 7 4 -- disk 8 3 -- disk 9 The recorders are on disk in order 'disk 1' .. 'disk 9'. But SQLite accesses the in id order. If the table is huge, then the head of your hard disk jumps around like crazy. When you create an index, SQLite uses the id order to access your entries. This takes for ever. If you can order the data on id before you insert should dramatically speed up the indexing. If this is not possible, don't make the id column primary key, but create an index for id instead. I wonder how this would change the performance of your application Michael Thanks for the suggestion. I'm a bit lost now. I've tried to load 80million rows now. It took 40 minutes to load into non-index tables; but creating index now take almost forever. It's already 12 hrs, not yet complete. regards, Radzi. - Original Message - From: <[EMAIL PROTECTED]> To: <sqlite-users@sqlite.org> Sent: Sunday, December 03, 2006 8:21 PM Subject: Re: [sqlite] How do I speed up CREATE INDEX ? "Mohd Radzi Ibrahim" <[EMAIL PROTECTED]> wrote: Hi, I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million rows (with no index). But then when I run CREATE INDEX it took me 40 mins to do that. What could I do to speed up the indexing process ? The reason index creation slows down when creating large indices is a problem with locality of reference in your disk cache. I've learned a lot about dealing with locality while working on full-text search, and I think I can probably implement a CREATE INDEX that runs much faster for a large table. There are some plans in the works that might permit me the time to do this in the spring. But in the meantime, the only thing I can suggest is to add more RAM to your machine so that you disk cache is larger. Or get a faster disk drive. -- D. Richard Hipp <[EMAIL PROTECTED]> - To unsubscribe, send email to [EMAIL PROTECTED] - - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] How do I speed up CREATE INDEX ?
Thanks for the suggestion. I'm a bit lost now. I've tried to load 80million rows now. It took 40 minutes to load into non-index tables; but creating index now take almost forever. It's already 12 hrs, not yet complete. regards, Radzi. - Original Message - From: <[EMAIL PROTECTED]> To: <sqlite-users@sqlite.org> Sent: Sunday, December 03, 2006 8:21 PM Subject: Re: [sqlite] How do I speed up CREATE INDEX ? "Mohd Radzi Ibrahim" <[EMAIL PROTECTED]> wrote: Hi, I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million rows (with no index). But then when I run CREATE INDEX it took me 40 mins to do that. What could I do to speed up the indexing process ? The reason index creation slows down when creating large indices is a problem with locality of reference in your disk cache. I've learned a lot about dealing with locality while working on full-text search, and I think I can probably implement a CREATE INDEX that runs much faster for a large table. There are some plans in the works that might permit me the time to do this in the spring. But in the meantime, the only thing I can suggest is to add more RAM to your machine so that you disk cache is larger. Or get a faster disk drive. -- D. Richard Hipp <[EMAIL PROTECTED]> - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] How do I speed up CREATE INDEX ?
"Mohd Radzi Ibrahim" <[EMAIL PROTECTED]> wrote: > Hi, > I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million > rows (with no index). But then when I run CREATE INDEX it took me 40 mins to > do that. What could I do to speed up the indexing process ? > The reason index creation slows down when creating large indices is a problem with locality of reference in your disk cache. I've learned a lot about dealing with locality while working on full-text search, and I think I can probably implement a CREATE INDEX that runs much faster for a large table. There are some plans in the works that might permit me the time to do this in the spring. But in the meantime, the only thing I can suggest is to add more RAM to your machine so that you disk cache is larger. Or get a faster disk drive. -- D. Richard Hipp <[EMAIL PROTECTED]> - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] How do I speed up CREATE INDEX ?
- Original Message - From: "P Kishor" <[EMAIL PROTECTED]> To: <sqlite-users@sqlite.org> Sent: Sunday, December 03, 2006 10:42 AM Subject: Re: [sqlite] How do I speed up CREATE INDEX ? On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote: - Original Message - Because I'm going to do it 24 times. I have 24 monthly files each of which about the same size. A faster computer is not an answer at this time. unless you are doing something really wrong (creating the wrong indexes, for example), there is nothing wrong about CREATE INDEX taking a relatively long time. That is the price you pay... you pay it once, when creating the index. After that, your queries are quick because they use the index. There really is no way around that. I have 5 simple tables: 1. Transaction (id, NoOfPack, NoOfDX, NoOfOps, NoOfCure) 2. Cure(transID, cureID) 3. Diag(transID, diagID, Sequence) 4. Ops(transID, opsID) 5. Pack(transID, packID, qty, amt) Tansaction has primary key on ID, and I've created index on each detail tables (Cure, Diag, Ops and Pack - one-to-many relationship with Trans) after loading the data. The indexes are Cure(CureID, TransID), Diag(diagID, transID), Ops(opsID, transID), Pack(packID, transID). Same goes with the other tables. There are 6million rows in Pack, 2 million in Trans. I was quite impressed with the loading part. Even faster than MSSQL Bulk Insert. But the indexing part is much slower than MSSQL. I got a better result (12 min) when using PRAGMA CACHE_SIZE=2. That probably match the time for MSSQL. Increasing to 4 does not show any improvement. regards, Radzi. - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] How do I speed up CREATE INDEX ?
On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote: - Original Message - From: "Jay Sprenkle" <[EMAIL PROTECTED]> To: <sqlite-users@sqlite.org> Sent: Sunday, December 03, 2006 9:12 AM Subject: Re: [sqlite] How do I speed up CREATE INDEX ? > On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote: >> Hi, >> I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 >> million rows (with no index). But then when I run CREATE INDEX it took me >> 40 mins to do that. What could I do to speed up the indexing process ? > > Details of the schema would help. > Assuming you didn't create more index than you need, a faster computer > is the answer. > It only has to be done once so why is that a problem? > Because I'm going to do it 24 times. I have 24 monthly files each of which about the same size. A faster computer is not an answer at this time. unless you are doing something really wrong (creating the wrong indexes, for example), there is nothing wrong about CREATE INDEX taking a relatively long time. That is the price you pay... you pay it once, when creating the index. After that, your queries are quick because they use the index. There really is no way around that. By the way, related question -- I typically drop the index before loading a huge amount of data. Once the data are loaded, I create the index. Is that act of creating an index conceptually the same as analyzing that table? Does analyze only end up rebuilding the indexes? Or does it do some other magic? -- Puneet Kishor http://punkish.eidesis.org/ Nelson Inst. for Env. Studies, UW-Madison http://www.nelson.wisc.edu/ Open Source Geospatial Foundation https://edu.osgeo.org/ - collaborate, communicate, compete = - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] How do I speed up CREATE INDEX ?
- Original Message - From: "Jay Sprenkle" <[EMAIL PROTECTED]> To: <sqlite-users@sqlite.org> Sent: Sunday, December 03, 2006 9:12 AM Subject: Re: [sqlite] How do I speed up CREATE INDEX ? On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote: Hi, I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million rows (with no index). But then when I run CREATE INDEX it took me 40 mins to do that. What could I do to speed up the indexing process ? Details of the schema would help. Assuming you didn't create more index than you need, a faster computer is the answer. It only has to be done once so why is that a problem? Because I'm going to do it 24 times. I have 24 monthly files each of which about the same size. A faster computer is not an answer at this time. - To unsubscribe, send email to [EMAIL PROTECTED] -
Re: [sqlite] How do I speed up CREATE INDEX ?
On 12/2/06, Mohd Radzi Ibrahim <[EMAIL PROTECTED]> wrote: Hi, I was loading a file to sqlite (3.3.8), and it took 4 mins to load 6 million rows (with no index). But then when I run CREATE INDEX it took me 40 mins to do that. What could I do to speed up the indexing process ? Details of the schema would help. Assuming you didn't create more index than you need, a faster computer is the answer. It only has to be done once so why is that a problem? -- SqliteImporter and SqliteReplicator: Command line utilities for Sqlite http://www.reddawn.net/~jsprenkl/Sqlite Cthulhu Bucks! http://www.cthulhubucks.com - To unsubscribe, send email to [EMAIL PROTECTED] -