I know the insert into an indexed table performance has been discussed here before, but I wanted to add to it and ask some questions.
I'm benchmarking loading insert statements, dumped from a 3.7GB file, into an empty database with sqlite.exe: BEGIN TRANSACTION; CREATE TABLE Articles(RID integer primary key, RefCnt integer, MessageID text, DomainID integer, Date integer, Lines integer, Flags integer, PartSize integer, FileSize integer, SetSize integer, CurPart integer, MaxPart integer, CurFile integer, MaxFile integer, Subject integer, Author integer, FileName integer, DlDir integer, PrevPart integer, NextPart integer, Score integer); CREATE INDEX Articles_Index ON Articles (MessageID, DomainID); INSERT INTO Articles VALUES(1,3,'dXOuc.4884$pX3.1156',1,817862976,2521,1328512,317646,1049920 2,0,2,33,0,0,1,1,1,0,109099,109100,0); INSERT INTO Articles VALUES(2,3,'z3Puc.4938$pX3.1969',1,817863259,2524,1328512,318024,1052301 6,0,22,33,0,0,1,1,2,0,109676,109701,0); -- etc -- INSERT INTO Articles VALUES(10602742,1,'8OWdnQHIA84fFSLdRVn-uA',11,818109133,3050,1327744,384 300,384300,0,1,40,23,36,84580,3412,243311,0,0,10602743,0); INSERT INTO Articles VALUES(10602743,1,'8OWdnQDIA84aFSLdRVn-uA',11,818109135,3051,1065600,384 426,384426,0,2,40,23,36,84580,3412,243311,0,10602742,0,0); COMMIT; The operation starts off loading data at about 500K/sec, but drops to about 50K/sec over 30 minutes and stays around there. (If it comes through, see attached) At that point about 1.2GB of the 3.7GB has loaded. It takes about 16 hours to complete the full 3.7GB. Without the index, the whole 3.7GB loads in about 20 minutes. CPU starts at 100% but soon starts to come down. I expect my hard drives to be able to deliver about 50MB/sec sequentially, about 5MB/sec random access, and have a 8-10 msec seek time. Memory allocation is stable, sqlite.exe is not paging, and the thread is either "running" or "waiting for executive". Using taskinfo2003, the only files with activity are the input sql file and the output sqlite DB file, the temp and journal files are quite with no movement. The file position for the sql input file grows linearly toward it's end. The file position for the sqlite DB file. As the DB grows, you can see the localization deteriorate by watching the file position. Taskinfo can give 1/2 second samples. It starts out spending most of it's time growing the file at the end. As time goes on, it starts jumping to positions back in the file. After 30 min or so, it is spending most of its time jumping around. I presume it is looking for where new records belong in the index ordering. So, what is the limiting resource? The seek time? Can the index be configured to be kept more in memory? Can it be kept in a different file and would this help (I suspect if and only if the file is on a different physical disk)? TIA --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]