On Thu, 2008-03-06 at 18:40 +0100, [EMAIL PROTECTED] wrote: > > > With a commit after every add: 30 min. > > > With a commit after 100 add: 23 min. > > > Only one commit: 20 min.
[...] > I think it is a real world scenario because one has always the read the docs > from somewhere and offen has to store the index state somewhere else. Very true, but the time it takes to create the documents varies greatly between systems. I tried repeating your test by creating a simple 14 MB index with 10,000 documents on my desktop-machine. each document was made up of - one non-tokenized unique stored indexed field - one non-tokenized indexed stored field with one of 9 terms - one tokenized field with 930 random characters, including space With a commit after every add: 4 min, 46 sec. With a commit after every 100 add: 12 sec. Only one commit: 8 sec. Guesstimating the amortized time spend on adding each document on such a small corpus, by blatantly ignoring the overhead of creating the documents, gives us the following: With a commit after every add: (286 sec / 10,000 docs) 28.6 ms. With a commit after every 100 add: (12 sec / 10,000 docs) 1.2 ms. Only one commit: (8 sec / 10,000 docs) 0.8 ms. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]