Also Linux has optional file systems that might be better for this. We plan to try them. ReiserFS and XFS have good reputations. (Reiser himself, that's a different story :(
Cheers, Lance -----Original Message----- From: Mike Klaas [mailto:[EMAIL PROTECTED] Sent: Monday, April 07, 2008 12:04 PM To: solr-user@lucene.apache.org Subject: Re: indexing slow, IO-bound? On 5-Apr-08, at 7:09 AM, Britske wrote: > Indexing of these documents takes a long time. Because of the size of > the documents (because of the indexed fields) I am currently batching > 50 documents at once which takes about 2 seconds.Without adding the > 10000 indexed fields to the document, indexing flies at about 15 ms > for these 50 documents. INdexing is done using SolrJ > > This is on a intel core 2 6400 @2.13ghz and 2 gb ram. > > To speed this up I let 2 threads do the indexing in parallel. What > happens is that solr just takes double the time (about 4 seconds) to > complete these two jobs of 50 docs each in parallel. I figured because > of the multi- core setup indexing should improve, which it doesn't. Multiple processors really only help indexing speeds when there is heavy analysis. > Does this perhaps indicate that the setup is IO-bound? What would be > your best guess (given the fact that the schema has a big amount of > indexed > fields) to try next to improve indexing performance? Use Lucene 2.3 with solr 1.2, or simple try out solr trunk. The indexing has been reworked to be considerably faster (it also makes better use of multiple processors by spawing a background merging thread). -Mike