You can adjust the performance of indexing by configuring of these parameters.

<mainIndex>
    <!-- lucene options specific to the main on-disk lucene index -->
    <useCompoundFile>false</useCompoundFile>
    <mergeFactor>10</mergeFactor>
    <maxBufferedDocs>1000</maxBufferedDocs>
    <maxMergeDocs>2147483647</maxMergeDocs>
    <maxFieldLength>10000</maxFieldLength>
  </mainIndex>


Jae

-----Original Message-----
From: Britske [mailto:[EMAIL PROTECTED]
Sent: Sat 4/5/2008 10:09 AM
To: solr-user@lucene.apache.org
Subject: indexing slow, IO-bound?
 

Hi, 

I have a schema with a lot of (about 10000) non-stored indexed fields, which
I use for sorting. (no really, that is needed). Moreover I have about 30
stored fields. 

Indexing of these documents takes a long time. Because of the size of the
documents (because of the indexed fields) I am currently batching 50
documents at once which takes about 2 seconds.Without adding the 10000
indexed fields to the document, indexing flies at about 15 ms for these 50
documents. INdexing is done using SolrJ

This is on a intel core 2 6400 @2.13ghz and 2 gb ram. 

To speed this up I let 2 threads do the indexing in parallel. What happens
is that solr just takes double the time (about 4 seconds) to complete these
two jobs of 50 docs each in parallel. I figured because of the multi-core
setup indexing should improve, which it doesn't. 

Does this perhaps indicate that the setup is IO-bound? What would be your
best guess  (given the fact that the schema has a big amount of indexed
fields) to try next to improve indexing performance? 

Geert-Jan
-- 
View this message in context: 
http://www.nabble.com/indexing-slow%2C-IO-bound--tp16513196p16513196.html
Sent from the Solr - User mailing list archive at Nabble.com.


Reply via email to