Charlie,

How's this:
* -Xmx2g
* ramBufferSizeMB 512
* mergeFactor 10 (default, but you could up it to 20, 30, if ulimit -n allows)
* ignore/delete maxBufferedDocs - not used if you ran ramBufferSizeMB
* use SolrStreamingUpdateServer (with params matching your number of CPU cores) 
or send batches of say 1000 docs with the other SolrServer impl using N threads 
(N=# of your CPU cores)

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Charles Wardell <charles.ward...@bcsolution.com>
> To: solr-user@lucene.apache.org
> Sent: Tue, April 26, 2011 2:32:29 PM
> Subject: Question on Batch process
> 
> I am sure that this question has been asked a few times, but I can't seem to  
>find the sweetspot for indexing.
> 
> I have about 100,000 files each  containing 1,000 xml documents ready to be 
>posted to Solr. My desire is to have  it index as quickly as possible and then 
>once completed the daily stream of ADDs  will be small in comparison.
> 
> The individual documents are small.  Essentially web postings from the net. 
>Title, postPostContent, date. 
>
> 
> What would be the ideal configuration? For RamBufferSize, mergeFactor,  
>MaxbufferedDocs, etc..
> 
> My machine is a quad core hyper-threaded. So it  shows up as 8 cpu's in TOP
> I have 16GB of available ram.
> 
> 
> Thanks in  advance.
> Charlie

Reply via email to