Configure Solr to use as much RAM as you can afford and not merge too often via mergeFactor. It's not clear (to me) from your explanation when you see 3000 docs/second and when only 100 docs/second.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Ian Connor <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Friday, August 1, 2008 3:36:13 PM > Subject: fastest way to load documents > > I have a number of documents in files > > 1.xml > 2.xml > ... > 17M.xml > > I have been using cat to join them all together: > > cat 1.xml 2.xml ... 1000.xml | grep -v '<\/add>' > /tmp/post.xml > > and posting them with curl: > > curl -d @/tmp/post.xml 'http://localhost:8983/solr/update' -H > 'Content-Type: text/xml' > > Is there a faster way to load up these documents into a number of solr > shards? I seem to be able to cover 3000/second just catting them > together (2500 at a time is the sweet spot for me) - but this slows > down to under 100/s once I try to do the post with curl. > > -- > Regards, > > Ian Connor