Re: Batch indexing a large number of records

Erik Hatcher Fri, 14 Sep 2007 09:59:02 -0700


On Sep 14, 2007, at 8:19 AM, Thompson,Roger wrote:

I am embarking on re-engineering an application using Solr/Lucene (If
you'd like to see the current manifestation go to:
fictionfinder.oclc.org). The database for this applicationconsists ofapproximatly 1.4 million records of varying size for the "work"record,and another database of 1.9 million bibliographic records. I fearthat
loading this through http will take several days, perhaps a week.  Do
any of you have a way to do a large batch load of the DB?

It won't take that long. Send multiple documents per POST andperhaps commit every big bunch or so. I ingested 3.8M binary MARCrecords in a pretty crude way in less than a day.

But, the fastest way to ingest data into Solr out of the box, Ithink, is to use the CSV import capabilities. I've indexed 1.8Mbibliographic-sized records in 18 minutes with the CSV uploader,pointing it to a local file.


        Erik

Re: Batch indexing a large number of records

Reply via email to