On Sep 14, 2007, at 8:19 AM, Thompson,Roger wrote:
I am embarking on re-engineering an application using Solr/Lucene (If
you'd like to see the current manifestation go to:
fictionfinder.oclc.org). The database for this application consists of approximatly 1.4 million records of varying size for the "work" record, and another database of 1.9 million bibliographic records. I fear that
loading this through http will take several days, perhaps a week.  Do
any of you have a way to do a large batch load of the DB?

It won't take that long. Send multiple documents per POST and perhaps commit every big bunch or so. I ingested 3.8M binary MARC records in a pretty crude way in less than a day.

But, the fastest way to ingest data into Solr out of the box, I think, is to use the CSV import capabilities. I've indexed 1.8M bibliographic-sized records in 18 minutes with the CSV uploader, pointing it to a local file.

        Erik

Reply via email to