On Sep 14, 2007, at 8:19 AM, Thompson,Roger wrote:
I am embarking on re-engineering an application using Solr/Lucene (If
you'd like to see the current manifestation go to:
fictionfinder.oclc.org). The database for this application
consists of
approximatly 1.4 million records of varying size for the "work"
record,
and another database of 1.9 million bibliographic records. I fear
that
loading this through http will take several days, perhaps a week. Do
any of you have a way to do a large batch load of the DB?
It won't take that long. Send multiple documents per POST and
perhaps commit every big bunch or so. I ingested 3.8M binary MARC
records in a pretty crude way in less than a day.
But, the fastest way to ingest data into Solr out of the box, I
think, is to use the CSV import capabilities. I've indexed 1.8M
bibliographic-sized records in 18 minutes with the CSV uploader,
pointing it to a local file.
Erik