:[EMAIL PROTECTED]
Sent: 15 December 2004 18:43
To: Lucene Users List
Subject: RE: Indexing a large number of DB records
Note that this really includes some extra steps.
You don't need a temp index. Add everything to a single index using a
single IndexWriter instance. No need to call addIndexes
Hello Homam,
The batches I was referring to were batches of DB rows.
Instead of SELECT * FROM table... do SELECT * FROM table ... OFFSET=X
LIMIT=Y.
Don't close IndexWriter - use the single instance.
There is no MakeStable()-like method in Lucene, but you can control the
number of in-memory
Ensuring that the documents go into the index sequentially your problem is
solved and memory usage on mine (dotlucene 1.3) is low
Regards
Garrett
-Original Message-
From: Homam S.A. [mailto:[EMAIL PROTECTED]
Sent: 15 December 2004 02:43
To: Lucene Users List
Subject: Indexing a large number
that the documents go into the index sequentially your
problem is
solved and memory usage on mine (dotlucene 1.3) is low
Regards
Garrett
-Original Message-
From: Homam S.A. [mailto:[EMAIL PROTECTED]
Sent: 15 December 2004 02:43
To: Lucene Users List
Subject: Indexing a large number
I'm trying to index a large number of records from the
DB (a few millions). Each record will be stored as a
document with about 30 fields, most of them are
UnStored and represent small strings or numbers. No
huge DB Text fields.
But I'm running out of memory very fast, and the
indexing is slowing
Hello,
There are a few things you can do:
1) Don't just pull all rows from the DB at once. Do that in batches.
2) If you can get a Reader from your SqlDataReader, consider this:
Thanks Otis!
What do you mean by building it in batches? Does it
mean I should close the IndexWriter every 1000 rows
and reopen it? Does that releases references to the
document objects so that they can be
garbage-collected?
I'm calling optimize() only at the end.
I agree that 1500 documents is