RE: Indexing a large number of DB records

2004-12-16 Thread Garrett Heaver
:[EMAIL PROTECTED] Sent: 15 December 2004 18:43 To: Lucene Users List Subject: RE: Indexing a large number of DB records Note that this really includes some extra steps. You don't need a temp index. Add everything to a single index using a single IndexWriter instance. No need to call addIndexes

Re: Indexing a large number of DB records

2004-12-15 Thread Otis Gospodnetic
Hello Homam, The batches I was referring to were batches of DB rows. Instead of SELECT * FROM table... do SELECT * FROM table ... OFFSET=X LIMIT=Y. Don't close IndexWriter - use the single instance. There is no MakeStable()-like method in Lucene, but you can control the number of in-memory

RE: Indexing a large number of DB records

2004-12-15 Thread Garrett Heaver
Ensuring that the documents go into the index sequentially your problem is solved and memory usage on mine (dotlucene 1.3) is low Regards Garrett -Original Message- From: Homam S.A. [mailto:[EMAIL PROTECTED] Sent: 15 December 2004 02:43 To: Lucene Users List Subject: Indexing a large number

RE: Indexing a large number of DB records

2004-12-15 Thread Otis Gospodnetic
that the documents go into the index sequentially your problem is solved and memory usage on mine (dotlucene 1.3) is low Regards Garrett -Original Message- From: Homam S.A. [mailto:[EMAIL PROTECTED] Sent: 15 December 2004 02:43 To: Lucene Users List Subject: Indexing a large number

Indexing a large number of DB records

2004-12-14 Thread Homam S.A.
I'm trying to index a large number of records from the DB (a few millions). Each record will be stored as a document with about 30 fields, most of them are UnStored and represent small strings or numbers. No huge DB Text fields. But I'm running out of memory very fast, and the indexing is slowing

Re: Indexing a large number of DB records

2004-12-14 Thread Otis Gospodnetic
Hello, There are a few things you can do: 1) Don't just pull all rows from the DB at once. Do that in batches. 2) If you can get a Reader from your SqlDataReader, consider this:

Re: Indexing a large number of DB records

2004-12-14 Thread Homam S.A.
Thanks Otis! What do you mean by building it in batches? Does it mean I should close the IndexWriter every 1000 rows and reopen it? Does that releases references to the document objects so that they can be garbage-collected? I'm calling optimize() only at the end. I agree that 1500 documents is